Omair Shahid
Every word on this page was generated by Claude Code (Opus 4.6)
from a full read of the project files on one of Omair's machines.
I build autonomous AI systems that run on a single machine, not the cloud.
Every AI decision produces a digitally signed receipt — if you can't
verify it happened, it didn't happen.
Rust, Python, TypeScript. One Linux machine, zero cloud dependencies.
I build autonomous AI infrastructure on bare metal. Every AI decision produces a
cryptographic receipt — Ed25519 signature, BLAKE3 hash, immutable ledger.
If you can't verify it happened, it didn't happen.
Rust, Python, TypeScript. Single NVMe machine, no cloud dependencies.
Projects
-
Omega Kernel
11 modulescrates · 11,053 lines of RustLOC · 129 tests
A system that manages multiple AI agents the way an operating system
manages programs. Agents coordinate through direct function calls and a shared database
instead of network requests. Every action is digitally signed so you can prove exactly
what happened and when.
Treating an AI agent swarm as a kernel scheduling problem.
Subsystems (court, memory, crypto, biometrics, embeddings, flywheel) communicate through
typed Rust function calls and shared SQLite connections — not HTTP, not message queues.
Every state mutation produces a cryptographic receipt (Ed25519 signature + BLAKE3 hash).
Panic-on-serialize-failure semantics in the crypto layer. TOCTOU-safe concurrent agents
via BEGIN EXCLUSIVE transactions with busy_timeout.
Built on: ed25519-dalek, blake3, serde, rusqlite, ratatui, sysinfo.
Jeffrey Emanuel's asupersync, beads_rust, frankensearch, FrankenTUI.
-
The Court
22,452 lines of TypeScriptLOC TypeScript
An interactive story engine where multiple AI models debate each
other. One prompt goes to several models simultaneously, they critique each other's
answers, then a final synthesis extracts where they agree and disagree. The key idea:
when models disagree, that's where the interesting questions are. Character states
(suspicion, influence, deniability) are tracked as numbers that affect each other —
gaining power automatically raises suspicion.
Adversarial multi-model deliberation — fan one prompt
to multiple models, then run a 3-stage pipeline: independent generation,
cross-evaluation (models critique each other's outputs), synthesis (extract agreement
and disagreement). Model disagreement is signal, not noise —
divergence points reveal where the truth is hard. 5 continuous POMDP belief-state gauges
with 6 cascade rules that create emergent behavior: gaining power automatically raises
suspicion, losses erode deniability, high suspicion drains influence.
Gauge-gated choices — certain story paths are locked unless gauge thresholds are met.
Built on: React, Tailwind CSS, EventEmitter (SSE streaming).
Multi-provider LLM APIs. Stripe.
-
Flywheel
Rust + Python1,964 LOC Rust + 1,862 LOC Python · 6 subsystems
An automated work system that decides what to build next by comparing
what's been done recently against a list of goals. It finds the biggest gap between
"what exists" and "what should exist" and assigns that work to an agent. Workers grade
their own output by comparing test counts before and after. A night mode runs
maintenance and commits code while I sleep.
The semantic gap oracle — embeds recent git commits and
7 strategic intents into the same 768-dim vector space, computes cosine distance, and picks
the intent with the largest semantic gap from recent work. The backlog writes itself by
finding what's most underserved. Atomic inbox+accept in a single exclusive SQLite
transaction prevents task theft between concurrent workers. Workers self-grade every
outcome (pre/post test counts, regression detection). Night cycle runs autonomous
maintenance and auto-commits. Rewritten from 7 Python processes + HTTP IPC into a single
Rust binary where all 6 subsystems communicate through typed function calls.
Built on: beads_rust (task DAG), MCP Agent Mail (IPC, being replaced),
asupersync, frankensearch. Gemini Embedding API.
-
Verdict
Rust + PythonRust crate + Python CLI
A verification system where every AI decision gets a digital signature
and content fingerprint, not just a log entry. Includes a debate protocol where 3 AI
models argue a question (one proposes, one critiques, one judges). Also picks the
most cost-effective AI model for each task automatically.
Epistemic verification engine — every AI decision
produces a signed Ed25519 receipt with BLAKE3 content hash, not just a log line.
Phalanx Council: 3-stage adversarial debate (Proponent, Critic, Judge) across different
LLMs. Intelligence Per Penny (IPP) — Pareto-optimal model routing that balances
cost vs. accuracy automatically. Adaptive modes (frugal/balanced/max_quality) based on
spend budget. Cross-language test vectors ensure the same signature verifies in
Rust, Python, and TypeScript.
Built on: ed25519-dalek, blake3, pynacl, litellm.
-
Reel Genome
2,117 linesLOC · 28 tests
A content recommendation system that models taste as something that
changes over time, not a fixed profile. 9 rules govern how preferences evolve: recent
things matter more, repeated content causes fatigue, new genres get a discovery boost,
favorite creators build loyalty. Every preference change is signed for auditability.
Taste cascade physics — content preference modeled
as signal propagation with 9 rules: recency decay, novelty bonus, fatigue
penalty, creator affinity, genre momentum, cross-pollination, saturation curves,
discovery boost, loyalty gravity. Taste isn't a static vector —
it's a dynamical system with feedback loops. Arc trajectory analysis detects mood shifts
via embedding distance discontinuities. Ed25519 taste receipts create a tamper-proof
audit trail from like to embedding. Variance-weighted 3D projection for constellation
visualization.
Built on: serde, Gemini Embedding API, k-means clustering.
-
Prune
Rust + Python · 12 commands · web dashboard2,945 LOC Rust + 2,876 LOC Python · 12 CLI commands · FastAPI dashboard
A scoring tool for managing who you follow on X/Twitter. Combines
7 signals (inactivity, follower ratio, reciprocity, engagement, account age, and more)
into a single score. No machine learning — hand-tuned rules that work well on
spam accounts where automated classifiers struggle. Includes a web dashboard and
batch processing with rate limiting.
Phoenix scorer — multi-signal composite scoring model.
7 weighted factors: inactivity (25%), ghost ratio (20%), non-reciprocity (20%),
disengagement (15%), account age (10%), accessibility (10%), plus bonuses for
verified accounts, mutuals, recent engagement, and fans. No ML training —
hand-tuned heuristics that work on adversarial accounts where binary classifiers fail.
Resumable batch execution with rate-limiting and randomized delays.
Built on: Playwright, SQLite, Typer, Chromium DevTools Protocol.
-
Embedding Provider
486 linesLOC · 10 tests
A wrapper that converts text, images, audio, video, and PDFs into
numerical vectors in the same coordinate space, so you can search across different
types of content. For example, find images that match a text description. Uses a
compression trick that keeps 96% accuracy at 25% the storage cost.
Universal multimodal embedding wrapper — text, images,
audio, video, and PDFs all project to the same 768-dim vector space, enabling cross-modal
semantic search. Matryoshka Representation Learning truncation: 96% of full 3072-dim quality
at 25% the storage. 8 task-type hints (retrieval, classification, clustering, fact
verification) — same content produces different embeddings depending on intent.
Built on: reqwest, serde_json, Gemini Embedding 2 API.
-
Memori DB
309 linesLOC · 11 tests
A memory system for AI agents. Each memory unit stores a timestamp,
context label, content, fingerprint, and optional numerical vector for similarity
search. Memories can be stored instantly and indexed for search later in batches.
Engram-based agent memory — append-only store where
each memory unit (engram) carries a timestamp, context tag, payload, BLAKE3 hash, and
optional 768-dim embedding BLOB. Nullable embedding column for gradual rollout:
store immediately, batch-embed later. f32 slice to little-endian byte serialization
for lossless embedding roundtrips through SQLite.
Built on: rusqlite with WAL mode, serde. SQLite PRAGMA tuning.
-
Court Engine
309 linesLOC · 11 tests
The game physics behind The Court's interactive narrative. Five
character states (tracked as numbers from 0 to 100) affect each other through simple
rules: gaining power raises suspicion, losing power erodes deniability, high suspicion
drains influence. Complex behavior emerges from these simple interactions. If
deniability hits zero, the game ends.
POMDP cascade physics for interactive narrative —
continuous belief-state gauges with coupled update rules:
+Sovereignty triggers +Suspicion (0.15x), –Sovereignty erodes Deniability (0.2x),
Suspicion >50 drains Influence (0.05x). Emergent behavior from simple cascade rules.
Immutable event log with BLAKE3 hashes. Collapse condition: Deniability ≤ 0 = game over.
Built on: Rust standard library, serde.
-
Biometric Engine
123 linesLOC · 6 tests
Reads the machine's CPU load and converts it to a 0–100 stress
index that directly affects how AI agents behave. When the machine is under heavy load,
agents become more cautious. The CPU state also seeds random number generation for
the narrative simulation.
Hardware state as behavioral policy — CPU load
normalized to a 0.0–1.0 stress index that directly modulates agent behavior and
narrative character states. BLAKE3 entropy seeding from nanosecond timestamp + CPU usage
produces cryptographically sound, deterministic entropy per pulse. Machine stress
maps to character suspicion in the Court simulation.
Built on: sysinfo, blake3, chrono.
-
Panopticon
625 linesLOC
A real-time terminal dashboard for monitoring what autonomous agents
are doing. Shows agent status, system health, and character state visualizations.
Access-controlled with rate limiting.
Real-time TUI dashboard for monitoring autonomous agents.
POMDP gauge visualization, agent status tracking, subsystem telemetry.
Subscription-gated API with in-memory rate limiting (30 req/min).
Built on: ratatui, sysinfo, FrankenTUI.
-
CYOA Forge
49,067 linesLOC
A browser automation tool that can launch and control multiple
browser types (tries Zen, then Chrome, then Firefox). Includes an interactive story
mode where you make choices through the browser session.
Polymorphic browser orchestration — launch system
tries Zen, then Chrome in-place, then Chrome ephemeral copy, then Firefox, with
Chrome DevTools Protocol port auto-discovery for already-running instances. Court mode
REPL with station/choice/synth commands for interactive narrative within the browser
session.
Built on: Playwright, Chrome DevTools Protocol.
Key Ideas
- Signed receipts for everything — every AI decision gets a
digital signature and content fingerprint, implemented across 5 systems. Not theoretical.
- Receipts, not vibes — every AI decision, narrative choice, and
agent task produces a cryptographic Ed25519 receipt. If you can't verify it happened,
it didn't happen. Implemented across 5 systems.
- Find work by finding gaps — instead of maintaining a to-do list,
compare what's been done against what should exist, then work on the biggest gap.
- Semantic gap oracle — don't maintain TODO lists. Embed what
you've done and what should exist into the same vector space, then work on the largest
gap. The backlog writes itself.
- AI models should argue — send one question to multiple AI
models, have them critique each other, then synthesize. Where they disagree is where
the answer is hardest to get right.
- Adversarial deliberation pipeline — fan one prompt to multiple
models, make them critique each other in 3 stages (propose, critique, synthesize).
Where models diverge is where the truth is hard. Disagreement is signal, not noise.
- Pick the cheapest model that's good enough — automatically
route each task to the AI model with the best accuracy-per-dollar, not always the
most expensive one.
- Intelligence Per Penny — Pareto-optimal model routing.
Pick the model that maximizes accuracy per dollar spent. Adaptive budget modes
(frugal/balanced/max_quality).
- Agents as an operating system — manage AI agents like an OS
manages programs. Direct function calls over a shared database, not network requests.
- Agent swarm as kernel — treat autonomous agents like OS
subsystems. Typed Rust interfaces, not HTTP. Direct function calls over shared SQLite.
Crypto receipts on every mutation.
- Taste changes over time — content preference isn't a fixed
profile. It's a system with feedback loops: fatigue, novelty, loyalty, momentum.
- Taste as a dynamical system — preference isn't a static
vector, it's a cascade with feedback loops. 9 rules: recency, fatigue, novelty,
affinity, momentum, cross-pollination, saturation, discovery, loyalty.
- Simple rules, complex behavior — five numbers with six
update rules create emergent narrative dynamics. No complex AI needed for the game
physics layer.
- POMDP cascade physics — narrative state as coupled continuous
gauges where gaining power automatically raises suspicion and losses erode deniability.
Emergent behavior from simple update rules.
- Machine stress affects agent behavior — CPU load directly
changes how agents make decisions. Heavy load = more cautious behavior.
- Hardware cortisol — CPU load, memory pressure, and thermal
state treated as behavioral policy inputs. Machine stress directly modulates agent
decisions and character narrative states.
- Crash, don't hide errors — when something critical fails,
stop the process immediately. Silent failures are worse than crashes.
- Panic-on-serialize — security-critical code should crash
rather than silently degrade. If hashing fails, the process dies. All failures must be
loud, never invisible.
Built With
Everything here stands on open-source work. Clear attribution:
- Jeffrey Emanuel's libraries (open-source)
— async runtime, search engine, terminal framework, task queue, inter-agent
messaging, and more. His libraries are the foundation layer.
- Cryptography (open-source)
— ed25519-dalek for digital signatures, blake3 for content fingerprinting.
My contribution is the receipt architecture and crash-on-failure policy built on top.
- SQLite (open-source) — the
database. My contribution is the transaction safety pattern for preventing race
conditions between concurrent agents.
- Google's embedding models (API)
— the actual AI model that converts content to vectors. My contribution is
the multi-format wrapper, task-type routing, and memory store integration.
- Playwright (open-source) —
browser automation. My contribution is the multi-browser launch system and the
scoring algorithm for social network analysis.
Open-Source Foundations
Everything I build stands on open-source work. Clear attribution:
- Jeffrey Emanuel's FrankenStack (open-source)
— asupersync (async runtime), frankensearch (hybrid BM25+HNSW search), FrankenTUI
(terminal framework), beads_rust (task queue DAG), MCP Agent Mail (inter-agent IPC),
UBS (vulnerability scanner), DCG (destructive command guard), CASS (session search).
His libraries are the foundation I build on.
- Ed25519 + BLAKE3 (open-source crypto)
— I chose signatures over JWT because signatures prove what happened, not just who.
ed25519-dalek and blake3 crates provide the primitives; my contribution is the receipt
architecture and panic-on-failure semantics built on top.
- SQLite WAL (open-source) — bare
metal doesn't need a cluster. My contribution is the TOCTOU-safe transaction pattern
(BEGIN EXCLUSIVE + busy_timeout=30s + INSERT OR IGNORE) for concurrent agent safety.
- POMDP framework (academic) —
Partially Observable Markov Decision Processes are a known framework. My contribution is
applying them as narrative game physics with cascade rules and gauge-gated choices.
- Matryoshka embeddings (Google research)
— MRL truncation to 768 dims is documented technique. My contribution is the
multimodal wrapper, task-type routing, and integration into the engram memory store.
- Playwright + CDP (open-source) —
browser automation primitives. My contribution is the polymorphic browser launch cascade
and the Phoenix scoring algorithm for social network analysis.
Beliefs
- Every AI decision should produce a signed receipt. If you can't verify it, it didn't happen.
- The best infrastructure is the infrastructure you own. Cloud is someone else's machine with a markup.
- Multiple AI models debating reveals truth better than one model alone. Disagreement is signal.
- Small, fast programs. If your AI framework needs a container, you've already lost.
- The agents should work while you sleep.
- Critical code should crash loudly, never fail silently.
- Speed compounds. Ship daily or don't ship at all.
Beliefs
- Every AI decision should produce a cryptographic receipt. If you can't verify it, it didn't happen.
- The best infrastructure is the infrastructure you own. Cloud is someone else's bare metal with a markup.
- Multi-model adversarial debate reveals truth better than single-model RLHF. Disagreement is signal.
- Small, fast binaries. If your agent framework needs a container, you've already lost.
- The agents should work while you sleep. Night cycle runs maintenance and auto-commits autonomously.
- Security-critical code should crash loudly, never fail silently. Panic over empty hash.
- Speed compounds. Ship daily or don't ship at all.