Agentic Coding
Systems that can read large repositories, plan implementation paths, edit code, run tools, validate behavior, and present changes for human review.
Research
CodexDeus studies the full stack required for autonomous coding agents: models, memory, tools, test execution, browser validation, source control, deployment awareness, and human review.
Tracks
Systems that can read large repositories, plan implementation paths, edit code, run tools, validate behavior, and present changes for human review.
Benchmarks and workflows that measure whether an agent completed useful work, not just whether it produced plausible text.
Data pipelines built around valuable human work patterns, source-grounded knowledge, software history, domain expertise, and operator feedback.
Agents that can safely use shells, browsers, APIs, documents, design systems, data sources, deployment targets, and other real-world interfaces.
Research into memory, error recovery, critique, replay, regression checks, and feedback loops that improve agent performance over time.
Turning lab work into reliable AI agent products, beginning with Stobon as the first applied CodexDeus project.
Validation
A coding agent is only useful if it can close the loop between intention and behavior. CodexDeus treats builds, tests, browser checks, linting, screenshots, source review, and operator feedback as first-class parts of the model environment.
The goal is to make agents that do not merely generate code, but learn to prove that the code works inside the systems people actually use.
Stobon extends that research posture into personal intelligence: an applied product built around local Apple-device context, relationship memory, and approved action rather than cloud-first data extraction.