Research

Research for agents that improve by doing validated work.

CodexDeus studies the full stack required for autonomous coding agents: models, memory, tools, test execution, browser validation, source control, deployment awareness, and human review.

Tracks

The research agenda is coding-agent-first.

Agentic Coding

Systems that can read large repositories, plan implementation paths, edit code, run tools, validate behavior, and present changes for human review.

Evaluation and Validation

Benchmarks and workflows that measure whether an agent completed useful work, not just whether it produced plausible text.

Human Knowledge Data

Data pipelines built around valuable human work patterns, source-grounded knowledge, software history, domain expertise, and operator feedback.

Tool and World Interaction

Agents that can safely use shells, browsers, APIs, documents, design systems, data sources, deployment targets, and other real-world interfaces.

Self-Improving Loops

Research into memory, error recovery, critique, replay, regression checks, and feedback loops that improve agent performance over time.

Productization

Turning lab work into reliable AI agent products, beginning with Stobon as the first applied CodexDeus project.

Validation

Tests are part of the intelligence system.

A coding agent is only useful if it can close the loop between intention and behavior. CodexDeus treats builds, tests, browser checks, linting, screenshots, source review, and operator feedback as first-class parts of the model environment.

The goal is to make agents that do not merely generate code, but learn to prove that the code works inside the systems people actually use.

Stobon extends that research posture into personal intelligence: an applied product built around local Apple-device context, relationship memory, and approved action rather than cloud-first data extraction.