OpenAI Codex 5.5: Not Just for Coders. An OS for Knowledge Work.

The name is misleading. Codex sounds like a tool for programmers, and it started as one. But what OpenAI has built with Codex and GPT 5.5 is something closer to what Every.to's team calls an operating system for knowledge work: a persistent agent workspace that runs goals across sessions, connects to your tools, and handles tasks across email, research, writing, planning, and meetings alongside code.

Fig. 1

Named for code, built for work

An operating system for knowledge work that happens to write code.

The companies using it well are not treating it as a faster way to write code. They are treating it as a team member. Katie Parrott, head of content at Every.to, describes it as an agent you can brief on what done looks like, point at your relevant data sources, and leave working while you do other things. You check back when it needs you. Not the other way around.

This article explains what Codex actually is, how GPT 5.5 differs from previous OpenAI models, what the key capabilities are, and how companies are using it across their business, not just in engineering.

What Codex Actually Is

A tool-using agentic workspace. Every word in that phrase matters.

Tool-using: Codex can read and write files, connect to external services through plugins and integrations, run scripts, control a browser, and take actions in apps. It is not just generating text. It is doing things.

Agentic: It runs multi-step tasks without asking for guidance at every step. You give it a goal, not a prompt. It plans, executes, checks its own work, revises, and reports back when it needs a decision or is done.

Workspace: It holds context across sessions. It remembers your files, your preferences, your ongoing work. When you return to a task the next day, it knows where things stand. This is fundamentally different from a chat interface that starts fresh with every conversation.

ChatGPT / traditional AI chat

Chat. Respond. Repeat.

Session model

One conversation at a time. Starts fresh every session with no memory of previous work.

Context

You provide all context in every message. Nothing carries over.

Work style

Stops when you stop asking. Waits for your next message.

Output

Generates text you then act on yourself.

Best for

Single questions, short tasks, drafts you refine yourself.

OpenAI Codex

Goals. Actions. Results.

Session model

Multiple tasks running in parallel. Persistent workspace with memory across sessions.

Context

Holds your goals, files, and preferences between sessions. Set once, applies everywhere.

Work style

Works continuously toward goals while you do other things.

Output

Takes actions in your tools and systems directly.

Best for

Ongoing work, multi-step projects, tasks that span days.

GPT 5.5: What Changed

GPT 5.5 is the model powering Codex. It is a step change from GPT-4o and the model line that came before it.

Marc
Andreessen

"For almost any topic, the top AIs now give better answers than the actual world-class experts I could call on the phone. And I can call basically anyone."

@pmarca · May 2026

The specific improvements that matter for Codex's knowledge work use cases:

Multi-step reasoning: Better at holding a complex goal across many steps without losing track of constraints or requirements.
Tool use accuracy: More reliable at connecting to external systems, reading files correctly, and taking precise actions.
Self-correction: More likely to notice when a result does not match what was asked and iterate without being told.
Context handling: Better at working with large amounts of information without losing the thread.

What GPT 5.5 unlocks in Codex

GPT 5.5

The underlying model in Codex. Part of the generation Marc Andreessen described as crossing an AGI threshold alongside Claude Opus 4.8 and Gemini 3.

Parallel tasks

Run multiple goals simultaneously. Codex works on all of them while you focus elsewhere.

Persistent goals

Set a goal once with /goal. Codex keeps working toward it across session breaks.

Mobile control

Control Codex from your phone via ChatGPT mobile. Kick off tasks, approve decisions, review results from anywhere.

Goals and Skills: The Key Concepts

Two concepts define how Codex works differently from a standard AI assistant. Understanding them explains most of what Codex can do.

Goals are persistent objectives. You set a goal with /goal and describe what done looks like, how success gets checked, and what constraints to respect. Codex then keeps working toward that outcome across interruptions and session breaks.

The test for when to use a goal: if you would type the same instruction in three separate messages, "always cite your sources, match our house style, never send without my review," make it a goal instead. It applies to everything Codex does in that session.

Skills are reusable instructions. A skill teaches Codex how to handle a recurring kind of task well. Once you have defined how you want weekly competitive analysis reports formatted, or how to process incoming sales enquiries, or how to handle end-of-day meeting summaries, that skill runs automatically for every instance of that task.

Together: Goals tell Codex what you are trying to accomplish. Skills tell it how to do recurring tasks within that goal. The combination is what makes it a workspace rather than a chat tool.

Goals and Skills in practice

A CEO's Codex setup

Goal: "My company is preparing for Series B. Track all relevant signals, competitor moves, customer feedback, market news, and surface anything I should know before my Monday morning standup."
Skill: "When summarising competitor news, always include: what changed, why it matters, and one question I should ask my team about it."
Skill: "Meeting notes format: attendees, decisions made, action items with owners, open questions. Post to Notion within 30 minutes of the meeting ending."
Result: Codex runs continuously. Monday morning summary is ready before you open your laptop.

What Every.to Learned Running Four Agents

Katie Parrott and the team at Every.to, a 25-person media company, documented what happened when they moved their core coordination work to four Codex agents:

Anton (Prioritisation): Routes daily priorities to team members and posts company-wide summaries to Slack. Synthesises launch calendars, strategy documents, and task lists. Answers the question every employee has every morning: "What should I work on today?"

Max (Meetings to Tasks): Extracts action items from meeting transcripts, posts them to Slack as numbered lists, and converts selected items into tasks linked to relevant projects.

Strategy Interviewer (OKR Planning): Conducts quarterly goal interviews with team members, pushing for specificity and measurable outcomes while ensuring alignment with company strategy. Reduced planning from weeks to two days.

Campaign Reporter (Growth Tracking): Delivers daily scorecards showing key metrics, pace indicators, and whether targets are being met.

Every.to
Team

"We moved from relying on the COO as a human router to automating core coordination work."

Every.to · 2026

The COO stopped being the conduit for information. The agents became the conduit.

Three things the Every.to team found essential:

Interconnected databases. Agents gain power by querying linked information, strategy, calendar, tasks, people, notes, not isolated files.
Outcome-focused prompts. Describe what you want, not the steps.
Progressive complexity. Start simple, then build subsequent agents on existing foundations.

The Folder Is the Agent

Kieran Klaassen at Every.to discovered something surprising when managing 44 concurrent agents: you do not need complex orchestration. You need well-organised context.

A well-structured project folder, with CLAUDE.md or similar context files encoding conventions and institutional knowledge, turns the same AI model into a different specialist depending on which folder it reads. Point it at your codebase folder, it behaves like a Rails engineer. Point it at your monitoring folder, it behaves like an operations engineer.

Klaassen runs 44 concurrent agents through a file-based dispatch system with two commands: /hey generates status reports across all projects, and /orchestrate breaks tasks into subtasks and spawns workers in appropriate folders.

The implication: the barrier to running multiple agents is lower than it looks. The orchestration infrastructure is not the hard part. The hard part is having clean, well-structured context in each domain.

Codex vs Claude Code: What to Use When

For companies looking at both tools: both are agentic coding and knowledge work platforms. Both can run parallel tasks, use tools, and connect to systems. The differences are in emphasis.

Codex (GPT 5.5)

Knowledge work. Operations. Mobile.

Primary strength

Knowledge work alongside coding: email, research, writing, ops tasks.

Mobile

First-class mobile control via ChatGPT app. Run and approve tasks from your phone.

Persistence

Persistent goals that survive session breaks. Set once, runs indefinitely.

Ecosystem

Strong integration with OpenAI's broader tool and plugin ecosystem.

Claude Code (Opus 4.8)

Engineering. Scale. Depth.

Primary strength

Deep coding and engineering focus. Long-horizon tasks: migrations, audits, refactoring.

Scale

Dynamic workflows: scale to hundreds of agents for large codebase tasks.

Effort control

Ultracode mode for maximum-effort engineering tasks when you need it done right.

Best for

Engineering-heavy workflows where code quality and depth matter most.

Neither is objectively better. The companies running at Wave 3 typically use both: Claude Code for engineering-heavy workflows, Codex for knowledge work and business operations. The nativefirst stack, Opus 4.8 and Codex GPT 5.5, reflects this.

Codex and Opus 4.8 are the tools. The architecture is what makes them work.

Book a free Diagnostic: 30–45 minutes, no deck, no pitch. It maps which knowledge work functions in your company are most ready for an agent setup, what the data access looks like, which goals to set first, and how to structure the context layer so agents actually produce useful output.

Book the Diagnostic →

Sources

1Katie Parrott, Codex for Knowledge Work, Every.to, May 2026. Complete guide to using Codex as an operating system for non-engineering work.

2Brandon Gell, How We Run a 25-person Company on Four AI Agents, Every.to, 2026. Every.to's four-agent setup and what changed.

3Kieran Klaassen, "The Folder Is the Agent", Every.to, 2026. On managing 44 concurrent agents through folder-based context.

4Marc Andreessen on GPT 5.5 and AGI thresholds, via @itsolelehmann summary on X, May 2026.

5@gdb (OpenAI), Codex real-time meeting transcription and Q&A, X, May 2026.

John Tan

Founder and CEO of nativefirst.ai. Embeds with scaling founders and CEOs to ship Level-3 agents and AI workflows in production.

OpenAI Codex 5.5: Not Just for Coders. An OS for Knowledge Work.

What Codex Actually Is

Chat. Respond. Repeat.

Goals. Actions. Results.

GPT 5.5: What Changed

Goals and Skills: The Key Concepts

A CEO's Codex setup

What Every.to Learned Running Four Agents

The Folder Is the Agent

Codex vs Claude Code: What to Use When

Knowledge work. Operations. Mobile.

Engineering. Scale. Depth.

Codex and Opus 4.8 are the tools. The architecture is what makes them work.

Join the waitlist.AI is moving fast.

Join the waitlist.
AI is moving fast.