Feature·Workflows·Journaled·Resumable
Agent orchestration, built in.
Define multi-phase agentic workflows as ordinary Sema code. Phases are markers, agents are LLM leaves, and every step is journalled to a frozen JSONL run directory — resume, replay, or fork without losing state.
Frozen JSONL journal · content-keyed resume · budget caps · parallel & pipeline fan-out
A workflow is just a function
Define, run, journal — no daemon, no YAML.
defworkflow is a prelude macro that expands to workflow/run. The body is ordinary Sema: phases are markers, agents are LLM calls, and the runtime journals everything to a run directory under .sema/runs/.
;; (helpers: article schema, slug, prompt, good-article? — omitted) (defworkflow content-pipeline "Generate + verify explainer articles." {:phases ["Topics" "Write" "Verify" "Publish"] :budget {:tokens 50000 :usd 1.0}} (phase "Topics") (def chosen topics) (phase "Write") ;; Fan out: one typed step per topic, ;; at most 4 model calls in flight. (def articles (pipeline chosen (fn (t) (step (article-prompt t) {:name "writer" :schema article})))) (phase "Verify") (def verified (filter good-article? articles)) (phase "Publish") (for-each write-file verified) {:status :success :published (count verified)})
- defworkflow — a macro that expands to
workflow/run. The body is a thunk; the form is the run. - phase — a marker, not a wrapper.
(phase "Write")opens a phase; forms after it belong to that phase until the next marker or run end. - step — a journaled LLM leaf. With
:schemait returns typed data (validated viallm/extract); without, it returns the completion text. With:toolsit runs the real tool loop. With:agentit runs a configureddefagent. - checkpoint —
(checkpoint :k v)records and returnsv;(checkpoint :k)reads it back. Threads state between phases. - parallel / pipeline — bounded-concurrency fan-out.
parallelruns thunks concurrently (barrier);pipelineflows each item through stages independently (no barrier between stages). - {:status …} — the return envelope.
:successor:failed; the runtime forces:failedif a budget cap trips.
Run it
One command. A run directory. A live viewer.
sema workflow run evaluates the file, journals every event to .sema/runs/<run-id>/, and writes result.json. The --view flag starts a live web viewer so you can watch the run progress in real time.
- events.jsonl — the system of record. Append-only, one JSON event per line. Frozen vocabulary.
- memo/ — per-leaf resume cache. A file's existence means that leaf completed with this value.
- metadata.json — workflow name, code version, budget, args.
- result.json — the final
{:status …}envelope. - sema workflow view — a read-only web viewer that polls the journal and renders the live tree.
The dashboard
Watch the run unfold, live.
sema workflow view (or --view on run) serves a read-only web viewer that polls the journal and renders the run as a tree: phases nest agents, budget events show per-leaf spend, checkpoints show their digests.
The run directory is the system of record
Every event, journaled. Frozen. Append-only.
The event vocabulary is a frozen public contract — add fields (append-only, all Option/skippable) but never change existing ones. Old runs stay readable forever. The journal is flushed per event, so a crash mid-run leaves a valid prefix.
{"event":"run.started","seq":0,"ts":"2026-06-27T…","workflow":"content-pipeline","run_id":"wf_…"}
{"event":"phase.started","seq":1,"ts":"…","phase":"Topics"}
{"event":"phase.ended","seq":2,"ts":"…","phase":"Topics","status":"success","dur_ms":0}
{"event":"phase.started","seq":3,"ts":"…","phase":"Write"}
{"event":"agent.started","seq":4,"ts":"…","agent_id":"writer_1","agent_name":"writer"}
{"event":"agent.result","seq":5,"ts":"…","agent_id":"writer_1","status":"ok","output":"…","dur_ms":1240,"model":"gpt-5.4-mini"}
{"event":"budget","seq":6,"ts":"…","agent_id":"writer_1","input_tokens":3120,"output_tokens":880,"cost_usd":0.0041,"budget_limit":50000}
{"event":"checkpoint","seq":7,"ts":"…","key":"files","content_key":"ck_4d2f8a1c","value_digest":"abc123"}
{"event":"phase.ended","seq":8,"ts":"…","phase":"Write","status":"success","dur_ms":2100}
{"event":"run.ended","seq":9,"ts":"…","status":"success","dur_ms":5400}run.started / run.endedworkflow name, run id, status, durationphase.started / phase.endedlabel, status, duration — nest agents under phasesagent.started / agent.resultagent_id, role, model, output, durationagent.tool_calltool name, argument digest — journaled tool invocationscheckpointkey, content-key (resume hash), value digestbudgetper-leaf token + cost attribution; never double-chargesResume
Crash, edit, re-run — pick up where you left off.
--resume <run-id> reuses the run directory and short-circuits any leaf whose content-key is in the prior run's memo/ dir. The model is not called for memoized leaves — they replay for free. A fresh events.resume-N.jsonl segment is written so the frozen invariants hold in each file.
- Content-keyed. Each leaf's key is a hash of (kind, code-version, phase, prompt, schema). Same inputs → same key → memo hit → no re-call.
- Automatic invalidation. Edit the workflow → different code version → different keys → no memo hits → full re-run. No guard files to maintain.
- Per-leaf granularity. Delete one memo file → that leaf re-runs while others still replay. Conservative: a missing memo always re-runs.
- Resume doesn't double-charge. Spend starts at zero on resume. Memoized leaves don't re-call the model and don't recharge the budget.
(defworkflow audit "Audit with a token cap." {:phases ["Scan" "Report"] :budget {:tokens 5000}} (phase "Scan") (def a (step "Find files." {})) ;; a burns 5200 tokens → cap trips (phase "Report") (def b (step "Summarize." {})) ;; b refused: over_budget latch {:status :success :a a :b b}) ;; → {:status :failed ;; :reason "budget exceeded"}
Budget enforcement
Spend caps that actually trip.
Declare :budget {:tokens N :usd M} in the workflow metadata. The runtime charges each step leaf and latches a sticky over_budget flag when a cap is exceeded — further step leaves are refused and the run ends {:status :failed :reason "budget exceeded"}.
- Per-leaf attribution. Each
budgetevent records theagent_id, token counts, and cost — so the dashboard shows per-leaf spend, not just a total. - Sticky latch. Once a cap trips, the latch stays set. No step leaf launches after it — even under concurrent
parallelfan-out. - Resume doesn't double-charge. A
--resumerun starts spend at zero. Memoized leaves replay for free. - USD is best-effort. Token caps are deterministic; USD caps depend on the pricing table being available.
The papers are just control flow
Classic agent architectures as composable macros.
The workflow DSL is homoiconic — agent patterns from the literature are macros that expand into parallel, pipeline, and step forms. No framework, no runtime tax. These are from examples/workflows/cookbook.sema — load and use them.
ReAct
think → act → observe → repeat
The step reasons about each tool result before deciding the next step. The loop is a let loop over an accumulator — bounded by max-rounds.
(defmacro react (question tools max-rounds) `(let loop ((round 1) (scratch "")) (let ((answer (step (str "Q: " ,question "\n" scratch) {:name "react" :tools ,tools}))) (if (or (>= round ,max-rounds) (not (contains? (lower answer) "next:"))) answer (loop (+ round 1) (str scratch "\n" answer))))))
Reflexion
try → self-critique → retry
The step critiques its own output and retries with the feedback. A critic reply starting with "OK" short-circuits. Bounded by max-tries.
(defmacro reflexion (task max-tries) `(let loop ((try 1) (note "")) (let ((attempt (step ,task {:name "actor"}))) (if (>= try ,max-tries) attempt (let ((critique (step (str "Critique. Reply OK if good.\n\n" attempt) {:name "critic"}))) (if (starts-with? (trim critique) "OK") attempt (loop (+ try 1) critique)))))))
Tree-of-Thought
branch → score → keep best
Fork N candidate solutions in parallel, score each, keep the best. parallel handles the fan-out; the workflow journals every branch.
(defmacro tree-of-thought (prompt n scorer) `(let ((cands (filter (fn (c) (not (nil? c))) (parallel (map (fn (i) (fn () (step (str ,prompt "\n(candidate #" i ")") {:name "thought"}))) (range ,n)))))) (foldl (fn (best c) (if (> (,scorer c) (,scorer best)) c best)) (first cands) (rest cands))))
Debate
N agents → judge
Two personas argue R rounds; a judge reads the transcript and decides. Each round is two step leaves; the judge is a third.
(defmacro debate (topic a b rounds) `(let loop ((r 1) (transcript topic)) (let* ((arg-a (step (str "You are " ,a ".\n" transcript) {:name ,a})) (t1 (str transcript "\n\n" ,a ": " arg-a)) (arg-b (step (str "You are " ,b ".\n" t1) {:name ,b})) (t2 (str t1 "\n\n" ,b ": " arg-b))) (if (>= r ,rounds) (step (str "Judge the debate:\n" t2) {:name "judge"}) (loop (+ r 1) t2)))))
Catch errors before a run
sema workflow check — instant, no LLM.
Statically validate a workflow file without evaluating it or calling any model. Catches arity traps (e.g. (phase "x" body) — phase is a one-arg marker), bad step options, and layout issues before you spend a token.
- Pure static analysis. Parses the AST and walks it — never evaluates, never configures a provider, never emits a journal event.
- Human or JSON output.
--jsonemits machine-readable diagnostics with source spans.--stricttreats warnings as errors (CI gate). - Workflow-only checks.
phase/checkpoint/parallel/pipelinearity is checked only inside adefworkflowbody — a bare(parallel …)in a library file never trips.
Run your first workflow.
The workflow DSL is homoiconic — the plan, the program, and the trace are all s-expressions. No YAML, no daemon, no separate runtime.