Overview·Language·Toolchain·Runtime
What is Sema?
Sema is a Scheme-like Lisp with a Clojure-flavored surface and first-class LLM/agent primitives, compiled to a NaN-boxed bytecode VM. Single-threaded, reference-counted, embeddable. Implemented in Rust.
v1.27.1 · MIT · Rust 2021 · 16 crates · ~125k lines
The language
A Lisp you can hold in your head.
One syntax rule: everything is an s-expression. The surface borrows from Clojure — keywords, maps, vectors, short lambdas, f-strings — while the semantics stay Scheme at the core: tail-call optimization, quasiquote macros, define/set!, lexical scope.
(define greet (fn (name) f"Hello, ${name}!")) (define person {:name "Ada" :age 36}) (:name person) ; keyword as getter (map #(* % %) (range 1 6)) ; short lambda (match (:status res) :ok (:data res) :error (throw (:message res)))
:keywords, {:k v} maps, [1 2 3] vectors, #(* % %) lambdas, f"..." strings, #"regex" literals.
define, set!, lambda/fn, let/let*/letrec, if/cond/case, begin, and/or, tail-call optimization.
Threading macros (->, ->>), pattern matching (match), destructuring in let/define/fn params, when-let/if-let.
try/catch/throw — caught errors are structured maps with :type, :message, :value, and :stack-trace.
async/await + channels — a deterministic cooperative scheduler, not OS threads.
Data types
Classic types, plus LLM as first-class.
Prompts, messages, conversations, tools, and agents are values — the same as integers and strings. They can be bound, passed, inspected, and stored. That's the defining difference.
The toolchain
What's in the box.
One binary, sema, gives you the REPL, script runner, bytecode compiler, standalone executable builder, formatter, LSP, DAP debugger, notebook server, and MCP server.
sema with no args..sema files, inline expressions with --eval, and shebang scripts.sema compile / sema disasm.sema build traces imports, bundles assets, emits a self-contained binary. No toolchain needed at runtime.sema fmt — opinionated code formatter for .sema files.sema lsp.sema dap..sema-nb format, shared-env cells, REST API, browser UI. sema notebook serve.sema mcp.wasm-bindgen.sema-lang with a builder API. Interpreter::new().eval_str("(+ 1 2)").The LLM layer
Not an SDK. A language.
LLM operations are forms and values, not library calls wrapped in boilerplate. The runtime handles retries, caching, cost tracking, provider fallback, and rate limiting — so your code stays the size of its idea.
- Eight chat providers. Anthropic, OpenAI, Gemini, Groq, xAI, Mistral, Moonshot, Ollama — auto-configured from environment variables. Plus Jina, Voyage, and Cohere for embeddings.
- Tools & agents.
deftooldefines a function with a schema.defagentdefines a system prompt + tools + turn limit.agent/runhandles the loop. - Conversations as data. Immutable, forkable, inspectable.
conversation/sayreturns a new value — the old one is untouched. - Cassettes. Record LLM calls to a file, replay them forever. Deterministic tests without API keys.
- Cost & budgets.
llm/with-budgetcaps spend for a scope. Token usage tracked per call and per session. - Observability. Built-in OpenTelemetry tracing with GenAI conventions. Off by default.
(deftool get-weather "Get weather for a city" {:city {:type :string}} (lambda (city) (format "~a: 22°C" city))) (defagent bot {:system "Weather assistant." :tools [get-weather] :max-turns 3}) (llm/with-budget {:max-cost-usd 0.10} (lambda () (agent/run bot "Weather in Oslo?"))) ;; => "It's 22°C in Oslo." ;; $0.003 · 1 tool call · 2 turns
How it differs from other Lisps
The things that surprise people.
Only #f and nil are falsy. 0, "", and () are all truthy. In CL, the empty list is false.
Lists are Rc<Vec<Value>> — O(1) nth, O(n) cons. Prefer map/filter/fold and vector for hot paths.
Mutable state is (define x 0) + (set! x 1). No atom/swap!/reset!.
{:k v} literals are sorted BTreeMaps — deterministic order, usable as keys. (hashmap/new) is faster and unordered.
Macros use defmacro with quasiquote. Symbols ending in # inside quasiquote are auto-unique — no variable capture.
Prompt, Message, Conversation, Tool, Agent are values — alongside integers and strings. This is why Sema exists.
Under the hood
NaN-boxed values, bytecode VM.
Every value is a single 8-byte struct Value(u64) — encoded in IEEE 754 quiet-NaN payload space. The sole evaluator is a stack-based bytecode VM with intrinsic opcodes and NaN-boxed fast paths. No tree-walking interpreter.
- Single-threaded.
Rc-based values, no cross-thread sharing. Parallelism is at the LLM-call level, not the compute level. - No GC. Deterministic destruction via reference counting. Memory is freed the moment the last reference drops.
- 16 crates. Strict dependency ordering:
sema-core ← sema-reader ← sema-vm ← sema-eval ← sema. Stdlib and LLM depend on core, not eval — dependency inversion via callbacks. - Bytecode format.
.semacfiles with a 24-byte header, string table, function table, main chunk.sema buildembeds the runtime + bytecode into a standalone binary.
struct Value(u64) — every type in 8 bytesNaming conventions
Slash-namespaced. Predicates with ?. Arrows for conversions.
The conventions are the API contract — get these right and the stdlib falls into place.
file/readstring/split, http/get, regex/match?, json/encode. Never read-file or split-string.empty??. null?, list?, file/exists?, equal?.string->symbol->. keyword->string, list->vector, string->number.string-appendstring-length, string-ref, substring — no string/ prefix on these.Now go build something with it.
Install it — or skip the tutorial and hand the docs to your agent.