Evaluator Internals
Sema is a tree-walking interpreter. There is no bytecode compiler, no JIT, no CPS transform — just an AST represented as Value nodes that get recursively evaluated. The key trick is a trampoline that turns tail calls into a loop, giving us tail-call optimization without growing the native Rust stack. This page documents how the evaluator works, from the top-level entry point down to function application and stack traces.
If you're familiar with the metacircular evaluator from SICP (Abelson & Sussman, 1996), Sema's evaluator is structurally similar — eval dispatches on expression type, apply handles function calls — but with two critical differences: the trampoline for TCO, and a mutable environment model instead of substitution.
The Trampoline
The central abstraction is a two-variant enum:
// crates/sema-eval/src/eval.rs
pub enum Trampoline {
Value(Value), // done, here's the result
Eval(Value, Env), // tail call: evaluate this expr in this env
}Every evaluation step returns a Trampoline instead of a Value. When a function wants to make a tail call — evaluate an expression as its final action — it returns Trampoline::Eval(expr, env) instead of recursing into eval_value. The caller's trampoline loop picks this up and continues iterating:
// crates/sema-eval/src/eval.rs — eval_value_inner
fn eval_value_inner(expr: &Value, env: &Env) -> EvalResult {
let mut current_expr = expr.clone();
let mut current_env = env.clone();
let entry_depth = call_stack_depth();
let guard = CallStackGuard { entry_depth };
loop {
match eval_step(¤t_expr, ¤t_env) {
Ok(Trampoline::Value(v)) => return Ok(v),
Ok(Trampoline::Eval(next_expr, next_env)) => {
// TCO: replace expr and env, continue loop
current_expr = next_expr;
current_env = next_env;
}
Err(e) => { /* attach stack trace if missing */ }
}
}
}This is the same technique described in Guy Steele's "Rabbit" compiler paper (1978), where he showed that tail calls in Scheme can be compiled as jumps. In a native-code compiler (like Chez Scheme), the tail call becomes an actual jmp instruction. In a tree-walking interpreter, we can't jump — but we can return a "please evaluate this next" token and loop. The trampoline bounces between returning tokens and evaluating them, hence the name.
Concrete Example
Consider this tail-recursive loop:
(define (loop n)
(if (= n 0)
"done"
(loop (- n 1))))
(loop 1000000)Without TCO, (loop 1000000) would push one million Rust stack frames and crash. With the trampoline:
eval_stepsees(loop 1000000), evaluatesloopto aLambda, evaluates1000000, callsapply_lambdaapply_lambdacreates a new env withn = 1000000, evaluates theifbody- The
ifspecial form tests(= n 0)→ false, so it returnsTrampoline::Eval((loop (- n 1)), env)— not a recursive call - Back in
eval_value_inner, the loop picks up(loop 999999)as the newcurrent_exprand continues - This repeats 1,000,000 times in constant Rust stack space
The key insight: if is a special form that returns Trampoline::Eval for its branches, and apply_lambda returns Trampoline::Eval for its last body expression. These are the two main sources of tail calls.
How This Differs From Other Approaches
- CPS (Continuation-Passing Style): Some Scheme compilers (such as earlier versions of Orbit and Rabbit) use CPS as an intermediate representation, making every call a tail call. This supports
call/ccnaturally but requires a whole-program transformation. Modern compilers like Chez Scheme use other strategies (nanopass compilation with explicit stack frames). Sema doesn't do CPS; only calls in tail position are optimized. - Bytecode VM: CPython and Lua compile to bytecode and run a
switch-dispatch loop. The VM's instruction pointer replaces the trampoline. Bytecode is faster (smaller dispatch overhead, better cache behavior) but more complex to implement. - Direct recursion: The metacircular evaluator in SICP chapter 4 just calls
evalrecursively. Simple, but blows the stack on deep tail recursion. Sema started this way and migrated to the trampoline.
eval_step: A Single Step
eval_step takes an expression and an environment, and returns one Trampoline:
// crates/sema-eval/src/eval.rs
fn eval_step(expr: &Value, env: &Env) -> Result<Trampoline, SemaError> {
match expr {
// Self-evaluating forms
Value::Nil | Value::Bool(_) | Value::Int(_) | Value::Float(_)
| Value::String(_) | Value::Char(_) | Value::Keyword(_)
| Value::Thunk(_) | Value::Bytevector(_) | Value::HashMap(_)
=> Ok(Trampoline::Value(expr.clone())),
// Symbol lookup
Value::Symbol(spur) => env.get(*spur)
.map(Trampoline::Value)
.ok_or_else(|| SemaError::Unbound(resolve(*spur))),
// Vector/Map: evaluate elements recursively
Value::Vector(items) => { /* eval each element */ }
Value::Map(map) => { /* eval each key and value */ }
// List: special forms, then function application
Value::List(items) => { /* see below */ }
}
}The dispatch order matters:
- Self-evaluating: Numbers, strings, booleans, nil, keywords, etc. return themselves immediately. These are the leaves of every evaluation.
- Symbols: Look up the binding in the environment chain. If not found, return
SemaError::Unbound. - Vectors and maps: Evaluate each element/entry recursively (these are not in tail position, so they call
eval_valuedirectly). - Lists: The interesting case — check for special forms first, then evaluate the head and dispatch on its type.
List Evaluation
When eval_step encounters a list (head arg1 arg2 ...):
// Check head for a special form (O(1) integer comparison)
if let Value::Symbol(spur) = head {
if let Some(result) = special_forms::try_eval_special(*spur, args, env) {
return result;
}
}
// Not a special form — evaluate the head to get the callable
let func = eval_value(head, env)?;
// Dispatch on callable type
match &func {
Value::NativeFn(native) => { /* evaluate args, call Rust fn */ }
Value::Lambda(lambda) => { /* evaluate args, apply_lambda */ }
Value::Macro(mac) => { /* pass unevaluated args, expand */ }
Value::Keyword(spur) => { /* keyword-as-function: (:key map) */ }
other => Err("not callable")
}Note: the head is checked for special forms before it's evaluated. This is essential — if, define, lambda, etc. must not evaluate their arguments eagerly.
Special Form Dispatch
Special form dispatch is the hottest path in the evaluator. Every list expression checks whether its head symbol is one of ~33 special forms. Sema optimizes this with pre-interned Spur constants:
// crates/sema-eval/src/special_forms.rs
struct SpecialFormSpurs {
quote: Spur,
if_: Spur,
cond: Spur,
define: Spur,
lambda: Spur,
// ... 28 more
}These are initialized once and leaked into a &'static reference via Box::leak:
thread_local! {
static SF: Cell<Option<&'static SpecialFormSpurs>> = const { Cell::new(None) };
}
fn special_forms() -> &'static SpecialFormSpurs {
SF.with(|cell| match cell.get() {
Some(sf) => sf,
None => {
let sf: &'static SpecialFormSpurs = Box::leak(Box::new(SpecialFormSpurs::init()));
cell.set(Some(sf));
sf
}
})
}The Box::leak trick converts a heap allocation into a &'static reference that lives for the rest of the program. This is safe because the special form names never change. The result: try_eval_special is a chain of integer comparisons (head_spur == sf.if_), with no string resolution, no hashing, and no allocation.
pub fn try_eval_special(
head_spur: Spur,
args: &[Value],
env: &Env,
) -> Option<Result<Trampoline, SemaError>> {
let sf = special_forms();
if head_spur == sf.quote {
Some(eval_quote(args))
} else if head_spur == sf.if_ {
Some(eval_if(args, env))
} else if head_spur == sf.define {
Some(eval_define(args, env))
}
// ... 30 more branches
else {
None // not a special form
}
}The function returns Option<Result<...>> — None means "not a special form, proceed with function application." This two-level return type avoids the need for a separate "is this a special form?" check.
Literature: This is essentially the same optimization that McCarthy's LISP 1.5 used in 1962 — special forms were identified by their atom pointer, not by string comparison. The Spur is Sema's version of the atom pointer.
Function Application
There are four callable types, each handled differently:
NativeFn
Native functions are Rust closures wrapped in Value::NativeFn. They receive evaluated arguments and return a Value:
Value::NativeFn(native) => {
let mut eval_args = Vec::with_capacity(args.len());
for arg in args {
eval_args.push(eval_value(arg, env)?);
}
push_call_frame(CallFrame { name: native.name.to_string(), ... });
match (native.func)(&eval_args) {
Ok(v) => {
truncate_call_stack(call_stack_depth().saturating_sub(1));
Ok(Trampoline::Value(v))
}
Err(e) => Err(e), // leave frame for stack trace
}
}Native functions always return Trampoline::Value — they don't participate in TCO because their API returns Result<Value, SemaError> rather than Trampoline, so they can't request a tail-eval step. The call frame is pushed before the call and popped on success; on error, it's left in place so the stack trace captures it.
Lambda
Lambdas are the heart of TCO. After evaluating arguments, apply_lambda binds parameters and returns Trampoline::Eval for the last body expression:
// crates/sema-eval/src/eval.rs
fn apply_lambda(lambda: &Lambda, args: &[Value]) -> Result<Trampoline, SemaError> {
let new_env = Env::with_parent(Rc::new(lambda.env.clone()));
// Bind parameters (with rest-param support)
for (param, arg) in lambda.params.iter().zip(args.iter()) {
new_env.set(sema_core::intern(param), arg.clone());
}
// Self-reference for recursion
if let Some(ref name) = lambda.name {
new_env.set(sema_core::intern(name), Value::Lambda(Rc::new(lambda.clone())));
}
// Evaluate all body exprs except the last
for expr in &lambda.body[..lambda.body.len() - 1] {
eval_value(expr, &new_env)?;
}
// TCO: return the last body expr for the trampoline to handle
Ok(Trampoline::Eval(lambda.body.last().unwrap().clone(), new_env))
}This is where TCO actually happens. The last expression is not evaluated by apply_lambda — it's returned as Trampoline::Eval so the trampoline loop can evaluate it without growing the stack. Non-tail body expressions (everything except the last) are evaluated normally via eval_value.
Named lambdas bind themselves in the new environment, enabling direct recursion without requiring letrec or Y-combinator gymnastics.
Macro
Macros receive unevaluated arguments, evaluate their body to produce an expansion, and then return Trampoline::Eval to evaluate the expansion in the caller's environment:
Value::Macro(mac) => {
let expanded = apply_macro(mac, args, env)?;
Ok(Trampoline::Eval(expanded, env.clone()))
}This means macro expansion is always in tail position — the expanded code gets the full benefit of TCO. A macro that expands to a tail call will be optimized just like a direct tail call.
Keyword-as-Function
Keywords can be used as functions for map lookup: (:name person) is equivalent to (get person :name). This is syntactic sugar evaluated directly in eval_step, not a general callable mechanism.
Stack Traces
Sema maintains a thread-local call stack for error reporting:
// crates/sema-eval/src/eval.rs
thread_local! {
static CALL_STACK: RefCell<Vec<CallFrame>> = const { RefCell::new(Vec::new()) };
}Each CallFrame captures the function name, current file, and source span:
// crates/sema-core
pub struct CallFrame {
pub name: String,
pub file: Option<String>,
pub span: Option<Span>,
}Source Span Tracking
Source spans are tracked using a pointer-identity trick. The reader stores Span information in a HashMap<usize, Span> keyed by the Rc::as_ptr address of each list's backing Vec:
fn span_of_expr(expr: &Value) -> Option<Span> {
match expr {
Value::List(items) => {
let ptr = Rc::as_ptr(items) as usize;
lookup_span(ptr)
}
_ => None,
}
}This avoids storing spans inside Value (which would increase the size of every value). The trade-off: spans are only available for list expressions (and vectors, which the reader also registers). Spans remain valid across Rc::clone() calls since cloning an Rc preserves the inner pointer. They only become stale if a list is rebuilt into a new Rc<Vec<_>> (e.g., by map or filter constructing a fresh vector). In practice this works well because error-producing expressions are typically the original parsed forms.
RAII Stack Management
The call stack is managed with a guard struct that truncates on drop:
struct CallStackGuard {
entry_depth: usize,
}
impl Drop for CallStackGuard {
fn drop(&mut self) {
truncate_call_stack(self.entry_depth);
}
}This ensures the call stack is cleaned up even if evaluation panics or returns early via ?. The guard is created at the top of eval_value_inner and persists for the duration of the trampoline loop.
Tail Call Frame Trimming
When the trampoline handles a tail call (Trampoline::Eval), it trims accumulated call frames above the entry depth, keeping only the most recent one:
Ok(Trampoline::Eval(next_expr, next_env)) => {
CALL_STACK.with(|s| {
let mut stack = s.borrow_mut();
if stack.len() > entry_depth + 1 {
let top = stack.last().cloned();
stack.truncate(entry_depth);
if let Some(frame) = top {
stack.push(frame);
}
}
});
current_expr = next_expr;
current_env = next_env;
}Without this trimming, a tail-recursive loop of 1M iterations would accumulate 1M call frames — the stack trace wouldn't overflow the Rust stack, but it would consume unbounded memory. The trim keeps exactly one frame for the current function, which is all you need to know where execution is.
Guardrails
MAX_EVAL_DEPTH (1024)
The evaluator tracks nesting depth via the thread-local EVAL_DEPTH counter, incremented on every eval_value call and decremented on return:
const MAX_EVAL_DEPTH: usize = 1024;
pub fn eval_value(expr: &Value, env: &Env) -> EvalResult {
let depth = EVAL_DEPTH.with(|d| {
let v = d.get();
d.set(v + 1);
v
});
if depth > MAX_EVAL_DEPTH {
return Err(SemaError::eval("maximum eval depth exceeded (1024)"));
}
// ...
}This protects against non-tail recursion that the trampoline can't help with — for example, (f (f (f ...))) nests eval_value calls for each argument evaluation. The limit of 1024 is generous enough for real programs but prevents native stack overflow.
Note that tail-recursive loops do not increase the depth counter, because each iteration returns from eval_value_inner via the trampoline loop rather than nesting a new eval_value call.
EVAL_STEP_LIMIT (Fuzzing)
For fuzz testing, an optional step counter limits the total number of trampoline iterations:
thread_local! {
static EVAL_STEP_LIMIT: std::cell::Cell<usize> = const { std::cell::Cell::new(0) };
static EVAL_STEPS: std::cell::Cell<usize> = const { std::cell::Cell::new(0) };
}When the limit is non-zero, every trampoline step increments EVAL_STEPS and checks against the limit. This catches infinite loops that the depth limit can't — a tail-recursive (define (f) (f)) uses O(1) stack depth but infinite steps. The step counter is reset at each top-level eval_value call (when EVAL_DEPTH is 0).
Environment Model
Sema uses a linked-list scope chain, where each scope is a BTreeMap keyed by Spur:
// crates/sema-core/src/value.rs
pub struct Env {
pub bindings: Rc<RefCell<BTreeMap<Spur, Value>>>,
pub parent: Option<Rc<Env>>,
}Rc<RefCell<...>> makes each scope mutable and reference-counted. BTreeMap<Spur, Value> provides deterministic ordering (important for printing and testing) and is faster than HashMap for the very small maps (1–5 entries) typical of let and lambda scopes — at that size, a few integer comparisons in the B-tree node are cheaper than computing a hash (see the Performance Internals page for the benchmark).
Operations
| Method | Behavior |
|---|---|
get(spur) | Walk the parent chain, return first match |
set(spur, val) | Insert into the current (innermost) scope |
set_existing(spur, val) | Walk the chain, update where found (for set!) |
take(spur) | Remove from current scope only (for COW optimization) |
take_anywhere(spur) | Remove from any scope in the chain |
update(spur, val) | Overwrite in current scope without re-hashing (for hot loops) |
The take method is critical for the copy-on-write map optimization described in the Performance page — by removing a value from the environment before passing it to a function, the Rc reference count drops to 1, enabling in-place mutation.
Literature: This is the standard lexical environment model described in Lisp in Small Pieces (Queinnec, 1996, Chapter 6) — a chain of frames linked by static (lexical) pointers. The alternative for lexical scoping — flat closures that copy all free variables into each closure — is faster for lookup but uses more memory when closures share large environments. Sema uses the chained model because closures are pervasive and lookup cost is dominated by the Spur integer comparison, not chain traversal.
Adding a Special Form
If you need to add a new special form to Sema:
- Add a
Spurfield toSpecialFormSpursincrates/sema-eval/src/special_forms.rsand initialize it ininit()withintern("your-form-name") - Add a match arm in
try_eval_special()— returnSome(eval_your_form(args, env)) - Implement
eval_your_form— it receivesargs: &[Value](unevaluated) andenv: &Env, and must returnResult<Trampoline, SemaError>. If the form has a natural tail position (likeif's branches), returnTrampoline::Evalfor TCO; otherwise returnTrampoline::Value - Add an integration test in
crates/sema/tests/integration_test.rs - Document the form in
website/docs/language/special-forms.md
Further Reading
- Christian Queinnec, Lisp in Small Pieces (Cambridge, 1996) — the canonical deep-dive into Lisp interpreter and compiler implementation, covering environment models, continuations, and compilation strategies
- Guy Lewis Steele Jr., "Rabbit: A Compiler for Scheme" (MIT AI Memo 474, 1978) — introduces the trampoline concept and proves that tail calls can be implemented as jumps
- Abelson & Sussman, Structure and Interpretation of Computer Programs (MIT Press, 1996) — Chapter 4's metacircular evaluator is the template for Sema's
eval_step/apply_lambdasplit; Chapter 5 shows how to compile to a register machine - R. Kent Dybvig, "Three Implementation Models for Scheme" (PhD thesis, 1987) — compares heap-based, stack-based, and string-based models; Sema uses heap-based (Rc+RefCell scopes)