Cost Tracking & Budgets
Usage Tracking
llm/last-usage
Get token usage from the most recent LLM call.
(llm/last-usage) ; => {:prompt-tokens 42 :completion-tokens 15 ...}llm/session-usage
Get cumulative usage across all LLM calls in the current session.
(llm/session-usage)llm/reset-usage
Reset session usage counters.
(llm/reset-usage)Pricing Sources
Sema tracks LLM costs using pricing data from multiple sources, checked in this order:
- Custom pricing — set via
(llm/set-pricing "model" input output), always wins - Dynamic pricing — fetched from llm-prices.com during
(llm/auto-configure), cached locally at~/.sema/pricing-cache.json - Built-in estimates — hardcoded fallback table (may be outdated)
- Unknown — if no source matches, cost tracking returns
niland budget enforcement is best-effort
Dynamic pricing is fetched with a short timeout (2s) and failures are silently ignored. The language works fully offline — the cache persists between sessions.
llm/pricing-status
Check which pricing source is active and when it was last updated.
(llm/pricing-status)
; => {:source fetched :updated-at "2025-10-10"}
; or {:source hardcoded} if no dynamic pricing is availableBudget Enforcement
Note: If pricing is unknown for a model (not in any source), budget enforcement operates in best-effort mode — the call proceeds with a one-time warning. Use
(llm/set-pricing)to set pricing for unlisted models.
llm/set-budget
Set a spending limit (in dollars) for the session. LLM calls that would exceed the budget will fail.
(llm/set-budget 1.00) ; set $1.00 spending limitllm/budget-remaining
Check current budget status.
(llm/budget-remaining) ; => {:limit 1.0 :spent 0.05 :remaining 0.95}llm/clear-budget
Remove the spending limit.
(llm/clear-budget)llm/set-pricing
Set custom pricing for a model (overrides both dynamic and built-in pricing). Costs are per million tokens.
(llm/set-pricing "my-model" 1.0 3.0) ; $1.00/M input, $3.00/M outputBatch & Parallel
llm/batch
Send multiple prompts concurrently and collect all results.
(llm/batch ["Translate 'hello' to French"
"Translate 'hello' to Spanish"
"Translate 'hello' to German"])llm/pmap
Map a function over items, sending all resulting prompts in parallel.
(llm/pmap
(fn (word) (format "Define: ~a" word))
'("serendipity" "ephemeral" "ubiquitous")
{:max-tokens 50})