Changelog
All notable changes to Noumenon, mirrored from CHANGES.md in the source repo.
Unreleased
0.12.3
Fixes
noum askandnoum introspectnow report a clean 400 whenNOUMENON_LLM_BASE_URLisn't a real URL — saving a bare alias likeclaudein~/.noumenon/credentials(instead ofhttps://api.anthropic.com) used to slip past the existing blank-check inllm/require-base-url!, so the value flowed all the way into the http-kit call that builds<base-url>/v1/messages. http-kit then failed URL parsing withhost is null: claude/v1/messages, the daemon's route catch-all rewrote that as a generic 500, and the launcher displayedError: Internal server error— nothing pointing at credentials. Every command that touches the LLM (ask,introspect,analyze,synthesize,enrich --analyze,digest,benchmark,update --analyze) shared this failure mode because they all route throughllm/make-messages-fn-from-opts.valid-base-url?now parses the value as ajava.net.URIand requires an http(s) scheme plus a non-blank host;require-base-url!,require-api-key!, andrequire-model!all tag their ex-info with:status 400so the HTTP route catch-all serves them as400 "Invalid NOUMENON_LLM_BASE_URL: \"claude\". Expected an absolute URL with scheme and host (e.g. https://api.anthropic.com)."instead of swallowing the message into a 500.
0.12.2
Fixes
noum updateandnoum watchnow stream progress instead of going silent — both commands posted to/api/updatewithout SSE, and the server'shandle-updatehad no streaming path, so the launcher's HTTP request blocked until the daemon finished a fresh import + LLM analysis pass (potentially many minutes) with no per-file feedback. The output looked indistinguishable from a hang: only the JRE-selection line printed, then dead silence until completion. The server now routeshandle-updatethroughmw/with-sse(mirroring/api/import//api/analyze//api/enrich//api/digest) and threads a:progress-fnthroughsync/update-repo!intogit/import-commits!,imports/enrich-repo!, andanalyze/analyze-repo!. The launcher adds"update"to itsprogress-commandsset so it requests SSE and renders a TUI spinner/bar, andwatch-loop!now builds a fresh progress handler per polling iteration so each in-flight iteration shows live progress instead of going dark for the duration of the call.
0.12.1
Fixes
noumreuses an existing system Java when one is available — the launcher used to unconditionally download a ~200MB JRE to~/.noumenon/jre/on first run, even when the user already had Java 21+ installed. It now checks$JAVA_HOMEandjavaonPATHfirst; if either points at a Java 21+ runtime (the minimum the uberjar targets — older JVMs fail withUnsupportedClassVersionErrorat class-load time), the launcher uses it and skips the download. The bundled JRE remains the fallback for users on Java 17 or older, or with no Java at all.noumJRE bootstrap no longer fails on WSL with a cross-filesystem extraction error — the launcher staged the downloaded JRE in/tmp(which on WSL istmpfs) before moving it into~/.noumenon/jre/(which lives on ext4).java.nio.file.Files/movecan rename single files across filesystems but throwsFileSystemExceptionon non-empty directories, so the move blew up on the JRE'slegal//bin//lib/subdirs withError: /tmp/noum-jre-…/jdk-21…/legal. Staging now happens under~/.noumenon/jre-staging-…/so the move is always intra-filesystem; a copy-tree fallback covers any other cross-fs configuration.
0.12.0
BREAKING CHANGES
- LLM configuration collapsed to two env vars (plus one optional) — Noumenon used to carry a multi-provider router: an EDN-encoded provider map (
NOUMENON_LLM_PROVIDERS_EDN), a default-provider selector (NOUMENON_DEFAULT_PROVIDER), per-provider env keys (NOUMENON_ZAI_TOKEN,ANTHROPIC_API_KEY), a runtime-mode toggle (NOUMENON_RUNTIME_MODE), an HTTPS allowlist (NOUMENON_LLM_BASE_URL_ALLOWLIST_EDN), model aliases (sonnet/haiku/opus), a--providerflag on every LLM-touching subcommand,llm-providers/llm-modelsCLI subcommands, and matchingnoumenon_llm_providers/noumenon_llm_modelsMCP tools. All of that is gone.
What changed
- Replaced: NOUMENON_LLM_PROVIDERS_EDN, NOUMENON_DEFAULT_PROVIDER, NOUMENON_LLM_PROVIDER, NOUMENON_ZAI_TOKEN, ANTHROPIC_API_KEY, NOUMENON_RUNTIME_MODE, and NOUMENON_LLM_BASE_URL_ALLOWLIST_EDN → with NOUMENON_LLM_BASE_URL (required), NOUMENON_LLM_API_KEY (required), and NOUMENON_LLM_MODEL (optional default for --model). - Removed: the --provider flag on every subcommand; the llm-providers and llm-models CLI subcommands; the noumenon_llm_providers and noumenon_llm_models MCP tools; the provider property on every other MCP tool input schema; the sonnet/haiku/opus model aliases. --model now takes a raw model id and passes it through to the upstream endpoint verbatim. - Credentials file fallback: ~/.noumenon/credentials is now read directly by Noumenon as a fallback to env vars — no source step is needed. The fallback is automatically disabled when the HTTP daemon binds to anything other than 127.0.0.1, so a shared-service deployment cannot pick up a user's on-disk credentials.
Why
Noumenon is not an LLM router. The provider-map config was an in-house mini-router that duplicated what dedicated tools — OpenRouter, LiteLLM, and any Anthropic-Messages-API-compatible gateway — do far better, with broader provider coverage and active maintenance. For multi-model flexibility, point NOUMENON_LLM_BASE_URL at one of those instead. This was the highest-friction surface in the codebase and the collapse removes ~300 lines of routing, validation, allowlisting, and discovery code with no loss of supported capability. The local single-user and headless shared-service deployment shapes are both still first-class — the daemon's bind address picks which credential-resolution policy applies.
How to upgrade
Pick the upstream and set three shell variables (or run noum setup to populate ~/.noumenon/credentials interactively):
```sh # Anthropic direct export NOUMENON_LLM_BASE_URL=https://api.anthropic.com export NOUMENON_LLM_API_KEY=sk-ant-... export NOUMENON_LLM_MODEL=claude-sonnet-4-6-20250514
# OpenRouter (Anthropic-compatible route) export NOUMENON_LLM_BASE_URL=https://openrouter.ai/api/v1 export NOUMENON_LLM_API_KEY=sk-or-... export NOUMENON_LLM_MODEL=anthropic/claude-sonnet-4-5
# Local LiteLLM (Anthropic-format proxy) export NOUMENON_LLM_BASE_URL=http://localhost:4000 export NOUMENON_LLM_API_KEY=sk-litellm-master-... export NOUMENON_LLM_MODEL=
If you previously used --provider claude --model sonnet, drop --provider entirely and pass the full upstream-recognized model id via --model (or set NOUMENON_LLM_MODEL). Old env vars are not consulted — the launcher's noum setup wizard will prompt for the new ones.
0.11.1
Fixes
- Docker image build adds
gitto the build stage — the v0.11.0 release pushed clj-p4 from:local/rootto{:git/tag "v0.6.1-alpha"}, soclojure -T:build ubernow needs to clone the dependency via git inside the build container. The build stage'sapk add --no-cache curl bashdid not include git, so the docker job in the release workflow failed atclojure -T:build uberwith a tools.deps git-clone error. The runtime stage already had git for noumenon's own clone-and-import workflow; the build stage now matches.
0.11.0
Changed
- clj-p4 upgraded to v0.6.1-alpha and pinned to a public Git tag —
deps.ednpreviously resolved clj-p4 via:local/root "../clj-p4", which made noumenon non-buildable on any machine without a sibling clone of the library. The coordinate is now{:git/tag "v0.6.1-alpha" :git/sha "332b280"}, matching the formatcognitect-labs/test-runneralready uses in the same file. clj-p4 v0.4.0-alpha renamed several namespaces (clj-p4.exclude→clj-p4.excludes,clj-p4.spec→clj-p4.predicates,clj-p4.shell.proc→clj-p4.io.subprocess) and renamedapi/sync!toapi/fetch!; v0.6.0-alpha removed the pre-compiled:excludeescape hatch outright (now throws:legacy-exclude-removedat the boundary).noumenon.p4was still calling all four; the adapter is rewritten against the post-0.4.0-alpha API surface. No public-API change fornoumenon.p4's callers (noumenon.repo,noumenon.repo-manager,noumenon.sync). - Binary filtering is delegated to clj-p4 — clj-p4 v0.5.0+ ships its own curated binary-category set (nine categories, ~77 patterns, including Wwise
.bnk/.wemand Unreal cooked content.uasset/.umap/.upk/.ubulkthat the noumenon-side list also covered) plus a Perforce-type catch-all that drops any revision whose:rev/typeis:binary/:apple/:resource. Bothnoumenon.p4/clone!andnoumenon.p4/sync!now pass:exclude-binaries? true :exclude-categories :all, replacing the noumenon-side:excludepattern vector compiled fromresources/p4-excludes.edn. This subsumes the post-0.10.3 fix that madesync!forward the same exclude vector asclone!— clone-vs-sync symmetry is now structural (identical fixed policy at both call sites) rather than a bug that had to be tracked.
Removed
resources/p4-excludes.edn— 42-line categorised extension blocklist that duplicated clj-p4's own built-in list. clj-p4 owns the responsibility now; the file is gone, the resource loader (excludes-resourcedelay innoumenon.p4) is gone, and thecompile-excludesprivate helper is gone.:no-default-excludes?,:extra-excludes,:includesoptions onnoumenon.p4/clone!— these were documented in the function's docstring but had no actual producer insrc/ortest/. If a future caller needs path-level carve-outs, plumb:excludes/:includesthroughp4-optsdirectly toclj-p4.api/clone!/fetch!.
0.10.3
Fixes
- Introspect
extra_reposresolver and target-set are no longer duplicated —resolve-extra-reposhad near-identical inline copies in CLI introspect (cli/commands/introspect.clj) and HTTP introspect (http/handlers/introspect.clj); they used different conn-open helpers (db/connect-and-ensure-schemavsdb/get-or-create-conn) but were otherwise the same shape. Lifted tonoumenon.repo/resolve-extra-reposso both transports go through one definition. The redundantmw/allowed-introspect-targetsset definition inhttp/middleware.cljis removed in favor of the canonicalutil/valid-introspect-targetsthat CLI already used. noum queryclamps result sets like the HTTP API — CLI returned the full:okseq fromquery/run-named-query, sonoum query recent-commits .against a large repo dumped every row to stdout. HTTP capped at 500 by default and 10000 max viaclamp-limit. CLI now accepts--limit <n>(default 500, clamped to [1, 10000]) and applies it via the samequery/clamp-limithelper — lifted intonoumenon.queryso CLI and HTTP share one definition.noum introspect --git-commitis silently disabled on bare repos — HTTP/api/introspectalready gated:git-commit?on(not (bare-repo?))so a bare clone (no working tree) wouldn't hit a confusing failure deep in the commit step. The CLI passed the flag through unchecked. CLI now matches: when--git-commitis set against a bare repo, the flag is silently downgraded and a[--git-commit ignored] target is a bare git repomessage is logged so the user sees what happened.- HTTP
digestruns the same pipeline as CLIdigest— CLI digest was:update → analyze → resolve calls → synthesize → embed → benchmark. HTTP digest was missing theresolve callsandembedsteps entirely, so a daemon-mode digest produced a graph with no cross-segment call edges (:code/calls) and no TF-IDF index.noumenon_searchreturned no results, andsegment-callers/uncalled-segmentsqueries came back empty. HTTP now runs both steps in the same order: calls is gated onskip_analyze(matching CLI), embed runs unconditionally (matching CLI). Result map exposes:callsand:embedkeys. - HTTP enrich/analyze/synthesize refuse to silently create empty DBs — calling
/api/enrich,/api/analyze, or/api/synthesizeon a never-imported repo would letdb/get-or-create-conncreate an empty Datomic DB on disk and the handler would report "0 files processed" success — leaving the user with a phantom database and the impression that nothing was wrong. CLI'swith-existing-dbalready errored out in the same scenario; HTTP now matches via a newwith-imported-repomiddleware wrapper that returns404 "Database not yet imported … Run /api/import first"when the on-disk db dir doesn't exist. Endpoints that legitimately establish the DB (import, update, digest) keep using the create-on-demandwith-repo. noum askpersists session records to the meta DB — only the HTTP/api/askhandler calledask-store/save-session!after each ask invocation. The CLI returned the answer to stdout but left no trace in the meta DB, so CLI ask sessions were invisible to the introspect loop's feedback/training signal andnoum ask --continue-from <id>couldn't reference its own prior sessions. CLI now saves the session with:channel :cli,:caller :human, the resolved repo db-name, and accurate wall-clock duration. Save failures are logged and swallowed so the answer always reaches the user.noum ask --max-iterationsis now capped at 50 — the HTTP/api/askhandler clamped:max_iterationsto[1, 50]so an LLM agent (or careless caller) couldn't run away with cost. The CLI'sdo-askpassed the user-supplied value straight through, sonoum ask -q "…" --max-iterations 10000 .would let the agent loop run 10000 iterations regardless of--max-cost/--stop-after. CLI now applies the same[1, 50]clamp; the default (10) is unchanged.- HTTP synthesize and digest reseed artifacts before running — CLI synthesize and CLI digest both call
artifacts/reseed!before constructing the LLM, so an updated prompt/query/rules seed is picked up automatically. The HTTP handlers skipped this step, so a daemon-modenoum digestproduced different results than the CLI version after a seed-source edit.run-synthesizeandrun-digestnow run reseed up front, matching the CLI contract.reseed!is identity-attribute-driven and idempotent, so the cost is zero on the steady state. - HTTP synthesize and HTTP digest's synth step honor
:max-tokens 16384—POST /api/synthesizeand the synthesize step insidePOST /api/digestwere callingwrap-as-prompt-fn-from-optswith no max-tokens override, so synth output (long architectural component descriptions) was getting truncated at the provider default (typically 4096) on the daemon path. The CLI synthesize command already raised the cap to 16384; HTTP now matches. The digest handler builds a separatesynth-llmfor the synthesize step (mirroring the CLI digest pattern) so the analyze and benchmark steps keep using the unraised cap. noum list-databases --deleterefuses to wipe the meta DB — the CLI's--deletebranch had no reserved-name guard, so a typo could destroynoumenon-internaland take every prompt, query, rules artifact, benchmark/introspect run record, ask session, token, and setting with it. The HTTPDELETE /api/databases/:namehandler already rejected this with a 400; the CLI now matches with aCannot delete reserved database: <name>error and exit 1. The CLI delete path also now callsdb/evict-conn!after a successful delete (matching the HTTP handler), so a stale cached connection doesn't survive past the deletion.noum analyze --no-promotenow actually bypasses the promotion cache — the--no-promoteflag was only exposed on the HTTP and MCP transports; the CLI didn't even define it, so users who wanted to force a fresh LLM call had no way to do so vianoum analyze. Added the flag to the analyze CLI spec and plumbed it throughbuild-analyze-optstoanalyze-repo!'s:no-promote?parameter, matching HTTP's(boolean (:no_promote params))shape so missing/false/true all produce a definite boolean. CLI, HTTP, and MCP now agree on the contract.
Removed
- Dead
mcp.handlers.*namespace tree —fd43977 refactor(mcp): make the MCP server a pure proxy(2026-04-30) made the bridge forward everytools/callto the daemon over HTTP and removed the in-process handler dispatch. The five handler namespaces (mcp.handlers.{query,mutation,benchmark,introspect,meta}) and their support helpers inmcp/util.clj(with-conn,lookup-repo-uri,resolve-extra-repos,selector-opts,validate-llm-inputs!,provider+model,validate-layers, length-cap defs,allowed-layers/allowed-introspect-targetssets) have been carrying no live callers since then. Deletion is no behavior change — the actual MCP behavior is whatever the HTTP daemon does. Audit findings about MCP-vs-HTTP drift in those handlers were false positives against dead code; this removes the surface area so future audits see only live code paths. --no-auto-updateflag onnoum serve— the only consumer of the:auto-updatesetting wasmcp.util/with-conn's auto-update branch, deleted alongside the handler tree. The serve command's epilog text is updated to reflect that the bridge is stateless and forwards to the daemon.
Fixes
- HTTP
POST /api/analyzehonors thereanalyzeparameter —noum analyze . --reanalyze stale(and any other scope) silently produced zero analyzed files because the daemon's HTTP handler never called the retraction step that the CLI and MCP handlers both did. The:reanalyzefield on the JSON body was dropped on the floor; the user only sawdigestwork because itsupdatestep retracted analysis on changed files via a different code path. The two near-identical local copies ofprepare-reanalysis!(one incli/commands/pipeline.clj, one inmcp/handlers/mutation.clj) are lifted to a singlenoumenon.sync/prepare-reanalysis!plus a sharedvalid-reanalyze-scopesset; the HTTP handler now calls it beforeanalyze-repo!and returns 400 on an invalid scope. CLI, HTTP, and MCP now agree on the contract.
0.10.2
Fixes
benchmarkno longer eagerly resolves a provider when there's no model to use —load-run-contextalways calledllm/make-isolated-prompt-fnfor any run that included the:rawlayer, which forced full provider resolution (model lookup, API-key validation) at construction time even if the caller had supplied aninvoke-llmmock. Tests that didn't bother passing:model-config(defaulted to{:provider "glm"}) crashed with "No model selected for provider glm"; tests that did pass a:provider "claude"config crashed on the missing API key. The construction is now gated on(:model model-config)— explicit model present means the user wants the isolated path; without it,select-llm-fnfalls back to the maininvoke-llmfor:rawstages, matching its existing fallback contract. Thewith-bench-mockstest helper also stubsmake-isolated-prompt-fnso tests that DO pass an explicit model config (without a real API key) don't trigger the API-key check.elixir-testskips cleanly when Erlang runtime is missing — the test guard checkedwhich elixironly. On a machine withelixiron PATH but no Erlang (erl: not found), thewhichtest passed, downstream import-extraction silently produced zero edges, and the count-query assertion(pos? (ffirst edges))NPEd becauseffirstof an empty Datalog result is nil. The guard now runselixir --versionso the four Jason tests skip cleanly when the runtime is broken; the assertion is also nil-safe ((or (ffirst edges) 0)) so a mis-configured extraction surfaces as "expected positive edge count, got 0" instead of an NPE.clojure -M:testruns the suite again — moving the per-language fixture sources totest/fixtures/brought files liketest/fixtures/clojure/test/myapp/core_test.cljunder cognitect.test-runner's namespace-discovery walk. The fixture's declared ns (myapp.core-test) doesn't match its filesystem location, so(require 'myapp.core-test)failed withFileNotFoundExceptionand the runner aborted before any test ran. The clj-kondo exclusion was added inf741235but the equivalent test-runner exclusion was not. The:testalias now restricts discovery totest/noumenonvia-d(for:main-opts) and:dirs(for:exec-fn); CI'sclojure -M:test 2>&1 | tail -20shows real test output instead of the FileNotFoundException startup error.noum stopadopts an orphan daemon whendaemon.ednis missing — when an earlier failed stop or a partial cleanup left the daemon JVM running but removed~/.noumenon/daemon.edn, the launcher had no record of the daemon and reported "No managed daemon to stop." while the JVM kept holding the meta-db lock indefinitely. The user's only recourse waskill -9from outside the tool.noum stopnow falls through to thelsoflock-holder probe (the same one already used bynoum startto name the conflicting PID) and adopts that PID through the existing SIGTERM-then-SIGKILL fallback. Output names the path taken so it's auditable: "No daemon.edn; adopting orphan PID N (cmdline)." → "Orphan daemon stopped (PID N)." or "Orphan daemon force-killed (PID N)."- Duplicate query-string keys return 400 instead of silently last-wins —
parse-query-paramscollapsed repeated keys via(into {} …), so?repo_path=/safe&repo_path=/etcreached the handler with/etcand a defender filtering on the first occurrence would miss the actual input. The dispatcher now rejects requests with any duplicated key with400 "duplicate query parameter <name>". Single-value query strings keep their existing first-and-only semantics. - Negative
limitno longer returns silently empty results —POST /api/query,/api/query-raw,/api/query-as-of, and/api/query-federatedclamped only the upper bound ((min limit 10000)), solimit:-5flowed through to(take -5 …)which returned[]while:totalreported a non-zero count — a scripted client would misread the empty:resultsas "no data". A newclamp-limitmiddleware helper floors at 1 and caps at 10000; missing or unparseable values still default to 500. - Readers can attach feedback to their own ask sessions —
auth/admin-only-prefixeslisted/api/ask/sessionsas a bare prefix andrequires-admin?matched it viastarts-with?, so all three sub-routes (list, detail-get, feedback-post) gated on admin. That meant a reader who created an ask session via/api/askcould not then post feedback on it — defeating the introspect loop's signal-harvesting hook. A newreader-allowed-patternslist overrides the prefix match forPOST /api/ask/sessions/:id/feedbackspecifically; the list and detail endpoints stay admin-only. - LLM model-resolution errors return 400 with the cause, not bare 500 —
resolve-model-idrejects three configuration-class shapes ("No model selected for provider X", "Configured :default-model is not listed in :models", "Model X is not configured") viaex-infowith no:status, so any HTTP route that ran an LLM (/api/ask,/api/analyze,/api/synthesize,/api/digest) lost the actionable message inside the routes-handler 500 fallback — the user only saw "Internal server error". All three throws now carry:status 400 :message <reason> :user-message <reason>. The same shapes are also user-actionable (the client can passmodelin the request body to override), so 400 is the right code. - Missing
query_namereturns a clean 400, not the 89-query registry —POST /api/queryand/api/query-as-oflength-cappedquery_namebut didn't reject blank/missing values; they forwarded an empty string toquery/run-named-query, which built an "Unknown query: " error string and concatenated all 89 query names — a ~4 KB response for what should be"query_name is required". Both handlers now reject the missing field up front, beforewith-repo, so the error is also independent of repo state. noum daemonandnoum servehonorNOUMENON_DB_DIR— both CLI commands hardcoded a~/.noumenon/datafallback before the env-var lookup ever ran, soNOUMENON_DB_DIR=/some/path noum daemonsilently kept locking~/.noumenon/data. The lookup is hoisted into a singleresolve-db-dirhelper with documented precedence:--db-dirflag →NOUMENON_DB_DIRenv →~/.noumenon/data.http.server/resolve-server-config's own env-var support remains unchanged for callers that bypass the CLI; the CLI just stops shadowing it./api/artifacts/history?type=prompt(noname) returns 400, not 500 — the HTTP handler forwarded a nil name into the Datalog query insideartifacts/prompt-history, which barfed with "Unable to find data source: $__in__2" and surfaced to the client as a generic "Internal server error". The MCP handler already validated this branch; the HTTP handler now matches with a clean 400 "name is required when type is 'prompt'", plus a length cap onname.paramsas a JSON array (or any non-object) returns 400, not 500 —POST /api/query,/api/query-as-of, and/api/query-federatedkeywordizedparamswith(into {} (map (fn [[k v]] …)) raw)*before* any validator ran. Passing{"params":[1,2]}made the transducer try to destructure a Long as a[k v]pair, which threwIllegalArgumentExceptionand surfaced to the client as a generic 500.validate-params!now type-checks first (rejects non-maps with a clear "params must be an object" message), and the three handlers run that check before keywordizing — so a non-map shape always 400s, regardless of whether the repo resolves.- Bad URLs to
POST /api/reposnow return 400 instead of 500 —git/validate-clone-url!andvalidate-url-host!rejected unsafe URLs (file://,http://localhost,http://127.0.0.1, unresolvable hosts) by throwingex-infowith no:status, so the HTTP routes handler's(or status 500)fallback turned every kind of bad URL into a generic 500 with the body{"ok":false,"error":"Internal server error"}. The actual rejection reason was only visible in the daemon log. Both validators now set:status 400 :message <reason>, matching the pattern every other input validator already follows. schema-summaryrenders value-type and cardinality as keywords, not raw entity IDs —query/list-attributeswas binding?vt/?carddirectly to the value-type/cardinality entity refs. Datomic Local does not auto-resolve refs in:find, so the JSON returned byGET /api/schema/<db-name>(and the MCPnoumenon_get_schematool that wraps it, and the CLIshow-schemaoutput) printed lines like:arch/component 20 35 — …instead of:arch/component :db.type/ref :db.cardinality/one — …. The endpoint's whole point is to give an LLM/agent something it can read, so the numeric IDs made the surface effectively useless. The query now joins to:db/identfor both refs.- HTTP
/api/importnow persists:repo/head-sha— the CLI'sdo-importalready wrote:repo/uriand:repo/head-shaafter a fresh import, but the HTTP handler'srun-importdid not, soGET /api/status/<db-name>(and the MCPnoumenon_statustool that wraps it) returnedhead-sha: nullafter the documented first-step workflow. The MCP description tells callers to "compare withgit rev-parse HEADto check if the knowledge graph is up to date" — that comparison was useless on a freshly-imported repo.run-importnow mirrors the CLI and transacts{:repo/uri repo-path :repo/head-sha (git/head-sha repo-path)}after the import is complete; the SHA is also returned in the response body. derive-db-namedisambiguates same-basename repos with a path hash — two filesystem paths that happened to share a basename (e.g. monorepo subdirs both namedrepo, or two clones of the same project at different locations) silently collapsed to the same Datomic database. The user's knowledge graph for one repo was getting merged with another's commits, files, and analyses with no warning. db-name format is now<sanitized-basename>-<12-hex-of-canonical-path>; same canonical path always yields the same db-name (so re-running on the same repo is still idempotent), but different paths now never collide. Existing databases under the old bare-basename names will not be recognized after upgrade — re-import the affected repos. Tests that hardcoded"ring"/"jason"/"mino"as db-names now derive the name throughutil/derive-db-nameso they continue to track the CLI's actual derivation.
0.10.1
Fixes
- CI lint excludes test fixtures — moving
test-fixtures/totest/fixtures/in 0.10.0 brought the per-language fixture tree under theclojure -M:lintscan path (src test). One Clojure fixture (test/fixtures/clojure/test/myapp/core_test.clj) requiresmyapp.corewithout using it — intentional, since the fixture imitates an "imports unused namespace" pattern that the import-extraction logic is supposed to detect — but clj-kondo flagged it as a warning and exited with code 2. New.clj-kondo/config.ednadds{:output {:exclude-files ["test/fixtures/.*"]}}so all language fixtures are skipped uniformly.
0.10.0
Changed
- MCP server is now a pure proxy —
mcp/serve!no longer falls back to opening local Datomic when no daemon is reachable. The in-processtool-handlersmap andhandle-tools-call(along with thesuppress-datomic-logging!call and the dependency onmcp.handlers.*from the bridge) are gone; everytools/callforwards to whatever daemonproxy/resolve-connpicks at that exact moment, with the lookup happening per call so a daemon that comes up mid-session is used immediately. When no daemon is reachable, the bridge returns a structured MCP error pointing the caller atnoum startrather than racing the daemon for the Datomic file lock and silently winning. The HTTP daemon's routes still use theh-mut/h-query/h-meta/h-bench/h-introhandler namespaces — only the bridge stopped touching them. do-serveensures a daemon before booting the MCP bridge — new namespacenoumenon.daemon-controlexposesensure-spawned!, a one-shot bootstrap that returns the existing daemon connection if one is reachable, otherwiseProcessBuilder-spawns a sibling JVM running the same code via thedaemonsubcommand and polls every 500ms (up to 15s) for it to become healthy. The argv is built fromjava.home+java.class.path, so it works for both uberjar runs (single jar on classpath) and devclj -M:runruns (long classpath) without branching. The spawned daemon is detached — it outlives the MCP bridge, so the next Claude Code session reuses it. Composition lives indo-serve(the CLI subcommand handler), not inmcp/serve!itself;daemon-controlis intentionally not an Integrant component because the daemon must outlive its caller and Integrant has no idiomatic "attach to existing instance" pattern.
Fixes
noum startsurfaces the real daemon-start error instead of a 30-second timeout —start!now detects an early JVM exit via.isAliveon the spawned:proc, so a daemon that crashes in the first second fails fast with the actual cause rather than waiting out the full 30 seconds and reporting a meaningless "failed to start within 30 seconds." The launcher also switches:out/:errfromio/writer (... :append true)to[:append paths/daemon-log]so the OS handles the redirect — the buffered Java writer was never being flushed before the launcher threw, eating the daemon JVM's stderr. Failure messages now include the last 30 lines of~/.noumenon/daemon.log, and a true 30-second timeout.destroys the JVM instead of leaking it.noum startnames the lock holder when daemon-start fails — when the daemon JVM exits before becoming healthy, the failure message now includes "Lock currently held by PID N (cmdline)" iflsoffinds a process holding~/.noumenon/data/noumenon/noumenon-internal/.lock. The cmdline is truncated to ~120 characters with a middle ellipsis so a Java classpath doesn't drown the message. Bothlsofandps -p X -o command=behave identically on macOS and Linux. Iflsofis missing or the lock is free, the helper silently returns nil and the failure message is unchanged — never block the error path.noum stopconfirms the kill and falls back to SIGKILL — the previousstop!had three independent bugs that compounded into orphan JVMs: only sent SIGTERM with no fallback when the JVM hung in shutdown, printed "Daemon stopped." regardless of whether kill actually worked, and deleteddaemon.ednunconditionally — so a failed stop left an unkillable daemon with no record, invisible to future stop attempts. Rewritten as a bounded sequence: SIGTERM, poll up to 5s, SIGKILL, poll up to 2s, only deletedaemon.ednonce the process is confirmed gone. If the daemon refuses to die even after SIGKILL, throw with the file left in place. Output now reports which path was taken: "Daemon stopped (PID N)." on a clean TERM, "Daemon force-killed (PID N)." when KILL was needed. And explicitly says "No managed daemon to stop." whendaemon.ednis absent — the previous silent no-op confused users into thinking stop wasn't doing anything.- Daemon shutdown hook is bounded so SIGTERM always wins —
do-daemon's shutdown hook calledsystem/halt!synchronously with no upper bound on how long Integrant teardown could take. http-kit's(srv)close has no timeout option, and a hung halt-key would prevent the hook from returning — defeating SIGTERM and turning the daemon into something that only SIGKILL could stop. Wrap halt in a future andderefwith a 5-second deadline. Combined withnoum stop's SIGKILL fallback, a hung daemon now reliably exits within ~7 seconds total instead of waiting for someone to notice andkill -9it. bin/noumenon.batfinds anynoumenon-*.jarinstead of hardcoded0.1.0— the Windows launcher hardcodednoumenon-0.1.0.jarand stopped finding the jar after the version bumped. Replaced with adirloop that picks the lexicographically lastnoumenon-*.jarintarget/, falling back tobin/noumenon.jarif no target build exists.
Chore
test-fixtures/moved totest/fixtures/— pull the per-language fixture trees undertest/so all test artifacts live in one place instead of straddlingtest-fixtures/andtest/. Slurp paths inimports_test.cljupdated to match.- Drop unused
test-repos/gitignore entry andcredentials.example—test-repos/is no longer used; the gitignore entry was the only remaining reference.credentials.examplewas a stale template — the launcher's setup flow no longer points users at a copy-this-file workflow, so the example was just dead documentation.
0.9.0
Fixes
- MCP proxy renders friendly 401/403 messages again —
interpret-responsewas reading the HTTP status from the parsed JSON body ((get parsed "status")) instead of the http-kit response's top-level:status. The daemon'serror-responsebuilder doesn't echo the code into the body — only:okand:error— so the case branch never matched and a user with an expired token saw the bareUnauthorized — bearer token requiredrather thanAuthentication failed. Run \noum connect--token \ …. The bug pre-dated the http-kit migration and was preserved verbatim through it.interpret-responsenow destructures:statusfrom the response map and uses that for the special-case branches. benchmark/judge-promptescapes template metacharacters in question/rubric/answer — judge prompts include the answer text returned from the answer-phase model. With no escape on the substituted values, an answer like"Answer with {{rubric}} injection"ended up in the rendered judge prompt verbatim — looking, to the judge model, like a placeholder it should resolve. The single-passstr/replacealready prevented cascading expansion of the bindings themselves, but didn't sanitize their content. All three bindings now pass throughescape-double-mustache, mirroring the pattern inanalyze/render-prompt.analyze/render-promptescapes every user-controlled binding, not just:content— only:contentwas being passed throughescape-double-mustache;:repo-name,:file-path,:imports, and:imported-bywere inserted verbatim. A repo whose path or name contains literal{{content}}(or any other template variable name) would land that text in the rendered prompt, opening a small prompt-injection surface for repo metadata. All five user-controlled bindings now go through the escape;:lang/:line-countare derived from internally validated values and are left as-is.cli/parse-argserror envelopes always carry:subcommand— error returns frombenchmark(:no-repo-path, etc.) andask(:ask-missing-question,:no-repo-path) used to come back as bare{:error <kw>}maps without a:subcommandkey, even though digest/status/analyze/etc. errors carried it. Callers (main/handle-parse-error, future tooling) had to special-case the gap when routing contextual help. The two override parsers now(assoc :subcommand "benchmark"|"ask")on every terminal branch, matchingparse-with-registryandparse-simple-args.introspect --targetrejects typos instead of silently expanding to all targets — when a user passed--target foobar(ortarget: "foobar"via the MCP/HTTP surfaces), the parser silently dropped the unknown keyword. With nothing left in the resulting set, thecond->guard skipped the assoc, and:allowed-targetsnever made it into the run options — meaning the introspect loop ran *unrestricted* across all targets. The user's intent to scope the run was invisibly turned into the opposite. Validation now happens up front innoumenon.util/validate-introspect-targets!, called from the MCP introspect handlers, the HTTP/api/introspecthandler, and the CLIdo-introspectcommand — so a typo'd target produces a clear error listing the valid set on every surface.- MCP proxy tests actually exercise the http-kit code paths —
proxy-tool-call-surfaces-curl-failure-clearlyandproxy-tool-call-surfaces-empty-body-on-zero-exitwere redefingclojure.java.shell/shto mockcurl, butproxy-tool-callno longer shells out — it goes throughorg.httpkit.client/request. The redefs intercepted nothing; the tests passed coincidentally because the real TCPconnect()to127.0.0.1:7892failed in a way that happened to match the assertions. They are now keyed offwith-redefs [http/request …]returning http-kit-shaped response promises ({:status N :body s :error e}), so the network-failure and empty-body branches ininterpret-responseare actually executed. - Daemon LLM semaphore honors
NOUMENON_MAX_LLM_CONCURRENCY— when the new Integrant lifecycle landed, the LLM semaphore was being initialized twice duringnoum daemonboot: once byhttp/start!(which read the env var) and once by the:noumenon/llm-semaphoreIntegrant init-key (which used the system-config default of 10). Integrant happened to run last, so the env var was silently clobbered and every daemon ran with permits = 10 regardless of configuration. Thesystem/configbuilder now readsNOUMENON_MAX_LLM_CONCURRENCYitself andhttp/start!no longer touches the semaphore, so there is one source of truth and the env var actually takes effect. - Daemon Integrant graph declares dependencies explicitly —
:noumenon/http-servernow references:noumenon/datomic-conns,:noumenon/llm-semaphore,:noumenon/embed-cache,:noumenon/completion-cache, and:noumenon/agent-sessionsviaig/ref. Without this, the dependency graph was empty and Integrant fell back to map iteration order (stable for ≤8 entries, undefined past that), so:noumenon/http-serveractually initialized before:noumenon/llm-semaphoredespite appearing later in the config — which was what caused the semaphore double-init to manifest as "env var ignored." The graph is now self-documenting and a 9th component, or any future component that genuinely needs ordering, won't silently regress.
Refactoring
- Function-level cleanup across
imports.clj,sync.clj,llm.clj,analyze.clj—enrich-repo!decomposes intoprepare-c-context,run-c-extraction,log-selection-stats!, andensure-enrich-tx!so the orchestrator body reads as named pipeline stages instead of an 18-bindingletwith_-boundlog!side-effects threaded through the bindings.update-repo!does the same (auto-sync-p4!,compute-changes,apply-retractions!,run-pipeline-stages!,update-result); the result builder is a pure helper.validate-segmentnow drives off asegment-rulesdata table that mirrors the existinganalysis-sanitizerspattern (key + predicate + cleaner per row), replacing a 16-clausecond->ladder with destructured args.files-for-reanalysiscollapses four near-identicald/qblocks into one query assembled fromreanalysis-scope-clauses+ a base:where.invoke-api's 4-branch retrycondis nowcaseover a pureclassify-attempt(:retry/:fail/:ok); error and retry messages live in purefailure-ex/retry-reasonhelpers.changed-filesextractsparse-status-fieldsandapply-status-lineso the diff-line interpretation is pure data.imports/sh-with-timeoutcleans up itsargs-detection by splitting viasplit-cmd+optsrather thantake-while+drop-whileoverkeyword?. No behavior change. mcp/proxyuses http-kit instead ofcurl—proxy-tool-callno longer shells out tocurlwith a stdin-fed config file; the request goes throughorg.httpkit.client/request, whichnoumenon.llmalready uses. Splits into purebuild-request-map(URL, headers, body construction withURLEncoder) and pureinterpret-response(status →tool-result/tool-error); theclojure.java.shellrequire drops, eliminating one process boundary per remote call.caseon HTTP status replaces thecondladder for 401/403 messages.- MCP handlers — data over atoms, shared opts builder —
handle-digestno longer threads pipeline outputs through an(atom {})with fiveswap!calls; each step is adigest-*-stephelper that returns its result (or nil on opt-out / soft failure), and the final summary is built viacond->, mirroring the pure shape ofhttp/handlers/pipeline.clj:run-digest.handle-introspectandhandle-introspect-startsharebuild-introspect-opts,introspect-llms, andparse-allowed-targetsinstead of duplicating a 15-linecond->opts builder twice; the async variant additionally factorstrack-introspect-future!andensure-session-capacity!.handle-askextracts a pureformat-ask-resultso the budget-exhausted branching is testable without an LLM mock. New helpermu/provider+modelconsolidates four copies of the(or (args "provider") (:provider defaults))boilerplate across mutation handlers. cli/parse-argsis now data-driven — the 50-line top-levelcasecollapsed seven near-identical "parse-then-tag" branches plus acondfallback, all looking up the same spec fromcommand-registry. The new shape is a 3-linecond:parser-overrides(a 2-entry map forbenchmarkandask— the only subcommands with bespoke parsing),simple-subcommandsset, orparse-with-registry. Per-subcommand post-processing (digestlayer string → keyword vector,introspecterror envelope filter) lives in aparse-post-fnsmap so each rule sits with its data, not buried in alet. Adding a new subcommand no longer requires editingparse-argsat all when a registry entry suffices.introspect.cljloop and formatting cleanup —run-loop!no longer drives a 4-accumulatorloop/recurwith side-effects interleaved into the loop body; one iteration isstep-iteration(pure — returns the next accumulator), andrecord-iteration!owns the transact + progress-event side-effects. The outer reduce usesreducedfor budget-exhaustion.run-iteration!decomposes intorequest-proposal!,skip-with-error,code-gate-for, andapply-and-evaluate!, replacing two layers of nestedif-letwith a flatif/if/doover named outcomes.format-ask-insightscollapses six near-identical(when (seq …) (str <header> <preamble> <items>))blocks into oneformat-sectiondriven by a[coll spec]data table. No behavior change.benchmark.cljfunction-level cleanup —run-benchmark!decomposes intoload-run-context,build-shared-state, andlog-run-start!so the body reads as a flat pipeline instead of a 19-bindingletinterleaving config, atom allocation, raw-context shelling, and a 15-line log-format.benchmark-run->tx-datadrives optional:bench.run/*fields from a smalloptional-run-fieldsdata table (with sub-helpers for usage and scoring-method blocks); the prior 130-linecond->pyramid that repeated(<key> aggregate) (assoc <attr> (cast …))for 12 fields collapses into one reduce.aggregate-scoreslifts five innerletlambdas (mean,wmean,layer-key,layer-mean,layer-wmean) to privatedefn-s and threads throughassoc-layer-aggregates/per-category-aggregate/empty-context-count.generate-reportreplaces a 110-line(doto (StringBuilder.) (.append …))mutation withstr/joinover a vector of pure section renderers (header-section,summary-section,scoring-method-section,per-category-section,per-question-section,context-efficiency-section,usage-section,validity-section,reproducibility-section); the JavaStringBuilderimport goes away.raw-contextswaps a manualloop/recurwith stage-by-stage truncation logic forreduce+reducedoverls-tree-files, withescape-html-attrandrender-file-contentextracted as pure helpers. No behavior change.benchmarkno longer depends oncli— the benchmarking subsystem pulled innoumenon.clisolely to interpolate the program name into one "Resume with: …" log line. The literal is inlined and the require dropped, breaking a small layering inversion (subsystem→api).mcp.cljsplit by responsibility — the 1440-line god namespace is now a 346-line declarative tool schema + dispatch, with sibling namespaces for transport (mcp/protocol), remote-proxy mode (mcp/proxy), shared infra (mcp/util), and per-cluster tool handlers (mcp/handlers/{query,mutation,benchmark,introspect,meta}). No behavior change; one file to grep was the bottleneck.http.cljsplit by responsibility — the 1404-line god namespace is now a 10-line public-surface re-export, backed byhttp/middleware(validation, JSON, auth, repo resolution, SSE/CORS),http/routes(route table + ring entry),http/server(lifecycle), and per-cluster handlers underhttp/handlers/{pipeline,query,benchmark,introspect,admin}. The shape mirrors the MCP split — same boundaries, same names — so the read/write/admin axis is consistent across both transports.main.cljsplit by subcommand cluster — the 973-line CLI dispatcher is now a 156-line-main+ dispatch + parse-error table, with shared helpers incli/utiland per-cluster command handlers undercli/commands/{pipeline,query,ask,inspect,benchmark,digest,introspect,artifact,daemon}. Three transports (CLI, HTTP, MCP) now decompose along the same axes, making cross-cluster invariants (e.g. "every benchmark handler validates layers via …") obvious at a glance.- Minimal Integrant lifecycle for the daemon — the HTTP daemon now boots through
noumenon.system, an Integrant config that owns the start/stop graph for Datomic connections, the LLM semaphore, in-memory caches (embed, completion), session stores, and the HTTP server itself. Pre-existing accessor APIs (db/get-or-create-conn,embed/get-cached-index,sessions/register!, …) keep working unchanged; subsystems are not rewritten as Integrant components. The daemon shutdown hook now callssystem/halt!, which clears caches, cancels in-flight introspect futures, drops the LLM semaphore, and releases the Datomic conn cache — so a daemon restart in the same JVM doesn't observe stale state. New dependency:integrant/integrant 0.13.1.
0.8.1
Fixes
- MCP server follows daemon-port changes —
serve!captured(or (load-connection-config) (detect-local-daemon))once at startup, so an MCP server spawned while the daemon was on one OS-assigned port (thenoum daemon --port 0path) kept proxying to that address forever. After a daemon restart on a new port, every tool call surfaced asCannot reach daemon at <stale host>until Claude Code itself was restarted. The connection is now resolved pertools/callvia the same expression, so daemon.edn is re-read each request and a daemon bounce no longer requires a client restart. Both reads are cheap EDN slurps; explicit remote setups viaload-connection-configstill win over auto-detect. synthesizeno longer creates "zombie" components — the hierarchical-merge step resolves a merged component's:filesby looking up each:source-componentsname inpart-comp-index. When the LLM hallucinates source-component names that don't exist in the partition results, every lookup returns[]and the merged component went downstream with empty:files.components->tx-datathen wrote the component entity (and any:component/depends-onedges) but emitted no file attribution, producing a phantom component visible incomponent-dep-drift(which joins on:component/name) but invisible incomponents(which joins on:arch/component). Adversarial diagnosis on the live noumenon db: 19 visible components vs 19 zombies, 84 of 107 dep-drift edges involving a zombie, inflating the over-declared count and making synthesis quality look ~88% wrong when the real signal among real components was 13/23 (~57%) import-grounded — a healthy synthesis ratio. Merged components with empty:filesare now filtered out before tx-data, and the dropped names are logged so the rate stays observable.- Cost telemetry survives non-Anthropic models —
llm-cost-by-modelandllm-cost-totalreturned empty against a fully analyzed db with 358 analyze txes. Three compounding bugs: (1)llm/model-pricingkeyed by date-stamped ids (claude-sonnet-4-6-20250514) while provider responses now carry undated names (claude-sonnet-4-6from the LevelInfinite/Tencent gateway,glm-4.6from Z.ai), so prefix-only lookup missed every response andestimate-costreturned 0.0; (2):tx/cost-usdwas guarded by(pos? cost)and never written for 0-cost runs (which, after bug 1, was every run); (3) the cost queries used bare[?tx :tx/cost-usd ?cost]clauses that silently excluded txes without the attr. Switchedmodel-pricingto undated keys with prefix-match (so both bare and date-stamped ids hit the same entry), addedclaude-opus-4-7, dropped the(pos? cost)guard so 0.0 is written explicitly, and switchedllm-cost-by-model/llm-cost-totaltoget-elsedefaults.llm-cost-totalalso gained a:tx/op #{:analyze :synthesize}anchor so it doesn't pull in import/enrich/seed rows that never had token attributes. Both providers were probed directly: neither GLM nor Tencent return cost inusage— the fix is local pricing and local query hygiene, nothing provider-side.
0.8.0
Added
- Branch-aware graph foundation — every database now records which branch it represents. New
:branch/name,:branch/kind(:trunk/:feature/:release/:unknown),:branch/vcs, and a tuple identity:branch/repo+nameare populated automatically on everyupdate. Repos point to their current branch via:repo/branch. - Content-addressed file identity —
:file/blob-shais now imported fromgit ls-treefor every file, enabling content-based comparisons and cache lookups. Existing files lazy-fill on next sync. - Local delta databases —
noum delta-ensure <repo> --basis-sha <sha>(orPOST /api/delta/ensure) materializes a sparse Datomic DB at~/.noumenon/deltas/<repo>__<branch>__<basis>containing only files added/modified/deleted between the trunk basis and the current HEAD. Deletions are recorded as:file/deleted? truetombstones. Delta DBs link back to their parent via:branch/basis-sha,:branch/parent-host, and:branch/parent-db-name. - Federated trunk + delta queries — a subset of named queries declare a
:federation-modeand accept:exclude_pathsso the daemon can return trunk results minus rows the launcher will overlay from a delta DB. New endpointPOST /api/query-federateddoes the merge in a single roundtrip; new flagnoum query <name> <repo> --federate --basis-sha <sha>opts in. Two modes are supported::tombstone-only(trunk minus tombstoned paths; no delta rows — the safe default for queries that join on commits, imports, analysis, or segments, none of which the sparse delta carries) and:added-files-merge(trunk plus delta rows for files added in the branch — opt-in for queries that join only on stable attrs like:file/path/:file/lang, validated at seed time). Federation-aware queries seeded so far:orphan-files,complex-hotspots,import-hotspots,hotspots,ai-authored-segments,bug-hotspots,files-by-churn— all:tombstone-only. Non-federation-aware queries return trunk-only with a banner. - Auto-federation in
noum query/noum ask— when the active connection is hosted and local HEAD has diverged from the trunk DB's:repo/head-sha, the launcher transparently rewrites a plainnoum query <name> <repo>into the federated path against a local delta and emits a yellow banner.noum askemits the banner but does not federate (no federated ask endpoint in v1). Per-call opt-out with--no-auto-federate; global opt-out withnoum settings set federation/auto-route false. noumenon_query_federatedMCP tool — exposes/api/query-federatedto MCP clients. Materializes the delta on demand from(repo_path, basis_sha)and returns the merged result.noum analyze --no-promote(and MCPno_promote) — bypasses the content-addressed promotion cache and always invokes the LLM. Useful when re-validating the cache itself.bb prune-deltas— interactive GC for stale local delta DBs under~/.noumenon/deltas/noumenon/. Lists each delta with size, classifies as:live/:trunk-missing/:unparseable, and prompts before deleting trunk-missing entries.- Content-addressed analysis promotion — when a file's
:file/blob-shaequals a previously-analyzed blob in the same DB whose:prov/prompt-hashand:prov/model-versionmatch the current run,noum analyzecopies the donor's:sem/*and:arch/*attrs onto the recipient and skips the LLM call. Donor lineage is preserved via:prov/promoted-from. Pass--no-promoteto bypass the cache. The analyze summary surfaces a:files-promotedcounter alongside:files-analyzed.
Changed
- No
:file/deleted?in trunk transactions — trunk DBs hard-retract deleted files as before; only delta DBs use tombstones. A guard asserts this inretract-deleted-files!. - Schema files — added
resources/schema/branch.ednandresources/schema/federation.edn(which defines:noumenon/scope). New attrs:file/blob-sha,:file/deleted?,:prov/promoted-fromin existing schema files.:tx/opdoc lists:promote;:tx/sourcedoc lists:promoted. Every data attribute carries an explicit:noumenon/scope :stable | :trunk-onlytag. - Centralized input length caps —
noumenon.utilnow exports the shared length limits (max-repo-path-len,max-question-len,max-query-name-len,max-branch-name-len,max-host-len,max-db-name-len,max-run-id-len,max-params-count,max-param-key-len,max-param-value-len) plus avalidate-params!helper. HTTP handlers and the MCP layer now consume them so the two surfaces stay in lockstep. - Schema-scoped federation modes — replaced the boolean
:federation-safe?query flag with an enum:federation-mode. A seed-time validator (noumenon.artifacts/validate-federation-mode!) rejects any:added-files-mergequery that touches an attribute not tagged:noumenon/scope :stable, so a contributor cannot accidentally re-introduce the orphan-files-style false-positive merge by mistakenly opting a query that joins on:file/importsor:commit/*into the more permissive mode. The legacy:artifact.query/federation-safe?boolean is preserved as a derived value so the launcher's banner logic keeps working.
Fixes
- Incremental sync was returning empty diffs —
git diff --name-statuswas given a--separator before the SHAs, which told git "no more revisions follow," so old-sha and HEAD were interpreted as pathspecs and the diff returned empty for every poll. Modified/deleted files were never retracted, so stale:sem/summary,:file/imports, segment, and analysis attrs lingered while only new commits got imported. Switched to--end-of-options, which keeps the flag-injection defense without breaking revision parsing. noum watchcorrectly distinguishes no-op ticks from real syncs —clojure.data.jsonon the daemon side serializes keywords as strings, so:up-to-datearrived as the string"up-to-date"while the launcher compared against the keyword constant — the check silently always-true'd, printing "Updated: 0 added, 0 modified" on every poll.noum.api/parse-bodynow keywordizes both keys and a declaredenum-keysset of known enum values viaclojure.walk/postwalk, so all HTTP read paths produce idiomatic Clojure maps. Watch also surfaces deletion counts alongside additions and modifications.noum settingslisting truncates long values — a deeply nested or very long single-line value (>120 chars) used to wrap the terminal into noise. Listings now clip to 120 chars with a trailing…;noum settings <key>still shows the un-truncated value.--insecurealways parses as boolean — the flag was missing fromcli/boolean-flags, so--insecure foo(with a non-flag value following) used to swallowfooas the flag's value ({:insecure "foo"}). Always boolean now, regardless of what follows.noum connectrejects non-http(s) URL schemes up front —noum connect ftp://example.com,file:///tmp/foo,ssh://host:22used to flow through the SSRF check (which gave a misleading "private/internal address" message) or surface as a generic "invalid URI scheme" wrapped in the network catch. Now produces a cleanError: --host scheme must be http or https. Got: <scheme>://(exit 1) before any network call.ask-secretno longer echoes any prefix of the secret — the previous mask wrote(subs input 0 (min 4 (count input))) "****", so short tokens (≤ 4 chars) were displayed in clear and longer ones leaked the first 4 characters. Always shows a fixed********mask now.- Confirm prompts re-prompt on garbage input —
tui.confirm/askreturneddefault-valon any non-y/n input. Withdefault-val=true(no current caller, but future ones) a typo would silently confirm a destructive action. Garbage now triggers a re-prompt with a "Please answer y/n." hint; empty input still falls back to the default as before. noum settingsrejects extra positionals —noum settings retry/limit 5 typo-extraused to silently discard the third positional and POST(key, value)as if only two args were passed. Now producesError: Too many arguments.(exit 1).noum help <unknown>exits 1 — previously printed "Unknown command: …" but exited 0, inconsistent withnoum <unknown>which correctly exits 1.noum connect <ip-literal>derives a useful saved-connection name —noum connect 127.0.0.1:7895used to save the connection as'127'(the first dot-segment), so127.0.0.1and127.0.0.2collided. The auto-naming now detects IP literals (andlocalhost) and keepshost:portjoined by-(e.g.127.0.0.1-7895,localhost-7895); real hostnames still use the first dot-segment (api.example.com→api).noum introspectrejects mutually exclusive flag combinations —--status,--stop, and--historytarget different sub-actions, but the cond order silently picked the first match.noum introspect --status run-a --stop run-bacted on--status run-aonly without warning. Two or more of the three flags now produceError: --status, --stop, and --history are mutually exclusive.(exit 1).noum introspect --status(no value) gives a clean error instead of a misleading databases hint —--status/--stopwithout a following value booleanized totrue, fell through(string? …)checks todo-api-command, and emitted "Usenoum databasesto see imported repositories" — the user wanted a run-id, not a repo. Both flags now produceError: --status requires a run-id./Error: --stop requires a run-id.(exit 1) at the boundary.--as-of,--raw, and--basis-shavalidate at the launcher boundary —noum query --as-of "",query --raw "",delta-ensure --basis-sha(no value → boolean true),query --federate --basis-sha not-hexall used to flow through to the server, which rejected after a network round-trip. The launcher now rejects blank--as-of/--rawand enforces a 40-char lowercase hex shape on--basis-sha(used by bothquery --federateanddelta-ensure) up front; valid inputs still pass through unchanged.noum serve --host Xnow produces a clean error instead of silently ignoring--host—do-serveonly forwarded--db-dir,--provider,--model,--tokento the spawned MCP process;--hostand--insecurewere dropped, so users targeting a remote ended up running an MCP server colocated with the local daemon. Reject the combination explicitly with a hint to runnoum connect <url>first; the MCP server already proxies to the saved active connection.noum pingandnoum versionhonor--hostand the active named connection — both used to calldaemon/connectiondirectly, which only consults~/.noumenon/daemon.ednand ignores every other connection signal.noum ping --host X --token Ysilently checked the local daemon instead. Both commands now go through a newping-targethelper that checks--hostfirst, then the saved active connection, then the local daemon — without spawning anything as a side effect.- Interactive menus no longer hang on a bare ESC press —
choose/selectmatched the ESC byte and then unconditionally read two more bytes to consume the CSI follow-up ([ <arrow-code>). With no follow-up bytes (the user pressed ESC alone, not as part of an arrow sequence), the second.readblocked forever — the only escape was Ctrl-C, which killed the JVM mid-cleanup and left the terminal in raw mode. The newread-arrow!helper sleeps 20ms after the ESC byte and only consumes the next two bytes ifInputStream/availableis positive; bare ESC returns nil, which the caller treats as cancel (same as Q / Ctrl-C). Arrow keys behave identically to before. noum settings <key> <huge-int>no longer silently nulls the value —parse-setting-valuematched any digit-only string with#"-?\d+"and calledparse-long, which returnsnilon overflow. The cond returned the nil result, so the daemon got:value niland replied "key and value are required" — the user typed a value, the launcher silently erased it. Now falls back to the raw string whenparse-longoverflows; the daemon's settings store accepts strings as-is.- Uncaught launcher exceptions surface as clean errors, not stack traces — when
daemon/start!timed out (e.g. slow JVM startup, missing JRE, port conflict) the bb runtime dumped a 30-line Clojure stack trace with internal source paths. Same shape for any other uncaught exception inside a handler.-mainnow wraps(handler parsed)in a newrun-handler!helper that catches everyException, printsError: <message>(in red) with a hint to setNOUM_DEBUG=1for the full trace, and exits 1. - Network failures surface as clean errors instead of stack traces —
noum <cmd> --host <unreachable>(no listener, DNS failure, timeout) used to dump a rawjava.net.ConnectExceptionstack trace fromapi/get!/api/post!. The HTTP client's:throw falseonly converts HTTP error responses into result maps; pre-response exceptions still bubbled. Both helpers now catch any exception and emitError: Could not reach <host>: <exception-class> — <message>(exit 1) so the user knows where to look without seeing source paths. - Auth-failure path no longer leaks Datalog clauses or crashes the launcher — when the meta-DB token query threw (e.g. a closed channel mid-request, transient backend error), the daemon's auth middleware bubbled the exception into the generic 500 handler, which echoed
processing clause: [?t :token/hash ?h], …back astext/plain. The launcher then crashed with a rawJsonParseExceptionbecauseparse-bodydidn't tolerate non-JSON. Two-layer fix: (1)check-authnow wrapsauth/validate-tokenin try/catch and returns a clean 401 JSON response on any internal failure, never leaking schema details; (2) the launcher'sparse-bodyreturns nil instead of throwing on non-JSON input, andget!/post!fall back to a{:ok false :error "HTTP <status>: <body>"}shape (truncated to 200 chars) so callers see a sensible message. noum watch --intervalrejects non-positive / non-numeric values up front —--interval -5used to print "polling every -5s", attempt one update, then crashThread/sleepwith rawIllegalArgumentException;--interval abcsilently fell back to 30 with no warning. The newparse-watch-intervalhelper validates beforeensure-backend!runs and emitsError: --interval must be a positive integer (got <value>)with exit 1.noum history prompt(no name) no longer NPEs — the no-name branch tried to enumerate prompt files via(io/resource "prompts/"), but the daemon'sresources/prompts/lives in the JVM-sidenoumenon.jar, not the launcher's bb classpath. The resource lookup returned nil and(io/file nil)threw NPE. Same NPE in the interactivecollect-historymenu. Both now skip the listing: the one-shot path emits a Usage message with the common prompt names, and the interactive path asks the user to type the name as free text. (No daemon endpoint exists today to enumerate prompts; if one is added later, the launcher can call it.)- Blank / NUL-only repo args no longer silently resolve to the cwd —
noum status "",status " ",status "\x00", andask "" "<question>"all used to flow throughpath->db-name/canonicalize-path, which delegated to(java.io.File. "")— the JDK normalized that to the current working directory andlastproduced the cwd basename. Users running noum from a directory whose basename happened to collide with a real DB silently got the wrong DB.cli/parse-argsnow drops blank positionals (after stripping NUL bytes) so the existing min-args / Usage-error paths fire as intended.noum status .still works as a current-directory shorthand; only empty / whitespace-only / NUL strings are dropped. noum query --paramis repeatable as documented —cli/extract-flagsstored every flag in a single-valued map, so a second--param k2=v2overwrote the first. Only the lastkey=valuereached the daemon, despite the help text claiming "(repeat command as needed)".--paramnow accumulates into a vector of strings;build-api-body(and the--as-of/--federatebranches ofdo-query) merge the vector into the request's:paramsmap. Single--param k=vcalls still work — they produce a 1-element vector that flattens to the same body as before.- SSRF check no longer crashes on IPv6-resolving hosts — bb's native-image build doesn't carry reflection metadata for
Inet6Address's instance methods (.isLoopbackAddress,.getAddress, etc.), so any host whose DNS lookup returned an IPv6 address — including everyday public hostnames likegoogle.com— surfaced asMissingReflectionRegistrationErrorwith a 30-line stack trace instead of either connecting or returning a clean blocked-private response. The classifier now readsgetHostAddress(the one Inet6Address method bb does carry) and matches the canonical full IPv6 form (0:0:0:0:0:0:0:1) alongside the compressed form (::1); the^fe80:/^fc00:/^fdprefix patterns expand to cover all offe80::/10andfc00::/7as well, so the regex-only path is at least as strict as the prior.isLinkLocalAddress/.isSiteLocalAddresscalls. IPv4-mapped IPv6 (e.g.::ffff:127.0.0.1) is auto-converted toInet4Addressby the JDK, so the existing IPv4 patterns catch it. noum connect http://localhost:Nno longer SSRF-blocked —base-url's loopback allowlist regex was anchored at the start of the host string, so the bare formlocalhost:7895matched but the scheme-prefixedhttp://localhost:7895(andhttp://127.0.0.1:7895,https://localhost:7895) didn't, falling through toprivate-address?. That helper then split on:, took"http"as the host, failed DNS, and the catch-all returnedtrue— every scheme-prefixed local URL got rejected as "private/internal".base-urlnow strips the scheme up front, applies the loopback check on the bare host, and reuses an explicit scheme when present so https://… and http://… both round-trip cleanly. Identical inputs with and without scheme produce the same final URL.- Delta DB collision on look-alike branch names —
feat/fooandfeat-fooboth sanitize tofeat-foofor the on-disk db-name; without disambiguation, branch-switching between the two would silently overwrite the same delta DB. The db-name now appendssha256(branch-name)[0..6]so collisions resolve to different DBs. - Federation merge keeps trunk history for modified files — the earlier "exclude all delta paths from trunk + append delta rows" merge made modified files vanish from churn-based queries because the delta DB has no commits to carry their history. Tombstone-only merge keeps trunk's authoritative history while still respecting branch deletions.
bb prune-deltaswalks the right directory — Datomic-Local stores DBs under<storage>/<system>/<db-name>/, so the actual deltas live under~/.noumenon/deltas/noumenon/. The previous parent-dir walk would have surfaced the system dir itself as a single "unparseable" entry — and ayat the prompt would have nuked every delta DB on the machine.- Empty / dot-only branch names —
delta-db-namefalls back todetachedfor nil, blank, or dot-only branch inputs (which would otherwise produce empty /./..directory names — the latter resolve to parent dirs in tools that aren't expecting them as literal path components). ensure-private!uses 700 on directories — the launcher's owner-only permission helper was applying600(no execute bit) to directories, making them unenterable. Nowrwx------for dirs,rw-------for files.validate-string-length!returns 400, not 500 — the:statuskey on the thrown ex-info now lets the HTTP handler surface a clean400 Bad Requestinstead of falling through to a generic500.- Idempotent branch upsert —
update-head-and-branch!resolves the existing repo + branch eids before transacting, so re-runningdelta-ensure(or any sync) doesn't trip the:branch/repo+nameunique constraint with a fresh tempid. - Branch / parent_host / parent_db_name / query_name length caps on
/api/delta/ensureand/api/query-federated— overlong values now return400instead of being persisted or echoed back unchecked. - Bogus
basis_shais now a clean error — a 40-char-hex SHA that doesn't resolve to a real commit used to silently produce an empty diff, anddelta-ensure/query-federatedwould respondsyncedwith zero counts.changed-filesnow throws on non-zerogit diffexit so HTTP surfaces a 400 with the actual git error;update-repo!catches the throw and falls back to a fresh sync, so a force-pushed trunk DB still recovers. bb prune-deltasparses branch names containing__— the old split-on-__parser misclassified delta DBs whose branch contained a double underscore (e.g.feat__under) as:trunk-missingand offered them for deletion. Anchored regex on the trailing-<hash6>__<basis7>suffix preserves the branch correctly. Pre-disambiguator on-disk names (no-<hash6>suffix) are not parsed by the new code; re-create them by runningdelta-ensureorquery-federatedagainst the same basis.bb prune-deltasclassifies repo basenames containing__— the parser's branch-favoring heuristic attributes every__-segment in the on-disk name to the branch, which is the right call for the common case (noumenon__feat__under-...) but misclassified the symmetric one: a real repo basename likemy__repoparsed asrepo=my, branch=repo__feat, and~/.noumenon/data/noumenon/my/doesn't exist, soclassifyflagged the delta:trunk-missingand offered it for deletion.classifynow walks the__boundaries between parsed repo and branch and reports:liveif any candidate split has an existing trunk dir; the displayed row shows the resolved repo/branch instead of the misparse.:added-files-mergequeries must put?pathfirst in:find— the merge code filters delta rows by(first row)matched against added paths, so the first column has to bind:file/path. The contract was implicit; a query whose:findstarted with anything else (e.g.[:find ?lang ?path …]) would have silently lost every delta row, and a non-path column that coincidentally equalled a path string would have leaked false rows through. The seed-time validator now rejects an:added-files-mergequery whose first:findelement isn't the symbol?path. No shipped query was affected — the new check makes the contract explicit at the boundary instead of relying on an undocumented convention.connectrecovers from a stale system-catalog entry — Datomic-Local's system catalog persists database entries to its own metadata files; if the on-disk db directory is removed externally (e.g.bb prune-deltaswipes a delta, or the user deletes one while the daemon is down) the catalog still says the db exists.create-databaseis then a no-op (catalog says exists) andconnectthrows:cognitect.anomalies/not-found— surfacing as a 500 likeDb not found: <name>even thoughensure-delta-db!had just logged success.create-dbnow catches that exact anomaly, drops the stale catalog entry, and recreates cleanly so the next caller gets a working connection. Centralized in one place so cache misses, fresh connects, and schema-ensure paths all share the same recovery.validate-string-length!rejects non-strings with 400, not 500 — the validator's old guard(when (and (string? s) ...))silently let any non-string value through, so a JSON request like{"branch": ["a"]}or{"branch": 123}flowed past every HTTP and MCP boundary that relied on it as a type+length check. Downstream code (e.g.sanitize-branchcallingstr/trimon a vector) then crashed as a 500. Non-nil non-strings now produce a clean 400 "X must be a string" at the validation boundary; nil still passes silently for optional fields.validate-repo-pathchecks type and length — the validator went straight to(io/file repo-path), so a JSON request like{"repo_path": 42}threwIllegalArgumentException(noas-fileimpl forLong) and surfaced as a 500. Long strings also walked all the way through.exists/.isDirectorywithout a cap. Non-nil non-strings now returnmust be a string, strings overmax-repo-path-len(4096) return anexceeds maximum lengthreason, and the existing FS-shape reasons are unchanged. Callers'when-let + throwpattern produces a clean 400 in every case.- MCP proxy surfaces unreachable-daemon errors clearly — when
~/.noumenon/config.ednpointed at a host with no daemon listening, every MCP tool call came back withRemote proxy error: JSON error (end-of-file)because curl exited non-zero with empty stdout and the proxy thenjson/read-str'd the empty string. The proxy now checks curl's exit code first and surfacesCannot reach daemon at <host>. Start it withnoum daemon, or update ~/.noumenon/config.edn to point at a running host.— including curl's stderr when present. Empty bodies on a zero exit (204 / premature close) get a similar host-naming message instead of bubbling up the JSON parse error. - Branch-name cap is FS-derived, not human-name-derived —
max-branch-name-lenwas 256, so a 256-char branch passed validation and then crashed Datomic-Local'smkdirwithFile name too longbecause the synthesized db-name (<repo>__<safe-branch>-<hash6>__<basis7>) overflowed the POSIX 255-byte path-component limit. Cap is now 200 — leaves ~37 bytes of headroom for the repo basename, which covers virtually every real-world case.delta-db-namedoes a final 255-byte check too so the long-repo edge case (where repo + cap can still overflow) surfaces as a 400 at the boundary instead of a 500 from the FS layer. analyzetruncates long strings at the writer boundary — Datomic-Local rejects single string values around the 4 KB mark withItem too large. The parse-timeclampalready limits:summary/:purposeto 4096 chars, but:sem/synthesis-hintsis apr-strofpurpose+architectural-notes+ patterns + layer + category, so the result can easily overflow even when each input was clamped.build-file-txnow caps every string attribute it writes at 4000 chars (matchingartifacts/chunk-size's headroom for UTF-8 multi-byte chars), so a verbose LLM response can't blow up the transact and lose the analysis.query-federatedis now a no-op when basis + HEAD haven't moved — every call used to write a head/branch tx (and re-transact every schema file viaensure-delta-db!), growing the delta'sdb.log~2.3 KB per 5 read-shaped requests. The handler short-circuits to:up-to-datewhen the stored basis-sha and HEAD already match, andensure-delta-db!now routes through the cached connection helper.query-federatedrecords parent metadata on the delta — auto-derives:branch/parent-db-namefrom the resolved trunk DB and:branch/parent-hostfrom the request's Host header (HTTP) or"local"(MCP). Previously, only the explicitdelta-ensurepath set these, so deltas materialized via auto-federation lost the lineage breadcrumb.- Trim branch name before disambiguator hash —
sanitize-branchalready trimmed before producing the on-disk label, but the disambiguator hashed the raw input, so"foo"and"foo "ended up in different delta DBs for one logical branch. Both paths now agree on the canonical branch. - Uniform
query_namelength cap across query endpoints —POST /api/queryandPOST /api/query-as-ofnow reject overlongquery_namewith the same 400 the federated endpoint already produced. The sharedutil/max-query-name-lenis the single source. - OpenAPI doc reflects the actual delta-DB path —
/api/delta/ensuredescription now shows~/.noumenon/deltas/noumenon/<repo>__<safe-branch>-<hash6>__<basis7>/(was missing both thenoumenon/Datomic system subdir and the-<hash6>disambiguator). - Cross-DB promotion guard —
find-cached-analysisrejects:donor-dbwithout a matching:donor-db-name, and now also the symmetric:donor-db-namewithout:donor-db. The two predicates that decide same-DB vs cross-DB used to disagree; either partial form would have written a dangling:prov/promoted-fromref or fabricated cross-DB provenance for a donor that actually came from the recipient. Currently dormant (no production caller wires either) but lands defensively before the cross-DB-promotion path gets enabled. DELETE /api/databases/noumenon-internalnow rejected — the meta database stores tokens, settings, prompts, rules, ask sessions, and benchmark/introspect history, and:meta-connis cached at daemon startup. Letting a caller delete it via the public delete endpoint silently corrupted the daemon: the cached connection became a closed channel, so/api/tokens,/api/repos, etc. all 500'd until restart, while lazy-ensure-meta-dbcallers (settings, queries) silently re-seeded into a fresh DB and hid the breakage. The meta-DB name lives asnoumenon.db/meta-db-name, andhandle-delete-databaserejects it up front with a 400 "Cannot delete reserved database".file://and other non-network URL schemes blocked at clone —validate-clone-url!only ran the SSRF private-IP check, and that check short-circuited on URLs without a host (file://,ssh://, raw paths). An authenticated admin posting{"url":"file:///some/local/repo"}to/api/reposcould clone an arbitrary readable git repo on the daemon's filesystem and then query it via/api/ask. The validator now rejects anything that doesn't matchgit-url?(https?://orgit@host:path) before the host lookup runs; Perforce depot paths still go through their own clone path and are unaffected.- Uniform
repo_pathvalidation across the bulk endpoints —with-repo(the wrapper used by/api/import,/api/analyze,/api/enrich,/api/update,/api/synthesize,/api/digest,/api/ask,/api/query,/api/query-raw,/api/query-as-of,/api/query-federated,/api/benchmark,/api/introspect,/api/completions, etc.) only checked for the field's presence, so a JSON body like{"repo_path": 42}/["a"]/{"a":1}/truereached(io/file …)and surfaced as a 500 with a leaking ClassCastException. Empty string fell through to the bare-db-name branch and shelled out togit logagainstdb://. A 5MB string walked the FS-shape checks and got reflected back in the 404 message (small request-amplification vector). Newutil/validate-repo-path-input!does the type+length+blank gate;with-reponow calls it after the missing-field check, so all of the above surface as a clean 400. The earlierdelta-ensure-only fix is now uniform. - Malformed JSON body returns 400, not 500 —
parse-json-bodyletjson/read-str's parse exception flow up intomake-handler's catch-all, so a typo'd request body became a generic "Internal server error" even though the fault was entirely client-side. The parser is now wrapped in a try/catch that re-throws an ex-info with:status 400and a clean "Invalid JSON body" message; the underlying parser detail (offset, character) is logged to stderr for daemon-side debugging but not exposed in the response. ask-session-feedbackrejects unknown session ids —POST /api/ask/sessions/<unknown>/feedbackused to callset-feedback!regardless and return 200, writing feedback attrs against a non-existent session id and lying to the caller about it. The handler now looks up the session viaask-store/get-sessionfirst and returns 404 "Session not found" on miss, matching the shape ofhandle-ask-session-detail./api/repos/:nameremove and refresh return 404 for unknown names — both handlers used to call intorepo-mgrwithout an existence check, so an unknown name surfaced as a generic 500 "Internal server error" (with a leaked filesystem path on the daemon side). A sharedregistered-repo!helper does the meta-DB lookup and 404s with "Repo not registered:" before any disk-touching work, so callers can distinguish "not registered" from a real server error and stop seeing the 500. /api/askrejects empty / whitespace question with 400 — the(when-not (:question params))gate only caught nil; an empty string passed all of nil-check + length-cap and reachedagent/ask, where the LLM loop crashed and surfaced as 500 "Internal server error". The gate now also rejects blank strings (after trim), so{"question":""}and{"question":" "}both produce 400 "question is required" the same as omitting the field.validate-db-name!is now a positive allowlist — the validator only rejected/,\, blank, and pure-dot names, so null bytes / newlines / tabs / spaces / non-ASCII slipped through and propagated into(io/file …)lookups. Tightened to[a-zA-Z0-9._-]+(matchingderive-db-name's sanitizer and the synthesized delta-DB naming), so exotic characters fail at the boundary with a 400 instead of leaking into the storage layer. Pure-dot names still get the explicit dot-only rejection./api/query-as-ofno longer leaks JVM class names inas_oferrors — sending{"as_of": [123]}/true/{}triggered the parse path's(long …)cast, and the catch wrapped the JVMClassCastExceptionmessage ("class clojure.lang.PersistentVector cannot be cast to class java.lang.Number …") into the 400 body. The handler now type-checksas_of(string or number) up front and produces a clean "as_of must be an ISO-8601 string or epoch milliseconds". The string-but-unparseable branch ("not-a-date") still surfaces the actualInstant/parsecomplaint so users see the real reason. Validation also now runs beforewith-repo, so a badas_offails fast even when the repo doesn't exist./api/settingsstrings stay strings — the handler used to run every string value throughedn/read-string, silently re-typing"42"to42,"true"totrue,":foo"to a keyword, etc. Cross-language callers (Electron UI, future GUIs) had no way to store an actual string that happened to parse as EDN. The handler now stores values as-is: typed callers send JSON-typed values ({"value": 42}for an int), string callers send strings ({"value": "42"}for a string). Thenoum settings <k> <v>CLI keeps its existing UX by pre-parsing the CLI string in the launcher (parse-setting-valuenow also runs for daemon-side settings, symmetric with how launcher-local settings already worked).- Daemon logs no longer leak absolute db-dir paths in clone errors —
repo-mgr/refresh-repo!'s "Clone not found: /abs/path/.git" message, the "Cloning into /abs/path/ .git ..." log line, the "Removing clone /abs/path/ .git" log, and the git stderr embedded in clone-failure errors all referenced the absolute filesystem path. Logs and surfaced messages now reference only the db-name (e.g. "Clone not found for ghost", "Cloning into ghost.git ..."); the absolute path stays in :ex-datafor daemon-side debugging.git/clone!andgit/clone-bare!redact the target-dir absolute path from git's own stderr before embedding it in the error message.
Notes
- Datomic schema is additive: no migrations runner, no version stamps.
ensure-schemare-transacts every connect; existing DBs pick up new attrs and queries pick up:federation-safe?on next start. - Delta DBs require a co-located daemon in this release. Cross-machine federation (remote daemon, launcher-side delta) is deferred.
- Promotion is same-DB only in this release. Cross-DB promotion (delta lookups against trunk's history) is deferred.
0.7.0
Changed
- Repo split — The Electron desktop app moved to
leifericf/noumenon-app; the website moved toleifericf/noumenon-site. This repo keeps the daemon,noumCLI, and OpenAPI spec. History was preserved on both new repos viagit filter-repo. - OpenAPI spec relocated — Canonical source moved from
docs/openapi.yamltoresources/openapi.yamlso it ships inside the daemon JAR and can be served viaio/resource. The website mirrors it daily via a cron-pull workflow. noum uiauto-updater — Now downloads the packaged Electron app fromleifericf/noumenon-appreleases (wasleifericf/noumenon).noum uidev mode — Resolves a noumenon-app source checkout in this order:$NOUMENON_APP_ROOT→$NOUMENON_ROOT/../noumenon-appsibling → fall back to installed app. The previousui/child directory is no longer valid.- Core CI/Release workflows — Removed the
uijob and thebuild-electron/deploy-pagesrelease jobs. CLI distribution (update-homebrew,update-scoop) and Docker publish remain in this repo.
0.6.2
Changed
- HTTP-only provider support — Removed Claude CLI provider support entirely. Supported providers are now API-based (
glm,claude-api, withclaudealiasing toclaude-api). - Strict model selection — LLM operations now require an explicit model source: pass
--modelor configure provider:default-model; no implicit fallback model is selected. - Provider credential policy — Removed legacy file-based credential fallback; provider credentials now resolve from
NOUMENON_LLM_PROVIDERS_EDNand process environment variables. - Analysis/synthesis provenance — LLM transactions now record provider and model provenance via
:tx/providerand:tx/model-sourcemetadata.
Fixes
- Provider migration errors — Using removed
claude-clinow fails with explicit migration guidance toclaude-api/claude. - API schema/docs alignment — OpenAPI and provider-config docs now reflect API-only provider support.
0.6.1
New
- Provider/model catalog commands — Added
noum llm-providersandnoum llm-modelsfor discovering configured providers, provider defaults, and available models. - MCP provider/model catalog tools — Added
noumenon_llm_providersandnoumenon_llm_modelswith help/schema metadata so agents can inspect defaults and model availability without reading config files.
Changed
- Provider default selection — Noumenon now resolves one global default provider via
NOUMENON_DEFAULT_PROVIDER, then:default-providerinNOUMENON_LLM_PROVIDERS_EDN, then built-in fallback. - Provider model policy — Each provider can declare
:modelsplus a single:default-model; model selection now resolves per-provider defaults when--modelis omitted. - Dynamic model discovery —
llm-models/noumenon_llm_modelsprefer provider API discovery (:models-path, with known defaults) and fall back to configured:modelswhen discovery is unavailable.
0.6.0
New
- Provider-agnostic LLM config — Added
NOUMENON_LLM_PROVIDERS_EDNsupport for API providers, allowing per-provider:base-urland:api-keyconfiguration (for example::glm,:claude-api, gateway-backed providers) through one canonical EDN map.
Changed
- Runtime mode policy for secrets — Added
NOUMENON_RUNTIME_MODE=local|service(defaultlocal). Inservicemode, file-based credential fallback is disabled and only process env secrets are used. - Provider resolution precedence — API providers now resolve config in this order: canonical EDN map entry, legacy env var fallback, then built-in default base URL (API keys are never defaulted).
- Centralized provider resolution — API-provider invocation now routes through a normalized resolver in
src/noumenon/llm.cljreturning{:base-url :api-key}to reduce provider-specific branching.
Fixes
- Service URL hardening — API provider base URLs are now validated as absolute URLs, and
servicemode requireshttps. - Safe error handling for credentials — Missing-key failures are explicit while avoiding secret value leakage in error messages.
- Optional base URL allowlist — Added
NOUMENON_LLM_BASE_URL_ALLOWLIST_EDNsupport to restrict provider base URL hosts/patterns.
0.5.6
New
- Pipeline selectors —
analyze,enrich,update, anddigestnow accept--path,--include,--exclude, and--langto scope work to selected files/directories/languages. Added parity across JVM CLI, launcher (noum), HTTP API, and MCP tool schemas. - OpenAPI selector schema — Added
PathSelectorstodocs/openapi.yamland wired it into analyze/enrich/update/digest endpoint request bodies.
Changed
- Prompt/model drift behavior — Drift is now advisory by default. Noumenon logs recommended re-analysis counts but does not auto re-analyze unless you explicitly pass
--reanalyze prompt-changedor--reanalyze model-changed.
Fixes
- MCP repo path mapping — Remote MCP proxy now derives database names from local path semantics (e.g.
mino) instead of org-repo remote URL synthesis (e.g.leifericf-mino), preventing status/query failures on path-to-db translation. - Launcher command help —
noum help <command>now renders command options (includinganalyze) so users can discover flags without leaving the CLI.
0.5.5
New
- Multi-repo introspect evaluation —
extra_reposparameter on MCP, HTTP, and CLI introspect commands. Evaluates prompt changes across multiple repos to reduce overfitting. Averages scores from primary + extra repos. introspect-skippedquery — New named query exposes skipped iterations (parse failures, validation errors, gate failures) for diagnosing introspect issues.- Introspect status progress —
noumenon_introspect_statusnow shows current iteration number and last outcome message, not just elapsed time.
Fixes
- Cascading template expansion — All prompt renderers (
agent,introspect,benchmark,analyze,synthesize) switched from sequentialstr/replaceto single-pass regex substitution. Previously, inserting a template that contained{{placeholder}}strings caused subsequent replacements to cascade, bloating prompts from 5K to 924K. - Stale chunked prompts —
reseed/ bootstrap now usessave-prompt!which properly retracts old chunks before writing. Previously, a prompt bloated by introspect and stored as chunks survived reseeds because the raw upsert added:templatewithout retracting stale:chunks. - EDN extraction from prose — Introspect proposal parser now extracts the outermost
{...}EDN map from optimizer responses that wrap the proposal in explanatory prose. Previously, the entire response was parsed as EDN, failing on any surrounding text. - Git commit on Datomic-only changes —
git-commit-improvement!no longer throws when introspect improves a Datomic-only target (examples, system-prompt, rules) that produces no filesystem changes. Previously,git commitexited 1 with "nothing to commit", which propagated as an exception insidewith-modification, reverting the improvement. - Introspect error persistence — Skipped iterations now store the raw optimizer response or error message in
:introspect.iter/errorfor post-hoc diagnosis.
0.5.4
Fixes
- MCP daemon lock contention —
noum servenow auto-detects a running local daemon viadaemon.ednand proxies tool calls to it instead of opening the database directly. Previously, the daemon's exclusive file lock caused every MCP tool call to fail with a generic "unexpected internal error." - MCP error messages — Tool call errors now include the actual cause and tool name instead of "An unexpected internal error occurred." Database lock errors include actionable kill instructions and explicitly tell AI agents not to retry.
- MCP proxy auth header — Proxy mode no longer sends
Authorization: Bearer nullwhen connecting to a local daemon without a token. - Setup binary path —
noum setup codenow resolves thenoumbinary viaPATH(e.g. Homebrew at/opt/homebrew/bin/noum) instead of always hardcoding~/.local/bin/noum. - Demo release fallback —
noum demonow searches the 5 most recent GitHub releases for a demo tarball instead of only checking the latest. Prevents "not found" errors when a patch release ships without a new demo database. - Progress bar lifecycle — The launcher's progress handler now resets the bar on completion and creates a new bar when the total changes. Fixes the flashing green bar during digest benchmark and spurious "✓ digest done." lines between steps.
- Progress bar step labels — Digest sub-steps (analyze, benchmark) tag their SSE progress events with
:step, so the bar shows "✓ analyze done." instead of "✓ digest done." - Synthesize progress event — Added missing
:current/:totalkeys to the synthesize progress event, preventing NPE in the launcher handler. - Digest output formatting — Nested result maps (analyze, benchmark, synthesize) are now printed as an indented tree with floats rounded to 2 decimal places, instead of raw EDN.
0.5.3
Fixes
- Stale JAR auto-update —
jar/ensure!now readsversion.ednfrom the installed JAR and compares against the launcher version. On mismatch, stops the daemon, downloads the matching release, and restarts fresh. Previously, an existing JAR was never re-checked, so Homebrew launcher updates silently ran against an old backend. - Daemon bounce on upgrade —
noum upgradenow stops the running daemon after downloading a new JAR, so the next command starts with the updated code. - Version def shared — Moved from
main.clj(private) topaths.cljso bothmainandapipass it tojar/ensure!.
0.5.2
Security hardening, bug fixes, and UX polish.
Security
- EDN read-eval disabled —
*read-eval*bound to false in introspect code verification;{:readers {}}added to alledn/read-stringcalls parsing LLM responses, checkpoints, and external data - CORS restricted —
file://origins now require explicitNOUMENON_ALLOW_FILE_ORIGINenv var - Admin-only endpoints —
/api/query-rawand/api/ask/sessionsadded to admin-only prefixes - SSRF hardening — CGN range
100.64.0.0/10added to blocked IP patterns;--separator in git clone commands; proxy host URL validation - Subprocess timeouts — Python, Node, C, and Elixir import extractors now timeout after 30 seconds
- Hook state directory — Moved from world-writable
/tmpto user-private~/.noumenon/tmp/ - CI tag validation —
GITHUB_REF_NAMEvalidated as semver before shell substitution in release workflow - Credential handling — Directory permissions set before writing config; warning on
--token+--insecure - MCP proxy — Admin tool forwarding logged; read-only flag respected for
git_commit; SSRF check on proxy host - Electron navigation — Restricted to exact daemon port instead of any localhost port
Fixes
- MCP digest skip flag — Synthesize step was gated on
skip_analyzeinstead ofskip_synthesize - Merge retry usage —
invoke-mergenow accumulates LLM token usage from both attempts - Agent nil dispatch — Guard against nil tool dispatch when LLM sends only
:reflect - Benchmark stop-flag —
run-benchmark!accepts external stop-flag for HTTP introspect sessions - Database deletion — Removed post-Datomic filesystem deletion that could corrupt shared storage
- Session limit race —
register-ask-session!enforced atomically via singleswap! - Leaf file re-enrichment — Files with no imports now get empty
[]for:file/importsto prevent redundant re-processing - Test speed — 429 retry test binds
*max-retries*to avoid 6-second sleep - Limit param coercion — HTTP query endpoints coerce string
:limitto long - History help text — Replaced hardcoded prompt names with dynamic hint
UX Improvements
- CLI — Spinner cleanup on API errors; actionable watch failure messages; dynamic prompt listing; post-setup instructions; upgrade progress spinner; explicit "Daemon: not running" message
- TUI — Non-interactive auto-select warns to stderr; confirm defaults to false for safety
- UI — Feedback polarity from event data; in-app delete confirmation; active nav indicator; flex layout for ask results; theme cached in localStorage; graph loading skeleton; empty table/history states; truncation with tooltips; formatted introspect deltas; error state on network failure
- MCP — Digest description lists all pipeline steps;
skip_synthesizein schema; search clarifies embed prerequisite; list_queries mentions required parameters - Sidebar — Unicode icons replace ambiguous single letters
- Benchmark — "Select 2 runs to compare" hint text
0.5.1
TUI hotfix.
Fixes
- Arrow key navigation — Menu selector now uses
condinstead ofcasefor escape sequence matching (Babashka'scasedoesn't resolve var references) - Menu line breaks — Raw terminal mode uses
\r\ninstead of\nfor correct vertical layout - Back navigation — Selecting "← Back" no longer leaves a stray line in the console
- Key input — Reads from
/dev/ttydirectly instead ofSystem/infor reliable raw-mode input
New
embedcommand in launcher — help text and Pipeline menu entry
0.5.0
TF-IDF vector search, hierarchical synthesis, and cross-repo benchmarks.
New
- TF-IDF vector search —
embedpipeline stage builds a vocabulary and vector index from file and component summaries. Pure Clojure, no external dependencies beyond Nippy for serialization. noumenon_searchMCP tool — Semantic file/component search without the agent loop. Zero LLM calls, millisecond responses.- Ask agent seeding — The ask agent is seeded with TF-IDF search results before querying the knowledge graph, giving it a warm start on relevant files and components.
embeddedbenchmark layer — Measures TF-IDF retrieval quality alongside raw and full KG layers.:fulllayer enriched — Benchmark's full layer now includes both KG query results and TF-IDF search results when available — representing everything Noumenon has.- Hierarchical map-reduce synthesis — Repos with 250+ files are synthesized per directory partition, then merged. Fixes guava (3,333 files) and redis (1,754 files) which previously returned 0 components.
- Session seed logging — Ask sessions persist TF-IDF seed results to Datomic for analytics.
Changed
- Neural net input — Query routing model now uses TF-IDF vectors instead of bag-of-words, giving it term-importance weighting. Existing trained models require retraining.
- MCP digest handler — Now includes synthesize and embed steps (was missing both).
- Raw context limit — Reduced from 800K to 500K chars to stay within the ~200K token API limit.
- Default benchmark provider — Falls back to GLM instead of Claude CLI.
Fixes
- MCP benchmark handler wasn't passing model-config, causing raw layer to silently fail via Claude CLI
- Synthesize retraction + creation in same Datomic transaction caused datoms-conflict on re-synthesis
- MCP synthesize and digest handlers weren't seeding new prompt templates
- Recursive directory partitioning caused StackOverflowError on flat directory structures (redis)
- Merge synthesis validator rejected components with
:source-componentsinstead of:files
Benchmarks
Cross-repo benchmark (8 repos, 7 languages, 22 deterministic questions each):
| Metric | Without Noumenon | With Noumenon |
|---|---|---|
| Accuracy | 20% | 53% |
| Token cost | 37K | 7K |
| Speed | 13.6s | 6.1s |
0.4.0
Architectural synthesis, visual desktop UI, and interactive CLI.
New
- Interactive TUI —
noumwith no arguments enters a menu-driven interface. Browse commands by category, select repositories/sessions/queries from live data. Smart arg collection for all commands including introspect sub-actions. - Visual desktop UI — Electron + ClojureScript app with force-directed graph visualization. Three-level drill-down (components, files, segments), floating Ask overlay with streamed reasoning, @-mention autocomplete. Launch with
noum open. synthesizecommand — Identifies logical components from file summaries, import edges, and directory structure. Maps component dependencies, layers, and categories. Language-agnostic.- Component entities —
component/name,component/summary,component/layer,component/category,component/depends-on. Files link viaarch/component. - 9 new named queries —
components,component-files,component-dependencies,component-dependents,component-authors,component-churn,component-bus-factor,cross-component-imports,subsystems. noum demo— Pre-built knowledge graph for instant querying without credentials.- Top-down query strategy — Ask agent starts at component level for architectural questions.
Security
- Electron renderer uses contextBridge (no executeJavaScript)
- CORS restricted to Electron origin
- Bounded
edn/read-stringon LLM-sourced strings
Fixes
- Inline markdown parser duplication bugs
- Concurrent SSE submission guard
- Electron namespace collision with Replicant fragments
- Unbounded memoize memory leak in graph builders
0.3.1
Security and UX hardening.
- Path traversal fix on
DELETE /api/databases/:name - Constant-time token comparison for auth
- HTTPS by default for remote
--hostconnections - Token via env var instead of CLI arg (hidden from
ps aux) - SSE error propagation — errors surface correctly instead of wrapping as success
- Delete confirmation with
--forceto skip - Relative path resolution for all commands
- 20+ additional security, UX, and robustness fixes — see git history
0.3.0
noum CLI launcher, HTTP daemon, and Docker image.
noumbinary — Self-contained Babashka launcher. Auto-downloads JRE and backend. 30 commands with custom TUI (spinner, menu, progress bar, table).- HTTP daemon — 22 REST endpoints, bearer token auth, SSE progress streaming.
- Docker image — 167MB Alpine, non-root, auth required for network access.
noum setup— Auto-configures MCP for Claude Desktop and Claude Code.- OpenAPI spec at
docs/openapi.yaml
0.2.0
Introspect (autonomous self-improvement) and ML query routing.
- Introspect loop — LLM optimizer proposes changes to prompts, examples, rules, and code. Keeps improvements, reverts regressions. Multi-repo evaluation to prevent overfitting.
- ML query routing — On-device neural network predicts which Datalog queries to try. Trained locally at zero token cost.
- Issue reference extraction from commits (
#123,PROJ-456) - Scoped re-analysis —
--reanalyzewithall,prompt-changed,model-changed,stalemodes
0.1.0
First public release.
- Import pipeline (git history into Datomic knowledge graph)
- LLM analysis (semantic metadata: complexity, safety, purpose)
- Import graph extraction (10+ languages)
- Named Datalog queries with parameterization and rules
- Agentic
askcommand (natural-language via iterative Datalog) - MCP server (
noum serve) - Benchmark framework with checkpointing and resume
- Concurrent processing (configurable parallelism)