Experimental, early beta. Data model and interfaces are unstable; expect breaking changes between releases.

Changelog

All notable changes to Noumenon, mirrored from CHANGES.md in the source repo.

Unreleased

0.12.3

Fixes

  • noum ask and noum introspect now report a clean 400 when NOUMENON_LLM_BASE_URL isn't a real URL — saving a bare alias like claude in ~/.noumenon/credentials (instead of https://api.anthropic.com) used to slip past the existing blank-check in llm/require-base-url!, so the value flowed all the way into the http-kit call that builds <base-url>/v1/messages. http-kit then failed URL parsing with host is null: claude/v1/messages, the daemon's route catch-all rewrote that as a generic 500, and the launcher displayed Error: Internal server error — nothing pointing at credentials. Every command that touches the LLM (ask, introspect, analyze, synthesize, enrich --analyze, digest, benchmark, update --analyze) shared this failure mode because they all route through llm/make-messages-fn-from-opts. valid-base-url? now parses the value as a java.net.URI and requires an http(s) scheme plus a non-blank host; require-base-url!, require-api-key!, and require-model! all tag their ex-info with :status 400 so the HTTP route catch-all serves them as 400 "Invalid NOUMENON_LLM_BASE_URL: \"claude\". Expected an absolute URL with scheme and host (e.g. https://api.anthropic.com)." instead of swallowing the message into a 500.

0.12.2

Fixes

  • noum update and noum watch now stream progress instead of going silent — both commands posted to /api/update without SSE, and the server's handle-update had no streaming path, so the launcher's HTTP request blocked until the daemon finished a fresh import + LLM analysis pass (potentially many minutes) with no per-file feedback. The output looked indistinguishable from a hang: only the JRE-selection line printed, then dead silence until completion. The server now routes handle-update through mw/with-sse (mirroring /api/import / /api/analyze / /api/enrich / /api/digest) and threads a :progress-fn through sync/update-repo! into git/import-commits!, imports/enrich-repo!, and analyze/analyze-repo!. The launcher adds "update" to its progress-commands set so it requests SSE and renders a TUI spinner/bar, and watch-loop! now builds a fresh progress handler per polling iteration so each in-flight iteration shows live progress instead of going dark for the duration of the call.

0.12.1

Fixes

  • noum reuses an existing system Java when one is available — the launcher used to unconditionally download a ~200MB JRE to ~/.noumenon/jre/ on first run, even when the user already had Java 21+ installed. It now checks $JAVA_HOME and java on PATH first; if either points at a Java 21+ runtime (the minimum the uberjar targets — older JVMs fail with UnsupportedClassVersionError at class-load time), the launcher uses it and skips the download. The bundled JRE remains the fallback for users on Java 17 or older, or with no Java at all.
  • noum JRE bootstrap no longer fails on WSL with a cross-filesystem extraction error — the launcher staged the downloaded JRE in /tmp (which on WSL is tmpfs) before moving it into ~/.noumenon/jre/ (which lives on ext4). java.nio.file.Files/move can rename single files across filesystems but throws FileSystemException on non-empty directories, so the move blew up on the JRE's legal//bin//lib/ subdirs with Error: /tmp/noum-jre-…/jdk-21…/legal. Staging now happens under ~/.noumenon/jre-staging-…/ so the move is always intra-filesystem; a copy-tree fallback covers any other cross-fs configuration.

0.12.0

BREAKING CHANGES

  • LLM configuration collapsed to two env vars (plus one optional) — Noumenon used to carry a multi-provider router: an EDN-encoded provider map (NOUMENON_LLM_PROVIDERS_EDN), a default-provider selector (NOUMENON_DEFAULT_PROVIDER), per-provider env keys (NOUMENON_ZAI_TOKEN, ANTHROPIC_API_KEY), a runtime-mode toggle (NOUMENON_RUNTIME_MODE), an HTTPS allowlist (NOUMENON_LLM_BASE_URL_ALLOWLIST_EDN), model aliases (sonnet/haiku/opus), a --provider flag on every LLM-touching subcommand, llm-providers/llm-models CLI subcommands, and matching noumenon_llm_providers/noumenon_llm_models MCP tools. All of that is gone.

What changed

- Replaced: NOUMENON_LLM_PROVIDERS_EDN, NOUMENON_DEFAULT_PROVIDER, NOUMENON_LLM_PROVIDER, NOUMENON_ZAI_TOKEN, ANTHROPIC_API_KEY, NOUMENON_RUNTIME_MODE, and NOUMENON_LLM_BASE_URL_ALLOWLIST_EDN → with NOUMENON_LLM_BASE_URL (required), NOUMENON_LLM_API_KEY (required), and NOUMENON_LLM_MODEL (optional default for --model). - Removed: the --provider flag on every subcommand; the llm-providers and llm-models CLI subcommands; the noumenon_llm_providers and noumenon_llm_models MCP tools; the provider property on every other MCP tool input schema; the sonnet/haiku/opus model aliases. --model now takes a raw model id and passes it through to the upstream endpoint verbatim. - Credentials file fallback: ~/.noumenon/credentials is now read directly by Noumenon as a fallback to env vars — no source step is needed. The fallback is automatically disabled when the HTTP daemon binds to anything other than 127.0.0.1, so a shared-service deployment cannot pick up a user's on-disk credentials.

Why

Noumenon is not an LLM router. The provider-map config was an in-house mini-router that duplicated what dedicated tools — OpenRouter, LiteLLM, and any Anthropic-Messages-API-compatible gateway — do far better, with broader provider coverage and active maintenance. For multi-model flexibility, point NOUMENON_LLM_BASE_URL at one of those instead. This was the highest-friction surface in the codebase and the collapse removes ~300 lines of routing, validation, allowlisting, and discovery code with no loss of supported capability. The local single-user and headless shared-service deployment shapes are both still first-class — the daemon's bind address picks which credential-resolution policy applies.

How to upgrade

Pick the upstream and set three shell variables (or run noum setup to populate ~/.noumenon/credentials interactively):

```sh # Anthropic direct export NOUMENON_LLM_BASE_URL=https://api.anthropic.com export NOUMENON_LLM_API_KEY=sk-ant-... export NOUMENON_LLM_MODEL=claude-sonnet-4-6-20250514

# OpenRouter (Anthropic-compatible route) export NOUMENON_LLM_BASE_URL=https://openrouter.ai/api/v1 export NOUMENON_LLM_API_KEY=sk-or-... export NOUMENON_LLM_MODEL=anthropic/claude-sonnet-4-5

# Local LiteLLM (Anthropic-format proxy) export NOUMENON_LLM_BASE_URL=http://localhost:4000 export NOUMENON_LLM_API_KEY=sk-litellm-master-... export NOUMENON_LLM_MODEL= ```

If you previously used --provider claude --model sonnet, drop --provider entirely and pass the full upstream-recognized model id via --model (or set NOUMENON_LLM_MODEL). Old env vars are not consulted — the launcher's noum setup wizard will prompt for the new ones.

0.11.1

Fixes

  • Docker image build adds git to the build stage — the v0.11.0 release pushed clj-p4 from :local/root to {:git/tag "v0.6.1-alpha"}, so clojure -T:build uber now needs to clone the dependency via git inside the build container. The build stage's apk add --no-cache curl bash did not include git, so the docker job in the release workflow failed at clojure -T:build uber with a tools.deps git-clone error. The runtime stage already had git for noumenon's own clone-and-import workflow; the build stage now matches.

0.11.0

Changed

  • clj-p4 upgraded to v0.6.1-alpha and pinned to a public Git tagdeps.edn previously resolved clj-p4 via :local/root "../clj-p4", which made noumenon non-buildable on any machine without a sibling clone of the library. The coordinate is now {:git/tag "v0.6.1-alpha" :git/sha "332b280"}, matching the format cognitect-labs/test-runner already uses in the same file. clj-p4 v0.4.0-alpha renamed several namespaces (clj-p4.excludeclj-p4.excludes, clj-p4.specclj-p4.predicates, clj-p4.shell.procclj-p4.io.subprocess) and renamed api/sync! to api/fetch!; v0.6.0-alpha removed the pre-compiled :exclude escape hatch outright (now throws :legacy-exclude-removed at the boundary). noumenon.p4 was still calling all four; the adapter is rewritten against the post-0.4.0-alpha API surface. No public-API change for noumenon.p4's callers (noumenon.repo, noumenon.repo-manager, noumenon.sync).
  • Binary filtering is delegated to clj-p4 — clj-p4 v0.5.0+ ships its own curated binary-category set (nine categories, ~77 patterns, including Wwise .bnk/.wem and Unreal cooked content .uasset/.umap/.upk/.ubulk that the noumenon-side list also covered) plus a Perforce-type catch-all that drops any revision whose :rev/type is :binary/:apple/:resource. Both noumenon.p4/clone! and noumenon.p4/sync! now pass :exclude-binaries? true :exclude-categories :all, replacing the noumenon-side :exclude pattern vector compiled from resources/p4-excludes.edn. This subsumes the post-0.10.3 fix that made sync! forward the same exclude vector as clone! — clone-vs-sync symmetry is now structural (identical fixed policy at both call sites) rather than a bug that had to be tracked.

Removed

  • resources/p4-excludes.edn — 42-line categorised extension blocklist that duplicated clj-p4's own built-in list. clj-p4 owns the responsibility now; the file is gone, the resource loader (excludes-resource delay in noumenon.p4) is gone, and the compile-excludes private helper is gone.
  • :no-default-excludes?, :extra-excludes, :includes options on noumenon.p4/clone! — these were documented in the function's docstring but had no actual producer in src/ or test/. If a future caller needs path-level carve-outs, plumb :excludes/:includes through p4-opts directly to clj-p4.api/clone!/fetch!.

0.10.3

Fixes

  • Introspect extra_repos resolver and target-set are no longer duplicatedresolve-extra-repos had near-identical inline copies in CLI introspect (cli/commands/introspect.clj) and HTTP introspect (http/handlers/introspect.clj); they used different conn-open helpers (db/connect-and-ensure-schema vs db/get-or-create-conn) but were otherwise the same shape. Lifted to noumenon.repo/resolve-extra-repos so both transports go through one definition. The redundant mw/allowed-introspect-targets set definition in http/middleware.clj is removed in favor of the canonical util/valid-introspect-targets that CLI already used.
  • noum query clamps result sets like the HTTP API — CLI returned the full :ok seq from query/run-named-query, so noum query recent-commits . against a large repo dumped every row to stdout. HTTP capped at 500 by default and 10000 max via clamp-limit. CLI now accepts --limit <n> (default 500, clamped to [1, 10000]) and applies it via the same query/clamp-limit helper — lifted into noumenon.query so CLI and HTTP share one definition.
  • noum introspect --git-commit is silently disabled on bare repos — HTTP /api/introspect already gated :git-commit? on (not (bare-repo?)) so a bare clone (no working tree) wouldn't hit a confusing failure deep in the commit step. The CLI passed the flag through unchecked. CLI now matches: when --git-commit is set against a bare repo, the flag is silently downgraded and a [--git-commit ignored] target is a bare git repo message is logged so the user sees what happened.
  • HTTP digest runs the same pipeline as CLI digest — CLI digest was: update → analyze → resolve calls → synthesize → embed → benchmark. HTTP digest was missing the resolve calls and embed steps entirely, so a daemon-mode digest produced a graph with no cross-segment call edges (:code/calls) and no TF-IDF index. noumenon_search returned no results, and segment-callers / uncalled-segments queries came back empty. HTTP now runs both steps in the same order: calls is gated on skip_analyze (matching CLI), embed runs unconditionally (matching CLI). Result map exposes :calls and :embed keys.
  • HTTP enrich/analyze/synthesize refuse to silently create empty DBs — calling /api/enrich, /api/analyze, or /api/synthesize on a never-imported repo would let db/get-or-create-conn create an empty Datomic DB on disk and the handler would report "0 files processed" success — leaving the user with a phantom database and the impression that nothing was wrong. CLI's with-existing-db already errored out in the same scenario; HTTP now matches via a new with-imported-repo middleware wrapper that returns 404 "Database not yet imported … Run /api/import first" when the on-disk db dir doesn't exist. Endpoints that legitimately establish the DB (import, update, digest) keep using the create-on-demand with-repo.
  • noum ask persists session records to the meta DB — only the HTTP /api/ask handler called ask-store/save-session! after each ask invocation. The CLI returned the answer to stdout but left no trace in the meta DB, so CLI ask sessions were invisible to the introspect loop's feedback/training signal and noum ask --continue-from <id> couldn't reference its own prior sessions. CLI now saves the session with :channel :cli, :caller :human, the resolved repo db-name, and accurate wall-clock duration. Save failures are logged and swallowed so the answer always reaches the user.
  • noum ask --max-iterations is now capped at 50 — the HTTP /api/ask handler clamped :max_iterations to [1, 50] so an LLM agent (or careless caller) couldn't run away with cost. The CLI's do-ask passed the user-supplied value straight through, so noum ask -q "…" --max-iterations 10000 . would let the agent loop run 10000 iterations regardless of --max-cost / --stop-after. CLI now applies the same [1, 50] clamp; the default (10) is unchanged.
  • HTTP synthesize and digest reseed artifacts before running — CLI synthesize and CLI digest both call artifacts/reseed! before constructing the LLM, so an updated prompt/query/rules seed is picked up automatically. The HTTP handlers skipped this step, so a daemon-mode noum digest produced different results than the CLI version after a seed-source edit. run-synthesize and run-digest now run reseed up front, matching the CLI contract. reseed! is identity-attribute-driven and idempotent, so the cost is zero on the steady state.
  • HTTP synthesize and HTTP digest's synth step honor :max-tokens 16384POST /api/synthesize and the synthesize step inside POST /api/digest were calling wrap-as-prompt-fn-from-opts with no max-tokens override, so synth output (long architectural component descriptions) was getting truncated at the provider default (typically 4096) on the daemon path. The CLI synthesize command already raised the cap to 16384; HTTP now matches. The digest handler builds a separate synth-llm for the synthesize step (mirroring the CLI digest pattern) so the analyze and benchmark steps keep using the unraised cap.
  • noum list-databases --delete refuses to wipe the meta DB — the CLI's --delete branch had no reserved-name guard, so a typo could destroy noumenon-internal and take every prompt, query, rules artifact, benchmark/introspect run record, ask session, token, and setting with it. The HTTP DELETE /api/databases/:name handler already rejected this with a 400; the CLI now matches with a Cannot delete reserved database: <name> error and exit 1. The CLI delete path also now calls db/evict-conn! after a successful delete (matching the HTTP handler), so a stale cached connection doesn't survive past the deletion.
  • noum analyze --no-promote now actually bypasses the promotion cache — the --no-promote flag was only exposed on the HTTP and MCP transports; the CLI didn't even define it, so users who wanted to force a fresh LLM call had no way to do so via noum analyze. Added the flag to the analyze CLI spec and plumbed it through build-analyze-opts to analyze-repo!'s :no-promote? parameter, matching HTTP's (boolean (:no_promote params)) shape so missing/false/true all produce a definite boolean. CLI, HTTP, and MCP now agree on the contract.

Removed

  • Dead mcp.handlers.* namespace treefd43977 refactor(mcp): make the MCP server a pure proxy (2026-04-30) made the bridge forward every tools/call to the daemon over HTTP and removed the in-process handler dispatch. The five handler namespaces (mcp.handlers.{query,mutation,benchmark,introspect,meta}) and their support helpers in mcp/util.clj (with-conn, lookup-repo-uri, resolve-extra-repos, selector-opts, validate-llm-inputs!, provider+model, validate-layers, length-cap defs, allowed-layers/allowed-introspect-targets sets) have been carrying no live callers since then. Deletion is no behavior change — the actual MCP behavior is whatever the HTTP daemon does. Audit findings about MCP-vs-HTTP drift in those handlers were false positives against dead code; this removes the surface area so future audits see only live code paths.
  • --no-auto-update flag on noum serve — the only consumer of the :auto-update setting was mcp.util/with-conn's auto-update branch, deleted alongside the handler tree. The serve command's epilog text is updated to reflect that the bridge is stateless and forwards to the daemon.

Fixes

  • HTTP POST /api/analyze honors the reanalyze parameternoum analyze . --reanalyze stale (and any other scope) silently produced zero analyzed files because the daemon's HTTP handler never called the retraction step that the CLI and MCP handlers both did. The :reanalyze field on the JSON body was dropped on the floor; the user only saw digest work because its update step retracted analysis on changed files via a different code path. The two near-identical local copies of prepare-reanalysis! (one in cli/commands/pipeline.clj, one in mcp/handlers/mutation.clj) are lifted to a single noumenon.sync/prepare-reanalysis! plus a shared valid-reanalyze-scopes set; the HTTP handler now calls it before analyze-repo! and returns 400 on an invalid scope. CLI, HTTP, and MCP now agree on the contract.

0.10.2

Fixes

  • benchmark no longer eagerly resolves a provider when there's no model to useload-run-context always called llm/make-isolated-prompt-fn for any run that included the :raw layer, which forced full provider resolution (model lookup, API-key validation) at construction time even if the caller had supplied an invoke-llm mock. Tests that didn't bother passing :model-config (defaulted to {:provider "glm"}) crashed with "No model selected for provider glm"; tests that did pass a :provider "claude" config crashed on the missing API key. The construction is now gated on (:model model-config) — explicit model present means the user wants the isolated path; without it, select-llm-fn falls back to the main invoke-llm for :raw stages, matching its existing fallback contract. The with-bench-mocks test helper also stubs make-isolated-prompt-fn so tests that DO pass an explicit model config (without a real API key) don't trigger the API-key check.
  • elixir-test skips cleanly when Erlang runtime is missing — the test guard checked which elixir only. On a machine with elixir on PATH but no Erlang (erl: not found), the which test passed, downstream import-extraction silently produced zero edges, and the count-query assertion (pos? (ffirst edges)) NPEd because ffirst of an empty Datalog result is nil. The guard now runs elixir --version so the four Jason tests skip cleanly when the runtime is broken; the assertion is also nil-safe ((or (ffirst edges) 0)) so a mis-configured extraction surfaces as "expected positive edge count, got 0" instead of an NPE.
  • clojure -M:test runs the suite again — moving the per-language fixture sources to test/fixtures/ brought files like test/fixtures/clojure/test/myapp/core_test.clj under cognitect.test-runner's namespace-discovery walk. The fixture's declared ns (myapp.core-test) doesn't match its filesystem location, so (require 'myapp.core-test) failed with FileNotFoundException and the runner aborted before any test ran. The clj-kondo exclusion was added in f741235 but the equivalent test-runner exclusion was not. The :test alias now restricts discovery to test/noumenon via -d (for :main-opts) and :dirs (for :exec-fn); CI's clojure -M:test 2>&1 | tail -20 shows real test output instead of the FileNotFoundException startup error.
  • noum stop adopts an orphan daemon when daemon.edn is missing — when an earlier failed stop or a partial cleanup left the daemon JVM running but removed ~/.noumenon/daemon.edn, the launcher had no record of the daemon and reported "No managed daemon to stop." while the JVM kept holding the meta-db lock indefinitely. The user's only recourse was kill -9 from outside the tool. noum stop now falls through to the lsof lock-holder probe (the same one already used by noum start to name the conflicting PID) and adopts that PID through the existing SIGTERM-then-SIGKILL fallback. Output names the path taken so it's auditable: "No daemon.edn; adopting orphan PID N (cmdline)." → "Orphan daemon stopped (PID N)." or "Orphan daemon force-killed (PID N)."
  • Duplicate query-string keys return 400 instead of silently last-winsparse-query-params collapsed repeated keys via (into {} …), so ?repo_path=/safe&repo_path=/etc reached the handler with /etc and a defender filtering on the first occurrence would miss the actual input. The dispatcher now rejects requests with any duplicated key with 400 "duplicate query parameter <name>". Single-value query strings keep their existing first-and-only semantics.
  • Negative limit no longer returns silently empty resultsPOST /api/query, /api/query-raw, /api/query-as-of, and /api/query-federated clamped only the upper bound ((min limit 10000)), so limit:-5 flowed through to (take -5 …) which returned [] while :total reported a non-zero count — a scripted client would misread the empty :results as "no data". A new clamp-limit middleware helper floors at 1 and caps at 10000; missing or unparseable values still default to 500.
  • Readers can attach feedback to their own ask sessionsauth/admin-only-prefixes listed /api/ask/sessions as a bare prefix and requires-admin? matched it via starts-with?, so all three sub-routes (list, detail-get, feedback-post) gated on admin. That meant a reader who created an ask session via /api/ask could not then post feedback on it — defeating the introspect loop's signal-harvesting hook. A new reader-allowed-patterns list overrides the prefix match for POST /api/ask/sessions/:id/feedback specifically; the list and detail endpoints stay admin-only.
  • LLM model-resolution errors return 400 with the cause, not bare 500resolve-model-id rejects three configuration-class shapes ("No model selected for provider X", "Configured :default-model is not listed in :models", "Model X is not configured") via ex-info with no :status, so any HTTP route that ran an LLM (/api/ask, /api/analyze, /api/synthesize, /api/digest) lost the actionable message inside the routes-handler 500 fallback — the user only saw "Internal server error". All three throws now carry :status 400 :message <reason> :user-message <reason>. The same shapes are also user-actionable (the client can pass model in the request body to override), so 400 is the right code.
  • Missing query_name returns a clean 400, not the 89-query registryPOST /api/query and /api/query-as-of length-capped query_name but didn't reject blank/missing values; they forwarded an empty string to query/run-named-query, which built an "Unknown query: " error string and concatenated all 89 query names — a ~4 KB response for what should be "query_name is required". Both handlers now reject the missing field up front, before with-repo, so the error is also independent of repo state.
  • noum daemon and noum serve honor NOUMENON_DB_DIR — both CLI commands hardcoded a ~/.noumenon/data fallback before the env-var lookup ever ran, so NOUMENON_DB_DIR=/some/path noum daemon silently kept locking ~/.noumenon/data. The lookup is hoisted into a single resolve-db-dir helper with documented precedence: --db-dir flag → NOUMENON_DB_DIR env → ~/.noumenon/data. http.server/resolve-server-config's own env-var support remains unchanged for callers that bypass the CLI; the CLI just stops shadowing it.
  • /api/artifacts/history?type=prompt (no name) returns 400, not 500 — the HTTP handler forwarded a nil name into the Datalog query inside artifacts/prompt-history, which barfed with "Unable to find data source: $__in__2" and surfaced to the client as a generic "Internal server error". The MCP handler already validated this branch; the HTTP handler now matches with a clean 400 "name is required when type is 'prompt'", plus a length cap on name.
  • params as a JSON array (or any non-object) returns 400, not 500POST /api/query, /api/query-as-of, and /api/query-federated keywordized params with (into {} (map (fn [[k v]] …)) raw) *before* any validator ran. Passing {"params":[1,2]} made the transducer try to destructure a Long as a [k v] pair, which threw IllegalArgumentException and surfaced to the client as a generic 500. validate-params! now type-checks first (rejects non-maps with a clear "params must be an object" message), and the three handlers run that check before keywordizing — so a non-map shape always 400s, regardless of whether the repo resolves.
  • Bad URLs to POST /api/repos now return 400 instead of 500git/validate-clone-url! and validate-url-host! rejected unsafe URLs (file://, http://localhost, http://127.0.0.1, unresolvable hosts) by throwing ex-info with no :status, so the HTTP routes handler's (or status 500) fallback turned every kind of bad URL into a generic 500 with the body {"ok":false,"error":"Internal server error"}. The actual rejection reason was only visible in the daemon log. Both validators now set :status 400 :message <reason>, matching the pattern every other input validator already follows.
  • schema-summary renders value-type and cardinality as keywords, not raw entity IDsquery/list-attributes was binding ?vt/?card directly to the value-type/cardinality entity refs. Datomic Local does not auto-resolve refs in :find, so the JSON returned by GET /api/schema/<db-name> (and the MCP noumenon_get_schema tool that wraps it, and the CLI show-schema output) printed lines like :arch/component 20 35 — … instead of :arch/component :db.type/ref :db.cardinality/one — …. The endpoint's whole point is to give an LLM/agent something it can read, so the numeric IDs made the surface effectively useless. The query now joins to :db/ident for both refs.
  • HTTP /api/import now persists :repo/head-sha — the CLI's do-import already wrote :repo/uri and :repo/head-sha after a fresh import, but the HTTP handler's run-import did not, so GET /api/status/<db-name> (and the MCP noumenon_status tool that wraps it) returned head-sha: null after the documented first-step workflow. The MCP description tells callers to "compare with git rev-parse HEAD to check if the knowledge graph is up to date" — that comparison was useless on a freshly-imported repo. run-import now mirrors the CLI and transacts {:repo/uri repo-path :repo/head-sha (git/head-sha repo-path)} after the import is complete; the SHA is also returned in the response body.
  • derive-db-name disambiguates same-basename repos with a path hash — two filesystem paths that happened to share a basename (e.g. monorepo subdirs both named repo, or two clones of the same project at different locations) silently collapsed to the same Datomic database. The user's knowledge graph for one repo was getting merged with another's commits, files, and analyses with no warning. db-name format is now <sanitized-basename>-<12-hex-of-canonical-path>; same canonical path always yields the same db-name (so re-running on the same repo is still idempotent), but different paths now never collide. Existing databases under the old bare-basename names will not be recognized after upgrade — re-import the affected repos. Tests that hardcoded "ring" / "jason" / "mino" as db-names now derive the name through util/derive-db-name so they continue to track the CLI's actual derivation.

0.10.1

Fixes

  • CI lint excludes test fixtures — moving test-fixtures/ to test/fixtures/ in 0.10.0 brought the per-language fixture tree under the clojure -M:lint scan path (src test). One Clojure fixture (test/fixtures/clojure/test/myapp/core_test.clj) requires myapp.core without using it — intentional, since the fixture imitates an "imports unused namespace" pattern that the import-extraction logic is supposed to detect — but clj-kondo flagged it as a warning and exited with code 2. New .clj-kondo/config.edn adds {:output {:exclude-files ["test/fixtures/.*"]}} so all language fixtures are skipped uniformly.

0.10.0

Changed

  • MCP server is now a pure proxymcp/serve! no longer falls back to opening local Datomic when no daemon is reachable. The in-process tool-handlers map and handle-tools-call (along with the suppress-datomic-logging! call and the dependency on mcp.handlers.* from the bridge) are gone; every tools/call forwards to whatever daemon proxy/resolve-conn picks at that exact moment, with the lookup happening per call so a daemon that comes up mid-session is used immediately. When no daemon is reachable, the bridge returns a structured MCP error pointing the caller at noum start rather than racing the daemon for the Datomic file lock and silently winning. The HTTP daemon's routes still use the h-mut/h-query/h-meta/h-bench/h-intro handler namespaces — only the bridge stopped touching them.
  • do-serve ensures a daemon before booting the MCP bridge — new namespace noumenon.daemon-control exposes ensure-spawned!, a one-shot bootstrap that returns the existing daemon connection if one is reachable, otherwise ProcessBuilder-spawns a sibling JVM running the same code via the daemon subcommand and polls every 500ms (up to 15s) for it to become healthy. The argv is built from java.home + java.class.path, so it works for both uberjar runs (single jar on classpath) and dev clj -M:run runs (long classpath) without branching. The spawned daemon is detached — it outlives the MCP bridge, so the next Claude Code session reuses it. Composition lives in do-serve (the CLI subcommand handler), not in mcp/serve! itself; daemon-control is intentionally not an Integrant component because the daemon must outlive its caller and Integrant has no idiomatic "attach to existing instance" pattern.

Fixes

  • noum start surfaces the real daemon-start error instead of a 30-second timeoutstart! now detects an early JVM exit via .isAlive on the spawned :proc, so a daemon that crashes in the first second fails fast with the actual cause rather than waiting out the full 30 seconds and reporting a meaningless "failed to start within 30 seconds." The launcher also switches :out/:err from io/writer (... :append true) to [:append paths/daemon-log] so the OS handles the redirect — the buffered Java writer was never being flushed before the launcher threw, eating the daemon JVM's stderr. Failure messages now include the last 30 lines of ~/.noumenon/daemon.log, and a true 30-second timeout .destroys the JVM instead of leaking it.
  • noum start names the lock holder when daemon-start fails — when the daemon JVM exits before becoming healthy, the failure message now includes "Lock currently held by PID N (cmdline)" if lsof finds a process holding ~/.noumenon/data/noumenon/noumenon-internal/.lock. The cmdline is truncated to ~120 characters with a middle ellipsis so a Java classpath doesn't drown the message. Both lsof and ps -p X -o command= behave identically on macOS and Linux. If lsof is missing or the lock is free, the helper silently returns nil and the failure message is unchanged — never block the error path.
  • noum stop confirms the kill and falls back to SIGKILL — the previous stop! had three independent bugs that compounded into orphan JVMs: only sent SIGTERM with no fallback when the JVM hung in shutdown, printed "Daemon stopped." regardless of whether kill actually worked, and deleted daemon.edn unconditionally — so a failed stop left an unkillable daemon with no record, invisible to future stop attempts. Rewritten as a bounded sequence: SIGTERM, poll up to 5s, SIGKILL, poll up to 2s, only delete daemon.edn once the process is confirmed gone. If the daemon refuses to die even after SIGKILL, throw with the file left in place. Output now reports which path was taken: "Daemon stopped (PID N)." on a clean TERM, "Daemon force-killed (PID N)." when KILL was needed. And explicitly says "No managed daemon to stop." when daemon.edn is absent — the previous silent no-op confused users into thinking stop wasn't doing anything.
  • Daemon shutdown hook is bounded so SIGTERM always winsdo-daemon's shutdown hook called system/halt! synchronously with no upper bound on how long Integrant teardown could take. http-kit's (srv) close has no timeout option, and a hung halt-key would prevent the hook from returning — defeating SIGTERM and turning the daemon into something that only SIGKILL could stop. Wrap halt in a future and deref with a 5-second deadline. Combined with noum stop's SIGKILL fallback, a hung daemon now reliably exits within ~7 seconds total instead of waiting for someone to notice and kill -9 it.
  • bin/noumenon.bat finds any noumenon-*.jar instead of hardcoded 0.1.0 — the Windows launcher hardcoded noumenon-0.1.0.jar and stopped finding the jar after the version bumped. Replaced with a dir loop that picks the lexicographically last noumenon-*.jar in target/, falling back to bin/noumenon.jar if no target build exists.

Chore

  • test-fixtures/ moved to test/fixtures/ — pull the per-language fixture trees under test/ so all test artifacts live in one place instead of straddling test-fixtures/ and test/. Slurp paths in imports_test.clj updated to match.
  • Drop unused test-repos/ gitignore entry and credentials.exampletest-repos/ is no longer used; the gitignore entry was the only remaining reference. credentials.example was a stale template — the launcher's setup flow no longer points users at a copy-this-file workflow, so the example was just dead documentation.

0.9.0

Fixes

  • MCP proxy renders friendly 401/403 messages againinterpret-response was reading the HTTP status from the parsed JSON body ((get parsed "status")) instead of the http-kit response's top-level :status. The daemon's error-response builder doesn't echo the code into the body — only :ok and :error — so the case branch never matched and a user with an expired token saw the bare Unauthorized — bearer token required rather than Authentication failed. Run \noum connect --token \. The bug pre-dated the http-kit migration and was preserved verbatim through it. interpret-response now destructures :status from the response map and uses that for the special-case branches.
  • benchmark/judge-prompt escapes template metacharacters in question/rubric/answer — judge prompts include the answer text returned from the answer-phase model. With no escape on the substituted values, an answer like "Answer with {{rubric}} injection" ended up in the rendered judge prompt verbatim — looking, to the judge model, like a placeholder it should resolve. The single-pass str/replace already prevented cascading expansion of the bindings themselves, but didn't sanitize their content. All three bindings now pass through escape-double-mustache, mirroring the pattern in analyze/render-prompt.
  • analyze/render-prompt escapes every user-controlled binding, not just :content — only :content was being passed through escape-double-mustache; :repo-name, :file-path, :imports, and :imported-by were inserted verbatim. A repo whose path or name contains literal {{content}} (or any other template variable name) would land that text in the rendered prompt, opening a small prompt-injection surface for repo metadata. All five user-controlled bindings now go through the escape; :lang / :line-count are derived from internally validated values and are left as-is.
  • cli/parse-args error envelopes always carry :subcommand — error returns from benchmark (:no-repo-path, etc.) and ask (:ask-missing-question, :no-repo-path) used to come back as bare {:error <kw>} maps without a :subcommand key, even though digest/status/analyze/etc. errors carried it. Callers (main/handle-parse-error, future tooling) had to special-case the gap when routing contextual help. The two override parsers now (assoc :subcommand "benchmark"|"ask") on every terminal branch, matching parse-with-registry and parse-simple-args.
  • introspect --target rejects typos instead of silently expanding to all targets — when a user passed --target foobar (or target: "foobar" via the MCP/HTTP surfaces), the parser silently dropped the unknown keyword. With nothing left in the resulting set, the cond-> guard skipped the assoc, and :allowed-targets never made it into the run options — meaning the introspect loop ran *unrestricted* across all targets. The user's intent to scope the run was invisibly turned into the opposite. Validation now happens up front in noumenon.util/validate-introspect-targets!, called from the MCP introspect handlers, the HTTP /api/introspect handler, and the CLI do-introspect command — so a typo'd target produces a clear error listing the valid set on every surface.
  • MCP proxy tests actually exercise the http-kit code pathsproxy-tool-call-surfaces-curl-failure-clearly and proxy-tool-call-surfaces-empty-body-on-zero-exit were redefing clojure.java.shell/sh to mock curl, but proxy-tool-call no longer shells out — it goes through org.httpkit.client/request. The redefs intercepted nothing; the tests passed coincidentally because the real TCP connect() to 127.0.0.1:7892 failed in a way that happened to match the assertions. They are now keyed off with-redefs [http/request …] returning http-kit-shaped response promises ({:status N :body s :error e}), so the network-failure and empty-body branches in interpret-response are actually executed.
  • Daemon LLM semaphore honors NOUMENON_MAX_LLM_CONCURRENCY — when the new Integrant lifecycle landed, the LLM semaphore was being initialized twice during noum daemon boot: once by http/start! (which read the env var) and once by the :noumenon/llm-semaphore Integrant init-key (which used the system-config default of 10). Integrant happened to run last, so the env var was silently clobbered and every daemon ran with permits = 10 regardless of configuration. The system/config builder now reads NOUMENON_MAX_LLM_CONCURRENCY itself and http/start! no longer touches the semaphore, so there is one source of truth and the env var actually takes effect.
  • Daemon Integrant graph declares dependencies explicitly:noumenon/http-server now references :noumenon/datomic-conns, :noumenon/llm-semaphore, :noumenon/embed-cache, :noumenon/completion-cache, and :noumenon/agent-sessions via ig/ref. Without this, the dependency graph was empty and Integrant fell back to map iteration order (stable for ≤8 entries, undefined past that), so :noumenon/http-server actually initialized before :noumenon/llm-semaphore despite appearing later in the config — which was what caused the semaphore double-init to manifest as "env var ignored." The graph is now self-documenting and a 9th component, or any future component that genuinely needs ordering, won't silently regress.

Refactoring

  • Function-level cleanup across imports.clj, sync.clj, llm.clj, analyze.cljenrich-repo! decomposes into prepare-c-context, run-c-extraction, log-selection-stats!, and ensure-enrich-tx! so the orchestrator body reads as named pipeline stages instead of an 18-binding let with _-bound log! side-effects threaded through the bindings. update-repo! does the same (auto-sync-p4!, compute-changes, apply-retractions!, run-pipeline-stages!, update-result); the result builder is a pure helper. validate-segment now drives off a segment-rules data table that mirrors the existing analysis-sanitizers pattern (key + predicate + cleaner per row), replacing a 16-clause cond-> ladder with destructured args. files-for-reanalysis collapses four near-identical d/q blocks into one query assembled from reanalysis-scope-clauses + a base :where. invoke-api's 4-branch retry cond is now case over a pure classify-attempt (:retry/:fail/:ok); error and retry messages live in pure failure-ex/retry-reason helpers. changed-files extracts parse-status-fields and apply-status-line so the diff-line interpretation is pure data. imports/sh-with-timeout cleans up its args-detection by splitting via split-cmd+opts rather than take-while+drop-while over keyword?. No behavior change.
  • mcp/proxy uses http-kit instead of curlproxy-tool-call no longer shells out to curl with a stdin-fed config file; the request goes through org.httpkit.client/request, which noumenon.llm already uses. Splits into pure build-request-map (URL, headers, body construction with URLEncoder) and pure interpret-response (status → tool-result/tool-error); the clojure.java.shell require drops, eliminating one process boundary per remote call. case on HTTP status replaces the cond ladder for 401/403 messages.
  • MCP handlers — data over atoms, shared opts builderhandle-digest no longer threads pipeline outputs through an (atom {}) with five swap! calls; each step is a digest-*-step helper that returns its result (or nil on opt-out / soft failure), and the final summary is built via cond->, mirroring the pure shape of http/handlers/pipeline.clj:run-digest. handle-introspect and handle-introspect-start share build-introspect-opts, introspect-llms, and parse-allowed-targets instead of duplicating a 15-line cond-> opts builder twice; the async variant additionally factors track-introspect-future! and ensure-session-capacity!. handle-ask extracts a pure format-ask-result so the budget-exhausted branching is testable without an LLM mock. New helper mu/provider+model consolidates four copies of the (or (args "provider") (:provider defaults)) boilerplate across mutation handlers.
  • cli/parse-args is now data-driven — the 50-line top-level case collapsed seven near-identical "parse-then-tag" branches plus a cond fallback, all looking up the same spec from command-registry. The new shape is a 3-line cond: parser-overrides (a 2-entry map for benchmark and ask — the only subcommands with bespoke parsing), simple-subcommands set, or parse-with-registry. Per-subcommand post-processing (digest layer string → keyword vector, introspect error envelope filter) lives in a parse-post-fns map so each rule sits with its data, not buried in a let. Adding a new subcommand no longer requires editing parse-args at all when a registry entry suffices.
  • introspect.clj loop and formatting cleanuprun-loop! no longer drives a 4-accumulator loop/recur with side-effects interleaved into the loop body; one iteration is step-iteration (pure — returns the next accumulator), and record-iteration! owns the transact + progress-event side-effects. The outer reduce uses reduced for budget-exhaustion. run-iteration! decomposes into request-proposal!, skip-with-error, code-gate-for, and apply-and-evaluate!, replacing two layers of nested if-let with a flat if/if/do over named outcomes. format-ask-insights collapses six near-identical (when (seq …) (str <header> <preamble> <items>)) blocks into one format-section driven by a [coll spec] data table. No behavior change.
  • benchmark.clj function-level cleanuprun-benchmark! decomposes into load-run-context, build-shared-state, and log-run-start! so the body reads as a flat pipeline instead of a 19-binding let interleaving config, atom allocation, raw-context shelling, and a 15-line log-format. benchmark-run->tx-data drives optional :bench.run/* fields from a small optional-run-fields data table (with sub-helpers for usage and scoring-method blocks); the prior 130-line cond-> pyramid that repeated (<key> aggregate) (assoc <attr> (cast …)) for 12 fields collapses into one reduce. aggregate-scores lifts five inner let lambdas (mean, wmean, layer-key, layer-mean, layer-wmean) to private defn-s and threads through assoc-layer-aggregates/per-category-aggregate/empty-context-count. generate-report replaces a 110-line (doto (StringBuilder.) (.append …)) mutation with str/join over a vector of pure section renderers (header-section, summary-section, scoring-method-section, per-category-section, per-question-section, context-efficiency-section, usage-section, validity-section, reproducibility-section); the Java StringBuilder import goes away. raw-context swaps a manual loop/recur with stage-by-stage truncation logic for reduce + reduced over ls-tree-files, with escape-html-attr and render-file-content extracted as pure helpers. No behavior change.
  • benchmark no longer depends on cli — the benchmarking subsystem pulled in noumenon.cli solely to interpolate the program name into one "Resume with: …" log line. The literal is inlined and the require dropped, breaking a small layering inversion (subsystemapi).
  • mcp.clj split by responsibility — the 1440-line god namespace is now a 346-line declarative tool schema + dispatch, with sibling namespaces for transport (mcp/protocol), remote-proxy mode (mcp/proxy), shared infra (mcp/util), and per-cluster tool handlers (mcp/handlers/{query,mutation,benchmark,introspect,meta}). No behavior change; one file to grep was the bottleneck.
  • http.clj split by responsibility — the 1404-line god namespace is now a 10-line public-surface re-export, backed by http/middleware (validation, JSON, auth, repo resolution, SSE/CORS), http/routes (route table + ring entry), http/server (lifecycle), and per-cluster handlers under http/handlers/{pipeline,query,benchmark,introspect,admin}. The shape mirrors the MCP split — same boundaries, same names — so the read/write/admin axis is consistent across both transports.
  • main.clj split by subcommand cluster — the 973-line CLI dispatcher is now a 156-line -main + dispatch + parse-error table, with shared helpers in cli/util and per-cluster command handlers under cli/commands/{pipeline,query,ask,inspect,benchmark,digest,introspect,artifact,daemon}. Three transports (CLI, HTTP, MCP) now decompose along the same axes, making cross-cluster invariants (e.g. "every benchmark handler validates layers via …") obvious at a glance.
  • Minimal Integrant lifecycle for the daemon — the HTTP daemon now boots through noumenon.system, an Integrant config that owns the start/stop graph for Datomic connections, the LLM semaphore, in-memory caches (embed, completion), session stores, and the HTTP server itself. Pre-existing accessor APIs (db/get-or-create-conn, embed/get-cached-index, sessions/register!, …) keep working unchanged; subsystems are not rewritten as Integrant components. The daemon shutdown hook now calls system/halt!, which clears caches, cancels in-flight introspect futures, drops the LLM semaphore, and releases the Datomic conn cache — so a daemon restart in the same JVM doesn't observe stale state. New dependency: integrant/integrant 0.13.1.

0.8.1

Fixes

  • MCP server follows daemon-port changesserve! captured (or (load-connection-config) (detect-local-daemon)) once at startup, so an MCP server spawned while the daemon was on one OS-assigned port (the noum daemon --port 0 path) kept proxying to that address forever. After a daemon restart on a new port, every tool call surfaced as Cannot reach daemon at <stale host> until Claude Code itself was restarted. The connection is now resolved per tools/call via the same expression, so daemon.edn is re-read each request and a daemon bounce no longer requires a client restart. Both reads are cheap EDN slurps; explicit remote setups via load-connection-config still win over auto-detect.
  • synthesize no longer creates "zombie" components — the hierarchical-merge step resolves a merged component's :files by looking up each :source-components name in part-comp-index. When the LLM hallucinates source-component names that don't exist in the partition results, every lookup returns [] and the merged component went downstream with empty :files. components->tx-data then wrote the component entity (and any :component/depends-on edges) but emitted no file attribution, producing a phantom component visible in component-dep-drift (which joins on :component/name) but invisible in components (which joins on :arch/component). Adversarial diagnosis on the live noumenon db: 19 visible components vs 19 zombies, 84 of 107 dep-drift edges involving a zombie, inflating the over-declared count and making synthesis quality look ~88% wrong when the real signal among real components was 13/23 (~57%) import-grounded — a healthy synthesis ratio. Merged components with empty :files are now filtered out before tx-data, and the dropped names are logged so the rate stays observable.
  • Cost telemetry survives non-Anthropic modelsllm-cost-by-model and llm-cost-total returned empty against a fully analyzed db with 358 analyze txes. Three compounding bugs: (1) llm/model-pricing keyed by date-stamped ids (claude-sonnet-4-6-20250514) while provider responses now carry undated names (claude-sonnet-4-6 from the LevelInfinite/Tencent gateway, glm-4.6 from Z.ai), so prefix-only lookup missed every response and estimate-cost returned 0.0; (2) :tx/cost-usd was guarded by (pos? cost) and never written for 0-cost runs (which, after bug 1, was every run); (3) the cost queries used bare [?tx :tx/cost-usd ?cost] clauses that silently excluded txes without the attr. Switched model-pricing to undated keys with prefix-match (so both bare and date-stamped ids hit the same entry), added claude-opus-4-7, dropped the (pos? cost) guard so 0.0 is written explicitly, and switched llm-cost-by-model / llm-cost-total to get-else defaults. llm-cost-total also gained a :tx/op #{:analyze :synthesize} anchor so it doesn't pull in import/enrich/seed rows that never had token attributes. Both providers were probed directly: neither GLM nor Tencent return cost in usage — the fix is local pricing and local query hygiene, nothing provider-side.

0.8.0

Added

  • Branch-aware graph foundation — every database now records which branch it represents. New :branch/name, :branch/kind (:trunk / :feature / :release / :unknown), :branch/vcs, and a tuple identity :branch/repo+name are populated automatically on every update. Repos point to their current branch via :repo/branch.
  • Content-addressed file identity:file/blob-sha is now imported from git ls-tree for every file, enabling content-based comparisons and cache lookups. Existing files lazy-fill on next sync.
  • Local delta databasesnoum delta-ensure <repo> --basis-sha <sha> (or POST /api/delta/ensure) materializes a sparse Datomic DB at ~/.noumenon/deltas/<repo>__<branch>__<basis> containing only files added/modified/deleted between the trunk basis and the current HEAD. Deletions are recorded as :file/deleted? true tombstones. Delta DBs link back to their parent via :branch/basis-sha, :branch/parent-host, and :branch/parent-db-name.
  • Federated trunk + delta queries — a subset of named queries declare a :federation-mode and accept :exclude_paths so the daemon can return trunk results minus rows the launcher will overlay from a delta DB. New endpoint POST /api/query-federated does the merge in a single roundtrip; new flag noum query <name> <repo> --federate --basis-sha <sha> opts in. Two modes are supported: :tombstone-only (trunk minus tombstoned paths; no delta rows — the safe default for queries that join on commits, imports, analysis, or segments, none of which the sparse delta carries) and :added-files-merge (trunk plus delta rows for files added in the branch — opt-in for queries that join only on stable attrs like :file/path / :file/lang, validated at seed time). Federation-aware queries seeded so far: orphan-files, complex-hotspots, import-hotspots, hotspots, ai-authored-segments, bug-hotspots, files-by-churn — all :tombstone-only. Non-federation-aware queries return trunk-only with a banner.
  • Auto-federation in noum query / noum ask — when the active connection is hosted and local HEAD has diverged from the trunk DB's :repo/head-sha, the launcher transparently rewrites a plain noum query <name> <repo> into the federated path against a local delta and emits a yellow banner. noum ask emits the banner but does not federate (no federated ask endpoint in v1). Per-call opt-out with --no-auto-federate; global opt-out with noum settings set federation/auto-route false.
  • noumenon_query_federated MCP tool — exposes /api/query-federated to MCP clients. Materializes the delta on demand from (repo_path, basis_sha) and returns the merged result.
  • noum analyze --no-promote (and MCP no_promote) — bypasses the content-addressed promotion cache and always invokes the LLM. Useful when re-validating the cache itself.
  • bb prune-deltas — interactive GC for stale local delta DBs under ~/.noumenon/deltas/noumenon/. Lists each delta with size, classifies as :live / :trunk-missing / :unparseable, and prompts before deleting trunk-missing entries.
  • Content-addressed analysis promotion — when a file's :file/blob-sha equals a previously-analyzed blob in the same DB whose :prov/prompt-hash and :prov/model-version match the current run, noum analyze copies the donor's :sem/* and :arch/* attrs onto the recipient and skips the LLM call. Donor lineage is preserved via :prov/promoted-from. Pass --no-promote to bypass the cache. The analyze summary surfaces a :files-promoted counter alongside :files-analyzed.

Changed

  • No :file/deleted? in trunk transactions — trunk DBs hard-retract deleted files as before; only delta DBs use tombstones. A guard asserts this in retract-deleted-files!.
  • Schema files — added resources/schema/branch.edn and resources/schema/federation.edn (which defines :noumenon/scope). New attrs :file/blob-sha, :file/deleted?, :prov/promoted-from in existing schema files. :tx/op doc lists :promote; :tx/source doc lists :promoted. Every data attribute carries an explicit :noumenon/scope :stable | :trunk-only tag.
  • Centralized input length capsnoumenon.util now exports the shared length limits (max-repo-path-len, max-question-len, max-query-name-len, max-branch-name-len, max-host-len, max-db-name-len, max-run-id-len, max-params-count, max-param-key-len, max-param-value-len) plus a validate-params! helper. HTTP handlers and the MCP layer now consume them so the two surfaces stay in lockstep.
  • Schema-scoped federation modes — replaced the boolean :federation-safe? query flag with an enum :federation-mode. A seed-time validator (noumenon.artifacts/validate-federation-mode!) rejects any :added-files-merge query that touches an attribute not tagged :noumenon/scope :stable, so a contributor cannot accidentally re-introduce the orphan-files-style false-positive merge by mistakenly opting a query that joins on :file/imports or :commit/* into the more permissive mode. The legacy :artifact.query/federation-safe? boolean is preserved as a derived value so the launcher's banner logic keeps working.

Fixes

  • Incremental sync was returning empty diffsgit diff --name-status was given a -- separator before the SHAs, which told git "no more revisions follow," so old-sha and HEAD were interpreted as pathspecs and the diff returned empty for every poll. Modified/deleted files were never retracted, so stale :sem/summary, :file/imports, segment, and analysis attrs lingered while only new commits got imported. Switched to --end-of-options, which keeps the flag-injection defense without breaking revision parsing.
  • noum watch correctly distinguishes no-op ticks from real syncsclojure.data.json on the daemon side serializes keywords as strings, so :up-to-date arrived as the string "up-to-date" while the launcher compared against the keyword constant — the check silently always-true'd, printing "Updated: 0 added, 0 modified" on every poll. noum.api/parse-body now keywordizes both keys and a declared enum-keys set of known enum values via clojure.walk/postwalk, so all HTTP read paths produce idiomatic Clojure maps. Watch also surfaces deletion counts alongside additions and modifications.
  • noum settings listing truncates long values — a deeply nested or very long single-line value (>120 chars) used to wrap the terminal into noise. Listings now clip to 120 chars with a trailing ; noum settings <key> still shows the un-truncated value.
  • --insecure always parses as boolean — the flag was missing from cli/boolean-flags, so --insecure foo (with a non-flag value following) used to swallow foo as the flag's value ({:insecure "foo"}). Always boolean now, regardless of what follows.
  • noum connect rejects non-http(s) URL schemes up frontnoum connect ftp://example.com, file:///tmp/foo, ssh://host:22 used to flow through the SSRF check (which gave a misleading "private/internal address" message) or surface as a generic "invalid URI scheme" wrapped in the network catch. Now produces a clean Error: --host scheme must be http or https. Got: <scheme>:// (exit 1) before any network call.
  • ask-secret no longer echoes any prefix of the secret — the previous mask wrote (subs input 0 (min 4 (count input))) "****", so short tokens (≤ 4 chars) were displayed in clear and longer ones leaked the first 4 characters. Always shows a fixed ******** mask now.
  • Confirm prompts re-prompt on garbage inputtui.confirm/ask returned default-val on any non-y/n input. With default-val=true (no current caller, but future ones) a typo would silently confirm a destructive action. Garbage now triggers a re-prompt with a "Please answer y/n." hint; empty input still falls back to the default as before.
  • noum settings rejects extra positionalsnoum settings retry/limit 5 typo-extra used to silently discard the third positional and POST (key, value) as if only two args were passed. Now produces Error: Too many arguments. (exit 1).
  • noum help <unknown> exits 1 — previously printed "Unknown command: …" but exited 0, inconsistent with noum <unknown> which correctly exits 1.
  • noum connect <ip-literal> derives a useful saved-connection namenoum connect 127.0.0.1:7895 used to save the connection as '127' (the first dot-segment), so 127.0.0.1 and 127.0.0.2 collided. The auto-naming now detects IP literals (and localhost) and keeps host:port joined by - (e.g. 127.0.0.1-7895, localhost-7895); real hostnames still use the first dot-segment (api.example.comapi).
  • noum introspect rejects mutually exclusive flag combinations--status, --stop, and --history target different sub-actions, but the cond order silently picked the first match. noum introspect --status run-a --stop run-b acted on --status run-a only without warning. Two or more of the three flags now produce Error: --status, --stop, and --history are mutually exclusive. (exit 1).
  • noum introspect --status (no value) gives a clean error instead of a misleading databases hint--status / --stop without a following value booleanized to true, fell through (string? …) checks to do-api-command, and emitted "Use noum databases to see imported repositories" — the user wanted a run-id, not a repo. Both flags now produce Error: --status requires a run-id. / Error: --stop requires a run-id. (exit 1) at the boundary.
  • --as-of, --raw, and --basis-sha validate at the launcher boundarynoum query --as-of "", query --raw "", delta-ensure --basis-sha (no value → boolean true), query --federate --basis-sha not-hex all used to flow through to the server, which rejected after a network round-trip. The launcher now rejects blank --as-of / --raw and enforces a 40-char lowercase hex shape on --basis-sha (used by both query --federate and delta-ensure) up front; valid inputs still pass through unchanged.
  • noum serve --host X now produces a clean error instead of silently ignoring --hostdo-serve only forwarded --db-dir, --provider, --model, --token to the spawned MCP process; --host and --insecure were dropped, so users targeting a remote ended up running an MCP server colocated with the local daemon. Reject the combination explicitly with a hint to run noum connect <url> first; the MCP server already proxies to the saved active connection.
  • noum ping and noum version honor --host and the active named connection — both used to call daemon/connection directly, which only consults ~/.noumenon/daemon.edn and ignores every other connection signal. noum ping --host X --token Y silently checked the local daemon instead. Both commands now go through a new ping-target helper that checks --host first, then the saved active connection, then the local daemon — without spawning anything as a side effect.
  • Interactive menus no longer hang on a bare ESC presschoose/select matched the ESC byte and then unconditionally read two more bytes to consume the CSI follow-up ([ <arrow-code>). With no follow-up bytes (the user pressed ESC alone, not as part of an arrow sequence), the second .read blocked forever — the only escape was Ctrl-C, which killed the JVM mid-cleanup and left the terminal in raw mode. The new read-arrow! helper sleeps 20ms after the ESC byte and only consumes the next two bytes if InputStream/available is positive; bare ESC returns nil, which the caller treats as cancel (same as Q / Ctrl-C). Arrow keys behave identically to before.
  • noum settings <key> <huge-int> no longer silently nulls the valueparse-setting-value matched any digit-only string with #"-?\d+" and called parse-long, which returns nil on overflow. The cond returned the nil result, so the daemon got :value nil and replied "key and value are required" — the user typed a value, the launcher silently erased it. Now falls back to the raw string when parse-long overflows; the daemon's settings store accepts strings as-is.
  • Uncaught launcher exceptions surface as clean errors, not stack traces — when daemon/start! timed out (e.g. slow JVM startup, missing JRE, port conflict) the bb runtime dumped a 30-line Clojure stack trace with internal source paths. Same shape for any other uncaught exception inside a handler. -main now wraps (handler parsed) in a new run-handler! helper that catches every Exception, prints Error: <message> (in red) with a hint to set NOUM_DEBUG=1 for the full trace, and exits 1.
  • Network failures surface as clean errors instead of stack tracesnoum <cmd> --host <unreachable> (no listener, DNS failure, timeout) used to dump a raw java.net.ConnectException stack trace from api/get! / api/post!. The HTTP client's :throw false only converts HTTP error responses into result maps; pre-response exceptions still bubbled. Both helpers now catch any exception and emit Error: Could not reach <host>: <exception-class> — <message> (exit 1) so the user knows where to look without seeing source paths.
  • Auth-failure path no longer leaks Datalog clauses or crashes the launcher — when the meta-DB token query threw (e.g. a closed channel mid-request, transient backend error), the daemon's auth middleware bubbled the exception into the generic 500 handler, which echoed processing clause: [?t :token/hash ?h], … back as text/plain. The launcher then crashed with a raw JsonParseException because parse-body didn't tolerate non-JSON. Two-layer fix: (1) check-auth now wraps auth/validate-token in try/catch and returns a clean 401 JSON response on any internal failure, never leaking schema details; (2) the launcher's parse-body returns nil instead of throwing on non-JSON input, and get!/post! fall back to a {:ok false :error "HTTP <status>: <body>"} shape (truncated to 200 chars) so callers see a sensible message.
  • noum watch --interval rejects non-positive / non-numeric values up front--interval -5 used to print "polling every -5s", attempt one update, then crash Thread/sleep with raw IllegalArgumentException; --interval abc silently fell back to 30 with no warning. The new parse-watch-interval helper validates before ensure-backend! runs and emits Error: --interval must be a positive integer (got <value>) with exit 1.
  • noum history prompt (no name) no longer NPEs — the no-name branch tried to enumerate prompt files via (io/resource "prompts/"), but the daemon's resources/prompts/ lives in the JVM-side noumenon.jar, not the launcher's bb classpath. The resource lookup returned nil and (io/file nil) threw NPE. Same NPE in the interactive collect-history menu. Both now skip the listing: the one-shot path emits a Usage message with the common prompt names, and the interactive path asks the user to type the name as free text. (No daemon endpoint exists today to enumerate prompts; if one is added later, the launcher can call it.)
  • Blank / NUL-only repo args no longer silently resolve to the cwdnoum status "", status " ", status "\x00", and ask "" "<question>" all used to flow through path->db-name / canonicalize-path, which delegated to (java.io.File. "") — the JDK normalized that to the current working directory and last produced the cwd basename. Users running noum from a directory whose basename happened to collide with a real DB silently got the wrong DB. cli/parse-args now drops blank positionals (after stripping NUL bytes) so the existing min-args / Usage-error paths fire as intended. noum status . still works as a current-directory shorthand; only empty / whitespace-only / NUL strings are dropped.
  • noum query --param is repeatable as documentedcli/extract-flags stored every flag in a single-valued map, so a second --param k2=v2 overwrote the first. Only the last key=value reached the daemon, despite the help text claiming "(repeat command as needed)". --param now accumulates into a vector of strings; build-api-body (and the --as-of / --federate branches of do-query) merge the vector into the request's :params map. Single --param k=v calls still work — they produce a 1-element vector that flattens to the same body as before.
  • SSRF check no longer crashes on IPv6-resolving hosts — bb's native-image build doesn't carry reflection metadata for Inet6Address's instance methods (.isLoopbackAddress, .getAddress, etc.), so any host whose DNS lookup returned an IPv6 address — including everyday public hostnames like google.com — surfaced as MissingReflectionRegistrationError with a 30-line stack trace instead of either connecting or returning a clean blocked-private response. The classifier now reads getHostAddress (the one Inet6Address method bb does carry) and matches the canonical full IPv6 form (0:0:0:0:0:0:0:1) alongside the compressed form (::1); the ^fe80:/^fc00:/^fd prefix patterns expand to cover all of fe80::/10 and fc00::/7 as well, so the regex-only path is at least as strict as the prior .isLinkLocalAddress/.isSiteLocalAddress calls. IPv4-mapped IPv6 (e.g. ::ffff:127.0.0.1) is auto-converted to Inet4Address by the JDK, so the existing IPv4 patterns catch it.
  • noum connect http://localhost:N no longer SSRF-blockedbase-url's loopback allowlist regex was anchored at the start of the host string, so the bare form localhost:7895 matched but the scheme-prefixed http://localhost:7895 (and http://127.0.0.1:7895, https://localhost:7895) didn't, falling through to private-address?. That helper then split on :, took "http" as the host, failed DNS, and the catch-all returned true — every scheme-prefixed local URL got rejected as "private/internal". base-url now strips the scheme up front, applies the loopback check on the bare host, and reuses an explicit scheme when present so https://… and http://… both round-trip cleanly. Identical inputs with and without scheme produce the same final URL.
  • Delta DB collision on look-alike branch namesfeat/foo and feat-foo both sanitize to feat-foo for the on-disk db-name; without disambiguation, branch-switching between the two would silently overwrite the same delta DB. The db-name now appends sha256(branch-name)[0..6] so collisions resolve to different DBs.
  • Federation merge keeps trunk history for modified files — the earlier "exclude all delta paths from trunk + append delta rows" merge made modified files vanish from churn-based queries because the delta DB has no commits to carry their history. Tombstone-only merge keeps trunk's authoritative history while still respecting branch deletions.
  • bb prune-deltas walks the right directory — Datomic-Local stores DBs under <storage>/<system>/<db-name>/, so the actual deltas live under ~/.noumenon/deltas/noumenon/. The previous parent-dir walk would have surfaced the system dir itself as a single "unparseable" entry — and a y at the prompt would have nuked every delta DB on the machine.
  • Empty / dot-only branch namesdelta-db-name falls back to detached for nil, blank, or dot-only branch inputs (which would otherwise produce empty / . / .. directory names — the latter resolve to parent dirs in tools that aren't expecting them as literal path components).
  • ensure-private! uses 700 on directories — the launcher's owner-only permission helper was applying 600 (no execute bit) to directories, making them unenterable. Now rwx------ for dirs, rw------- for files.
  • validate-string-length! returns 400, not 500 — the :status key on the thrown ex-info now lets the HTTP handler surface a clean 400 Bad Request instead of falling through to a generic 500.
  • Idempotent branch upsertupdate-head-and-branch! resolves the existing repo + branch eids before transacting, so re-running delta-ensure (or any sync) doesn't trip the :branch/repo+name unique constraint with a fresh tempid.
  • Branch / parent_host / parent_db_name / query_name length caps on /api/delta/ensure and /api/query-federated — overlong values now return 400 instead of being persisted or echoed back unchecked.
  • Bogus basis_sha is now a clean error — a 40-char-hex SHA that doesn't resolve to a real commit used to silently produce an empty diff, and delta-ensure / query-federated would respond synced with zero counts. changed-files now throws on non-zero git diff exit so HTTP surfaces a 400 with the actual git error; update-repo! catches the throw and falls back to a fresh sync, so a force-pushed trunk DB still recovers.
  • bb prune-deltas parses branch names containing __ — the old split-on-__ parser misclassified delta DBs whose branch contained a double underscore (e.g. feat__under) as :trunk-missing and offered them for deletion. Anchored regex on the trailing -<hash6>__<basis7> suffix preserves the branch correctly. Pre-disambiguator on-disk names (no -<hash6> suffix) are not parsed by the new code; re-create them by running delta-ensure or query-federated against the same basis.
  • bb prune-deltas classifies repo basenames containing __ — the parser's branch-favoring heuristic attributes every __-segment in the on-disk name to the branch, which is the right call for the common case (noumenon__feat__under-...) but misclassified the symmetric one: a real repo basename like my__repo parsed as repo=my, branch=repo__feat, and ~/.noumenon/data/noumenon/my/ doesn't exist, so classify flagged the delta :trunk-missing and offered it for deletion. classify now walks the __ boundaries between parsed repo and branch and reports :live if any candidate split has an existing trunk dir; the displayed row shows the resolved repo/branch instead of the misparse.
  • :added-files-merge queries must put ?path first in :find — the merge code filters delta rows by (first row) matched against added paths, so the first column has to bind :file/path. The contract was implicit; a query whose :find started with anything else (e.g. [:find ?lang ?path …]) would have silently lost every delta row, and a non-path column that coincidentally equalled a path string would have leaked false rows through. The seed-time validator now rejects an :added-files-merge query whose first :find element isn't the symbol ?path. No shipped query was affected — the new check makes the contract explicit at the boundary instead of relying on an undocumented convention.
  • connect recovers from a stale system-catalog entry — Datomic-Local's system catalog persists database entries to its own metadata files; if the on-disk db directory is removed externally (e.g. bb prune-deltas wipes a delta, or the user deletes one while the daemon is down) the catalog still says the db exists. create-database is then a no-op (catalog says exists) and connect throws :cognitect.anomalies/not-found — surfacing as a 500 like Db not found: <name> even though ensure-delta-db! had just logged success. create-db now catches that exact anomaly, drops the stale catalog entry, and recreates cleanly so the next caller gets a working connection. Centralized in one place so cache misses, fresh connects, and schema-ensure paths all share the same recovery.
  • validate-string-length! rejects non-strings with 400, not 500 — the validator's old guard (when (and (string? s) ...)) silently let any non-string value through, so a JSON request like {"branch": ["a"]} or {"branch": 123} flowed past every HTTP and MCP boundary that relied on it as a type+length check. Downstream code (e.g. sanitize-branch calling str/trim on a vector) then crashed as a 500. Non-nil non-strings now produce a clean 400 "X must be a string" at the validation boundary; nil still passes silently for optional fields.
  • validate-repo-path checks type and length — the validator went straight to (io/file repo-path), so a JSON request like {"repo_path": 42} threw IllegalArgumentException (no as-file impl for Long) and surfaced as a 500. Long strings also walked all the way through .exists/.isDirectory without a cap. Non-nil non-strings now return must be a string, strings over max-repo-path-len (4096) return an exceeds maximum length reason, and the existing FS-shape reasons are unchanged. Callers' when-let + throw pattern produces a clean 400 in every case.
  • MCP proxy surfaces unreachable-daemon errors clearly — when ~/.noumenon/config.edn pointed at a host with no daemon listening, every MCP tool call came back with Remote proxy error: JSON error (end-of-file) because curl exited non-zero with empty stdout and the proxy then json/read-str'd the empty string. The proxy now checks curl's exit code first and surfaces Cannot reach daemon at <host>. Start it with noum daemon, or update ~/.noumenon/config.edn to point at a running host. — including curl's stderr when present. Empty bodies on a zero exit (204 / premature close) get a similar host-naming message instead of bubbling up the JSON parse error.
  • Branch-name cap is FS-derived, not human-name-derivedmax-branch-name-len was 256, so a 256-char branch passed validation and then crashed Datomic-Local's mkdir with File name too long because the synthesized db-name (<repo>__<safe-branch>-<hash6>__<basis7>) overflowed the POSIX 255-byte path-component limit. Cap is now 200 — leaves ~37 bytes of headroom for the repo basename, which covers virtually every real-world case. delta-db-name does a final 255-byte check too so the long-repo edge case (where repo + cap can still overflow) surfaces as a 400 at the boundary instead of a 500 from the FS layer.
  • analyze truncates long strings at the writer boundary — Datomic-Local rejects single string values around the 4 KB mark with Item too large. The parse-time clamp already limits :summary / :purpose to 4096 chars, but :sem/synthesis-hints is a pr-str of purpose + architectural-notes + patterns + layer + category, so the result can easily overflow even when each input was clamped. build-file-tx now caps every string attribute it writes at 4000 chars (matching artifacts/chunk-size's headroom for UTF-8 multi-byte chars), so a verbose LLM response can't blow up the transact and lose the analysis.
  • query-federated is now a no-op when basis + HEAD haven't moved — every call used to write a head/branch tx (and re-transact every schema file via ensure-delta-db!), growing the delta's db.log ~2.3 KB per 5 read-shaped requests. The handler short-circuits to :up-to-date when the stored basis-sha and HEAD already match, and ensure-delta-db! now routes through the cached connection helper.
  • query-federated records parent metadata on the delta — auto-derives :branch/parent-db-name from the resolved trunk DB and :branch/parent-host from the request's Host header (HTTP) or "local" (MCP). Previously, only the explicit delta-ensure path set these, so deltas materialized via auto-federation lost the lineage breadcrumb.
  • Trim branch name before disambiguator hashsanitize-branch already trimmed before producing the on-disk label, but the disambiguator hashed the raw input, so "foo" and "foo " ended up in different delta DBs for one logical branch. Both paths now agree on the canonical branch.
  • Uniform query_name length cap across query endpointsPOST /api/query and POST /api/query-as-of now reject overlong query_name with the same 400 the federated endpoint already produced. The shared util/max-query-name-len is the single source.
  • OpenAPI doc reflects the actual delta-DB path/api/delta/ensure description now shows ~/.noumenon/deltas/noumenon/<repo>__<safe-branch>-<hash6>__<basis7>/ (was missing both the noumenon/ Datomic system subdir and the -<hash6> disambiguator).
  • Cross-DB promotion guardfind-cached-analysis rejects :donor-db without a matching :donor-db-name, and now also the symmetric :donor-db-name without :donor-db. The two predicates that decide same-DB vs cross-DB used to disagree; either partial form would have written a dangling :prov/promoted-from ref or fabricated cross-DB provenance for a donor that actually came from the recipient. Currently dormant (no production caller wires either) but lands defensively before the cross-DB-promotion path gets enabled.
  • DELETE /api/databases/noumenon-internal now rejected — the meta database stores tokens, settings, prompts, rules, ask sessions, and benchmark/introspect history, and :meta-conn is cached at daemon startup. Letting a caller delete it via the public delete endpoint silently corrupted the daemon: the cached connection became a closed channel, so /api/tokens, /api/repos, etc. all 500'd until restart, while lazy-ensure-meta-db callers (settings, queries) silently re-seeded into a fresh DB and hid the breakage. The meta-DB name lives as noumenon.db/meta-db-name, and handle-delete-database rejects it up front with a 400 "Cannot delete reserved database".
  • file:// and other non-network URL schemes blocked at clonevalidate-clone-url! only ran the SSRF private-IP check, and that check short-circuited on URLs without a host (file://, ssh://, raw paths). An authenticated admin posting {"url":"file:///some/local/repo"} to /api/repos could clone an arbitrary readable git repo on the daemon's filesystem and then query it via /api/ask. The validator now rejects anything that doesn't match git-url? (https?:// or git@host:path) before the host lookup runs; Perforce depot paths still go through their own clone path and are unaffected.
  • Uniform repo_path validation across the bulk endpointswith-repo (the wrapper used by /api/import, /api/analyze, /api/enrich, /api/update, /api/synthesize, /api/digest, /api/ask, /api/query, /api/query-raw, /api/query-as-of, /api/query-federated, /api/benchmark, /api/introspect, /api/completions, etc.) only checked for the field's presence, so a JSON body like {"repo_path": 42} / ["a"] / {"a":1} / true reached (io/file …) and surfaced as a 500 with a leaking ClassCastException. Empty string fell through to the bare-db-name branch and shelled out to git log against db://. A 5MB string walked the FS-shape checks and got reflected back in the 404 message (small request-amplification vector). New util/validate-repo-path-input! does the type+length+blank gate; with-repo now calls it after the missing-field check, so all of the above surface as a clean 400. The earlier delta-ensure-only fix is now uniform.
  • Malformed JSON body returns 400, not 500parse-json-body let json/read-str's parse exception flow up into make-handler's catch-all, so a typo'd request body became a generic "Internal server error" even though the fault was entirely client-side. The parser is now wrapped in a try/catch that re-throws an ex-info with :status 400 and a clean "Invalid JSON body" message; the underlying parser detail (offset, character) is logged to stderr for daemon-side debugging but not exposed in the response.
  • ask-session-feedback rejects unknown session idsPOST /api/ask/sessions/<unknown>/feedback used to call set-feedback! regardless and return 200, writing feedback attrs against a non-existent session id and lying to the caller about it. The handler now looks up the session via ask-store/get-session first and returns 404 "Session not found" on miss, matching the shape of handle-ask-session-detail.
  • /api/repos/:name remove and refresh return 404 for unknown names — both handlers used to call into repo-mgr without an existence check, so an unknown name surfaced as a generic 500 "Internal server error" (with a leaked filesystem path on the daemon side). A shared registered-repo! helper does the meta-DB lookup and 404s with "Repo not registered: " before any disk-touching work, so callers can distinguish "not registered" from a real server error and stop seeing the 500.
  • /api/ask rejects empty / whitespace question with 400 — the (when-not (:question params)) gate only caught nil; an empty string passed all of nil-check + length-cap and reached agent/ask, where the LLM loop crashed and surfaced as 500 "Internal server error". The gate now also rejects blank strings (after trim), so {"question":""} and {"question":" "} both produce 400 "question is required" the same as omitting the field.
  • validate-db-name! is now a positive allowlist — the validator only rejected /, \, blank, and pure-dot names, so null bytes / newlines / tabs / spaces / non-ASCII slipped through and propagated into (io/file …) lookups. Tightened to [a-zA-Z0-9._-]+ (matching derive-db-name's sanitizer and the synthesized delta-DB naming), so exotic characters fail at the boundary with a 400 instead of leaking into the storage layer. Pure-dot names still get the explicit dot-only rejection.
  • /api/query-as-of no longer leaks JVM class names in as_of errors — sending {"as_of": [123]} / true / {} triggered the parse path's (long …) cast, and the catch wrapped the JVM ClassCastException message ("class clojure.lang.PersistentVector cannot be cast to class java.lang.Number …") into the 400 body. The handler now type-checks as_of (string or number) up front and produces a clean "as_of must be an ISO-8601 string or epoch milliseconds". The string-but-unparseable branch ("not-a-date") still surfaces the actual Instant/parse complaint so users see the real reason. Validation also now runs before with-repo, so a bad as_of fails fast even when the repo doesn't exist.
  • /api/settings strings stay strings — the handler used to run every string value through edn/read-string, silently re-typing "42" to 42, "true" to true, ":foo" to a keyword, etc. Cross-language callers (Electron UI, future GUIs) had no way to store an actual string that happened to parse as EDN. The handler now stores values as-is: typed callers send JSON-typed values ({"value": 42} for an int), string callers send strings ({"value": "42"} for a string). The noum settings <k> <v> CLI keeps its existing UX by pre-parsing the CLI string in the launcher (parse-setting-value now also runs for daemon-side settings, symmetric with how launcher-local settings already worked).
  • Daemon logs no longer leak absolute db-dir paths in clone errorsrepo-mgr/refresh-repo!'s "Clone not found: /abs/path/.git" message, the "Cloning into /abs/path/.git ..." log line, the "Removing clone /abs/path/.git" log, and the git stderr embedded in clone-failure errors all referenced the absolute filesystem path. Logs and surfaced messages now reference only the db-name (e.g. "Clone not found for ghost", "Cloning into ghost.git ..."); the absolute path stays in :ex-data for daemon-side debugging. git/clone! and git/clone-bare! redact the target-dir absolute path from git's own stderr before embedding it in the error message.

Notes

  • Datomic schema is additive: no migrations runner, no version stamps. ensure-schema re-transacts every connect; existing DBs pick up new attrs and queries pick up :federation-safe? on next start.
  • Delta DBs require a co-located daemon in this release. Cross-machine federation (remote daemon, launcher-side delta) is deferred.
  • Promotion is same-DB only in this release. Cross-DB promotion (delta lookups against trunk's history) is deferred.

0.7.0

Changed

  • Repo split — The Electron desktop app moved to leifericf/noumenon-app; the website moved to leifericf/noumenon-site. This repo keeps the daemon, noum CLI, and OpenAPI spec. History was preserved on both new repos via git filter-repo.
  • OpenAPI spec relocated — Canonical source moved from docs/openapi.yaml to resources/openapi.yaml so it ships inside the daemon JAR and can be served via io/resource. The website mirrors it daily via a cron-pull workflow.
  • noum ui auto-updater — Now downloads the packaged Electron app from leifericf/noumenon-app releases (was leifericf/noumenon).
  • noum ui dev mode — Resolves a noumenon-app source checkout in this order: $NOUMENON_APP_ROOT$NOUMENON_ROOT/../noumenon-app sibling → fall back to installed app. The previous ui/ child directory is no longer valid.
  • Core CI/Release workflows — Removed the ui job and the build-electron/deploy-pages release jobs. CLI distribution (update-homebrew, update-scoop) and Docker publish remain in this repo.

0.6.2

Changed

  • HTTP-only provider support — Removed Claude CLI provider support entirely. Supported providers are now API-based (glm, claude-api, with claude aliasing to claude-api).
  • Strict model selection — LLM operations now require an explicit model source: pass --model or configure provider :default-model; no implicit fallback model is selected.
  • Provider credential policy — Removed legacy file-based credential fallback; provider credentials now resolve from NOUMENON_LLM_PROVIDERS_EDN and process environment variables.
  • Analysis/synthesis provenance — LLM transactions now record provider and model provenance via :tx/provider and :tx/model-source metadata.

Fixes

  • Provider migration errors — Using removed claude-cli now fails with explicit migration guidance to claude-api/claude.
  • API schema/docs alignment — OpenAPI and provider-config docs now reflect API-only provider support.

0.6.1

New

  • Provider/model catalog commands — Added noum llm-providers and noum llm-models for discovering configured providers, provider defaults, and available models.
  • MCP provider/model catalog tools — Added noumenon_llm_providers and noumenon_llm_models with help/schema metadata so agents can inspect defaults and model availability without reading config files.

Changed

  • Provider default selection — Noumenon now resolves one global default provider via NOUMENON_DEFAULT_PROVIDER, then :default-provider in NOUMENON_LLM_PROVIDERS_EDN, then built-in fallback.
  • Provider model policy — Each provider can declare :models plus a single :default-model; model selection now resolves per-provider defaults when --model is omitted.
  • Dynamic model discoveryllm-models/noumenon_llm_models prefer provider API discovery (:models-path, with known defaults) and fall back to configured :models when discovery is unavailable.

0.6.0

New

  • Provider-agnostic LLM config — Added NOUMENON_LLM_PROVIDERS_EDN support for API providers, allowing per-provider :base-url and :api-key configuration (for example: :glm, :claude-api, gateway-backed providers) through one canonical EDN map.

Changed

  • Runtime mode policy for secrets — Added NOUMENON_RUNTIME_MODE=local|service (default local). In service mode, file-based credential fallback is disabled and only process env secrets are used.
  • Provider resolution precedence — API providers now resolve config in this order: canonical EDN map entry, legacy env var fallback, then built-in default base URL (API keys are never defaulted).
  • Centralized provider resolution — API-provider invocation now routes through a normalized resolver in src/noumenon/llm.clj returning {:base-url :api-key} to reduce provider-specific branching.

Fixes

  • Service URL hardening — API provider base URLs are now validated as absolute URLs, and service mode requires https.
  • Safe error handling for credentials — Missing-key failures are explicit while avoiding secret value leakage in error messages.
  • Optional base URL allowlist — Added NOUMENON_LLM_BASE_URL_ALLOWLIST_EDN support to restrict provider base URL hosts/patterns.

0.5.6

New

  • Pipeline selectorsanalyze, enrich, update, and digest now accept --path, --include, --exclude, and --lang to scope work to selected files/directories/languages. Added parity across JVM CLI, launcher (noum), HTTP API, and MCP tool schemas.
  • OpenAPI selector schema — Added PathSelectors to docs/openapi.yaml and wired it into analyze/enrich/update/digest endpoint request bodies.

Changed

  • Prompt/model drift behavior — Drift is now advisory by default. Noumenon logs recommended re-analysis counts but does not auto re-analyze unless you explicitly pass --reanalyze prompt-changed or --reanalyze model-changed.

Fixes

  • MCP repo path mapping — Remote MCP proxy now derives database names from local path semantics (e.g. mino) instead of org-repo remote URL synthesis (e.g. leifericf-mino), preventing status/query failures on path-to-db translation.
  • Launcher command helpnoum help <command> now renders command options (including analyze) so users can discover flags without leaving the CLI.

0.5.5

New

  • Multi-repo introspect evaluationextra_repos parameter on MCP, HTTP, and CLI introspect commands. Evaluates prompt changes across multiple repos to reduce overfitting. Averages scores from primary + extra repos.
  • introspect-skipped query — New named query exposes skipped iterations (parse failures, validation errors, gate failures) for diagnosing introspect issues.
  • Introspect status progressnoumenon_introspect_status now shows current iteration number and last outcome message, not just elapsed time.

Fixes

  • Cascading template expansion — All prompt renderers (agent, introspect, benchmark, analyze, synthesize) switched from sequential str/replace to single-pass regex substitution. Previously, inserting a template that contained {{placeholder}} strings caused subsequent replacements to cascade, bloating prompts from 5K to 924K.
  • Stale chunked promptsreseed / bootstrap now uses save-prompt! which properly retracts old chunks before writing. Previously, a prompt bloated by introspect and stored as chunks survived reseeds because the raw upsert added :template without retracting stale :chunks.
  • EDN extraction from prose — Introspect proposal parser now extracts the outermost {...} EDN map from optimizer responses that wrap the proposal in explanatory prose. Previously, the entire response was parsed as EDN, failing on any surrounding text.
  • Git commit on Datomic-only changesgit-commit-improvement! no longer throws when introspect improves a Datomic-only target (examples, system-prompt, rules) that produces no filesystem changes. Previously, git commit exited 1 with "nothing to commit", which propagated as an exception inside with-modification, reverting the improvement.
  • Introspect error persistence — Skipped iterations now store the raw optimizer response or error message in :introspect.iter/error for post-hoc diagnosis.

0.5.4

Fixes

  • MCP daemon lock contentionnoum serve now auto-detects a running local daemon via daemon.edn and proxies tool calls to it instead of opening the database directly. Previously, the daemon's exclusive file lock caused every MCP tool call to fail with a generic "unexpected internal error."
  • MCP error messages — Tool call errors now include the actual cause and tool name instead of "An unexpected internal error occurred." Database lock errors include actionable kill instructions and explicitly tell AI agents not to retry.
  • MCP proxy auth header — Proxy mode no longer sends Authorization: Bearer null when connecting to a local daemon without a token.
  • Setup binary pathnoum setup code now resolves the noum binary via PATH (e.g. Homebrew at /opt/homebrew/bin/noum) instead of always hardcoding ~/.local/bin/noum.
  • Demo release fallbacknoum demo now searches the 5 most recent GitHub releases for a demo tarball instead of only checking the latest. Prevents "not found" errors when a patch release ships without a new demo database.
  • Progress bar lifecycle — The launcher's progress handler now resets the bar on completion and creates a new bar when the total changes. Fixes the flashing green bar during digest benchmark and spurious "✓ digest done." lines between steps.
  • Progress bar step labels — Digest sub-steps (analyze, benchmark) tag their SSE progress events with :step, so the bar shows "✓ analyze done." instead of "✓ digest done."
  • Synthesize progress event — Added missing :current/:total keys to the synthesize progress event, preventing NPE in the launcher handler.
  • Digest output formatting — Nested result maps (analyze, benchmark, synthesize) are now printed as an indented tree with floats rounded to 2 decimal places, instead of raw EDN.

0.5.3

Fixes

  • Stale JAR auto-updatejar/ensure! now reads version.edn from the installed JAR and compares against the launcher version. On mismatch, stops the daemon, downloads the matching release, and restarts fresh. Previously, an existing JAR was never re-checked, so Homebrew launcher updates silently ran against an old backend.
  • Daemon bounce on upgradenoum upgrade now stops the running daemon after downloading a new JAR, so the next command starts with the updated code.
  • Version def shared — Moved from main.clj (private) to paths.clj so both main and api pass it to jar/ensure!.

0.5.2

Security hardening, bug fixes, and UX polish.

Security

  • EDN read-eval disabled*read-eval* bound to false in introspect code verification; {:readers {}} added to all edn/read-string calls parsing LLM responses, checkpoints, and external data
  • CORS restrictedfile:// origins now require explicit NOUMENON_ALLOW_FILE_ORIGIN env var
  • Admin-only endpoints/api/query-raw and /api/ask/sessions added to admin-only prefixes
  • SSRF hardening — CGN range 100.64.0.0/10 added to blocked IP patterns; -- separator in git clone commands; proxy host URL validation
  • Subprocess timeouts — Python, Node, C, and Elixir import extractors now timeout after 30 seconds
  • Hook state directory — Moved from world-writable /tmp to user-private ~/.noumenon/tmp/
  • CI tag validationGITHUB_REF_NAME validated as semver before shell substitution in release workflow
  • Credential handling — Directory permissions set before writing config; warning on --token + --insecure
  • MCP proxy — Admin tool forwarding logged; read-only flag respected for git_commit; SSRF check on proxy host
  • Electron navigation — Restricted to exact daemon port instead of any localhost port

Fixes

  • MCP digest skip flag — Synthesize step was gated on skip_analyze instead of skip_synthesize
  • Merge retry usageinvoke-merge now accumulates LLM token usage from both attempts
  • Agent nil dispatch — Guard against nil tool dispatch when LLM sends only :reflect
  • Benchmark stop-flagrun-benchmark! accepts external stop-flag for HTTP introspect sessions
  • Database deletion — Removed post-Datomic filesystem deletion that could corrupt shared storage
  • Session limit raceregister-ask-session! enforced atomically via single swap!
  • Leaf file re-enrichment — Files with no imports now get empty [] for :file/imports to prevent redundant re-processing
  • Test speed — 429 retry test binds *max-retries* to avoid 6-second sleep
  • Limit param coercion — HTTP query endpoints coerce string :limit to long
  • History help text — Replaced hardcoded prompt names with dynamic hint

UX Improvements

  • CLI — Spinner cleanup on API errors; actionable watch failure messages; dynamic prompt listing; post-setup instructions; upgrade progress spinner; explicit "Daemon: not running" message
  • TUI — Non-interactive auto-select warns to stderr; confirm defaults to false for safety
  • UI — Feedback polarity from event data; in-app delete confirmation; active nav indicator; flex layout for ask results; theme cached in localStorage; graph loading skeleton; empty table/history states; truncation with tooltips; formatted introspect deltas; error state on network failure
  • MCP — Digest description lists all pipeline steps; skip_synthesize in schema; search clarifies embed prerequisite; list_queries mentions required parameters
  • Sidebar — Unicode icons replace ambiguous single letters
  • Benchmark — "Select 2 runs to compare" hint text

0.5.1

TUI hotfix.

Fixes

  • Arrow key navigation — Menu selector now uses cond instead of case for escape sequence matching (Babashka's case doesn't resolve var references)
  • Menu line breaks — Raw terminal mode uses \r\n instead of \n for correct vertical layout
  • Back navigation — Selecting "← Back" no longer leaves a stray line in the console
  • Key input — Reads from /dev/tty directly instead of System/in for reliable raw-mode input

New

  • embed command in launcher — help text and Pipeline menu entry

0.5.0

TF-IDF vector search, hierarchical synthesis, and cross-repo benchmarks.

New

  • TF-IDF vector searchembed pipeline stage builds a vocabulary and vector index from file and component summaries. Pure Clojure, no external dependencies beyond Nippy for serialization.
  • noumenon_search MCP tool — Semantic file/component search without the agent loop. Zero LLM calls, millisecond responses.
  • Ask agent seeding — The ask agent is seeded with TF-IDF search results before querying the knowledge graph, giving it a warm start on relevant files and components.
  • embedded benchmark layer — Measures TF-IDF retrieval quality alongside raw and full KG layers.
  • :full layer enriched — Benchmark's full layer now includes both KG query results and TF-IDF search results when available — representing everything Noumenon has.
  • Hierarchical map-reduce synthesis — Repos with 250+ files are synthesized per directory partition, then merged. Fixes guava (3,333 files) and redis (1,754 files) which previously returned 0 components.
  • Session seed logging — Ask sessions persist TF-IDF seed results to Datomic for analytics.

Changed

  • Neural net input — Query routing model now uses TF-IDF vectors instead of bag-of-words, giving it term-importance weighting. Existing trained models require retraining.
  • MCP digest handler — Now includes synthesize and embed steps (was missing both).
  • Raw context limit — Reduced from 800K to 500K chars to stay within the ~200K token API limit.
  • Default benchmark provider — Falls back to GLM instead of Claude CLI.

Fixes

  • MCP benchmark handler wasn't passing model-config, causing raw layer to silently fail via Claude CLI
  • Synthesize retraction + creation in same Datomic transaction caused datoms-conflict on re-synthesis
  • MCP synthesize and digest handlers weren't seeding new prompt templates
  • Recursive directory partitioning caused StackOverflowError on flat directory structures (redis)
  • Merge synthesis validator rejected components with :source-components instead of :files

Benchmarks

Cross-repo benchmark (8 repos, 7 languages, 22 deterministic questions each):

MetricWithout NoumenonWith Noumenon
Accuracy20%53%
Token cost37K7K
Speed13.6s6.1s

0.4.0

Architectural synthesis, visual desktop UI, and interactive CLI.

New

  • Interactive TUInoum with no arguments enters a menu-driven interface. Browse commands by category, select repositories/sessions/queries from live data. Smart arg collection for all commands including introspect sub-actions.
  • Visual desktop UI — Electron + ClojureScript app with force-directed graph visualization. Three-level drill-down (components, files, segments), floating Ask overlay with streamed reasoning, @-mention autocomplete. Launch with noum open.
  • synthesize command — Identifies logical components from file summaries, import edges, and directory structure. Maps component dependencies, layers, and categories. Language-agnostic.
  • Component entitiescomponent/name, component/summary, component/layer, component/category, component/depends-on. Files link via arch/component.
  • 9 new named queriescomponents, component-files, component-dependencies, component-dependents, component-authors, component-churn, component-bus-factor, cross-component-imports, subsystems.
  • noum demo — Pre-built knowledge graph for instant querying without credentials.
  • Top-down query strategy — Ask agent starts at component level for architectural questions.

Security

  • Electron renderer uses contextBridge (no executeJavaScript)
  • CORS restricted to Electron origin
  • Bounded edn/read-string on LLM-sourced strings

Fixes

  • Inline markdown parser duplication bugs
  • Concurrent SSE submission guard
  • Electron namespace collision with Replicant fragments
  • Unbounded memoize memory leak in graph builders

0.3.1

Security and UX hardening.

  • Path traversal fix on DELETE /api/databases/:name
  • Constant-time token comparison for auth
  • HTTPS by default for remote --host connections
  • Token via env var instead of CLI arg (hidden from ps aux)
  • SSE error propagation — errors surface correctly instead of wrapping as success
  • Delete confirmation with --force to skip
  • Relative path resolution for all commands
  • 20+ additional security, UX, and robustness fixes — see git history

0.3.0

noum CLI launcher, HTTP daemon, and Docker image.

  • noum binary — Self-contained Babashka launcher. Auto-downloads JRE and backend. 30 commands with custom TUI (spinner, menu, progress bar, table).
  • HTTP daemon — 22 REST endpoints, bearer token auth, SSE progress streaming.
  • Docker image — 167MB Alpine, non-root, auth required for network access.
  • noum setup — Auto-configures MCP for Claude Desktop and Claude Code.
  • OpenAPI spec at docs/openapi.yaml

0.2.0

Introspect (autonomous self-improvement) and ML query routing.

  • Introspect loop — LLM optimizer proposes changes to prompts, examples, rules, and code. Keeps improvements, reverts regressions. Multi-repo evaluation to prevent overfitting.
  • ML query routing — On-device neural network predicts which Datalog queries to try. Trained locally at zero token cost.
  • Issue reference extraction from commits (#123, PROJ-456)
  • Scoped re-analysis--reanalyze with all, prompt-changed, model-changed, stale modes

0.1.0

First public release.

  • Import pipeline (git history into Datomic knowledge graph)
  • LLM analysis (semantic metadata: complexity, safety, purpose)
  • Import graph extraction (10+ languages)
  • Named Datalog queries with parameterization and rules
  • Agentic ask command (natural-language via iterative Datalog)
  • MCP server (noum serve)
  • Benchmark framework with checkpointing and resume
  • Concurrent processing (configurable parallelism)