Experimental, early beta. Data model and interfaces are unstable; expect breaking changes between releases.

Source Control

Noumenon ingests git history into the knowledge graph. Native git works out of the box, and Perforce works through git-p4. Once the tree is on disk, everything downstream is identical.

Four Kinds of Repo Identifier

Every Noumenon command that takes a repo argument (noum digest, noum ask, noumenon_query, and the HTTP endpoints) accepts any of the four. The dispatch is a small cond that checks Perforce depot syntax first, then git URL shape, then a filesystem path, and falls back to looking the name up as an existing database.

InputExampleWhat happens
Local path/path/to/repoUse the working tree as-is. No clone, no network.
Git URLhttps://github.com/owner/repo.gitCloned to data/repos/<repo-name> on first use, reused thereafter.
Perforce depot path//depot/Project/main/...Bridged through git-p4. Cloned to data/repos/Project-main.
Database namenoumenonAlready-imported repo, looked up by canonical name.

Git

Native. Noumenon shells out to the git CLI (no libgit2 or JGit) to extract commits, authors, file paths, parent edges, and per-commit diffs, and writes them into Datomic. Local repos and remote URLs follow the same code path; URLs are cloned once into data/repos/ and reused on subsequent runs.

What gets imported:

  • Every commit on the configured branch (default: HEAD).
  • Author name and email as written in the commit. No alias merging or identity unification yet.
  • Files touched per commit, including renames and deletions.
  • Per-commit additions, deletions, and changed-file lists from git log --numstat.

Subsequent noum update calls are incremental. Only commits newer than the last imported HEAD are processed.

Perforce

Bridged through git-p4. A depot path that starts with // triggers a git p4 clone into data/repos/<derived-name>. Once cloned, it's a git repo and the rest of the pipeline runs unchanged.

noum digest //depot/ProjectA/main/...

Requirements:

  • git p4 on the PATH (ships with most Git for Windows builds; on macOS via brew install git plus Python; on Linux often a separate package).
  • A working P4PORT / P4USER environment, the same as you'd use for the p4 CLI.

Tuning the clone:

  • --use-client-spec clones exactly the workspace view configured in your P4 client. Ignores Noumenon's default exclusions.
  • --max-changes N limits history depth. Useful for huge depots; defaults to full history.
  • --p4-include "*.uasset" / --p4-exclude "*.custom" adjust the binary-asset exclusion list.
  • --no-default-excludes skips the built-in exclusions entirely.

The default exclusions strip common game-engine binaries: Unreal .uasset/.umap, Unity .prefab/.asset, and the .fbx/.png/.wav/.mp4 families. The clone stays small and the LLM sees source code rather than assets. Override per-import when you need a specific binary type indexed.

noum update on a git-p4 clone runs git p4 sync followed by git p4 rebase before the incremental import, so new changelists land in the graph the same way new git commits do.

Branches and Local Deltas

Experimental — interfaces may change between releases. A Noumenon database tracks one branch at a time. Each database carries branch metadata (:branch/name, :branch/kind:trunk / :feature / :release, :branch/vcs) and the repo entity points to the current branch via :repo/branch. Trunk is hosted (one shared knowledge graph for the team); a developer's working branch lives in a sparse delta DB on the developer's own machine.

When local git rev-parse HEAD diverges from the trunk DB's :repo/head-sha, noum delta-ensure (or POST /api/delta/ensure) materializes a delta DB at ~/.noumenon/deltas/<repo>__<safe-branch>__<basis7>/ containing only the files added, modified, or deleted vs the trunk basis SHA. Deletions are stored as :file/deleted? true tombstones rather than retracted, so federated queries can subtract them cleanly.

Federated answers. A federation-safe named query (those flagged :federation-safe? true in their EDN) can run merged across trunk and a delta in a single HTTP roundtrip via /api/query-federated. The server overlays the delta on trunk by injecting (not [?file :file/path "<p>"]) clauses for each delta path, then concatenates the delta's own rows on top. The launcher detects divergence automatically and reroutes noum query transparently — a yellow banner makes the rerouting observable. Disable per-call with --no-auto-federate or persistently with noum settings federation/auto-route false.

Why server-side. Federation lives in the daemon because the Babashka launcher does not carry datomic-client. One HTTP roundtrip beats coordinating multiple from a language without direct DB access. Delta DBs require a co-located daemon for the same reason — full remote-mode support is a future iteration.

Throwaway by design. Delta DBs are wipe-and-rebuild on schema mismatch or basis drift; no migrations runner. bb prune-deltas interactively GCs orphan directories whose trunk DB has been deleted.

Content addressing for free. Files now carry :file/blob-sha from git ls-tree. The analyze stage uses this for content-addressed promotion: when a file's blob has been analyzed before with the current prompt + model, the prior analysis is copied across instead of paying the LLM again.

Same Graph, Regardless of Source

Once the working tree is on disk, the rest of the pipeline (enrich, analyze, synthesize, embed) does not know or care which source-control system the repo came from. Queries, the Ask agent, MCP tools, and the knowledge graph schema are identical in either case.

The four-input dispatch lives in src/noumenon/repo.clj; the Perforce bridge details are in src/noumenon/git.clj.