Indexing is the process that reads your Markdown files and builds the data structures smolbren uses for full-text search and graph queries. It stores notes, typed edges, and an ontology summary on disk so every subsequent search or query command operates instantly without re-reading your files.
Running the indexer
Incremental (default) — only processes files that have changed since the last index run:
Full rebuild — drops the existing index and re-processes every file from scratch:
Incremental indexing
smolbren uses a two-stage diffing strategy to minimise work on each run:
-
mtime/size pre-filter — files whose modification timestamp and byte size are identical to the stored values are skipped immediately, without reading the file contents. This makes the common case (most files unchanged) very fast.
-
Content hash comparison — files that pass the pre-filter (mtime or size changed) are read and hashed. If the hash matches the stored value, the file is treated as a metadata-only change (for example, a filesystem copy that updated the modification time) and the note row is refreshed without re-parsing frontmatter or edges.
Only files whose content genuinely differs are fully re-parsed, their edges recomputed, and their rows replaced in the index.
IndexStats output
After indexing, smolbren prints a single-line JSON object:
{
"scanned": 142,
"unchanged": 138,
"added": 2,
"updated": 1,
"removed": 1,
"edges": 87,
"unresolved_edges": 4,
"duration_ms": 312
}
| Field | Meaning |
|---|
scanned | Total .md files found in the vault directory |
unchanged | Files skipped because no content change was detected |
added | New notes not previously in the index |
updated | Existing notes whose content changed |
removed | Notes deleted from the vault since the last index run |
edges | Total edges (resolved + unresolved) written to the index |
unresolved_edges | Edges whose wikilink target could not be matched to a vault note |
duration_ms | Total wall-clock time for the index run, in milliseconds |
What gets indexed
For each Markdown file, smolbren extracts and stores:
- Note type — the
type frontmatter key, used as the graph node label.
- Title — the first
# Heading in the note body; falls back to the filename stem if no heading is found.
- All frontmatter keys — stored as a JSON blob for
smolbren get and future filtering.
- Note body — the full text after the frontmatter fence, indexed for BM25 full-text search.
- Typed edges — every frontmatter key (other than
type) whose value contains one or more [[wikilinks]], stored as directed edges with the key name as the relationship type.
Hidden files and ignored paths
smolbren uses the ignore library for directory traversal. The following are automatically excluded without any configuration:
- Hidden directories:
.obsidian/, .git/, .trash/, and any path component starting with .
- Files and directories listed in
.gitignore (if present in the vault root)
- Any non-
.md files
Use --full when you have made major structural changes to your vault — such as moving many files, renaming folders, or bulk-editing frontmatter across notes — or if search results seem stale and you suspect index corruption. For normal day-to-day editing, the default incremental mode is faster and sufficient.