> ## Documentation Index
> Fetch the complete documentation index at: https://smolbren.com/llms.txt
> Use this file to discover all available pages before exploring further.

# How smolbren indexes your vault

> smolbren uses incremental indexing with content hashing to keep your vault's search index fast and up to date with minimal overhead.

Indexing is the process that reads your Markdown files and builds the data structures smolbren uses for full-text search and graph queries. It stores notes, typed edges, and an ontology summary on disk so every subsequent `search` or `query` command operates instantly without re-reading your files.

## Running the indexer

**Incremental** (default) — only processes files that have changed since the last index run:

```bash theme={null}
smolbren index
```

**Full rebuild** — drops the existing index and re-processes every file from scratch:

```bash theme={null}
smolbren index --full
```

## Incremental indexing

smolbren uses a two-stage diffing strategy to minimise work on each run:

1. **mtime/size pre-filter** — files whose modification timestamp and byte size are identical to the stored values are skipped immediately, without reading the file contents. This makes the common case (most files unchanged) very fast.

2. **Content hash comparison** — files that pass the pre-filter (mtime or size changed) are read and hashed. If the hash matches the stored value, the file is treated as a metadata-only change (for example, a filesystem copy that updated the modification time) and the note row is refreshed without re-parsing frontmatter or edges.

Only files whose content genuinely differs are fully re-parsed, their edges recomputed, and their rows replaced in the index.

## IndexStats output

After indexing, smolbren prints a single-line JSON object:

```json theme={null}
{
  "scanned": 142,
  "unchanged": 138,
  "added": 2,
  "updated": 1,
  "removed": 1,
  "edges": 87,
  "unresolved_edges": 4,
  "duration_ms": 312
}
```

| Field              | Meaning                                                          |
| ------------------ | ---------------------------------------------------------------- |
| `scanned`          | Total `.md` files found in the vault directory                   |
| `unchanged`        | Files skipped because no content change was detected             |
| `added`            | New notes not previously in the index                            |
| `updated`          | Existing notes whose content changed                             |
| `removed`          | Notes deleted from the vault since the last index run            |
| `edges`            | Total edges (resolved + unresolved) written to the index         |
| `unresolved_edges` | Edges whose wikilink target could not be matched to a vault note |
| `duration_ms`      | Total wall-clock time for the index run, in milliseconds         |

## What gets indexed

For each Markdown file, smolbren extracts and stores:

* **Note type** — the `type` frontmatter key, used as the graph node label.
* **Title** — the first `# Heading` in the note body; falls back to the filename stem if no heading is found.
* **All frontmatter keys** — stored as a JSON blob for `smolbren get` and future filtering.
* **Note body** — the full text after the frontmatter fence, indexed for BM25 full-text search.
* **Typed edges** — every frontmatter key (other than `type`) whose value contains one or more `[[wikilinks]]`, stored as directed edges with the key name as the relationship type.

## Hidden files and ignored paths

smolbren uses the [`ignore`](https://docs.rs/ignore) library for directory traversal. The following are automatically excluded without any configuration:

* Hidden directories: `.obsidian/`, `.git/`, `.trash/`, and any path component starting with `.`
* Files and directories listed in `.gitignore` (if present in the vault root)
* Any non-`.md` files

<Tip>
  Use `--full` when you have made major structural changes to your vault — such as moving many files, renaming folders, or bulk-editing frontmatter across notes — or if search results seem stale and you suspect index corruption. For normal day-to-day editing, the default incremental mode is faster and sufficient.
</Tip>
