Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

CLI Commands

All commands accept a package name as a positional argument. The package name resolves to data/{name}/{name}.toml. Use --db <path> globally to override the database path.

forager [--db <path>] <command> [args]

new

Create a new crawl package.

forager new <name> [--config <path>] [--reference <text>] [--from <pkg>]
  • --config copies an existing TOML as the starting config.
  • --reference generates a config from a natural language description (no --config).
  • --from clones an existing package (config + DB state).
forager new ecology-masters --reference "European masters in ecology and conservation"
forager new phil-v2 --from process-philosophy-masters

import

Import terms from a CSV file into the package database.

forager import <pkg> --terms <csv> [--replace]
  • CSV format (full): group,text,weight with header row.
  • CSV format (simple): one term per line, assigned to group “imported” with weight 1.0.
  • --replace clears existing DB terms before importing.
  • Config terms always take priority over imported terms at merge time.
forager import my-crawl --terms keywords.csv
forager import my-crawl --terms extra.csv --replace

run

Run or resume a crawl.

forager run <pkg> [--resume <run-id>] [--new] [--verbose]
  • Default behavior: resumes the latest run if one exists, otherwise starts a new run.
  • --resume resumes a specific run by its UUID.
  • --new forces a fresh run even if a previous one exists.
  • --verbose prints per-round timing breakdowns.
forager run my-crawl
forager run my-crawl --new --verbose

status

Show crawl state, learned parameters, and statistics.

forager status <pkg>
forager status my-crawl

tune

Override a learned parameter value.

forager tune <pkg> <param> <value>
  • <value> can be a number, "auto", or "fixed".
forager tune my-crawl relevance_threshold 0.15
forager tune my-crawl semantic_weight auto

log

Show crawl run history.

forager log [<pkg>]
  • Without a package name, shows all runs across all packages.
forager log my-crawl
forager log

list

List all packages in data/.

forager list

query

Run a raw GQL query against the package database.

forager query <pkg> <gql>
forager query my-crawl "MATCH (p:Page) WHERE p.score > 0.5 RETURN p.url, p.score ORDER BY p.score DESC LIMIT 10"