Technical Reference

Complete specification of every computation, formula, and parameter in the forager pipeline.

Pipeline Overview

flowchart TB
    subgraph ROUND["Each Round"]
        direction TB
        SELECT["<b>Select</b><br/>DQN ranks frontier candidates<br/>Filter: already fetched, domain capped, unreliable"]
        FETCH["<b>Fetch</b><br/>Concurrent HTTP with stealth headers<br/>Per-request timeout + jitter from params"]
        PARSE["<b>Parse</b><br/>Parallel HTML → title, headings, body<br/>Adaptive truncation per domain"]
        SCORE["<b>Score</b><br/>Semantic + keyword blend<br/>Language + domain weight adjustments"]
        DISCOVER["<b>Discover</b><br/>Extract links, batch anchor embeddings<br/>Skip GPU for low-scoring pages"]
        PROCESS["<b>Process</b><br/>Compute 11-dim feature vectors<br/>Build children if score ≥ propagation_threshold"]
        INTEGRATE["<b>Integrate</b><br/>DB write, learner observations<br/>Domain profiles, DQN training"]
        LEARN["<b>Learn</b><br/>All param groups update<br/>Reference embedding refines"]
        PERSIST["<b>Persist</b><br/>Param groups, frontier tree<br/>DQN model, domain profiles"]

        SELECT --> FETCH --> PARSE --> SCORE --> DISCOVER --> PROCESS --> INTEGRATE --> LEARN --> PERSIST
    end

    PERSIST -.->|"next round"| SELECT

Scoring System

Total Score

flowchart LR
    subgraph SEMANTIC["Semantic Similarity"]
        REF["Reference<br/>embedding<br/>(384-dim)"]
        ANTI["Anti-reference<br/>embedding"]
        TITLE["title embedding"]
        HEAD["heading embedding"]
        BODY["body embedding"]

        REF --> |"cosine sim"| TA["title_aff"]
        REF --> |"cosine sim"| HA["heading_aff"]
        REF --> |"cosine sim"| BA["body_aff"]
        ANTI --> |"penalty"| TA & HA & BA
    end

    subgraph KEYWORD["Keyword Matching"]
        TERMS["term groups<br/>(required + optional)"]
        TERMS --> DENSITY["keyword_density"]
    end

    TA & HA & BA --> BLEND["signal blend"]
    BLEND --> AFFINITY["affinity"]
    AFFINITY & DENSITY --> TOTAL["total_score"]

Formulas

Per-signal affinity (with anti-reference penalty):

affinity(signal) = max(0, cos(reference, signal_emb) - anti_w × cos(anti_ref, signal_emb))

Multi-signal blend (learnable weights, sum to 1.0):

semantic_affinity = tw × title_aff + hw × heading_aff + bw × body_aff

Total score (semantic + keyword blend):

total = clamp(sem_w × semantic_affinity + (1 - sem_w) × keyword_density, 0, 1)

Keyword density — flat terms:

density = min(1.0, Σ(count_i × weight_i) / word_count × 100)

Keyword density — with term groups (required groups use geometric mean):

required_score = exp(Σ(group_weight × ln(group_density)) / Σ(group_weight))
total = min(1.0, required_score + optional_sum × 0.1)

If any required group has zero density → entire score = 0.

Score Adjustments (pipeline)

adjusted_score = raw_score × lang_factor × domain_factor

lang_factor = lang_penalty  (if page language ∉ accepted languages)
            = 1.0           (otherwise)

domain_factor = domain_weights[domain]  (if configured)
              = 1.0                     (otherwise)

Reference Blending (end of round)

centroid = mean(relevant_page_embeddings)
reference = (1 - blend) × reference + blend × centroid
reference = reference / ‖reference‖

Feature Vector

11-dimensional feature vector computed per discovered link. This is the DQN’s input.

flowchart LR
    subgraph FEATURES["Feature Vector [0..10]"]
        direction TB
        F0["<b>[0]</b> parent_relevant<br/>1.0 if parent scored above threshold, else 0.0"]
        F1["<b>[1]</b> inverse_distance_to_relevant<br/>1 / (hops_since_relevant_ancestor + 1)"]
        F2["<b>[2]</b> path_relevance_ratio<br/>relevant_ancestors / total_ancestors"]
        F3["<b>[3]</b> keyword_in_url<br/>1.0 if any term appears in URL, else 0.0"]
        F4["<b>[4]</b> keyword_in_anchor<br/>1.0 if any term appears in anchor text, else 0.0"]
        F5["<b>[5]</b> anchor_relevance<br/>cosine sim of anchor embedding vs reference"]
        F6["<b>[6]</b> domain_reward<br/>avg reward for this domain (0.0 if unknown)"]
        F7["<b>[7]</b> domain_novelty<br/>1.0 if domain never seen, else 0.0"]
        F8["<b>[8]</b> ancestor_depth<br/>min(total_ancestors + 1, 20) / 20"]
        F9["<b>[9]</b> required_signal_proximity<br/>1 / (min(hops_since_required_match, 20) + 1)"]
        F10["<b>[10]</b> parent_semantic_distance<br/>cosine sim of parent body embedding vs reference"]
    end

All features are pre-scaled to approximately [0, 1]. No additional normalization.

DQN Agent

Network Architecture

flowchart LR
    IN["features<br/>[batch, 11]"] --> L1["Linear(11 → 30)"] --> A1["LeakyReLU<br/>α = 0.1"] --> L2["Linear(30 → 15)"] --> A2["LeakyReLU<br/>α = 0.1"] --> L3["Linear(15 → 1)"] --> Q["Q-value"]

LeakyReLU: f(x) = max(x, 0.1 × x)

Double DQN Training

flowchart TB
    SAMPLE["Sample batch from PER buffer<br/>(stratified by priority)"]
    SAMPLE --> ARGMAX["<b>Online net:</b> find best next action<br/>a* = argmax_a Q_online(s', a)<br/>(batched forward pass)"]
    ARGMAX --> TARGET["<b>Target net:</b> evaluate best action<br/>Q_target(s', a*)<br/>(batched forward pass)"]
    TARGET --> TD["<b>TD target:</b><br/>y = r + γ × Q_target(s', a*)"]
    TD --> LOSS["<b>Weighted MSE loss:</b><br/>L = mean(w_i × (Q_online(s) - y)²)<br/>w_i = importance sampling weights"]
    LOSS --> BACKWARD["Backward + Adam optimizer step"]
    BACKWARD --> PRIORITY["Update priorities:<br/>priority_i = |td_error_i| + ε"]

TD target formula:

y_i = r_i + γ × Q_target(s', argmax_a Q_online(s', a))
y_i = r_i                                                  (if no next actions)

Training schedule:

train every:      replay_period steps (default: 3)
min buffer size:  min_replay_size (default: 64)
batch size:       batch_size (default: 60)
target sync:      every target_update_freq steps (default: 500)
LR decay:         current_lr *= lr_decay at each target sync

Epsilon schedule:

if steps < decay_steps:
    ε = ε_start + (ε_end - ε_start) × steps / decay_steps
else:
    ε = ε_end

Prioritised Experience Replay

Sampling probability:

P(i) = priority_i^α / Σ priority_j^α

Stratified sampling: divide the cumulative distribution into batch_size equal segments, sample one from each.

Importance sampling weights (bias correction):

w_i = (N × P(i))^(-β) / max_j(w_j)

β = min(1.0, 0.4 + 0.6 × steps / decay_steps)

β anneals from 0.4 → 1.0 over training. At β=1.0, full bias correction.

TRES Tree Frontier

flowchart TB
    ROOT["Root leaf<br/>(all URLs)"]
    ROOT -->|"split on feature[k] at threshold t"| LEFT["Left leaf<br/>feature[k] ≤ t"]
    ROOT --> RIGHT["Right leaf<br/>feature[k] > t"]
    LEFT -->|"further split"| LL["..."] & LR["..."]

    SAMPLE["<b>Candidate selection:</b><br/>1 random URL per leaf<br/>O(leaves) instead of O(frontier)"]

Split Criterion (CART regression on reward)

variance(values) = Σ(v - mean)² / n

variance_reduction = var(parent) - (n_L/n) × var(left) - (n_R/n) × var(right)

Split on the (feature, threshold) pair that maximizes variance reduction.

Constraints:

min child size = max(min_samples_per_split, ⌊0.15 × n_parent⌋)
max tree depth = max_depth (from config)

Domain Capping

domain_fetch_count[d] += 1 per fetch
capped = domain_fetch_count[d] ≥ domain_max_pages

Capped domains are excluded from candidate selection.

Adaptive Parameter Learning

All learning requires a minimum number of observations before updating. Each group runs update() once per round.

[score] Parameters

flowchart LR
    subgraph SCORE_LEARN["Score Learning (per round)"]
        OBS["Page observations<br/>(reservoir sampled, max 2000)"]
        OBS --> RT["<b>relevance_threshold</b><br/>P25 of nonzero scores<br/>(needs ≥ 20 obs, ≥ 10 nonzero)"]
        OBS --> SW["<b>signal weights</b><br/>proportional to which signal<br/>best predicts relevance<br/>(needs ≥ 10 signal hits)"]
        OBS --> SEM["<b>semantic_weight</b><br/>avg_semantic / (avg_semantic + avg_keyword)<br/>among relevant pages<br/>(needs ≥ 5 relevant)"]
    end

Signal weight normalization:

tw = title_hits / total_signal_hits
hw = heading_hits / total_signal_hits
bw = body_hits / total_signal_hits
normalize: tw, hw, bw = tw/sum, hw/sum, bw/sum

[fetch] Parameters

Parameter	Formula	Conditions
`timeout_ms`	`clamp(P95(durations) × 1.5, 1000, 30000)`	≥ 20 fetches, ≥ 10 durations
`connect_timeout_ms`	`clamp(P50(durations) × 0.75, 500, 10000)`	≥ 20 fetches, ≥ 10 durations
`jitter_max_ms`	`current × 1.5` if >5% rate-limited, `× 1.2` if >1%, `× 0.9` if safe	≥ 20 fetches
`max_links_per_page`	`clamp(avg_links × 2, 50, 500)`	≥ 10 pages, >10% change
`embed_threshold_factor`	linear map: kw_ratio 0.05→0.3, 0.30→0.7	≥ 10 pages
`propagation_threshold`	0.0 if >30% yield, 0.02 if >10%, 0.05 otherwise	≥ 10 low-score pages

[select] Parameters

Parameter	Formula	Conditions
`domain_max_pages`	median exhaustion point: rolling-5 avg drops below 50% of mean	≥ 3 domains with ≥ 5 pages
`slow_domain_ms`	`clamp(P90(latencies), 500, 15000)`	≥ 10 observations, ≥ 10 latencies
`unreliable_threshold`	`clamp(P10(success_rates), 0.1, 0.6)`	≥ 10 observations, ≥ 5 rates
`html_limit_factor`	not yet learned (config knob)	—

[tune] Parameters

Parameter	Formula	Conditions
`per_alpha`	`clamp(0.3 + (CV - 0.5) / 1.5 × 0.5, 0.3, 0.8)` where CV = σ/μ of TD errors	≥ 20 steps, ≥ 10 errors
`per_epsilon`	not learned (fixed priority floor)	—

Domain Profiling

flowchart LR
    FETCH_EVENT["fetch event"] --> PROFILE["DomainProfile"]
    PROFILE --> AVG_F["avg_fetch_ms<br/>running mean"]
    PROFILE --> AVG_P["avg_parse_ms<br/>running mean"]
    PROFILE --> AVG_H["avg_html_bytes<br/>running mean"]
    PROFILE --> SR["success_rate<br/>successes / fetches"]
    PROFILE --> AR["avg_reward<br/>reward_sum / fetches"]

    AVG_F & AVG_P --> SLOW{"is_slow?<br/>fetch+parse > slow_domain_ms<br/>(after ≥ 3 fetches)"}
    SR --> UNRELIABLE{"is_unreliable?<br/>success_rate < threshold<br/>(after ≥ 5 fetches)"}
    AVG_H --> HTML{"html_limit<br/>avg_bytes × factor<br/>clamped [50KB, 2MB]<br/>(default 512KB until ≥ 3 fetches)"}

Data Flow

flowchart TB
    CONFIG["TOML Config<br/>user intent + defaults"] --> INIT["init()"]
    DB_RESTORE["DB: ParamGroup nodes<br/>DQN model, frontier tree"] --> INIT

    INIT --> STATE["PipelineState"]

    STATE --> |"each round"| PIPELINE["Pipeline Loop"]

    PIPELINE --> |"observations"| SCORE_G["ScoreParams.observe()"]
    PIPELINE --> |"observations"| FETCH_G["FetchParams.observe()"]
    PIPELINE --> |"observations"| FRONT_G["SelectParams.observe()"]
    PIPELINE --> |"TD errors"| DQN_G["TuneParams.observe()"]

    SCORE_G & FETCH_G & FRONT_G & DQN_G --> |"group.update()"| LEARN_STEP["End-of-round learning"]

    LEARN_STEP --> |"group.to_json()"| DB_SAVE["DB: save_param_group()"]

    DB_SAVE --> |"next run"| DB_RESTORE

All 22 Adaptive Parameters

Section	Parameter	Default	Mode	What it controls
select	domain_max_pages	100	auto	Per-domain page cap
	slow_domain_ms	3000	auto	Latency threshold for “slow”
	unreliable_threshold	0.3	auto	Success rate floor for “unreliable”
	html_limit_factor	2.0	auto	Multiplier on avg HTML for truncation
fetch	timeout_ms	8000	auto	HTTP request timeout
	connect_timeout_ms	3000	auto	TCP connect timeout
	max_redirects	3	fixed	Redirect limit
	pool_idle_per_host	8	fixed	Connection pool size
	jitter_max_ms	50	auto	Random delay before requests
parse	max_links_per_page	200	auto	Link extraction cap
	anchor_batch_size	512	fixed	GPU batch cap for anchor embeddings
	embed_threshold_factor	0.5	auto	Min page score for anchor GPU work
	propagation_threshold	0.0	auto	Min score to propagate children
score	relevance_threshold	0.1	auto	Score cutoff for “relevant”
	semantic_weight	0.7	range [0.3, 0.9]	Semantic vs keyword blend
	anti_weight	0.3	range [0.1, 0.5]	Anti-reference penalty strength
	title_weight	0.4	auto	Title signal importance
	heading_weight	0.3	auto	Heading signal importance
	body_weight	0.3	auto	Body signal importance
	reference_blend	0.1	range [0.0, 0.3]	Reference adaptation speed
	lang_penalty	0.0	auto	Score multiplier for wrong language
tune	per_alpha	0.6	auto	PER prioritisation exponent
	per_epsilon	1e-4	fixed	PER priority floor

Keyboard shortcuts

Forager