# askETC

`askETC` is a distilled local corpus for answering questions about Ethereum Classic philosophy, ETC history, and adjacent cryptocurrency theory from the materials already present in this workspace.

## Layout

- `corpus/`: copied source documents with original relative paths preserved
- `core-canon/`: smaller first-pass canon built from `config/core_canon.tsv`
- `config/core_canon.tsv`: explicit canon definition with section and inclusion reason
- `indexes/`: generated indexes for `corpus/` and `core-canon/`
- `packed/`: LLM-ready mega-context exports in `TXT` and `HTML`
- `manifests/manifest.tsv`: one row per copied file with category, provenance, and inclusion reason
- `manifests/core_canon_manifest.tsv`: one row per canon file with section, provenance, and reason
- `manifests/all_files.txt`: full current file list for the `asketc` subtree
- `manifests/summary.json`: corpus counts by category
- `manifests/core_canon_summary.json`: canon counts by section and source category
- `query_etc.py`: local query and index rebuild tool

## Included

- Primary doctrine: Declaration of Independence, Classic Guide, FAQ, redesign text, and a few high-value framing docs
- Governance specs: English ECIP markdown specs and the ECIPs README
- Blog history: official ETC English blog posts from 2015-2020 plus later history/governance posts
- Blog philosophy: official ETC philosophy-tagged English blog posts
- Blog course: ETC course and proof-of-work course posts covering ETC, Bitcoin, PoW, monetary policy, and related theory

## Core Canon

The core canon is intentionally narrower and English-first. It keeps the primary ETC doctrine, founding history, the key monetary-policy and governance reference, and a selected set of modern philosophy/course essays that best explain ETC's recurring claims about:

- Code Is Law
- immutability
- decentralization
- trust minimization
- proof of work
- monetary policy
- ETC versus ETH

The canon definition is editable in `config/core_canon.tsv`. Rebuilding does not touch `corpus/`; it only refreshes `core-canon/`, the core manifest, and the generated indexes.

## Core Vs Full

`core-canon/` is the curated first-pass reading set. It is the smaller, higher-signal subset you would keep in standing LLM context when you want ETC's main philosophical claims, origin story, and monetary-policy thesis without hauling the whole archive into the prompt.

`corpus/` is the broader research set. It contains the canon material plus a much larger supporting body of official ETC blog history, course material, ECIPs, reference texts, translations, and adjacent historical documents. Use it when you need depth, corroboration, or retrieval beyond the main canon.

In practice:

- use `core-canon/` for direct prompt stuffing, default answers, and high-signal summaries
- use `corpus/` for retrieval, corroboration, niche topics, and longer historical or governance questions
- use the packed `core` files when a model has a smaller context window
- use the packed `full` files only on very large-context models or when you explicitly want a single mega-context artifact

## Packed Context

The packed exports are meant for direct LLM ingestion when you want a single deterministic context artifact instead of retrieval at query time.

- `packed/core-context-packed.txt`
- `packed/core-context-packed.html`
- `packed/full-context-packed.txt`
- `packed/full-context-packed.html`

Each packed file contains:

- a small pack header with generation time and rough size estimates
- one metadata block per document
- cleaned body text with source path preserved

The `TXT` version is better for direct prompt injection. The `HTML` version is better for browser inspection, archival, or feeding into systems that prefer a single HTML document.

## Excluded

- Source code and site/build code
- Images, media, and other non-text assets
- Most translation index files and site metadata
- Duplicate content where a cleaner canonical text source already existed

Two historical PDFs are kept as references under `corpus/ethereumclassic.org-redesign-text/history/`, but they are not grep-friendly.

## Querying

Search the full corpus:

```sh
rg -n -i "code is law|immutab|decentral" asketc/corpus
```

Search governance and monetary policy material:

```sh
rg -n -i "monetary policy|supply cap|ecip-1017|block reward" asketc/corpus/ECIPs
```

Search official ETC blog philosophy/history/course material:

```sh
rg -n -i "social consensus|proof of work|censorship resistance|history" asketc/corpus/ethereumclassic.github.io/content/blog
```

Inspect the manifest when you want to narrow by category before querying:

```sh
sed -n '1,40p' asketc/manifests/manifest.tsv
```

Build or refresh the canon mirror and both indexes:

```sh
python3 asketc/query_etc.py rebuild
```

Generate the `TXT` and `HTML` packed context files:

```sh
python3 asketc/query_etc.py pack --scope all
```

Generate only the core canon packed files:

```sh
python3 asketc/query_etc.py pack --scope core
```

Search only the core canon:

```sh
python3 asketc/query_etc.py search "code is law" --scope core
```

Search the broader corpus:

```sh
python3 asketc/query_etc.py search "social consensus" --scope full
```

Search both scopes without duplicating core files:

```sh
python3 asketc/query_etc.py search "monetary policy" --scope both
```

List canon files by section:

```sh
python3 asketc/query_etc.py files --scope core --section philosophy
```

Show index counts:

```sh
python3 asketc/query_etc.py stats
```
