Provenance

The differentiator: every datapoint is traced to its origin. Results carry a provenance block so your agent can cite the primary source. Anti-hallucination grounding, built in.

Inline provenance#

Most results include a compact provenance block alongside the data, naming the source it came from, the filing period, and the license:

json
{
  "fund": "BERKSHIRE HATHAWAY INC",
  "quarter": "2026-03-31",
  "holdings": [ /* ... */ ],
  "provenance": {
    "source": "SEC EDGAR 13F",
    "filing_quarter": "2026-03-31",
    "license": "public domain"
  }
}

Full lineage#

For the complete chain of custody, call provenance.get with a record id. It returns the exact origin URL, when we fetched it, the parser version, and a content hash:

bash
curl -X POST https://arkolith.com/api/mcp \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "provenance.get",
    "arguments": {
      "record_type": "filing",
      "record_id": "clr4f9k2a0001"
    }
  }
}'

Returns one record per fetch:

json
{
  "record_type": "filing",
  "record_id": "clr4f9k2a0001",
  "provenance": [
    {
      "source": "SEC EDGAR",
      "source_url": "https://www.sec.gov/Archives/edgar/data/1067983/...",
      "fetched_at": "2026-05-16T03:11:42Z",
      "parser_version": "13f-v3",
      "raw_hash": "sha256:9f2c…"
    }
  ]
}

Why it matters for agents#

An LLM asked for a holding value will happily invent one. With provenance attached, your agent can show the number and the SEC filing it came from, so a human (or another agent) can verify it. That's the difference between data you can look at and data you can act on.

Licensing is part of provenance
The license field tells you what you may do with each datapoint. Most SEC and FRED data is public domain; congressional STOCK Act data carries a commercial-use restriction and is served free. Always read it before redistributing.