AI & Agents

How to Give Claude Access to SEC Filings

A practical tutorial: connect Claude to SEC filings over MCP in one command, the tools your agent gets, example prompts, and why provenance is the trust layer.

9 min read
How to Give Claude Access to SEC Filings

The short version

Claude cannot read SEC filings on its own: its training data has a cutoff, and EDGAR's raw XML is painful to parse mid-conversation, so an unassisted model will often produce plausible but wrong holdings. The fix is the Model Context Protocol (MCP): connect a server that exposes parsed 13F, Form 4, and ownership data as tools Claude can call. With Arkolith that is one shell command plus an API key, and Claude immediately gets tools like fund.holdings, fund.holdings.diff, and insider.transactions. Every datapoint carries the SEC accession number it came from, so any claim can be verified against the original filing.

Why Claude needs help with SEC filings

A model has a knowledge cutoff and no native pipe into EDGAR. Ask a bare model what a hedge fund holds and you get a blend of stale training data and confident interpolation. It often looks right, which is the dangerous part.

The raw filings are public and free. The SEC even runs a usable full-text search. The problem is the shape of the data. A 13F information table is XML keyed by CUSIP, not ticker. Filers spell issuer names however they like. Amendments arrive as restatements or new-holdings patches and must be merged with the original under specific rules. Form 4 filings get superseded by 4/A corrections. Parsing all of this on the fly inside a chat session burns context, runs slowly, and fails on edge cases that take months of production traffic to catalog.

Timeliness compounds the problem. A 13F is due 45 days after quarter end (the 2026 deadlines are Feb 17, May 15, Aug 14, and Nov 16). Form 4 is due within 2 business days of the trade, and Schedule 13D within 5 business days of crossing the 5% threshold. Even a freshly trained model is structurally behind the filing calendar. If you want Claude to reason over what was actually filed, it needs a live, parsed source it can query at answer time. That is the general case for tool grounding, covered in why AI models hallucinate market data.

Radiant orb linked to a constellation of instruments: illustration for

Connect Claude to SEC filings in one command

First, mint an API key: sign up and a key with free starter credits is created for you. Then register the server with Claude Code:

claude mcp add --transport http arkolith \
  https://arkolith.com/api/mcp \
  --header "Authorization: Bearer YOUR_API_KEY"

That is the whole integration. If you have never used MCP, think of it as a universal connector for agent tools: one wire format for discovery, invocation, and results, so any compliant client can use any compliant server. On connect, Claude performs tool discovery: the server advertises each tool with a JSON schema and a plain-language description, and Claude decides when to call them based on your prompt. There is no SDK to install and no glue code to write.

Two properties of this setup are worth understanding. First, the reasoning stays on your side. Arkolith serves data; your own Claude does the thinking; the data layer never runs inference. Second, billing is prepaid credits metered per tool call rather than a seat license, which fits agents that make bursts of calls and then go quiet for days. The same pattern works in Claude Desktop and other MCP-capable clients; the step-by-step walkthrough lives in connect market data to Claude over MCP.

The tools Claude gets

Once connected, Claude sees a catalog of SEC-cluster tools. The dataset underneath, as of Q1 2026: 1,824 institutional filers, 1.87 million long positions worth $53.7 trillion in reported value, and 51,000+ insider transactions from Form 4.

Tool The question it answers
search Resolve a fund, manager, or ticker to its identifiers
fund.list Which institutional filers are tracked, ranked by reported book
fund.holdings What a fund reported in its most recent 13F
fund.holdings.diff What changed between quarters: new, added, trimmed, exited
fund.position.history One position traced across every reported quarter
fund.holdings.as_of The portfolio as it was publicly known on a specific date
stock.owners Which tracked funds hold a given ticker
insider.transactions Form 4 buys and sells for a company or insider
insider.clusters Multiple insiders at one company buying in the same window
watchlist.poll New filings for entities you follow since the last check
provenance.get The exact source filing behind any datapoint

Three of these do work raw EDGAR will not do for you. fund.holdings.diff compares position to position across quarters after amendment merging, so an increased share count is a real add and not a restatement artifact. fund.holdings.as_of answers point-in-time questions: what was publicly known on a date, the only honest basis for a backtest. And insider.clusters aggregates the Form 4 pattern that academic work has generally found most informative: several insiders buying the same stock in the same window.

Example prompts that actually work

Treat Claude plus these tools like a junior analyst with a terminal. Prompts that map cleanly onto tool calls:

  • "Pull Berkshire Hathaway's latest 13F and diff it against the prior quarter. Which positions are new and which were exited?"
  • "Which tracked funds opened new positions in NVDA last quarter, and how large is each position relative to that fund's book?"
  • "Check TSM for insider activity over the last two quarters. Is any of the buying clustered?"
  • "Trace one fund's position in a single name across every reported quarter and summarize the timing."
  • "Watch these three filers and tell me what is new since yesterday."

Two habits make these dramatically better. State the as-of expectation: a 13F shows quarter-end positions and can be filed up to 45 days later, so in early June the freshest institutional snapshot is Q1, filed by May 15. Claude with tools will respect that lag; Claude without tools will happily imply it knows the current portfolio. Also distinguish filing types in the ask. 13F is the quarterly portfolio, Schedule 13D/13G is the 5%-ownership event, Form 4 is the insider trade, Form 3 is the initial insider disclosure (due within 10 days of becoming an insider), and Form 5 is the annual catch-up (due 45 days after fiscal year end). A prompt that names the right instrument gets a sharper answer, and adding "include the accession number for each claim" turns a chat answer into something you can check line by line.

Why provenance matters more than coverage

Every Arkolith response carries a provenance block: the SEC accession number of the filing each datapoint was derived from, plus the filing quarter and source system. The chain terminates at EDGAR, so any number Claude gives you can be verified against the SEC's own 13F documentation and the underlying filing in under a minute.

This matters because language models do not fail loudly. They produce fluent, internally consistent tables with wrong numbers in them, and a fabricated holdings table is indistinguishable at a glance from a real one. The only structural defense is grounding: force every figure to come from a tool call, and make every tool result carry its citation. When you can click through to the actual information table, trust stops being a vibe and becomes a property of the system.

Provenance also keeps you honest about the data's real limits, which belong to the SEC's regime, not to any pipeline. A 13F covers long US-listed equity positions for managers above the $100M reporting threshold, at quarter end, up to 45 days late. It generally omits short positions and most non-US holdings. Form 4 is fast but covers only insiders; 13D/13G covers only large stakes. An agent that knows the lag and scope of each source can caveat its own answers instead of overclaiming. For a deeper treatment of what this data can and cannot tell you, see how accurate is 13F data; the official definitions live at investor.gov.

The same data over REST when you need it

MCP is for interactive reasoning. For pipelines, crons, and deterministic code paths, the identical data is available as a REST API with the same key and the same credit meter:

# List tracked institutional filers
curl -H "Authorization: Bearer YOUR_KEY" "https://arkolith.com/api/v1/funds"

# Resolve a name or ticker to identifiers
curl -H "Authorization: Bearer YOUR_KEY" "https://arkolith.com/api/v1/search?q=berkshire"

# Pull a fund's holdings by CIK
curl -H "Authorization: Bearer YOUR_KEY" "https://arkolith.com/api/v1/funds/0001067983/holdings"

A useful rule: if a human or an agent is deciding what to ask next, use MCP; if a machine asks the same question every morning, use REST. Many real systems use both. A nightly REST job materializes a watchlist, and an MCP-connected Claude session investigates whatever the job flags. The tradeoffs between the two surfaces are covered in MCP vs REST API.

Authentication is one bearer key across both. Keys are minted and rotated from the dashboard, and a leaked key can be revoked without disturbing the rest. The API behavior is boring on purpose: predictable JSON, stable field names, provenance on every response, and explicit errors when credits run out rather than silent truncation. Boring is what you want underneath an agent.

Radiant orb linked to a constellation of instruments, elevated view: illustration for

Frequently asked questions about giving Claude access to SEC filings

Does Claude need a plugin to read SEC filings?

It needs a tool source, and MCP is the standard way to provide one. Claude Code and Claude Desktop both support remote MCP servers over HTTP with a bearer token. Once registered, the tools appear automatically and Claude decides when to call them; there is no plugin marketplace step.

How fresh is the SEC data Claude gets?

As fresh as the SEC's deadlines allow. Form 4 insider trades arrive within 2 business days of the trade, Schedule 13D within 5 business days, and 13F portfolios up to 45 days after quarter end. Ingestion follows the filing calendar, so the binding constraint is regulatory lag, not pipeline lag.

Can Claude still hallucinate with the MCP server connected?

It can misread or over-summarize a tool result, so grounding has to be checked, not assumed. Every datapoint carries its SEC accession number, and asking Claude to cite accession numbers in its answers makes fabricated figures stand out immediately. Treat provenance as a verification habit, not a guarantee.

What does it cost to connect Claude to SEC filings?

Signup includes free credits, and usage is metered per tool call from a prepaid balance rather than a subscription. Because the server only serves data (your own Claude does the reasoning), there is no inference charge on the data side. Heavy users top up; light users may never pay.

This article explains public filings and data concepts. It is not investment advice.

#Claude#MCP#SEC EDGAR#13F#Form 4#AI agents