TL;DR: Language models hallucinate numbers because they generate plausible text, not verified facts. Prompts like "don't make things up" barely help. The real fix is grounding: give the model live, sourced data through a tool, and require it to cite the source. If a number can't be traced, it shouldn't be stated.

Why it happens

An LLM predicts the next token. Asked for "Apple's latest reported holdings value," it will produce a confident, well-formatted number whether or not it knows the answer. There is no internal "I'm not sure" gate unless you build one. For financial work, a wrong number that looks right is worse than no number.

What doesn't work

"Be accurate" prompts. They reduce nothing reliably.
Bigger models. Better, still not grounded.
Trusting training data. It's stale and uncited by definition.

What works: grounding + provenance

Give the model a tool, not a memory. Let it fetch the datapoint at answer time (via an API or MCP server) instead of recalling it.
Return the source with the data. Every value should arrive with its origin, a timestamp, and ideally a link.
Require citation. Instruct the agent to answer only with fetched, sourced numbers and to show the source. If it can't fetch it, it says so.

This is exactly why provenance is a feature, not a footnote. When the data layer hands back { value, source, timestamp, url }, the agent can cite primary evidence and a human can verify it.

A concrete pattern

Connect an MCP server that exposes sourced data (what is an MCP server?).
The agent calls a tool, gets a value plus its provenance.
The agent answers: "X, per [source], as of [date]" with a link.
Anything it can't source, it flags rather than invents.

Frequently asked questions

Does RAG solve hallucination?

Retrieval helps, but only if what you retrieve is sourced and the model is required to cite it. Retrieving unsourced text just moves the problem.

How is this different from a normal API?

A normal API returns a value. A provenance-first one returns the value and where it came from, so the answer is verifiable.

Arkolith stamps every datapoint with its source and timestamp so your agent cites ground truth. Get a key or read the docs.