Kenneth W. Bingham

Reliability, measured.

I don't promise zero hallucinations. I measure yours and cut them - and you see the number.

Every language model can make things up, and anyone who guarantees none is selling you a story. Here is the honest version, and it is the one that survives contact with your customers: I ground the AI in your own data, make it cite and verify, let it say "I don't know," and then I measure the hallucination rate on your real questions and show it drop.

How I make it trustworthy

The meter - your number, from your data

Run your test set before and after the work, enter the counts, and it computes the reduction you can defend.

Before (baseline)

Test questions
Hallucinated (wrong / unsupported)
Honest "I don't know"
Grounded & correct28

After (your setup)

Test questions
Hallucinated (wrong / unsupported)
Honest "I don't know"
Grounded & correct42
Hallucination rate: , a cut
relative reduction
grounded & correct (after)
honest "I don't know" (after)

How the number is made (so it holds up)

Grounded & correct ✓ - right, and supported by a real cited source. This is what we grow.
Hallucination ✗ - states something false, or a claim/citation not in the sources, as if it were fact. This is what we cut.
Honest "I don't know" ✓ - declines or asks for more when there is no solid source. Not an error - a feature.
The honest scope. The number is real but bounded: it is measured on this test set of your questions, which is representative, not every possible question, and it is not a guarantee of zero. Refusals count as honest, not as errors. Have your own expert validate the scoring, or at least spot-check a sample, so the figure is yours and credible. That is the version I will stand behind in front of your customers.
Kenneth W. Bingham · reliability and AI enablement

Concept and direction: Kenneth W. Bingham. Built with the help of Claude AI under a standing directive to be skeptical, to insist on proof, and to allow no claim that is not demonstrated.