Measure how unstable your LLM's confidence really is.
Perturbation-driven hallucination detection — black-box, logprob-friendly, CLI-first.
pip install yuragi
Releases →v0.4.1 — Apr 14, 2026
README →Usage guide
Theory →Mathematical foundation
Paper draft →Confidence inversion on 8B models
baseline_confidence; perturbation features add no statistically significant Δ (p=0.35)fragility_score AUC ≈ 0.50 across 6 datasets)yuragi generates 13 semantic-preserving prompt perturbations (typos, tone, paraphrase, authority framing, counterfactual context) and compares the model's confidence distribution across responses. When answer text stays the same but confidence moves, that's fragility — a measurable property of prompt wording rather than model knowledge.