General AI tools retrieve what they can find in abstract-level text. For rare disease drug development and clinical variant interpretation, that is not enough. The patient-level evidence that reshapes prevalence estimates, informs trial design, and supports variant classification often lives in full-text publications and supplemental datasets that AI retrieval tools consistently miss.
AI tools are fast. They are not complete.
This poster, presented at Bio-IT World, demonstrates the gap between AI-estimated results and expert-curated results across two complementary rare disease use cases. In the GLA gene/Fabry disease analysis, systematic literature mining and expert curation expanded clinically actionable variant coverage by 129%, adding 622 new pathogenic and likely pathogenic variants to ClinVar that were previously absent. In the PRKAG2 syndrome patient- landscape analysis, expert-curated evidence identified 548 patients characterized across 79 curated variables, spanning clinical presentation, genomic findings, laboratory metrics, and treatment data. That is 83% more patients than the highest estimate produced by any leading AI tool (ChatGPT, OpenEvidence, or openscilm.allen.ai).
The result: a structured, multidimensional dataset that enables genotype-phenotype analysis at a depth and clinical resolution that AI retrieval alone cannot replicate.



