From Literature to AI-Ready Data via Autonomous Knowledge Distillation
数学专题报告
报告题目(Title):From Literature to AI-Ready Data via Autonomous Knowledge Distillation
报告人(Speaker):王阳帅(新加坡国立大学)
地点(Place):后主楼1124
时间(Time):2026年4月23日(周四)10:00-11:00
邀请人(Inviter):陈华杰
报告摘要
While large language models offer new ways to process scientific literature, their high hallucination rates and rigid workflows often limit their utility in complex domains. In this talk, we present a high-fidelity pipeline designed to overcome these limitations by autonomously extracting data from raw, multimodal PDFs. Using rare-earth luminescence as a test case, our system decodes the complex relationships between material synthesis, properties, and applications with 95.6% accuracy. More importantly, this structured data empowers the rational prediction of novel phosphor candidates and kinetic pathways. We will discuss how transitioning from static document archives to dynamic, cross-disciplinary databases lays the groundwork for automated knowledge graphs and the next generation of self-driving laboratories.