Data – Knowledge Environment and Knowledge Landmarks in Machine Learning
数学学科创建110周年系列报告
报告题目(Title):Data – Knowledge Environment and Knowledge Landmarks in Machine Learning
报告人(Speaker):Witold Pedrycz 教授 (University of Alberta)
地点(Place):后主楼1220
时间(Time):2025年4月27日 16:00-17:00
邀请人(Inviter):于福生
报告摘要
The unpreceded progress in Machine Learning (ML) can be attributed to an efficient use of masses of data as being recently exemplified through numerous constructs of LLMs and foundation models.
It becomes intriguing, though, that while exhibiting a heavy reliance on data, a role of knowledge in ML has not been clearly considered. In this talk, we advocate an ultimate importance of synthesizing a unified design knowledge-data (KD) of Machine Learning or KD-ML, for brief. As a new paradigm, KD-ML focuses on a prudent and orchestrated engagement of data and knowledge in the design practices in the area.
The fundamentals of the KD environment are formulated along with a historical perspective and the key highlights are identified. The issues of origin of problem-oriented knowledge, taxonomy of knowledge and the and its main features are discussed.
Data and knowledge arise at very different levels of abstraction with knowledge being formalized and represented at symbolic level. This constitutes a genuine challenge as data are predominantly numeric. We stress that in the development of a cohesive and unified framework of coping with data and knowledge in learning processes, one needs to reconcile highly distinct levels of abstraction (numeric-qualitative) and with this regard information granules play a pivotal role.
We offer a taxonomy of knowledge by distinguishing between scientific and common-sense knowledge and elaborate on a spectrum of ensuing knowledge representation scheme. In the sequel, the main categories of knowledge-oriented ML design are discussed including physics-informed ML (with the reliance of scientific knowledge), an augmentation of data driven models through knowledge-oriented constraints (regularization), a development of granular expansion of the data-driven model and ways of building ML models in the presence of knowledge conveyed by rules. When analyzing the proposed categories, it is also clearly explained how the new ML environment helps avoid a detrimental effect of data blinding. Selected schemes of the KD unified environment and ensuing learning schemes are discussed including a study on LLM-based knowledge acquisition.
主讲人简介
Witold Pedrycz (IEEE Life Fellow) is Professor in the Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada. He is also with the Systems Research Institute of the Polish Academy of Sciences, Warsaw, Poland. Dr. Pedrycz is a foreign member of the Polish Academy of Sciences and a Fellow of the Royal Society of Canada. He is a recipient of several awards including Norbert Wiener award from the IEEE Systems, Man, and Cybernetics Society, IEEE Canada Computer Engineering Medal, a Cajastur Prize for Soft Computing from the European Centre for Soft Computing, a Killam Prize, a Fuzzy Pioneer Award from the IEEE Computational Intelligence Society, and 2019 Meritorious Service Award from the IEEE Systems Man and Cybernetics Society.
His main research directions involve Computational Intelligence, Granular Computing, and Machine Learning.
Professor Pedrycz serves as an Editor-in-Chief of WIREs Data Mining and Knowledge Discovery (Wiley), and Co-editor-in-Chief of J. of Data Information and Management (Springer).