Trustworthy Knowledge Discovery and Data Mining (TrustKDD)

Welcome to the Workshop on Trustworthy Knowledge Discovery and Data Mining (TrustKDD) at CIKM 2025!

The explosion of data and the widespread adoption of AI techniques, especially the success of foundation models and generative AI, have transformed knowledge discovery and data mining (KDD), making them integral to real-world decision-making. For both traditional AI methods and generative AI, issues such as data noise, algorithmic bias, lack of interpretability, and privacy concerns can significantly impact the quality and reliability of extracted knowledge, thereby affecting downstream decision-making. This workshop aims to bring together researchers and practitioners from information and knowledge management, data mining, and intelligent systems to explore trustworthy KDD across diverse settings in the generative AI era. We welcome contributions on robust data preprocessing, explainable learning algorithms, bias detection and mitigation, secure and privacy-preserving mining, trustworthy knowledge graph construction, resource-efficient deployment, alignment of foundation models, and applications for social good. Special emphasis is placed on emerging challenges posed by large-scale, pre-trained models in dynamic, multi-source, and user-centric environments. By fostering dialogue between traditional KDD approaches and innovations in the foundation model era, TrustKDD seeks to advance trustworthy methodologies that align with CIKM’s mission of developing reliable, scalable, and intelligent information and knowledge systems.

Call for Papers

Submision Website

Please submit your papers via OpenReview: https://openreview.net/group?id=ACM.org/CIKM/2025/Workshop/TrustKDD

Submision Requirements

Authors are invited to submit full-length research papers, including those that have already been published elsewhere. Submissions should be relevant to the workshop theme and meet the standards of top-tier international research conferences.

Manuscripts must be submitted in PDF format. Papers should follow the official ACM sigconf two-column template.

Full papers: maximum 9 pages (including appendix), plus unlimited pages for references and GenAI usage disclosure, presenting mature research results.
Short papers: maximum 4 pages, plus unlimited references and GenAI usage disclosure, presenting ongoing work, demos, position, or opinion papers.
Any appendix must be included within the 9-page limit (excluding references).
Submitted papers will undergo double-blind review: all submissions must be properly anonymized. Non-anonymized papers will be desk-rejected without review.
LLM-generated text is prohibited unless part of the experimental analysis. AI tools may only be used for light editing (e.g., grammar checks).
Accepted papers will not appear in formal proceedings, so future submission to other venues will not be affected.

Workshop Theme and Topics

Topics of interest for the Trustworthy Knowledge Discovery and Data Mining (TKDDM) workshop include, but are not limited to, the following:

Data Processing, Integration, and Generation Mechanisms: Methods for efficient data preprocessing, fusion, labeling, and valuation to reduce preparation costs and mitigate the risks associated with low-quality or AI-generated content.
Robustness, Fairness, and Privacy in KDD: Techniques and frameworks that enhance the trustworthiness of KDD systems, including robustness to adversarial inputs, fairness-aware learning, transparency, privacy preservation, and mitigation of harmful or toxic outputs.
Sustainable and Continual KDD Deployment: Approaches for energy-efficient training and deployment of KDD models, continual and life-long learning, automation with minimal human intervention, and adaptability to resource-constrained environments.
Trustworthy KDD for Social Good: Applications in recommendation systems, knowledge graph construction, intelligent education, scientific discovery, healthcare, and other areas that contribute to societal well-being.
Trustworthiness in Generative AI: Methods for addressing trust-related challenges in prompting, alignment, fine-tuning, and post-training of generative models, focusing on reducing hallucination, ensuring transparency, and preventing bias and misuse in generative KDD tasks.

Workshop Objectives, Goals, Target Audience, and Expected Outcomes

The TKDD workshop aims to advance research on trustworthy KDD in the generative AI era, focusing on how the whole KDD pipeline can be deployed in ways that support the long-term growth for social good. Specifically, this workshop seeks to foster interdisciplinary collaboration, promote innovative solutions to trust-related challenges, and bridge the gap between classical KDD methods and emerging generative AI technologies. It also aims to accelerate exchange between researchers and engineers, especially around the practical deployment of trustworthy foundation models. By bringing together academic and industry perspectives for all KDD participants, the workshop supports the development of responsible and robust information systems aligned with CIKM’s mission.

Workshop Relevance

The TKDD workshop closely aligns with CIKM’s core themes of data mining, knowledge management, and artificial intelligence. As KDD becomes central to decision-making—especially with the rise of generative AI—trustworthiness is a growing concern for system developers, deployers, and end users alike. TKDD offers a timely venue to explore methods that improve the reliability, fairness, interpretability, and robustness of data-driven systems. By addressing key challenges from data processing to real-world deployment, the workshop supports CIKM’s mission to advance cutting-edge research and foster academic-industry collaboration.

Important Dates

Paper Submission Deadline	September 15, 2025
Paper Acceptance notification	October 5, 2025 (AOE)
Workshop Date	November 14, 2025

Workshop Length and Schedule

The workshop will be half-day event. We plan to invite keynote speakers in both academia and industry, and original research papers that align with the workshop’s theme.

Time	Activity
09:00-09:05	Start
09:05-09:50	Invited talk 1: Generalizable Generative Retrieval (Zhaochun Ren)
09:50-10:30	Invited talk 2: Towards Responsible and Trustworthy Foundation Models (Xiaoyuan Yi)
10:30-11:00	Coffee break + Poster (Location: Lobby 3F(Auditrorium))
11:00-11:20	Oral presentation 1: Reliable and Adaptive Node Classification in Multiplex Heterophilic Graphs
11:20-11:40	Oral presentation 2: ChemE-MTDS: Multi-Turn Dialogue Data Synthesis and Quality-Controlled Fine-Tuning for Chemical Engineering Large Language Models
11:40-12:00	Oral presentation 3: On the Diminishing Returns of Complex Robust RAG Training in the Era of Powerful LLMs
12:00-12:15	Best paper award
12:15-12:20	Closing

Invited Talks

Title: Generalizable Generative Retrieval

Abstract: Generative retrieval (GR) redefines information retrieval as generating document identifiers with large language models, enabling end-to-end optimization but facing challenges in generalization and adaptability. In this talk, I will present two complementary advances toward generalizable GR. First, ZeroGR introduces a scalable, instruction-driven framework for zero-shot retrieval, featuring a docID generator for heterogeneous documents, an instruction-tuned query generator for diverse corpus indexing, and a reverse-annealing decoding strategy to balance precision and recall. Second, I will discuss DOME, a docID-oriented model editing method that efficiently integrates new documents without full retraining by applying hybrid-label adaptive updates to key decoder layers. Together, ZeroGR and DOME demonstrate how instruction tuning and model editing jointly enhance GR’s robustness, scalability, and cross-domain generalization, paving the way for retrieval systems that continuously evolve with changing knowledge.

Bio: Dr. Zhaochun Ren is an Associate Professor at Leiden University, the Netherlands. He is interested in information retrieval and natural language processing, with an emphasis on conversational AI, recommender systems, and information retrieval. He aims to develop intelligent agents that can address complex user requests and solve core challenges in NLP and IR towards that goal. His research has been recognized with multiple awards at RecSys, SIGIR, WSDM, EMNLP, and CIKM. Prior to joining Leiden, he was a Professor at Shandong University and a Research Scientist at JD.com.
Title: On the Risks and Benefits of Human Values for LLMs

Abstract: The advent of generative LLMs has transformed how information is produced and consumed. Through frequent, deep interaction with human users, LLMs not only generate and spread traditional harmful content, e.g., social bias, hate speech, but can also embed biased value orientations in more implicit ways within their outputs. This talk focuses on LLM value orientations and their potential impacts, and delves into two questions: (1) Do LLMs inherently exhibit certain values, and how are these related to risky outputs? (2) If we proactively align AI with human values, what benefits follow? These investigations aim to inform better alignment methods and clearer alignment goals for the future.

Bio: Xiaoyuan Yi, Senior Researcher at Microsoft Research Asia. He obtained his bachelor’s and doctorate degrees in computer science from Tsinghua University, and mainly engaged in natural language generation and Societal AI research. He published 40+ papers at top-tier AI venues like ICLR, NeurIPS, ACL, EMNLP and AAAI, with 5000+ Google Scholar citation. He has won honors such as the Best Paper Award and the Best System Demonstration Award of the Chinese Conference on Computational Linguistics, Rising Star Award of IJCAI Young Elite Symposium, Outstanding Doctoral Dissertation Award by China Computer Federation (CCF), Rising Stars in Social Computing by The Chinese Association for Artificial Intelligence (CAAI) and so on.

Organizers

Participation and Selection Process

We welcome participation from researchers, practitioners, and students in both academia and industry. Submissions will be evaluated through a rigorous peer-review process based on originality, technical quality, relevance to the workshop theme, and clarity of presentation. The accepted papers will be presented with opportunities for interactive discussion.

Contact

Name: Le Wu
Website: https://le-wu.com/
Postal Address: Kejiao Building, No. 485, Danxia Road, Hefei University of Technology
Email: lewu@hfut.edu.cn