Research Topics
Our research topics include (but are not limited to)
Search and Ranking
Natural Language Processing
Multimedia Retrieval
Artificial Intelligence
News
Recent
2024-05 "Exploring Memorization in Fine-tuned Language Models" is accepted by ACL 2024!
2024-05 "The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)" is accepted by ACL 2024!
2024-03 "Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method" is accepted by NAACL 2024!
2024-03 "A Robust Semantics-based Watermark for Large Language Model against Paraphrasing" is accepted by NAACL 2024!
2024-03 "MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion" is accepted by NAACL 2024!
2023-10 "Text-Video Retrieval via Multi-Modal Hypergraph Networks" is accepted by WSDM 2024!
2023-10 "Large Language Models for Data Aumgnetation in Recommendation" is accepted by WSDM 2024!
2023-10 "Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent" is accepted by EMLNP 2023!
2023-10 "DiQAD: A Benchmark Dataset for Open-domain Dialogue Quality Assessment" is accepted by EMLNP 2023!
2023-10 "GS2P: A Generative Pre-trained Learning to Rank Model with Over-parameterization for Web-Scale Search" has received IEEE DSAA 2023 Best Paper Award!
2023-08 "Learning to Tokenize for Generative Retrieval" is accepted by NeurIPS 2023!
2023-08 "I^3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval" is accepted by CIKM 2023!
2023-05 "Boosting Event Extraction with Denoised Structure-to-Text Augmentation" is accepted by ACL 2023!
2023-05 "Are Message Passing Neural Networks Really Helpful for Knowledge Graph Completion?" is accepted by ACL 2023!
2023-05 Three papers are accepted by KDD 2023!
2023-04 Two papers are accepted by SIGIR 2023!
Recent Publication
Qian Li, Lixin Su, Jiashu Zhao, Long Xia, Hengyi Cai, Suqi Cheng, Hengzhu Tang, Junfeng Wang, Dawei Yin. Text-Video Retrieval via Multi-Modal Hypergraph Networks. Accepted by WSDM 2024.
Wei Wei, Xubin Ren, Jiabin Tang, Qinyong Wang, Lixin Su, Suqi Cheng, Junfeng Wang, Dawei Yin and Chao Huang. Large Language Models for Data Aumgnetation in Recommendation. Accepted by WSDM 2024.
Weiwei Sun, Lingyong Yan, Xinyu Ma, Shuaiqiang Wang, Pengjie Ren, Zhumin Chen, Dawei Yin, Zhaochun Ren. Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents. Accepted by EMNLP 2023.
Yukun Zhao* , Lingyong Yan*, Weiwei Sun, Chong Meng, Shuaiqiang Wang, Zhicong Cheng, Zhaochun Ren, Dawei Yin. DiQAD: A Benchmark Dataset for Open-domain Dialogue Quality Assessment. Accepted by EMNLP 2023 Findings. (* equal contribution)
Weiwei Sun, Lingyong Yan, Zheng Chen, Shuaiqiang Wang, Haichao Zhu, Pengjie Ren, Zhumin Chen, Dawei Yin, Maarten de Rijke, Zhaochun Ren. Learning to Tokenize for Generative Retrieval. Accepted by NeurIPS 2023.
Qian Dong, Yiding Liu, Qingyao Ai, Haitao Li, Shuaiqiang Wang, Yiqun Liu, Dawei Yin, Shaoping Ma. I^3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval. In CIKM 2023.
Yuchen Li, Haoyi Xiong, Linghe Kong, Zeyi Sun, Hongyang Chen, Shuaiqiang Wang, and Dawei Yin. MPGraf: a Modular and Pre-trained Graphformer for Learning to Rank at Web-scale. In ICDM 2023.
Yubao Tang, Ruqing Zhang, Jiafeng Guo, Jiangui Chen, Zuowei Zhu, Shuaiqiang Wang, Dawei Yin, Xueqi Cheng. Semantic-Enhanced Differentiable Search Index Inspired by Learning Strategies. In KDD 2023 (Applied Data Science track).
Rong Huang, Danfeng Zhang, Weixue Lu, Han Li, Meng Wang, Daiting Shi, Jun Fan, Zhicong Cheng, Simiu Gu, Dawei Yin. Learning Discrete Document Representations in Web Search. In KDD 2023 (Applied Data Science track).
Yuchen Li, Haoyi Xiong, Linghe Kong, Qingzhong Wang, Shuaiqiang Wang, Guihai Chen, Dawei Yin. S2phere: Semi-Supervised Pre-training for Web Search over Heterogeneous Learning to Rank Data. In KDD 2023 (Applied Data Science track).
Bo Wang, Heyan Huang, Xiaochi Wei, Ge Shi, Xiao Liu, Chong Feng, Tong Zhou, Shuaiqiang Wang, Dawei Yin. Boosting Event Extraction with Denoised Structure-to-Text Augmentation. In ACL 2023 Findings.
Juanhui Li, Harry Shomer, Jiayuan Ding, Yiqi Wang, Yao Ma, Neil Shah, Jiliang Tang, Dawei Yin. Are Message Passing Neural Networks Really Helpful for Knowledge Graph Completion? In ACL 2023.
Xubin Ren, Chao Huang, Lianghao Xia, Jiashu Zhao and Dawei Yin. Disentangled Contrastive Collaborative Filtering. In SIGIR 2023.
Juanhui Li, Wei Zeng, Suqi Cheng, Yao Ma, Jiliang Tang, Shuaiqiang Wang and Dawei Yin. Graph Enhanced BERT for Query Understanding. In SIGIR 2023 (Industry Track).
Dan Luo, Lixin Zou, Qingyao Ai, Zhiyu Chen, Dawei Yin, Brian D Davison. Model-based Unbiased Learning to Rank. In WSDM 2023.
Yuchen Li, Haoyi Xiong, Qingzhong Wang, Linghe Kong, Hao Liu, Haifang Li, Jiang Bian, Shuaiqiang Wang, Guihai Chen, Dejing Dou, Dawei Yin. COLTR: Semi-supervised Learning to Rank with Co-training and Over-parameterization for Web Search. In TKDE 2023.
Recruiting
We are hiring researchers who are driven to innovate in areas on Information Retrieval, Natural Language Processing, Multimedia, Data Mining, Machine Learning and Artificial Intelligence. Please send your cv to search_science@baidu.com , if interested.
Job Requirements
- Master and PhD in Computer Science or equivalent.
- Experience in Information Retrieval & Web Search, Natural Language Processing, Multimedia, and other related areas.
- Publications at top-tier peer-reviewed conferences or journals.
- Proven track record of innovation in creating novel algorithms and advancing the state of the art.
Location:
Baidu Technology Park,
Haidian District,
Beijing, China
Email:
search_science@baidu.com
Call:
(+86 10)5992 8888