Matching between two information objects is the core of many different information retrieval (IR) applications including Web search, question answering, and recommendation. Recently, deep learning methods have yielded immense success in speech recognition, computer vision, and natural language processing, significantly advancing state-of-the-art of these areas. In the IR community, deep learning has also attracted much attention, and researchers have proposed a large number of deep matching models to tackle the matching problem for different IR applications. Despite the fact that deep matching models have gained significant progress in these areas, there are still many challenges to be addressed when applying these models to real IR scenarios. In this workshop, we focus on the applicability of deep matching models to practical applications. We aim to discuss the issues of applying deep matching models to production systems, as well as to shed some light on the fundamental characteristics of different matching tasks in IR.
|9:00 am||Opening Remark|
|9:05-09:50||Keynote Speaker: Maarten de RijKe|
|Title: Neural Outfit Recommendation [pdf]|
|9:50-10:20||Invited talk: Rui Yan|
|Title: Recent Advances and Challenges on Human-Computer Conversational Systems [pdf]|
|10:20-10:30||Release Dataset Walkthrough [pdf]|
|10:30-11:00||Coffee break and Poster session|
|11:00-11:45||Keynote Speaker: Hang Li|
|Title: Framework and Principles of Matching Technologies [pdf]|
|11:45-12:20||Invited talk: Yiwei Song|
|Title: Deep Semantic Matching in Amazon Product Search [pdf]|
|12:20-12:30||Matching toolkit walkthrough [pdf]|
Maarten de Rijke is full professor of Information Retrieval in the Informatics Institute at the University of Amsterdam. He holds MSc degrees in Philosophy and Mathematics (both cum laude), and a PhD in Theoretical Computer Science. He worked as a postdoc at CWI, before becoming a Warwick Research Fellow at the University of Warwick, UK. He joined the University of Amsterdam in 1998, and was appointed full professor in 2004. He is a member of the Royal Dutch Academy of Sciences (KNAW). De Rijke leads the Information and Language Processing Systems group, one of the world’s leading academic research groups in information retrieval. His research focus is on intelligent information access, with projects on self-learning search engines, semantic search, and social media analytics. He is the director of Amsterdam Data Science. He’s a former director of the Intelligent Systems Lab (ISLA), of the Center for Creation, Content and Technology (CCCT), and of the University of Amsterdam’s Ad de Jonge Center for Intelligence and Security Studies.
Title: Neural Outfit Recommendation
Most previous work on outfit recommendation focuses on designing visual features to enhance recommendations. Existing work neglects user comments of fashion items, which have been proved to be effective in generating explanations along with better recommendation results. We propose a novel neural network framework, neural outfit recommendation (NOR), that simultaneously provides outfit recommendations and generates abstractive comments. NOR consists of two parts: outfit matching and comment generation. For outfit matching, we propose a convolutional neural network with a mutual attention mechanism to extract visual features. The visual features are then decoded into a rating score for the matching prediction. For abstractive comment generation, we propose a gated recurrent neural network with a cross-modality attention mechanism to transform visual features into a concise sentence. The two parts are jointly trained based on a multi-task learning framework in an end-to-end back-propagation paradigm. Extensive experiments conducted on an existing dataset and a collected real-world dataset show NOR achieves significant improvements over state-of-the-art baselines for outfit recommendation
Hang Li is a director of AI Lab, Bytedance Technology (also known as Toutiao), adjunct professors of Peking University and Nanjing University. He is an IEEE Fellow and an ACM Distinguished Scientist. His research areas include natural language processing, information retrieval, machine learning, and data mining. Hang graduated from Kyoto University in 1988 and earned his PhD from the University of Tokyo in 1998. He worked at NEC Research as researcher from 1990 to 2001, Microsoft Research Asia as senior researcher and research manager from 2001 to 2012, and chief scientist and director of Huawei Noah’s Ark Lab from 2012 to 2017. He joined Bytedance in 2017. Hang has published three technical books, and more than 120 technical papers at top international conferences including SIGIR, WWW, WSDM, ACL, EMNLP, ICML, NIPS, SIGKDD, AAAI, IJCAI, and top international journals including CL, NLE, JMLR, TOIS, IRJ, IPM, TKDE, TWEB, TIST. He and his colleagues’ papers received the SIGKDD’08 best application paper award, the SIGIR’08 best student paper award, the ACL’12 best student paper award.
Title: Concepts and Principles of Matching Technologies
Many application problems can be formalized as matching between two sets of objects or two sequences of objects. Examples of the former include user-item matching in recommendation, and examples of the latter include query-title matching in search. Although matching techniques have been widely used in practice, it still lacks a thorough study from a general viewpoint. In this talk, I will give an overview of matching technologies, particularly those used in search and recommendation. I will describe a formulation of the matching problem, summarize the major concepts with regard to matching, and make comparison between matching and other tasks. I will propose several principles for development of matching technologies. I will also introduce related work at the AI Lab of Bytedance Technology.
Rui Yan is an assistant professor at Peking University, an adjunct professor at Central China Normal University and Central University of Finance and Economics, and he was a Senior Researcher at Baidu Inc. He has investigated several open-domain conversational systems and dialog systems in vertical domains. Till now he has published more than 50 highly competitive peer-reviewed papers. He serves as a (senior) program committee member of several top-tier venues (such as KDD, SIGIR, ACL, WWW, IJCAI, AAAI, CIKM, EMNLP).
Title: Recent Advances and Challenges on Human-Computer Conversational Systems
Abstract: Nowadays, automatic human-computer conversational systems have attracted great attention from both industry and academia. Intelligent products such as XiaoIce (by Microsoft) have been released, while tons of Artificial Intelligence companies have been established. We see that the technology behind the conversational systems is accumulating and now open to the public gradually. With the investigation of researchers, conversational systems are more than scientific fictions: they become real. It would be interesting to review the recent development of human-computer conversational systems, especially the significant changes brought by deep learning techniques.
Yiwei Song is a senior applied scientist at Amazon and currently focuses on building a semantic matching model for Amazon product search. He previously worked on Click Through Rate (CTR) modeling and semantic matching in sponsored products team. He published 3 papers on semantic matching in product search and CTR modeling in Amazon internal machine learning conferences. He obtained his Ph.D. in Electrical Engineering from University of Illinois at Chicago and published more than 10 papers in top information theory conferences and journals, including IEEE Transactions on Information Theory. He also has several patents granted in the area of data storage system and wireless communications.
Title: Deep Semantic Matching in Amazon Product Search
Abstract: In this talk, we will describe the journey of building a semantic matching engine for helping customers effortlessly search and shop at Amazon. We will present various experiments of using neural networks to compute the semantic similarity between a query and product, represented as a combination of text and behavioral features (such as review rating, number of reviews, sales, etc.). We will report results from various neural architectures, loss functions, training data, and tokenization methods including strategies for handling novel queries. Overall, we are going to share some learnings from training a neural network-based information retrieval engine that significantly improved offline metrics, customer experience and key business metrics. The talk closes with a description of the challenges that require further innovation in science and engineering to build a deep learning-based semantic search engine.
Institute of Computing Technology, CAS
CICS, UMass Amherst
Amherst, MA, USA
Senior Research Manager
Head of Data Science
New York City, USA
Senior Director of Research
Institute of Computing Technology, CAS
Chen Qu (University of Massachusetts Amherst)
Marc Najork (Google)
Daniel Hill (Amazon)
Daniel Cohen (University of Massachusetts Amherst)
Xuanhui Wang (Google)
Liu Yang (MUniversity of Massachusetts Amherst)
Shangsong Liang (Sun Yat-sen University)
Choo Hui (Amazon)
Jun Xu (Renmin University of China)
Keping Bi (University of Massachusetts Amherst)
Wayne Xin-Zhao (Renmin University of China)