Robust training objectives improve embedding-based retrieval in industrial recommendation systems
arXiv preprint arXiv:2409.14682, 2024•arxiv.org
Improving recommendation systems (RS) can greatly enhance the user experience across
many domains, such as social media. Many RS utilize embedding-based retrieval (EBR)
approaches to retrieve candidates for recommendation. In an EBR system, the embedding
quality is key. According to recent literature, self-supervised multitask learning (SSMTL) has
showed strong performance on academic benchmarks in embedding learning and resulted
in an overall improvement in multiple downstream tasks, demonstrating a larger resilience to …
many domains, such as social media. Many RS utilize embedding-based retrieval (EBR)
approaches to retrieve candidates for recommendation. In an EBR system, the embedding
quality is key. According to recent literature, self-supervised multitask learning (SSMTL) has
showed strong performance on academic benchmarks in embedding learning and resulted
in an overall improvement in multiple downstream tasks, demonstrating a larger resilience to …
Improving recommendation systems (RS) can greatly enhance the user experience across many domains, such as social media. Many RS utilize embedding-based retrieval (EBR) approaches to retrieve candidates for recommendation. In an EBR system, the embedding quality is key. According to recent literature, self-supervised multitask learning (SSMTL) has showed strong performance on academic benchmarks in embedding learning and resulted in an overall improvement in multiple downstream tasks, demonstrating a larger resilience to the adverse conditions between each downstream task and thereby increased robustness and task generalization ability through the training objective. However, whether or not the success of SSMTL in academia as a robust training objectives translates to large-scale (i.e., over hundreds of million users and interactions in-between) industrial RS still requires verification. Simply adopting academic setups in industrial RS might entail two issues. Firstly, many self-supervised objectives require data augmentations (e.g., embedding masking/corruption) over a large portion of users and items, which is prohibitively expensive in industrial RS. Furthermore, some self-supervised objectives might not align with the recommendation task, which might lead to redundant computational overheads or negative transfer. In light of these two challenges, we evaluate using a robust training objective, specifically SSMTL, through a large-scale friend recommendation system on a social media platform in the tech sector, identifying whether this increase in robustness can work at scale in enhancing retrieval in the production setting. Through online A/B testing with SSMTL-based EBR, we observe statistically significant increases in key metrics in the friend recommendations, with up to 5.45% improvements in new friends made and 1.91% improvements in new friends made with cold-start users.
arxiv.org
Showing the best result for this search. See all results