eBay has developed a new recommendation model based on natural language processing (NLP) techniques and in particular the BERT model. This new model, called “ranker”, uses the result of the distance between embeddings as a feature, thus analyzing the information in the product titles from a semantic point of view.
Ranker allows eBay to increase purchase metrics, clicks and ad revenue by 3.76%, 2.74% and 4.06% compared to the previous native app (Android and iOS) and web platform model.
There are different stages of the buyer’s journey when shopping, often referred to as the “funnel”. In the “lower funnel” stage of the journey, buyers have identified the product or type of item they are interested in within the eBay marketplace and clicked on a product listing. On the product listing page, there are several modules that recommend products to the buyer based on different topics. The top module “Similar Sponsored Items” recommends similar items in relation to the main item on the page, which is called the “home” item.
Machine learning ranking machine
Retrieves a subset of candidate PLS items (“recall set”) that are most relevant to the seed item.
Applies a trained machine learning ranker to rank the listings in the recall set according to the probability of purchase.
Reorders the listings by incorporating the seller’s advertising rate to balance the seller’s rate, enabled by promotion, with the relevance of the recommendations.
The ranking model in stage 2 of the engine is trained offline on historical data. The features or variables of the ranking model include data such as:
- Historical product recommendation data
- Similarity between recommended and initial item
- Context (country, product category)
- User customization features
MicroBERT Finely Tuned Siamese Vans
The well-known BERT model demonstrates robust performance on all language comprehension tasks. BERT is a pre-trained model that uses an extensive amount of unlabeled data – sentences from the Wikipedia and Toronto Books corpora – which allows it to recognize semantically similar words (i.e., synonyms) as well as generate contextual word embeddings. For example, the embedding of the word “server” would be different in the sentence “The server crashed” compared to the sentence “Can you ask the server for the bill?”. Pre-trained BERT models can be fine-tuned to solve different NLP problems by adding task-specific layers that map the contextualized embedded words into the desired output function.
This new ranking model achieves a 3.5% improvement in purchase rank (the average rank of the item sold), but its complexity makes it difficult to perform real-time recommendations. Therefore, title embeddings are generated from a daily batch job and stored in NuKV (eBay’s cloud key-value store) with item titles as key and ad value embeddings.