Sergey Nivens - Fotolia
Historically, more than half of the sales through eBay Inc.'s online auction site are made overseas. There was a problem, though: Users on the site saw only listings posted in their preferred language.
People could change their settings and browse the site in different languages, one at a time. But they had little visibility into products and goods listed in languages other than the selected one. With different sellers posting the same items in a variety of languages, this meant that individual users were only seeing a segment of all the listings for a given product. Also, some languages weren't well represented on the site, so users who preferred them saw very few listings.
To make the platform more of a unified global marketplace, eBay has turned to machine learning techniques to translate pages automatically. "By enabling translation of user queries, we could help those people who have insufficient English knowledge," said Evgeny Matusov, senior research manager of machine translation at the San Jose, Calif., company.
This kind of natural language processing (NLP) has come to be one of the primary use cases for deep learning -- advanced machine learning techniques that tackle some of the most challenging computing problems. Leading tech companies like Google, Facebook, Microsoft and IBM are using deep learning to improve computer vision, speech recognition and artificial intelligence.
The process is a tricky undertaking at eBay, where the company traffics in brand names and hosts a world of user-generated jargon. Should the word apple be translated in a listing? In some cases, perhaps, but most often it probably refers to the electronics company, in which case it should not be translated. How do you handle a phrase like NIB, which is often used in eBay listings to stand for new in box? Translating the phrase to another language would result in a different acronym, but many eBay users recognize NIB even if they aren't using English.
Matusov and his team have developed a set of statistical algorithms that leverage machine learning techniques and natural language processing to make these determinations. The process starts with a team of translators who manually translate sample listings into different languages. This, along with other data scraped from the web and news clippings, make up the data used to train the algorithms.
The team uses a manual approach to training and building the NLP algorithms because it gives them more control over the output, Matusov said. In the past few years, neural networks have become the most common approach to NLP and many other deep learning projects. But Matusov said neural networks involve a degree of self-directed learning that could lead to unfavorable results in this particular setting given the challenges of brand names and esoteric vocabulary.
The system utilizes a mix of homegrown tools, as well as some open source and proprietary software. (Matusov declined to specify exactly which tools are in use, citing corporate policy.) It has been live since its original launch in Russia in 2014, and usage has since spread to Latin America and Europe.
Matusov said the translation algorithms have been partially responsible for an increase in items sold from the U.S. and Europe to Russia, Mexico, Argentina and Brazil. "We definitely had a big impact with both buyers and sellers in those places," he said.
Spark's built-in libraries enhance businesses' machine learning projects
Chatbots are coming, thanks to ready-to-use machine learning algorithms
Machine learning techniques help make IoT applications smarter