Leading Russian search engine Yandex, has introduced a new method of predicting user intent in order to improve the quality of its results and thereby improve relevancy. The Yandex team advise that the system, known as “Spectrum” is based on analysing user queries statistically, categorising them — and then making sure that a range or “Spectrum” of possible answers to the query is delivered in the search results pages based on popularity.
The issue with keywords which have can have multiple user intentions behind them is that it is too easy for a search engine’s results to be dominated by the most popular single meaning associated with a particular keyword. Yandex says that around 20% of user queries on their engine are ambiguous. A query for “Apple” might mean the fruit or the company — likewise someone searching for “pizza”, might be searching for a restaurant or a recipe.
The new search technology has been developed by the Yandex team and focuses on allowing the search engine to return a whole “spectrum” of results matching a variety of user intents based on the frequency of user searches.
Yandex is attempting to regularly analyse user searches and to connect the most popular meanings behind the searches, based on the subsequent search paths of the users, then creating categories which become associated with the particular keyword. This meaning-based categorisation means the search engine “knows” what the range of most likely meaning options is and can cover the broader range within the immediate results reducing the amount of time the user needs to go hunting for the answers.
Each ambiguous keyword is analysed for objects such as personal names, films, books or cars. Each object is then classified into one or more categories. So, in the search query [panadol dosage] the medicine’s brand name ‘Panadol’ will be categorized as ‘medicine’, while the search term [casablanca] will be classified both into the ‘city’ and the ‘film’ categories. Currently, Spectrum uses about 60 (and counting) pre-defined categories.
For each category there is a range of search intents, the intentions with which the users look for something. So, the ‘product’ category will have search intents such as buy something or read customer reviews. The search intents for this category, consequently, will include ‘buy’, ‘reviews’ and ‘feedback’. A category may have from two or three search intents to dozens of them.
The query analysis undertaken by Yandex is fully automated, the company says. “Using the power of over a thousand processor cores, Spectrum analyses over 5 billion search queries on each analysis. The resulting database is kept up to date by repeating each analysis several times each week.”
Effectively, Yandex is using the historical experience of many searchers and searches to “guess” more accurately what the popular range of non-visible intentions is. In addition to search log statistics, Spectrum also uses information from reference sources and encyclopedias, such as Wikipedia. This helps the search engine to recognize new objects, learn about new meanings that do not fit any of the existing categories and add new categories.