Multilingual web search taking a step up in Scandinavia

A multilingual web search project has received a research grant of DKK 2.5 mio ~ 335,000 EUR to research the options of multilingual web search covering Scandinavian languages including Danish, Swedish, Norwegian, Finnish and Icelandic.

Receiving the grant will enable the university research group to work (part time) on the project for a while without worrying about funds for about a year. The project called Tvärsök – roughly translated as ‘cross search”, meaning searching across several languages but using one language. This will enable users to search across all the langauges using phrases from only one language.

Hercules Dalianis from the University in Stockholm is the project manager and CEO for the company behind the search engine technology to be used in the project. Euroling, the company, has developed the search engine Siteseeker, which already is being used on more than 80 public sector websites in Sweden. According to an article in Computer Sweden, Mr. Dalianis sees no problem in him being both project manager and CEO for the company delivering the tech behind the search engine and Dalianis mentions that Euroling has covered the cost of developing the tech and, if succesfull, the results from the engine will be free to use and and a demo of the search engine will be available during two years. He sees combining dictionaries and glossaries from different languages as being one of the major obstacles to overcome for the group. He finds that they might even have to do quite a bit of manual work to cross reference the local language data.

The project is supposed to give the user a unique option of searching in Danish language and receiving results based on all the languages. Technically when a user puts in a search query the phrase will be sent to a translation program which then sends the translated search phrase into the different local language databases. So far, results will be presented in different frames to help get a better overview of the results.

Many Scandinavians are able to read and understand their neighbours, but having the ability to do actual searches in other Scandianvian languages, is very limited. To do this kind of search, you have to have knowledge about what words and phrases to use. However the languages in the Scandinavian region are very similar in many cases, but being under more than a 1,000 years of development, there are some important differences including grammar rules, number of words used to make up sentences, pronounciation and word variations. Tvärsök seeks to solve some of these problems making a universial Scandinavian search engine possible.

The future for Scandinavian web searchers certainly looks a little brighter.

Read more on Information retrieval with human language technology
Tvärsök Homepage

Leave a Reply

Yandex.Metrica