When reading German SEO blogs and forums, few questions are as frequently recurring as the issue of umlauts and how to deal with them when optimising web pages. Umlauts
are special characters in the German language that are not part of the ASCII character set originally used for web pages and URLs.
During the 25 years of the World Wide Web, several ways of representing umlauts in a way that conforms with a limited set of characters have been created, from the HTML-code equivalents of ”Ä (Ä) to UFT character replacements such as %C3%84 (Ä). But even before that, since the early days of typewriters, there was an easy way of replacing umlauts that is still widely used by German speakers when forced to type on a keyboard that lacks the correct keys: adding a “e” to the character without the dots on top “AE” (Ä).
When it comes to on-site keyword use, umlauts mainly are of concern for URLs, as it is still common to form these using only ASCII characters for ease of use and to avoid the risk of misrepresentation in certain web browsers as well as non-functioning links. But what are we to do with the sometimes significant search volume that we might miss if we ignore the alternative spellings and character representations? We had a look at some examples in order to shed some light on how Google handles these search queries.
First things first, though: at no point should we use alternatives to the correct umlauts in places visible to the user other than URLs, simply because it would be a misspelling and look truly awful to a German eye. In fact, one of the most common ways of getting round this is to use the full URL in titles, descriptions and on the webpage itself. But is it really necessary?
We looked at three keywords containing umlauts and the search results they returned. We also looked at the top ranking pages in order to find out how the keywords were used on them.
The first keyword is Rätsel (Puzzle). We compared the SERPs for the correct spelling (Rätsel), the substituted version ”Raetsel” and the plain misspelling ”Ratsel”. All these keywords have search volume, from 60,500 average monthly searches to 1,300 and 320 for the substituted and misspelt versions.
We can see that 90% of the top 10 search results for ”Raetsel” are the same as for the correct spelling. However, for ”Ratsel” it’s only 40%.
Looking at the actual search results, we can see that the dominating spelling in the URL is the substitution ”ae” and not the now often recommended UTF code for the ”ä”. Intrigued by this, we had a closer look at how the keywords were used on the actual pages. Of the top four pages, all had the ”Raetsel” in the URL. One actually had it in the page title, and one even in the H1 (in the form of the URL).
All pages had the correct ”Rätsel” in the H1 and page title as well as, of course, in the page content. More interestingly though, no page had the misspelled ”Ratsel” anywhere at all, but Google still showed 40% of these pages for that query. The remaining 60% had either the misspelt ”Ratsel” in the URL (50%) or nowhere at all. Google is clearly getting quite good at this.
Does Google get confused by plurals?
Our next example is the keyword ”Äpfel” (Apples). The interesting aspect here is that, apart from the usual substitution ”Aepfel”, the third alternative ”Apfel” is actually a valid keyword, namely the singular form of the same fruit. It also has a very high search volume, making it definitely worth targeting. And in this case it’s even possible without upsetting any grammar or spelling rules.
Interestingly, here the10 top search results for ”Äpfel and ”Aepfel” are exactly the same, even though not all of them appear on exactly the same ranking positions.
Looking at the use of keywords on the top three pages we find something interesting: This time, the substitution ”ae” was not present anywhere at all. The first two pages, being Wikipedia pages, use the UTF character replacement option to present the correct ”Ä” in the URL instead. The third ranking page uses the singular ”Apfel” keyword. Interestingly, none of the top pages uses the ”Aepfel” substitution. In fact, of the 10 top ranking pages, only two use this form of representing the umlaut in the URL. Clearly, the webmasters implementing them have rightly decided that the singular form ”Apfel” is the better choice.
When it comes to the on-page elements, we can notice that ”Aepfel” isn’t represented anywhere at all. Again, the singular form ”Apfel” is predominant in H1s, titles and the content of the top ranking pages, followed by ”Äpfel” which is present in the content of all of the pages. We can also note that in several cases the ”apfel” is part of a compound noun, but still recognised and highlighted as keyword by Google, supporting what I have said in my previous blog post.
Can skipping the dots make a keyword English?
The third and final keyword we looked at is ”Übermensch”. This one is interesting, because, dropping the dots makes it into a word commonly used in English, where this philosophical term usually isn’t translated. But as English keyboards lack a key for the umlaut, it’s often spelled simply ”ubermensch”. In Germany, this form still has 140 monthly searches, as opposed to the correct spelling’s 1300. The substitution ”Uebermensch” on the other hand, is only searched for 20 times.
The look at the search result pages shows a similar picture to the first keyword we examined. The substitution ”ae” shows 90% identical pages. However, when searching for ”ubermensch”, only 30% of the results are the same. We can suspect that 22,200 monthly searches in the US might lead Google to see this as an English keyword. The fact that, on Google.de, only 2 of the top 10 search results are in German – everything else is in English – reinforces this theory.
But how are the keywords used on the pages? Well, three of the English pages listed have the keyword without dots in content and URL, another one in the URL only. Otherwise the correct spelling dominates, even on the English pages listed. And the ”Uebermensch” is nowhere to be seen.
So how are we supposed to treat those umlauts then?
There are a few useful conclusions from this experiment that can help us to decide how to handle German words with umlauts as keywords better in our search engine optimisation efforts.
Firstly, it’s clear that the classic substitution of umlauts by skipping the dots and adding an ”e” instead works well in URLs. We can also conclude that Google does understand these keywords to be the same as the correct spelling. Thus we should use these in URLs and file names wherever we are unable to use UTF characters.
Secondly, when the German word without the dots is a correct word, such as the singular of the same thing, we should consider targeting this as a separate keyword for a separate page. Combining it with the umlauted form could mean missed opportunities.
And lastly, we should avoid simply removing the dots and assuming google will still recognise the keyword. While this many times works for accents in French or Spanish, umlauts are not accents and Google will not treat them the same. In fact, simply dropping the dots is the least advisable way of using these keywords – still it is often recommended to do just that for URLs and file names.
Additionally, doing this, might risk attracting English web pages to your search results, whenever the dot-less version is widely used as a term by English speakers.