Software Imagineer's blog

Hacking my language learning process

Sat Oct 07 2017

Learning languages is good for your brain. Scientists say it can help prevent alzheimer’s disease. Others say it changes your perception of the world. But most of all, learning languages is fun. Well…. It is fun when you are in a bar with multi-cultural group of friends and you learn how to say “Cheers!” and “Thank you” in five different languages. Or when you learn how to say “my bear drinks beer” in Russian on Duolingo. But past the basics, learning languages is quite hard. I am definitely feeling the struggle while learning Danish and this motivates me to look for ways to hack the learning process.

One of the hacks which have caught my attention is the “1000 words theory”. The idea is that you should be able to read an article in foreign language by knowing 1000 most common words. I honestly think it is a great sales pitch for learning languages, no matter if it is true or not. It is much better than showing someone a French dictionary with 270,000 words, or Lithuanian dictionary with 500,000 words. I was curious how the “1000 words theory” would fair in practice, so I decided to create a Chrome browser extension which would visualize it for me.

The extension allows me to mark words as known or unknown on any website. These words are saved, so the next time I am reading an article I will see which words I have already marked. You can see how it works in the screenshot below.

Danish article after learning 1000 words

The screenshot was taken after marking 1000 Danish words. The words I have marked are not exactly the most common in the language, but rather commonly used in news articles. In articles covering daily life topics, I have seen up to 80% of known words and I usually never see a number below 50%. You could argue that a lot of complex words are still unmarked in the article. But I think, if you know English language and basic Danish grammar rules, you would get close to 90% of the meaning.

I think “1000 words theory” is a good approximation, but your mileage will vary depending on the language you are trying to learn. Knowing basic grammar rules is also very important, because it basically multiplies the number of words you are able to recognize.

What about mutually intelligible languages?

Written Danish and Norwegian are pretty similar. Could you read an article in Norwegian by knowing 1000 Danish words? Not really, but 38% of known words in an article is a good start.

Norwegian article with Danish words marked

What’s next

I have noticed a pleasant side effect of using my browser extension. I have become more motivated to read articles in Danish. I am excited to find new words and see the percentage of known words increasing. I have also started thinking what to do with all the collected words. Specifically, I am looking into memorization techniques like spaced repetition. I will post a followup article once I have made some progress in applying memorization.