What is the best order to learn Chinese Characters?

TL;DR

  • The order depends on your objective
  • If you are studying HSK or TOCFL, use their official vocab lists
  • If you want the most efficient and easy to remember way, use a hybrid approach that considers both character frequency and it’s component composition

Why does the order matter?

To be able to read Chinese at a basic level, it’s necessary to be able to recognize a minimum of 2000 characters

This is a major challenge in learning the language and what makes Chinese one of the most intimidating languages to learn.

If characters were just random combinations of strokes, the order wouldn’t matter. However, they are not, Chinese characters are often built up of components which are often themselves characters. Because of that, the order of learning can make memorization easier or more difficult. 

木木木 = 森
日 + 月 = 明

etc…

So which order is the best?

Before we can answer that, we need to consider why we are learning because our priority affects the best order to choose. 

  • Is it because we want to take a HSK or TOCFL exam?
  • Is it because we want to be able to read as much as possible?
  • Is it that we want to learn in the most efficient way possible?

     

I will go through these in turn. 

The best order when preparing for HSK or TOCFL exams

HSK and TOCFL have prepared word lists which split the vocabulary by level. Unfortunately, official lists provided are not in a very user-friendly format and also contain typos and mistakes! To help with that, we have prepared and checked vocab lists that can be downloaded for multiple flashcard formats including: Anki, Pleco, Quizlet, Excel, CSV and are immediately available for free inside the WeHanzi App.
HSK wordlists
TOCFL wordlists

What is the best order to be able to read as much as possible?

If you want to be able to read as much as possible, the most efficient order is to learn based on word frequency. 

Word frequency is how popular a character and how often it appears. For example, in English, the top 5 words by frequency are:

  1. “the”
  2. “of”
  3. “and”
  4. “to”
  5. “a”


In Chinese, the top 5 words are: 

  1. 的 (de) – of 
  2. 了(le) – completed action marker
  3. 在 (zài) – (located) at
  4. 是 (shì) – to be
  5. 我 (wǒ) – I


Because higher frequency characters appear more often in text, learning in frequency order will give you the best bang for your buck. 

However, there is another consideration. Chinese characters are not random. More complex characters are built up from simpler components. For example: 

的 is a combination of: 

  • 白 (bái) – white (word rank #733)
  • 勺 (sháo) – spoon  (word rank #9030)

As you can see, the component parts of the character have a much lower rank. This means that in many cases when you learn based on frequency, you will need to learn complex characters with many components before learning the components individually. 

This makes it harder because you need to learn the character verbatim without any reference. But when you recognize the components, memorizing the character is significantly easier. 

As an example, consider the character: 

  • 看 (kàn) – to look (word rank #52)


This may look difficult by itself, but when we break it down into its components we get: 

  • 手 (shǒu) – hand (word rank #124)
  • 目 (mù) – eye (word rank #2209)

So the original character is a person shielding their eye with their hand. In many case, knowing the components of a character makes it easier to learn the character itself. 

Words vs Characters

In Chinese, some words are made up of one character, but most words are composed of 2 or more characters. In this case again, it’s easier to learn a word if you already know the characters it’s made up of.

Learning based on character components

Because it’s easier to learn a character when you know the components, it’s possible to develop a whole approach based on that effect. This is the most efficient way to learn in terms of (ease of remembering). 

If you are interested in this approach, I would recommend a book called Remembering the Hanzi. Available for Simplified and Traditional Chinese: 

Book Recommendations:

  1. Remembering the Traditional Hanzi by James Heisig
  2. Remembering the Simplified Hanzi by James Heisig


There is also a free spreadsheet resource with this information. 

With this approach, you would start off with some basic numbers like: 

  1. 一 (yī) – one
  2. 二 (èr) – two
  3. 三 (sān) – three


Then move on to common components: 

  1. 口 (kǒu) – mouth
  2. 日 (rì) – sun
  3. 月 (yuè) – moon


Then one-by-one, new characters are introduced that can be made up of these components for example: 

  • 明 (míng) – bright


This character is made up of the sun and the moon. “Compared to the moon, the sun is very bright”. Again, knowing the components makes it much easier to remember. The author of this book did a lot of research to work out the best order to learn in this way. I also know from experience because I used this book to learn 1200 characters in 4 months. 

However, this method is not without its issues. The main problem with this method, is you end up learning a lot of low frequency characters before some very common characters. 

For example:

  1. 人 (rén) – person (Remembering the Hanzi no. 736)
  2. 来 | 來 (lái) – to come (Remembering the Hanzi no. 789)


You would cover 735 characters before you learn “person” which is one of the most fundamental and useful characters. 

This approach is elegant and efficient from a pure memorization perspective but very inefficient if you want to be able to read or speak… 

Also, you only learn characters so even when you’ve learned 1000 characters, you probably can’t speak because most Chinese words are made up of multiple characters!

The best of both worlds - a hybrid approach

I think an ideal approach would need to include the following elements:

  • Consideration of character frequency
  • Consideration of character composition and components
  • Inclusion of words as well as characters


This is because:

It’s easier to learn a character when know it’s components

Learning a character is easier when you know its components. It’s also easier to learn a word when you know the characters that make it up. 

For example: 袋鼠 looks difficult. But if you already know:

  1. 袋 (dài) – pocket
  2. 鼠 (shǔ) – mouse


If I told you that pocket mouse is a kangaroo. It’s much easier to remember! Mandarin is very much like this, often more complex words are made up of simple character combinations. 

High frequency words are more important than low frequency words

If you want to keep motivated, it’s it’s better if you can immediately apply what you learn. I memorized over 1200 characters using Remembering the Hanzi, but I couldn’t use it to read because I didn’t learn any words. Then I got busy and forgot most of what I learned 🤦.

Combining the component-first approach with character frequency, means you are also learning characters in the most efficient order to maximise your reading as well. 

What does it look like in practice? 

When James Heisig was developing his book in 1944 technology was much less advanced than it is today. But now, we can use computers to help us determine an efficient learning order. 

So how can we work out the best order using modern technology? First, we take a subset of characters (this can be modified based on specific learning goals). It could be: 

  • All characters / words in the top 2000 by frequency
  • HSK 1-6
  • TOCFL 1-6 


We start with the highest frequency word e.g.  then we use software to determine which components make up the character. In this case: 

  1. 白  (bái)  white
  2. 勺  (sháo)  spoon
  3. 勹  (bāo) (no meaning)
  4. 一  (yī) one, 1

Then we check to see if there are any other characters or words that can be constructed out of these components. 

The software provides us with:

  • 不  (bù) negative prefix, not (#11)
  • 目的  (mù dì) purpose, aim (#992)
  • 亚 | 亞  (yà) Asia, Asian (#1418)


These characters are all constructed from many 一 strokes. Since we can see the word rank (the number after the #), we can decide if we want to learn them at this time (or skip them). 

After that we move to the next most popular characters:

  • 了 (le) completed action marker
  • 为 | 為 (wèi) because of, for, to (#18, HSK2)


And any words we can make with the characters and components we have learned already:

  • 為了 (wèi le) in order to, for the purpose of (#230, HSK3)
  • 不了 (bù liǎo) no thanks (#1133)


As you can see, each time we learn a new character, it opens up possibilities of learning new components and new words. 

We are learning useful characters and words but we are not overwhelmed with new components. It’s a very step-by-step process that makes learning much simpler. 

I’m only at the very start of researching this method and in the future I will release an update with a full list. Another interesting element is that we can also research the ancient meaning of each component. For example, originally 為 represented an elephant! 

In the screenshot from the WeHanzi App on the right, you can see the ancient form of 為. You can see how it resembles an elephant reaching up to eat something from a branch. 

Mandarin is a difficult language to read (if you use a brute force approach), but with care, that difficulty can be gradually reduced. Learning characters in the right order and learning about their origin can make it substantially quicker easier and more fun!

If you are interested in getting an update when the final word order is released, you can sign up using the form below.