Thursday, September 11, 2008


are the Chinese characters that are used in the modern along with hiragana , katakana , Arabic numerals, and the occasional use of the Latin alphabet. The term ''kanji'' literally means " characters".


Chinese characters first came to Japan on articles imported from China. An early instance of such an import was a gold seal given by the of the Eastern Han Dynasty in 57 AD. It is not clear when Japanese people started to gain a command of Classical Chinese by themselves. The first Japanese documents were probably written by Chinese immigrants. For example, the diplomatic correspondence from King Bu of Wa to of the Liu Song Dynasty in 478 has been praised for its skillful use of allusion. Later, groups of people called ''fuhito'' were organized under the monarch to read and write Classical Chinese. From the 6th century onwards, Chinese documents written in Japan tended to show from Japanese, suggesting the wide acceptance of Chinese characters in Japan.

The Japanese language itself had no written form at the time kanji was introduced. Originally texts were written in the Chinese language and would have been read as such. Over time, however, a system known as ''kanbun'' emerged, which involved using Chinese text with diacritical marks to allow Japanese speakers to restructure and read Chinese sentences, by changing word order and adding particles and verb endings, in accordance with the rules of Japanese grammar.

Chinese characters also came to be used to write Japanese words, resulting in the modern kana syllabaries. A writing system called ''man'yōgana'' evolved that used a limited set of Chinese characters for their sound, rather than for their meaning. Man'yōgana written in became ''hiragana'', a writing system that was accessible to women . Major works of Heian era literature by women were written in hiragana. ''Katakana'' emerged via a parallel path: monastery students simplified ''man'yōgana'' to a single constituent element. Thus the two other writing systems, hiragana and katakana, referred to collectively as ''kana'', are actually descended from kanji.

In modern Japanese, kanji are used to write parts of the language such as nouns, adjective and verb , while hiragana are used to write verb and adjective endings , , native Japanese words, and words where the kanji is too difficult to read or remember. Katakana is used for representing onomatopoeia, s, certain naming, and for emphasis on certain words.

Local developments

While kanji are essentially Chinese ''hanzi'' used to write Japanese, there are now significant differences between kanji and hanzi, including the use of characters created in Japan, characters that have been given different meanings in Japanese, and post World War II simplifications of the kanji.


''Kokuji'' are characters peculiar to Japan. ''Kokuji'' are also known as ''wasei kanji'' . There are hundreds of ''kokuji'' . Many are rarely used, but a number have become important additions to the written Japanese language. These include:


Some of them, like "腺", have been introduced to China.


In addition to ''kokuji'', there are kanji that have been given meanings in Japanese different from their original Chinese meanings. These kanji are not considered ''kokuji'' but are instead called ''kokkun'' and include characters such as:

*沖 ''oki''
*椿 ''tsubaki''

Old characters and new characters

Before the end of World War II, the Chinese characters used in Japan were mostly, if not completely, the same as the Traditional Chinese characters. the government introduced the simplified "Tōyō Kanji Form List" . The older forms are now known as 旧字体 and the simplified forms as 新字体 . The following are some examples of Kyūjitai simplifications to Shinjitai:

*國 → 国 ''kuni'', ''koku''
*號 → 号 ''gō''
*變 → 変 ''hen'', ''ka''

Some of the new characters are similar to later adopted in the People's Republic of China. Also, like the simplification process in China, some of the shinjitai were once abbreviated forms used in handwriting. In contrast with the "proper" unsimplified characters these were originally only acceptable in colloquial contexts. shows examples of these handwritten abbreviations, identical to their modern Shinjitai forms, from the pre WWII era.

There are also handwritten simplifications today that are significantly simpler than their standard forms , examples of which can be seen here. Despite their wide usage and popularity, they are not considered orthographically correct and are only used in handwriting.

Theoretically, however, any Chinese character can also be a Japanese character—the ''Daikanwa Jiten'', one of the largest dictionaries of kanji ever compiled, has about 50,000 entries, even though most of the entries have never been used in Japanese.


Because of the way they have been adopted into Japanese, a single kanji may be used to write one or more different words . From the point of view of the reader, kanji are said to have one or more different "readings". Deciding which reading is meant depends on context, intended meaning, use in compounds, and even location in the sentence. Some common kanji have ten or more possible readings. These readings are normally categorized as either ''on'yomi'' or ''kun'yomi'' .


The ''on'yomi'' , the reading, is a Japanese approximation of the Chinese pronunciation of the character at the time it was introduced. Some kanji were introduced from different parts of China at different times, and so have multiple ''on'yomi'', and often multiple meanings. ''Kanji'' invented in Japan would not normally be expected to have ''on'yomi'', but there are exceptions, such as the character 働 "to work", which has the ''kun'yomi'' ''hataraku'' and the on'yomi ''dō'', and 腺 "gland", which has only the ''on'yomi'' ''sen''.

Generally, ''on'yomi'' are classified into four types:
*''Go-on'' readings are from the pronunciation during the Southern and Northern Dynasties or Baekje, an ancient state on the Korean Peninsula, during the and . ''Go'' means the region .
*''Kan-on'' readings are from the pronunciation during the Tang Dynasty in the to , primarily from the standard speech of the capital, Chang'an .
*''Tō-on'' readings are from the pronunciations of later dynasties, such as the and . They cover all readings adopted from the Heian era to the Edo period .

*''Kan'yō-on'' readings, which are mistaken or changed readings of the kanji that have become accepted into the language.


The most common form of readings is the ''kan-on'' one. The ''go-on'' readings are especially common in Buddhist terminology such as ''gokuraku'' 極楽 "paradise". The ''tō-on'' readings occur in some words such as ''isu'' 椅子 "chair" or ''futon'' 布団 "mattress".

In Chinese, most characters are associated with a single Chinese syllable. However, some homographs called 多音字 such as 行 have more than one reading in Chinese representing different meanings, which is reflected in the carryover to Japanese as well. Additionally aside, most Chinese syllables did not fit the largely consonant-vowel phonotactics of classical Japanese. Thus most ''on'yomi'' are composed of two , the second of which is either a lengthening of the vowel in the first mora, or one of the syllables ''ku'', ''ki'', ''tsu'', ''chi'', or syllabic ''n'', chosen for their approximation to the final consonants of Middle Chinese. In fact, , as well as syllabic ''n'', were probably added to Japanese to better simulate Chinese; none of these features occur in words of native Japanese origin.

''On'yomi'' primarily occur in multi-kanji compound words , many of which are the result of the adoption, along with the kanji themselves, of Chinese words for concepts that either did not exist in Japanese or could not be articulated as elegantly using native words. This borrowing process is often compared to the English borrowings from Latin and Norman French, since Chinese-borrowed terms are often more specialized, or considered to sound more erudite or formal, than their native counterparts. The major exception to this rule is family names, in which the native ''kun'yomi'' reading is usually used.


The ''kun'yomi'' , Japanese reading, or native reading, is a reading based on the pronunciation of a native word, or ''yamatokotoba'', that closely approximated the meaning of the character when it was introduced. As with ''on'yomi'', there can be multiple ''kun'' readings for the same kanji, and some kanji have no ''kun'yomi'' at all.

For instance, the kanji for east, , has the ''on'' reading ''tō''. However, already had two words for "east": ''higashi'' and ''azuma''. Thus the kanji 東 had the latter readings added as ''kun'yomi''. In contrast, the kanji 寸, denoting a Chinese unit of measurement , has no native equivalent; it only has an ''on'yomi'', ''sun'', with no native ''kun'' reading. Most ''kokuji'', Japanese-created Chinese characters, only have ''kun'' readings.

''Kun'yomi'' are characterized by the strict V syllable structure of ''yamatokotoba''. Most noun or adjective ''kun'yomi'' are two to three syllables long, while verb ''kun'yomi'' are usually between one and three syllables in length, not counting trailing hiragana called ''okurigana''. ''Okurigana'' are not considered to be part of the internal reading of the character, although they are part of the reading of the word. A beginner in the language will rarely come across characters with long readings, but readings of three or even four syllables are not uncommon. 承る ''uketamawaru'' and 志 ''kokorozashi'' have five syllables represented by a single kanji, the longest readings of any kanji in the .

In a number of cases, multiple kanji were assigned to cover a single word. Typically when this occurs, the different kanji refer to specific shades of meaning. For instance, the word なおす, ''naosu'', when written 治す, means "to heal an illness or sickness". When written 直す it means "to fix or correct something". Sometimes the distinction is very clear, although not always. Differences of opinion among reference works is not uncommon; one dictionary may say the kanji are equivalent, while another dictionary may draw distinctions of use. As a result, native speakers of the language may have trouble knowing which kanji to use and resort to personal preference or by writing the word in hiragana. This latter strategy is frequently employed with more complex cases such as もと ''moto'', which has at least five different kanji: 元, 基, 本, 下 and 素, three of which have only very subtle differences.

Local dialectical readings of kanji are also classified under ''kun'yomi'', most notably readings for words in Ryukyuan languages.

Other readings

There are many kanji compounds that use a mixture of ''on'yomi'' and ''kun'yomi'', known as ''jūbako'' or ''yutō'' words, which are themselves examples of this kind of compound : the first character of ''jūbako'' is read using ''on'yomi'', the second ''kun'yomi'', while it is the other way around with ''yutō''. These are the Japanese form of hybrid words. Other examples include 場所 ''basho'' "place" , 金色 ''kin'iro'' "golden" and 合気道 ''aikidō'' "the martial art Aikido" .

Some kanji also have lesser-known readings called ''nanori'' , which are mostly used for names , and are generally closely related to the ''kun'yomi''. Place names sometimes also use ''nanori'' or, occasionally, unique readings not found elsewhere.

''Gikun'' or ''jukujikun'' are readings of kanji combinations that have no direct correspondence to the characters' individual ''on'yomi'' or ''kun'yomi''. For example, 今朝 is read neither as ''*ima'asa'', the ''kun'yomi'' of the characters, nor ''*konchō'', the ''on'yomi'' of the characters. Instead it is read as ''kesa''—a native Japanese word with two syllables .

Many ''ateji'' have meanings derived from their usage: for example, the now-archaic 亜細亜 ''ajia'' was formerly used to write "Asia" in kanji; the character 亜 now means ''Asia'' in such compounds as 東亜 ''tōa'', "East Asia". From the written 亜米利加 ''amerika'', the second character was taken, resulting in the semi-formal coinage 米国 ''beikoku'', which literally translates to "rice country" but means "United States of America".

When to use which reading

Although there are general rules for when to use ''on'yomi'' and when to use ''kun'yomi'', the language is littered with exceptions, and it is not always possible for even a native speaker to know how to read a character without prior knowledge.

The rule of thumb is that kanji occurring in isolation, such as a character representing a single word unit, are typically read using their ''kun'yomi''. They may be written with okurigana to mark the inflected ending of a verb or adjective, or by convention. For example: 情け ''nasake'' "sympathy", 赤い ''akai'' "red", 新しい ''atarashii'' "new ", 見る ''miru'' " see", 必ず ''kanarazu'' "invariably". Okurigana is an important aspect of kanji usage in Japanese; see that article for more information on ''kun'yomi'' orthography

Kanji occurring in compounds are generally read using ''on'yomi'', called 熟語 ''jukugo'' in Japanese. For example, 情報 ''jōhō'' "information", 学校 ''gakkō'' "school", and 新幹線 ''shinkansen'' "bullet train" all follow this pattern. This isolated kanji and compound distinction gives words for similar concepts completely different pronunciations. 東 "east" and 北 "north" use the ''kun'' readings ''higashi'' and ''kita'', being stand-alone characters, while 北東 "northeast", as a compound, uses the ''on'' reading ''hokutō''. This is further complicated by the fact that many kanji have more than one ''on'yomi'': 生 is read as ''sei'' in 先生 ''sensei'' "teacher" but as ''shō'' in 一生 ''isshō'' "one's whole life". Meaning can also be an important indicator of reading; 易 is read ''i'' when it means "simple", but as ''eki'' when it means "divination", both being ''on'yomi'' for this character.

This rule of thumb has many exceptions. ''Kun'yomi'' compound words are not as numerous as those with ''on'yomi'', but neither are they rare. Examples include 手紙 ''tegami'' "letter", 日傘 ''higasa'' "parasol", and the famous 神風 ''kamikaze'' "divine wind". Such compounds may also have okurigana, such as 空揚げ ''karaage'' "fried food" and 折り紙 ''origami'', although many of these can also be written with the okurigana omitted .

Similarly, some ''on'yomi'' characters can also be used as words in isolation: 愛 ''ai'' "love", 禅 ''Zen'', 点 ''ten'' "mark, dot". Most of these cases involve kanji that have no ''kun'yomi'', so there can be no confusion, although exceptions do occur. A lone 金 may be read as ''kin'' "gold" or as ''kane'' "money, metal"; only context can determine the writer's intended reading and meaning.

Multiple readings have given rise to a number of homographs, in some cases having different meanings depending on how they are read. One example is 上手, which can be read in three different ways: ''jōzu'' , ''uwate'' , or ''kamite'' . In addition, 上手い has the reading ''umai'' . Furigana is often used to clarify any potential ambiguities.

As stated above, 重箱 ''jūbako'' and 湯桶 ''yutō'' readings are also not uncommon. Indeed, all four combinations of reading are possible: ''on-on'', ''kun-kun'', ''kun-on'' and ''on-kun''.

Some famous place names, including those of Tokyo and Japan itself are read with ''on'yomi''; however, the majority of Japanese place names are read with ''kun'yomi'': 大阪 ''?saka'', 青森 ''Aomori'', 箱根 ''Hakone''. When characters are used as abbreviations of place names, their reading may not match that in the original. The Osaka and Kobe baseball team, the Hanshin Tigers, take their name from the ''on'yomi'' of the second kanji of ''?saka'' and the first of ''Kōbe''. The name of the Keisei railway line, linking Tokyo and Narita is formed similarly, although the reading of 京 from 東京 is ''kei'', despite ''kyō'' already being an ''on'yomi'' in the word ''Tōkyō''.

Family names are also usually read with ''kun'yomi'': 山田 ''Yamada'', 田中 ''Tanaka'', 鈴木 ''Suzuki''. Given names, although they are not typically considered ''jūbako'' or ''yutō'', often contain mixtures of ''kun'yomi'', ''on'yomi'' and ''nanori'': 大助 ''Daisuke'' , 夏美 ''Natsumi'' . Being chosen at the discretion of the parents, the readings of given names do not follow any set rules and it is impossible to know with certainty how to read a person's name without independent verification. Parents can be quite creative, and rumours abound of children called 地球 ''?su'' and 天使 ''Enjeru'', quite literally "Earth" and "Angel"; neither are common names, and have normal readings ''chikyū'' and ''tenshi'' respectively. Common patterns do exist, however, allowing experienced readers to make a good guess for most names.

Pronunciation assistance

Because of the ambiguities involved, kanji sometimes have their pronunciation for the given context spelled out in ruby characters known as ''furigana'', or ''kumimoji'' . This is especially true in texts for children or foreign learners and ''manga'' . It is also used in newspapers for rare or unusual readings and for characters not included in the officially recognized set of .

Total number of kanji

The number of possible characters is disputed. The "Daikanwa Jiten" contains about 50,000 characters, and this was thought to be comprehensive, but more recent mainland Chinese dictionaries contain 80,000 or more characters, many consisting of obscure variants. Most of these are not in common use in either Japan or China.

Orthographic reform and lists of kanji

In 1946, following World War II, the Japanese government instituted a series of reforms. This was done with the goal of facilitating learning for children and simplifying kanji use in literature and periodicals.
The number of characters in circulation was reduced, and formal lists of characters to be learned during each grade of school were established.
Some characters were given simplified glyphs, called . Many variant forms of characters and obscure alternatives for common characters were officially discouraged.

These are simply guidelines, so many characters outside these standards are still widely known and commonly used; these are known as .

Kyōiku kanji

The ''Kyōiku kanji'' 教育漢字 are 1006 characters that Japanese children learn in elementary school. The number was 881 until 1981. The grade-level breakdown of the education kanji is known as the Gakunen-betsu kanji haitōhyō , or the ''gakushū kanji''.

Jōyō kanji

The ''Jōyō kanji'' 常用漢字 are 1,945 characters consisting of all the ''Kyōiku kanji'', plus an additional 939 kanji taught in junior high and high school. In publishing, characters outside this category are often given ''''. The ''Jōyō kanji'' were introduced in 1981. They replaced an older list of 1850 characters known as the General-use kanji introduced in 1946. The Japanese National Kanji Conference will add 11 new characters to the list, totaling 1956, to be enforced by 2010. These new characters are currently Jinmeiyō kanji and were previously not included in the Jōyō kanji, and are used to write prefecture names: 阪,熊,奈,岡,鹿,梨,阜,埼,茨,栃 and 媛。

Jinmeiyō kanji

The ''Jinmeiyō kanji'' 人名用漢字 are 2,928 characters consisting of the ''Jōyō kanji'', plus an additional 983 kanji found in people's names. Over the years, the Minister of Justice has on several occasions added to this list. Sometimes the phrase ''Jinmeiyō kanji'' refers to all 2928, and sometimes it only refers to the 983 that are only used for names.


are any kanji not contained in the jōyō kanji and jinmeiyō kanji lists. These are generally written using traditional characters, but extended shinjitai forms exist.

Japanese Industrial Standards for kanji

The Japanese Industrial Standards for kanji and kana define character code-points for each kanji and kana, as well as other forms of writing such as the Latin alphabet, Cyrillic alphabet, Greek alphabet, Hindu-Arabic numerals, etc. for use in information processing. They have had numerous revisions. The current standards are:
*, the most recent version of the main standard. It has 6,355 kanji.
* , a supplementary standard containing a further 5,801 kanji. This standard is rarely used, mainly because the common Shift JIS encoding system could not use it. This standard is effectively obsolete;
* , a further revision which extended the JIS X 0208 set with 3,625 additional kanji, of which 2,741 were in JIS X 0212. The standard is in part designed to be compatible with Shift JIS encoding;
* JIS X 0221:1995, the Japanese version of the ISO 10646/Unicode standard.


''Gaiji'' , literally meaning "external characters", are kanji that are not represented in existing . These include variant forms of common kanji that need to be represented alongside the more conventional glyph in reference works, and can include non-kanji symbols as well.

''Gaiji'' can be either user-defined characters or system-specific characters. Both are a problem for information interchange, as the codepoint used to represent an external character will not be consistent from one computer or operating system to another.

''Gaiji'' were nominally prohibited in JIS X 0208-1997, and JIS X 0213-2000 used the range of code-points previously allocated to ''gaiji'', making them completely unusable. Nevertheless, they persist today with NTT DoCoMo's "i-mode" service, where they are used for emoji .

Unicode allows for optional encoding of ''gaiji'' in . Adobe's SING technology allows the creation of customized gaiji. The uses a element to encode any non-standard character or glyph, including gaiji.

Types of Kanji: by category

A Chinese scholar Xu Shen , in the '''' ca. 100 , classified Chinese characters into six categories . The traditional classification is still taught but is problematic and no longer the focus of modern lexicographic practice, as some categories are not clearly defined, nor are they mutually exclusive: the first four refer to structural composition, while the last two refer to usage.


These characters are pictograms, sketches of the object they represent. For example, 目 is an eye, 木 is a tree, etc. . The current forms of the characters are very different from the original, and it is now hard to see the origin in many of these characters. It is somewhat easier to see in seal script. These make up a small fraction of modern characters.


''Shiji-moji'' are ideograms, often called "simple ideograms" or "simple indicatives" to distinguish them from compound ideograms . They are usually simple graphically and represent an abstract concept such as 上 "up" or "above" and 下 "down" or "below". These make up a tiny fraction of modern characters.


These are compound ideograms, often called "compound indicatives", "associative compounds", or just "ideograms". These are usually a combination of pictograms that combine iconicly to present an overall meaning. An example is the ''kokuji'' 峠 made from 山 , 上 and 下 . Another is 休 from 人 and 木 . These make up a tiny fraction of modern characters.


These phono-semantic or -phonetic compounds, sometimes called "semantic-phonetic", "semasio-phonetic", or "phonetic-ideographic" characters, are by far the largest category, making up about 90% of characters. Typically they are made up of two components, one of which suggests the general category of the meaning or semantic context, and the other approximates the pronunciation.

As examples of this, consider the kanji with the 言 shape: 語, 記, 訳, 説, etc. All are related to word/language/meaning. Similarly kanji with the 雨 shape are almost invariably related to weather. Kanji with the 寺 shape on the right usually have an ''on'yomi'' of "shi" or "ji". Sometimes one can guess the meaning and/or reading simply from the components. However, exceptions do exist — for example, neither 需 nor 霊 have anything to do with weather , and 待 has an ''on'yomi'' of "tai". That is, a component may play a semantic role in one compound, but a phonetic role in another.


This group have variously been called "derivative characters", "derivative cognates", or translated as "mutually explanatory" or "mutually synonymous" characters; this is the most problematic of the six categories, as it is vaguely defined. It may refer to kanji where the meaning or application has become extended. For example, 楽 is used for 'music' and 'comfort, ease', with different pronunciations in Chinese reflected in the two different ''on'yomi'', ''gaku'' 'music' and ''raku'' 'pleasure'.


These are rebuses, sometimes called "phonetic loans". The etymology of the characters follows one of the pattern above, but the present-day meaning is completely unrelated this. A character was appropriated to represent a similar sounding word. For example, 来 in ancient Chinese was originally a pictograph for 'wheat'. Its syllable was homophonous with the verb meaning 'to come' and the character is used for that verb as a result, without any embellishing 'meaning' element attached. Interestingly, the character for wheat 麦, originally meant 'to come', being a Keisei-moji having 'foot' at the bottom for its meaning part and 'wheat' at the top for sound. The two characters swapped meaning, so today the more common word has the simpler character. This borrowing of sounds has a very long history. 東 'east' is a pictograph of a bag on a stick, but it was used to mean 'east' very early in the history of the Chinese written language; not one example of it meaning 'bag on a stick' has survived.

Related symbols

The iteration mark is used to indicate that the preceding kanji is to be repeated, functioning similarly to a ditto mark in English. It is pronounced as though the kanji were written twice in a row, for example 色々 and 時々 . This mark also appears in personal and place names, as in the Sasaki . This symbol is a simplified version of the kanji 仝 .

Another frequently used symbol is ヶ , pronounced "ka" when used to indicate quantity or "ga" in place names like Kasumigaseki . This symbol is a simplified version of the kanji 箇.

Radical-and-stroke sorting

Kanji, whose thousands of symbols defy ordering by convention such as is used with the Roman Alphabet, uses radical-and-stroke sorting to order a list of Kanji words. In this system, common components of characters are identified; these are called in Chinese and logographic systems derived from Chinese, such as Kanji.

Characters are then grouped by their primary radical, then ordered by number of pen strokes within radicals. When there is no obvious radical or more than one radical, convention governs which is used for collation. For example, the Chinese character for "mother" is sorted as a thirteen-stroke character under the three-stroke primary radical meaning "woman".

Kanji education

Japanese schoolchildren are expected to learn 1,006 basic kanji characters, the ''kyōiku kanji'', before finishing the sixth grade. The order in which these characters are learned is fixed. The ''kyōiku kanji'' list is a subset of a larger list of 1,945 kanji characters known as the ''jōyō kanji'', characters required for the level of fluency necessary to read newspapers and literature in Japanese. This larger list of characters is to be mastered by the end of the ninth grade. Schoolchildren learn the characters by repetition and .

Students studying Japanese as a foreign language are often required to acquire kanji without having first learned the vocabulary associated with them. Strategies for these learners vary from copying-based methods to mnemonic-based methods such as those used in James Heisig's series ''Remembering the Kanji''. Other textbooks use methods based on the etymology of the characters, such as Mathias and Habein's ''The Complete Guide to Everyday Kanji'' and Henshall's ''A Guide to Remembering Japanese Characters''. Pictorial mnemonics, as in the text ''Kanji Pict-o-graphix'', are also seen.

The Japanese government provides the ''Kanji kentei'' which tests the ability to read and write kanji. The highest level of the ''Kanji kentei'' tests about 6,000 kanji.