Knowledge

CJK characters

Source 📝

27: 118:, with up to 40,000 characters for reasonably complete coverage. Japanese uses fewer characters—general literacy in Japanese can be expected with 2,136 characters. The use of Chinese characters in Korea is increasingly rare, although idiosyncratic use of Chinese characters in proper names requires knowledge (and therefore availability) of many more characters. Even today, however, South Korean students are taught 195:
up to and including version 2.0, are now deprecated due to the requirement to encode more characters than a 16-bit encoding can accommodate—Unicode 5.0 has some 70,000 Han characters—and the requirement by the Chinese government that software in China support the
203:
Although CJK encodings have common character sets, the encodings often used to represent them have been developed separately by different East Asian governments and software companies, and are mutually incompatible.
665: 175:
script, consisting of Chinese characters with many characters created locally. Since the 1920s, the script since then used for recording literature has been the Latin-based
149:
for Korean, are not strictly "CJK characters", although CJK character sets almost invariably include them as necessary for full coverage of the target languages.
670: 318: 692: 611:
Lemberg, Werner: The CJK package for LATEX2ε—Multilingual support beyond babel. TUGboat, Volume 18 (1997), No. 3—Proceedings of the 1997 Annual Meeting.
237: 114:
Standard Mandarin Chinese and Standard Cantonese are written almost exclusively in Chinese characters. Over 3,000 characters are required for general
1199: 191:, requiring at least a 16-bit fixed width encoding or multi-byte variable-length encodings. The 16-bit fixed width encodings, such as those from 321:(right-to-left and top-to-bottom in ancient documents), but are usually considered left-to-right scripts when discussing encoding issues. 187:
The number of characters required for complete coverage of all these languages' needs cannot fit in the 256-character code space of 8-bit
310:
code space. There is much controversy among Japanese experts of Chinese characters about the desirability and technical merit of the
700: 1179: 828: 685: 358: 748: 573: 20: 1169: 551: 532: 63: 1189: 474: 1184: 368: 1194: 796: 784: 780: 776: 772: 768: 764: 760: 756: 752: 678: 635: 605: 597: 582: 424: 1079: 1070: 215:
CJK character encodings should consist minimally of Han characters plus language-specific phonetic scripts such as
1174: 824: 808: 792: 399: 119: 479: 47: 886: 846: 404: 388: 272: 43: 816: 373: 314:
process used to map multiple Chinese and Japanese character sets into a single set of unified characters.
804: 800: 859: 363: 246: 839: 378: 620:, State-of-the-art Report, Prepared for the Board of Directors, Association for Asian Studies. 1971. 342: 820: 788: 419: 79: 994: 701: 383: 83: 8: 654: 394: 176: 164: 163:
was the written language of government and scholarship in Vietnam. Popular literature in
103: 31: 503: 208:
has attempted, with some controversy, to unify the character sets in a process known as
739: 521: 188: 87: 1044: 812: 631: 601: 593: 578: 547: 528: 409: 160: 51: 478: prior to 1 November 2008 and incorporated under the "relicensing" terms of the 26: 617:
Automated Orthographic Systems for East Asian Languages (Chinese, Japanese, Korean)
39: 734: 723: 616: 311: 294: 209: 75: 55: 568: 170: 95: 1163: 718: 492: 349:
in 2006). The trademark owned by OCLC between 1987 and 2009 has now expired.
130: 623: 414: 338: 334: 289: 242: 71: 35: 156:
Carl Leban (1971) produced an early survey of CJK encoding systems.
1097: 1086: 281: 268: 220: 197: 153: 142: 138: 126: 115: 99: 659: 705: 649: 330: 307: 299: 262: 205: 192: 1075: 257: 252: 216: 146: 134: 234:(the most prevalent encoding before Unicode was implemented) 346: 231: 469: 437: 306:
The CJK character sets take up the bulk of the assigned
630:. Sebastopol, Calif.: O'Reilly & Associates, 1998. 449: 520: 30:Translation of "That old man is 72 years old" in 16:Logographs in shared East Asian written tradition 1161: 168: 125:Other scripts used for these languages, such as 666:FGA: Unicode CJKV character set rationalization 329:Libraries cooperated on encoding standards for 592:. Honolulu: University of Hawaii Press, 1997. 577:. Honolulu: University of Hawaii Press, 1990. 686: 468:This article is based on material taken from 333:characters in the early 1980s. According to 655:Lemberg CJK article from above, TUGboat18-3 544:Colonialism and language policy in Viet Nam 693: 679: 337:, the abbreviation "CJK" was a registered 541: 455: 19:For help with CJK character display, see 1200:Writing systems using Chinese characters 317:All three languages can be written both 25: 829:CJK Compatibility Ideographs Supplement 518: 443: 359:Chinese character description languages 109: 1162: 1142: 574:The Chinese Language: Fact and Fantasy 21:Help:Multilingual support (East Asian) 674: 475:Free On-line Dictionary of Computing 265:(subset and predecessor of GB 18030) 369:Chinese input methods for computers 13: 797:Ideographic Description Characters 785:CJK Unified Ideographs Extension I 781:CJK Unified Ideographs Extension H 777:CJK Unified Ideographs Extension G 773:CJK Unified Ideographs Extension F 769:CJK Unified Ideographs Extension E 765:CJK Unified Ideographs Extension D 761:CJK Unified Ideographs Extension C 757:CJK Unified Ideographs Extension B 753:CJK Unified Ideographs Extension A 14: 1211: 643: 486: 425:Vietnamese language and computers 226:CJK character encodings include: 223:, hiragana, katakana and hangul. 523:The writing systems of the world 825:Enclosed Ideographic Supplement 809:Enclosed CJK Letters and Months 400:Japanese language and computers 324: 319:left-to-right and top-to-bottom 1180:Natural language and computing 512: 497: 461: 159:Until the early 20th century, 1: 430: 405:Korean language and computers 389:Complex Text Layout languages 102:script formerly used for the 1170:Encodings of Asian languages 817:CJK Compatibility Ideographs 374:CJK Compatibility Ideographs 7: 1190:Japanese-language computing 801:CJK Symbols and Punctuation 628:CJKV Information Processing 590:Asia's Orthographic Dilemma 352: 182: 10: 1216: 1185:Chinese-language computing 740:Scripts contained in block 660:On "CJK Unified Ideograph" 650:CJKV: A Brief Introduction 561: 364:Chinese character encoding 273:People's Republic of China 271:(mandated standard in the 18: 1195:Korean-language computing 1138: 712: 519:Coulmas, Florian (1991). 379:Chinese character strokes 70:is a collective term for 542:DeFrancis, John (1977). 343:Research Libraries Group 278:Giga Character Set (GCS) 821:CJK Compatibility Forms 789:CJK Radicals Supplement 482:, version 1.3 or later. 420:Variable-width encoding 1175:Languages of East Asia 749:CJK Unified Ideographs 384:CJK Unified Ideographs 245:(official standard of 169: 84:Korean writing systems 59: 48:traditional characters 546:. The Hague: Mouton. 98:, the Chinese-origin 86:, which each include 29: 110:Character repertoire 64:internationalization 588:Hannas, William C. 446:, pp. 113–115. 395:Input method editor 345:(which merged with 189:character encodings 177:Vietnamese alphabet 167:was written in the 104:Vietnamese language 1148:As of version 16.0 145:for Japanese, and 88:Chinese characters 60: 1157: 1156: 1133: 1132: 938:2F800–2FA1F 936:1F200–1F2FF 916:2EBF0–2EE5F 914:31350–323AF 912:30000–3134F 910:2CEB0–2EBEF 908:2B820–2CEAF 906:2B740–2B81F 904:2A700–2B73F 902:20000–2A6DF 813:CJK Compatibility 662:, from Wenlin.com 553:978-90-279-7643-7 534:978-0-631-18028-9 410:List of CJK fonts 247:Republic of China 161:Classical Chinese 1207: 1146: 1091:Katakana, Common 715: 714: 695: 688: 681: 672: 671: 557: 538: 526: 506: 501: 495: 490: 484: 483: 465: 459: 456:DeFrancis (1977) 453: 447: 441: 174: 120:1,800 characters 1215: 1214: 1210: 1209: 1208: 1206: 1205: 1204: 1160: 1159: 1158: 1153: 1152: 1149: 1143: 1134: 1123: 1118: 1105: 1103: 1101: 1096: 1094: 1092: 1090: 1084: 1082: 1073: 1069: 1067: 1065: 1063: 1061: 1059: 1057: 1055: 1053: 1051: 1049: 1047: 1039: 1037: 1035: 1033: 1031: 1029: 1027: 1025: 1023: 1021: 1019: 1017: 1015: 1013: 1011: 1009: 1007: 1005: 1003: 1001: 999: 997: 989: 987: 985: 983: 981: 979: 977: 975: 973: 971: 969: 967: 965: 963: 961: 959: 957: 955: 953: 951: 949: 947: 941: 939: 937: 935: 934:FE30–FE4F 933: 932:F900–FAFF 931: 930:3300–33FF 929: 928:3200–32FF 927: 926:31C0–31EF 925: 924:3000–303F 923: 922:2FF0–2FFF 921: 920:2F00–2FDF 919: 918:2E80–2EFF 917: 915: 913: 911: 909: 907: 905: 903: 901: 900:3400–4DBF 899: 898:4E00–9FFF 893: 891: 889: 884: 882: 880: 878: 876: 874: 872: 870: 868: 866: 864: 862: 857: 855: 853: 851: 849: 844: 842: 833: 831: 827: 823: 819: 815: 811: 807: 803: 799: 795: 793:Kangxi Radicals 791: 787: 783: 779: 775: 771: 767: 763: 759: 755: 751: 735:Han unification 708: 699: 646: 641: 569:DeFrancis, John 564: 554: 535: 515: 510: 509: 502: 498: 493:Ken Lunde, 1996 491: 487: 467: 466: 462: 454: 450: 442: 438: 433: 355: 327: 312:Han unification 304: 210:Han unification 200:character set. 185: 112: 24: 17: 12: 11: 5: 1213: 1203: 1202: 1197: 1192: 1187: 1182: 1177: 1172: 1155: 1154: 1151: 1150: 1147: 1140: 1139: 1136: 1135: 1131: 1130: 1127: 1124: 1121: 1119: 1116: 1114: 1111: 1107: 1106: 1042: 1040: 1030:12 are unified 992: 990: 944: 942: 896: 894: 836: 834: 746: 743: 742: 737: 732: 729: 726: 721: 713: 710: 709: 702:CJK ideographs 698: 697: 690: 683: 675: 669: 668: 663: 657: 652: 645: 644:External links 642: 640: 639: 621: 612: 609: 586: 565: 563: 560: 559: 558: 552: 539: 533: 514: 511: 508: 507: 504:Justia listing 496: 485: 460: 448: 444:Coulmas (1991) 435: 434: 432: 429: 428: 427: 422: 417: 412: 407: 402: 397: 392: 386: 381: 376: 371: 366: 361: 354: 351: 326: 323: 303: 302: 297: 292: 287: 284: 279: 276: 266: 260: 255: 250: 240: 235: 228: 184: 181: 111: 108: 94:also includes 68:CJK characters 15: 9: 6: 4: 3: 2: 1212: 1201: 1198: 1196: 1193: 1191: 1188: 1186: 1183: 1181: 1178: 1176: 1173: 1171: 1168: 1167: 1165: 1145: 1141: 1137: 1128: 1125: 1120: 1115: 1112: 1109: 1108: 1104: 1099: 1088: 1081: 1077: 1072: 1046: 1041: 1038: 996: 991: 988: 943: 940: 895: 892: 888: 861: 848: 841: 835: 832: 830: 826: 822: 818: 814: 810: 806: 802: 798: 794: 790: 786: 782: 778: 774: 770: 766: 762: 758: 754: 750: 745: 744: 741: 738: 736: 733: 730: 727: 725: 722: 720: 717: 716: 711: 707: 703: 696: 691: 689: 684: 682: 677: 676: 673: 667: 664: 661: 658: 656: 653: 651: 648: 647: 637: 636:1-56592-224-7 633: 629: 625: 622: 619: 618: 614:Leban, Carl. 613: 610: 607: 606:0-8248-1842-3 603: 600:(paperback); 599: 598:0-8248-1892-X 595: 591: 587: 584: 583:0-8248-1068-6 580: 576: 575: 570: 567: 566: 555: 549: 545: 540: 536: 530: 527:. Blackwell. 525: 524: 517: 516: 505: 500: 494: 489: 481: 477: 476: 471: 464: 457: 452: 445: 440: 436: 426: 423: 421: 418: 416: 413: 411: 408: 406: 403: 401: 398: 396: 393: 390: 387: 385: 382: 380: 377: 375: 372: 370: 367: 365: 362: 360: 357: 356: 350: 348: 344: 340: 336: 332: 322: 320: 315: 313: 309: 301: 298: 296: 293: 291: 288: 285: 283: 280: 277: 274: 270: 267: 264: 261: 259: 256: 254: 251: 248: 244: 241: 239: 236: 233: 230: 229: 227: 224: 222: 218: 213: 211: 207: 201: 199: 194: 190: 180: 178: 173: 172: 166: 162: 157: 155: 150: 148: 144: 140: 137:for Chinese, 136: 132: 128: 123: 121: 117: 107: 105: 101: 97: 93: 89: 85: 81: 77: 73: 69: 65: 57: 53: 49: 45: 41: 37: 33: 28: 22: 1144: 1043: 993: 945: 897: 837: 747: 627: 615: 608:(hardcover). 589: 572: 543: 522: 499: 488: 473: 463: 451: 439: 328: 325:Legal status 316: 305: 225: 214: 202: 186: 158: 151: 124: 113: 91: 90:. The term 74:used in the 67: 61: 1036:Not unified 1034:Not unified 1032:Not unified 1028:Not unified 1026:Not unified 1024:Not unified 1022:Not unified 1020:Not unified 1018:Not unified 1016:Not unified 805:CJK Strokes 728:Chart range 513:Works cited 282:ISO 2022-JP 100:logographic 1164:Categories 1078:, Common, 731:Characters 719:Block name 624:Lunde, Ken 431:References 165:Vietnamese 154:sinologist 44:simplified 32:Vietnamese 1080:Inherited 415:Sinoxenic 339:trademark 335:Ken Lunde 290:Shift-JIS 286:KS C 5861 243:CNS 11643 72:graphemes 36:Cantonese 1100:, Common 1098:Hiragana 1089:, Common 1087:Katakana 1085:Hangul, 353:See also 269:GB 18030 221:bopomofo 198:GB 18030 183:Encoding 143:katakana 139:hiragana 129:and the 127:bopomofo 116:literacy 80:Japanese 52:Japanese 40:Mandarin 1014:Unified 1012:Unified 1010:Unified 1008:Unified 1006:Unified 1004:Unified 1002:Unified 1000:Unified 998:Unified 995:Unified 706:Unicode 562:Sources 472:at the 331:JACKPHY 308:Unicode 300:Unicode 263:GB 2312 206:Unicode 193:Unicode 171:chữ Nôm 133:-based 96:Chữ Nôm 76:Chinese 1129:  1126:  1122:99,737 1113:  1110:Totals 1095:Common 1083:Common 1076:Hangul 1071:Common 950:42,720 946:20,992 634:  604:  596:  581:  550:  531:  258:EUC-KR 253:EUC-JP 217:pinyin 147:hangul 135:pinyin 82:, and 56:Korean 54:, and 1074:Han, 962:4,192 960:4,939 958:7,473 956:5,762 952:4,154 948:6,592 890:2 SIP 883:0 BMP 881:0 BMP 879:0 BMP 877:0 BMP 875:0 BMP 873:0 BMP 871:0 BMP 869:0 BMP 867:0 BMP 865:2 SIP 863:3 TIP 856:2 SIP 854:2 SIP 852:2 SIP 850:2 SIP 843:0 BMP 724:Plane 391:(CTL) 238:CCCII 131:Latin 632:ISBN 602:ISBN 594:ISBN 579:ISBN 548:ISBN 529:ISBN 480:GFDL 347:OCLC 295:TRON 232:Big5 152:The 141:and 92:CJKV 46:and 42:(in 1102:Han 1093:Han 1068:Han 1066:Han 1064:Han 1062:Han 1060:Han 1058:Han 1056:Han 1054:Han 1052:Han 1050:Han 1048:Han 1045:Han 986:542 980:472 978:256 976:255 968:214 966:115 964:622 954:222 887:SMP 860:TIP 847:SIP 840:BMP 704:in 470:CJK 341:of 62:In 50:), 1166:: 1117:21 984:64 982:32 974:39 972:64 970:16 885:1 858:3 845:2 838:0 626:. 571:. 219:, 212:. 179:. 122:. 106:. 78:, 66:, 38:, 34:, 694:e 687:t 680:v 638:. 585:. 556:. 537:. 458:. 275:) 249:) 58:. 23:.

Index

Help:Multilingual support (East Asian)

Vietnamese
Cantonese
Mandarin
simplified
traditional characters
Japanese
Korean
internationalization
graphemes
Chinese
Japanese
Korean writing systems
Chinese characters
Chữ Nôm
logographic
Vietnamese language
literacy
1,800 characters
bopomofo
Latin
pinyin
hiragana
katakana
hangul
sinologist
Classical Chinese
Vietnamese
chữ Nôm

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.