Knowledge

Unicode and HTML

Source 📝

1115:.) The use of the BOM character (U+FEFF) means that the encoding automatically declares itself to any processing application. Processing applications need only look for an initial 0x0000FEFF, 0xFEFF or 0xEFBBBF in the byte stream to identify the document as UTF-32, UTF-16 or UTF-8 encoded respectively. No additional metadata mechanisms are required for these encodings since the byte-order mark includes all of the information necessary for processing applications. In most circumstances, the byte-order mark character is handled by editing applications separately from the other characters so there is little risk of an author removing or otherwise changing the byte order mark to indicate the wrong encoding (as can happen when the encoding is declared in English/Latin script). If the document lacks a byte-order mark, the fact that the first non-blank printable character in an HTML document is supposed to be "<" (U+003C) can be used to determine a UTF-8/UTF-16/UTF-32 encoding. 1662:, can only display text supported by the current font associated with the character encoding of the page, and may misinterpret numeric character references as being references to code values within the current character encoding, rather than references to Unicode code points. When you are using such a browser, it is unlikely that your computer has all of those fonts, or that the browser can use all available fonts on the same page. As a result, the browser will not display the text in the examples above correctly, though it may display a subset of them. Because they are encoded according to the standard, though, they 1090:
Consequently, many HTML authors are unaware of encoding issues and may not have any idea what encoding their documents actually use. Misunderstandings, such as the belief that the encoding declaration affects a change in the actual encoding (whereas it is actually just a label that could be inaccurate), is also a reason for this editor attitude. Another factor contributing in the same direction, is the arrival of UTF-8 – which greatly diminishes the need for other encodings, and thus modern editors tends to default, as recommended by the HTML5 specification, to UTF-8.
1155:, manual encoding override is not permitted. To override the encoding of such an XML document would mean that the document stopped being XML, as it is a fatal error for XML documents to have an encoding declaration with detectable errors. Currently, Gecko browsers such as Firefox, abide to this rule, whereas the bulk of the other common browsers that support HTML as XML, such as Webkit browsers (Chrome/Safari) do allow the encoding of XHTML documents to be manually overridden. 670:. It does not vary between documents of different languages or created on different platforms. The external character encoding is chosen by the author of the document (or the software the author uses to create the document) and determines how the bytes used to store and/or transmit the document map to characters from the document character set. Characters not present in the chosen external character encoding may be represented by character entity references. 3565: 3554: 226: 66: 25: 651:) may contain multilingual text represented with the Unicode universal character set. Key to the relationship between Unicode and HTML is the relationship between the "document character set", which defines the set of characters that may be present in an HTML document and assigns numbers to them, and the "external character encoding", or "charset", used to encode a given document as a sequence of bytes. 1843: 168: 895:
The support for hexadecimal in this context is more recent, so older browsers might have problems displaying characters referenced with hexadecimal numbers – but they will probably have a problem displaying Unicode characters above code point 255 anyway. To ensure better compatibility
1123:
Many HTML documents are served with inaccurate encoding information, or no encoding information at all. In order to determine the encoding in such cases, many browsers allow the user to manually select an encoding name from a list. They may also employ an encoding auto-detection algorithm that works
1098:
For both serializations of HTML (content-type "text/html" and content/type "application/xhtml+xml"), the byte order mark (BOM) is an effective way to transmit encoding information within an HTML document. For UTF-8, the BOM is optional, while it is a must for the UTF-16 and the UTF-32 encodings.
922:
for characters - some common, some obscure - that are either not found in certain character encodings or are markup sensitive in some contexts (for example angle brackets and quotation marks). Although any Unicode character can be referenced by its numeric code point, some HTML document authors
1143:
serialized, manual override may apply to all documents, or only those for which the encoding cannot be ascertained by looking at declarations and/or byte patterns. The fact that the manual override is present and widely used hinders the adoption of accurate encoding declarations on the Web;
835:
characters, such as English letters, digits, and some other common characters are preserved unchanged against ASCII. This makes HTML code (such as <br> and </div>) unchanged compared to ASCII. Characters outside the ASCII range are stored in 2–4 bytes. It is also possible to use
1089:
and the desire to avoid burdening users with the need to understand the nuances of encoding, many text editors used by HTML authors are unable or unwilling to offer a choice of encodings when saving files to disk and often do not even allow input of characters beyond a very limited range.
1056:
An encoding default applies when there is no external or internal encoding declaration and also no byte order mark. While the encoding default for HTML pages served as XML is required to be UTF-8, the encoding default for a regular Web page (that is: for HTML pages serialized as
775:, nevertheless relies upon a similar definition of permissible characters that cover most, but not all, of the Unicode/UCS character definitions. The sets used by HTML and XHTML/XML are slightly different, but these differences have little effect on the average document author. 1047:
can be used. For HTML pages serialized as XML, then declaration options is to either rely on the encoding default (which for XML documents is UTF-8), or to use an XML encoding declaration. The meta attribute plays no role in HTML served as XML.
756:, which also establishes the syntax (allowable sequences of characters) that can produce a valid HTML document. The HTML document character set for HTML 4.0 consists of most, but not all, of the characters jointly defined by 986:
In order to correctly process HTML, a web browser must ascertain which Unicode characters are represented by the encoded form of an HTML document. In order to do this, the web browser must know what encoding was used.
1666:
display correctly on any system that is compliant and does have the characters available. Further, those characters given names for use in named entity references are likely to be more commonly available than others.
1865: 858:
In order to work around the limitations of legacy encodings, HTML is designed such that it is possible to represent characters from the whole of Unicode inside an HTML document by using a
1639:(from version 7 on), are able to display multilingual web pages by intelligently choosing a font to display each individual character on the page. They will correctly display any mix of 1759:
Authors are encouraged to use UTF-8. Conformance checkers may advise authors against using legacy encodings. Authoring tools should default to using UTF-8 for newly created documents.
1674:, such as the Gothic letter faihu, which is a variant of the runic letter fehu in the table above, some systems (like Windows 2000) need manual adjustments of their settings. 827:
In order to support all Unicode characters without resorting to numeric character references, a web page must have an encoding covering all of Unicode. The most popular is
1099:(Note: UTF-16 and UTF-32 without the BOM are formally known under different names, they are different encodings, and thus needs some form of encoding declaration – see 178: 1163:
Many browsers are only capable of displaying a small subset of the full Unicode repertoire. Here is how your browser displays various Unicode code points:
87: 80: 862:: a sequence of characters that explicitly spell out the Unicode code point of the character being represented. A character reference takes the form 2457: 975: 477: 2112: 2399: 892:. The characters that compose the numeric character reference are universally representable in every encoding approved for use on the Internet. 189: 744:
An HTML document is a sequence of Unicode characters. More specifically, HTML 4.0 documents are required to consist of characters in the HTML
3522: 2012: 1908:- a W3C & Unicode Consortium joint publication that describes issues and provides guidelines relating to Unicode in markup languages 1061:) varies depending on the localization of the browser. For a system set up mainly for Western European languages, it will generally be 3507: 811:, that cannot. However, even when using encodings that do not support all Unicode characters, the encoded document may make use of 1144:
therefore the problem is likely to persist. But note that Internet Explorer, Chrome and Safari – for both XML and
923:
prefer to use these named entities instead, where possible, as they are less cryptic and were better supported by early browsers.
654:
In RFC 1866, the initial HTML 2.0 standard, the document character set was defined as ISO-8859-1 (later HTML standard defaults to
3527: 2317: 603: 2302: 1984: 290: 130: 2546: 2225: 1429: 262: 102: 2541: 896:
with older browsers, it is still a common practice to convert the hexadecimal code point into a decimal value (for example
354: 1957:
Unicode character charts; hexadecimal numbers only; PDF files showing all characters independent of browser capabilities
269: 109: 2075: 1726: 1891: 327: 309: 243: 207: 149: 52: 38: 2374: 2005: 276: 116: 2379: 2294: 2195: 1148:
serializations – do not permit the encoding to be overridden whenever the page includes the BOM.
1008: 2856: 2575: 1974:
Web tool that converts "special" characters (such as Chinese characters) to Unicode numeric character references
1505: 634: 247: 1963: 2676: 2490: 2474: 2437: 2284: 2220: 2040: 1478: 800: 767:
Like HTML documents, an XHTML document is a sequence of Unicode characters. However, an XHTML document is an
608: 565: 504: 258: 98: 3415: 3285: 2600: 2210: 1711: 1613:
To display all of the characters above, you may need to install one or more large multilingual fonts, like
1603: 812: 3534: 2651: 2462: 2262: 1998: 1716: 859: 853: 472: 3214: 2701: 2536: 2531: 2180: 2085: 913: 3219: 2595: 3139: 2416: 2125: 560: 1948: 1822: 1772:"12897 – In some parsers, UTF-8 BOM trumps the HTTP charset attribute (Encoding sniffing algorithm)" 1007:. Other external means of declaring encoding are permitted but rarely used. If the document uses a 2230: 1873: 1869: 1853: 1671: 1644: 1073:
multi-byte character encodings are prevalent, some form of auto-detection is likely to be applied.
753: 3016: 182:
that states a Knowledge editor's personal feelings or presents an original argument about a topic.
3569: 3470: 3390: 2826: 2731: 2080: 2068: 761: 748: : a character repertoire wherein each character is assigned a unique, non-negative integer 350: 344: 236: 76: 3159: 2906: 2761: 2696: 2411: 2279: 2185: 2144: 1973: 1501: 1474: 462: 677:
and HTML tends to be a difficult topic for many computer professionals, document authors, and
283: 123: 3355: 3109: 3104: 2981: 2394: 2159: 726: 494: 44: 3061: 3340: 3254: 3174: 2896: 2861: 2741: 1931: 1818: 1632: 1082: 499: 416: 1861: 185: 8: 3385: 3295: 3239: 2605: 2585: 2384: 2369: 2274: 2200: 2190: 1659: 1628: 3380: 1690:
Unicode encoding became the most frequently used encoding on web pages, overtaking both
971:
character "—" even if the character encoding used doesn't contain that character.
3486: 3405: 3375: 3345: 3325: 2961: 2941: 2691: 2257: 2134: 2130: 2035: 1942: 1771: 1655: 1580: 694: 627: 516: 3455: 3365: 3350: 3189: 3154: 2966: 2806: 2631: 2442: 2154: 2095: 1721: 1636: 1425: 790: 2881: 1747: 1062: 3599: 3558: 3517: 3410: 3360: 3249: 3209: 3134: 3124: 3114: 2986: 2971: 2876: 2851: 2726: 2706: 2566: 2205: 2164: 1648: 1086: 738: 734: 686: 663: 1915: 3512: 3465: 3450: 3310: 3275: 3270: 3204: 3194: 3144: 3011: 3001: 2996: 2946: 2916: 2786: 2776: 2736: 2636: 2551: 2526: 2215: 2120: 2090: 1978: 1967: 1919: 1796:"66189 – XML parser doesn't emit FATAL ERROR for all, detectable encoding errors" 1624: 1400: 1371: 1345: 1319: 1012: 1003:
response, the message may signal the encoding via a Content-Type header, such as
698: 443: 384: 2721: 1795: 667: 3420: 3400: 3320: 3300: 3290: 3199: 3046: 2976: 2951: 2931: 2886: 2871: 2846: 2796: 2771: 2716: 2671: 1911: 1695: 1554: 1274: 1270: 1248: 1198: 819:(☺) is used to indicate a smiling face character in the Unicode character set. 690: 678: 587: 450: 394: 799:) according to a particular character encoding. This encoding may either be a 3594: 3588: 3440: 3425: 3305: 3244: 3129: 3071: 3066: 3031: 3006: 2956: 2841: 2831: 2681: 2626: 2469: 2267: 2063: 1640: 1397: 1151:
For HTML documents serialized with the preferred XML label –
807:, that can directly encode any Unicode character, or a legacy encoding, like 702: 620: 489: 455: 438: 771:
document, which, while not having an explicit "document character" layer of
3435: 3169: 3164: 3119: 3021: 2936: 2926: 2836: 2811: 2751: 2666: 2646: 2641: 2621: 2500: 2447: 1699: 1066: 1015:(BOM). Finally, the encoding can be declared via the HTML syntax. For the 962: 808: 655: 433: 428: 423: 374: 2045: 3491: 3330: 3315: 3229: 3099: 3076: 3041: 2911: 2891: 2866: 2686: 2149: 2139: 1037:<meta http-equiv="content-type" content="text/html; charset=UTF-8"> 885: 779: 772: 706: 582: 577: 467: 411: 2421: 3370: 2821: 2711: 2347: 2055: 1403: 1297: 841: 725:
documents. Both types of documents consist, at a fundamental level, of
648: 526: 521: 399: 389: 1349: 1019:
serialisation then, as long as the page is encoded in an extension of
926:
Character entities can be included in an HTML document via the use of
778:
Regardless of whether the document is HTML or XHTML, when stored on a
3234: 3149: 3081: 2816: 2590: 2510: 2505: 2406: 2389: 1558: 659: 1905: 1375: 225: 65: 3460: 3430: 3280: 3265: 3260: 3051: 2801: 2781: 2746: 2656: 2495: 2312: 1960: 1936: 1872:
external links, and converting useful links where appropriate into
1614: 1531: 1451: 1112: 1108: 1104: 1100: 730: 682: 2901: 3184: 3179: 3056: 2991: 2921: 2661: 2021: 1954: 1949:
http://www.alanwood.net/unicode/cjk_compatibility_ideographs.html
1323: 968: 881: 757: 674: 844:, which is supported by modern browsers but less commonly used. 3445: 3224: 3036: 3026: 2756: 2342: 2337: 2307: 1683: 1528: 1028: 999:
message or a transport that uses MIME content types such as an
837: 572: 548: 3539: 3395: 3335: 2791: 2766: 2332: 2327: 2322: 1987:- Original HTML5 Citation Reference saved via Wayback Machine 1925: 1691: 1687: 1040: 1024: 1020: 832: 828: 804: 782:
or transmitted over a network, the document's characters are
733:
and grapheme-like units, independent of how they manifest in
722: 553: 543: 538: 531: 406: 379: 1011:, the encoding info might also be present in the form of a 1000: 996: 840:
where most characters are stored as two bytes with varying
795: 718: 365: 179:
personal reflection, personal essay, or argumentative essay
1990: 1928:- Browse Unicode characters, ranges, and other information 1225: 1065:. For Cyrillic alphabet locales, the default is typically 787: 768: 509: 1130:
in the case of the BOM and in case of HTML served as XML
1081:
Because of the legacy of 8-bit text representations in
1202: 1943:
http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm
681:
users alike. The accurate representation of text in
647:
Web pages authored using HyperText Markup Language (
1712:
Help file for using special characters on Knowledge
1167:Example web browser support for Unicode characters 1093: 981: 250:. Unsourced material may be challenged and removed. 2565: 1856:may not follow Knowledge's policies or guidelines 3586: 1922:named character entity definitions for HTML 4.01 976:List of XML and HTML character entity references 918:In HTML 4, there is a standard set of 252 named 16:Relationship between Unicode characters and HTML 1932:SIL's freeware fonts, editors and documentation 847: 1945:The International Phonetic Alphabet in Unicode 662:(which is basically equivalent to Unicode) by 2006: 888:number, in which case it must be prefixed by 628: 1534:Tteolp (Korean "Ssangtikeut Eo Rieulbieup") 1961:Table of Unicode characters from 1 to 65535 1745: 907: 712: 53:Learn how and when to remove these messages 3508:Cultural, political, and religious symbols 2013: 1999: 635: 621: 1906:Unicode in XML and other Markup Languages 1892:Learn how and when to remove this message 328:Learn how and when to remove this message 310:Learn how and when to remove this message 208:Learn how and when to remove this message 150:Learn how and when to remove this message 948:is the name of the entity. For example, 884:number for the Unicode code point, or a 2041:ISO/IEC 10646 (Universal Character Set) 1727:Unicode character reference (wikibooks) 990: 355:question marks, boxes, or other symbols 3587: 1970:- shows how they look in one's browser 1670:For displaying characters outside the 1158: 1118: 1069:. For a browser from a location where 1005:Content-Type: text/html; charset=UTF-8 752:. This set is defined in the HTML 4.0 86:Please improve this article by adding 2564: 1994: 1677: 1612: 1027:, and thus, not if the page is using 995:When a document is transmitted via a 822: 2542:International Components for Unicode 2491:Common Locale Data Repository (CLDR) 1836: 1051: 248:adding citations to reliable sources 219: 161: 59: 18: 1979:Multi-lingual web pages and Unicode 1686:'s web index, in December 2007 the 705:, and varying levels of support by 13: 3523:Mathematical operators and symbols 1076: 14: 3611: 1832: 693:is complicated by the details of 34:This article has multiple issues. 3564: 3563: 3553: 3552: 3535:Phonetic symbols (including IPA) 1939:- Unicode fonts and information. 1920:Mathematical, Greek and Symbolic 1841: 1825:Official Google blog, 5 May 2008 1682:According to internal data from 1094:Byte order mark/Unicode sniffing 982:Character encoding determination 224: 166: 64: 23: 235:needs additional citations for 42:or discuss these issues on the 1955:http://www.unicode.org/charts/ 1812: 1788: 1764: 1739: 658:encoding). It was extended to 1: 2475:International Ideographs Core 2285:International Ideographs Core 2226:Alias names and abbreviations 1981:- how to fix display problems 1937:Alan Wood’s Unicode Resources 1732: 1139:For HTML documents which are 801:Unicode Transformation Format 609:Comparison of browser engines 88:secondary or tertiary sources 2697:CJK Unified Ideographs (Han) 2547:People involved with Unicode 1951:CJK Compatibility Ideographs 1604:Face with Tears of Joy emoji 1045:<meta charset="UTF-8"> 848:Numeric character references 813:numeric character references 7: 2020: 1717:Character encodings in HTML 1705: 1623:Some web browsers, such as 1181:What your browser displays 860:numeric character reference 854:Numeric character reference 10: 3616: 2537:Ideographic Research Group 2532:ConScript Unicode Registry 1985:w3.org via web.archive.org 914:character entity reference 911: 851: 3548: 3500: 3479: 3090: 2614: 2574: 2560: 2519: 2483: 2430: 2417:Regional indicator symbol 2360: 2293: 2250: 2243: 2173: 2126:Combining grapheme joiner 2111: 2104: 2054: 2028: 1643:, as long as appropriate 1590: 1567: 1541: 1515: 1488: 1461: 1438: 1412: 1384: 1358: 1332: 1306: 1283: 1257: 1234: 1211: 1185: 1180: 1177: 1174: 1171: 961: 673:The relationship between 604:Document markup languages 3570:Category: Unicode blocks 2375:Compatibility characters 1672:Basic Multilingual Plane 1654:Older browsers, such as 974:For the full list, see: 908:Named character entities 717:Web pages are typically 713:HTML document characters 2295:Comparison of encodings 2221:Halfwidth and fullwidth 2076:Universal Character Set 762:Universal Character Set 760:and ISO/IEC 10646: the 3220:Inscriptional Parthian 2907:Nyiakeng Puachue Hmong 2569:and symbols in Unicode 2186:CJK Unified Ideographs 930:, which take the form 746:document character set 343:This article contains 188:by rewriting it in an 75:relies excessively on 3356:Old Persian cuneiform 3215:Inscriptional Pahlavi 3110:Ancient North Arabian 3105:Anatolian hieroglyphs 2395:Precomposed character 2231:Whitespace characters 2160:Zero-width non-joiner 1823:Moving to Unicode 5.1 1502:CJK Unified Ideograph 1475:CJK Unified Ideograph 1296:Latin capital letter 1153:application/xhtml+xml 1136:the manual override. 1128:or – 1083:programming languages 495:Document Object Model 3175:Egyptian hieroglyphs 2380:Duplicate characters 2196:Duplicate characters 1862:improve this article 1746:Ian Hickson (2011). 1702:(Western European). 1454:letter A (Japanese) 991:Encoding information 500:Browser Object Model 244:improve this article 3240:Khitan small script 2677:Canadian Aboriginal 2412:Variation sequences 2370:Combining character 2280:Variation sequences 2191:Combining character 1874:footnote references 1660:Internet Explorer 6 1647:are present in the 1506:Traditional Chinese 1247:Latin small letter 1224:Latin small letter 1168: 1159:Web browser support 1132: – 1119:Encoding overriding 473:Character encodings 3480:Notational scripts 3431:Tagalog (Baybayin) 3140:Caucasian Albanian 2463:numeric references 2438:Domain names (IDN) 2258:Bidirectional text 2135:Right-to-left mark 2131:Left-to-right mark 2086:Character property 2036:Unicode Consortium 1966:2007-11-03 at the 1678:Frequency of usage 1656:Netscape Navigator 1479:Simplified Chinese 1166: 1039:or (starting with 920:character entities 823:Character encoding 695:character encoding 345:special characters 259:"Unicode and HTML" 190:encyclopedic style 177:is written like a 99:"Unicode and HTML" 3582: 3581: 3578: 3577: 3559:Category: Unicode 2596:Punctuation marks 2578:inherited scripts 2484:Related standards 2458:entity references 2356: 2355: 2239: 2238: 2155:Zero-width joiner 1902: 1901: 1894: 1722:Charset detection 1637:Internet Explorer 1621: 1620: 1087:operating systems 1052:Encoding defaults 928:entity references 786:as a sequence of 687:natural languages 645: 644: 351:rendering support 338: 337: 330: 320: 319: 312: 294: 218: 217: 210: 160: 159: 152: 134: 57: 3607: 3567: 3566: 3556: 3555: 3518:Control Pictures 3471:Zanabazar Square 3210:Imperial Aramaic 3093:historic scripts 2562: 2561: 2422:Emoji skin color 2248: 2247: 2165:Zero-width space 2109: 2108: 2096:Private Use Area 2081:Character charts 2015: 2008: 2001: 1992: 1991: 1897: 1890: 1886: 1883: 1877: 1845: 1844: 1837: 1826: 1816: 1810: 1809: 1807: 1806: 1792: 1786: 1785: 1783: 1782: 1768: 1762: 1761: 1756: 1754: 1743: 1649:operating system 1600: 1596: 1577: 1573: 1551: 1547: 1525: 1521: 1498: 1494: 1471: 1467: 1448: 1444: 1422: 1418: 1394: 1390: 1368: 1364: 1342: 1338: 1316: 1312: 1293: 1289: 1267: 1263: 1244: 1240: 1221: 1217: 1195: 1191: 1169: 1165: 1154: 1147: 1142: 1060: 1046: 1038: 1034: 1018: 1009:Unicode encoding 1006: 966: 959: 955: 951: 942: 934: 903: 899: 891: 874: 866: 818: 735:computer storage 637: 630: 623: 588:Rendering engine 478:named characters 362: 361: 333: 326: 315: 308: 304: 301: 295: 293: 252: 228: 220: 213: 206: 202: 199: 193: 170: 169: 162: 155: 148: 144: 141: 135: 133: 92: 68: 60: 49: 27: 26: 19: 3615: 3614: 3610: 3609: 3608: 3606: 3605: 3604: 3585: 3584: 3583: 3574: 3544: 3528:List by subject 3501:Symbols, emojis 3496: 3475: 3391:Psalter Pahlavi 3092: 3086: 2947:Pracalit (Newa) 2762:Hanifi Rohingya 2610: 2586:Combining marks 2577: 2570: 2556: 2552:Han unification 2515: 2479: 2426: 2362: 2352: 2289: 2235: 2169: 2113:Special purpose 2100: 2050: 2024: 2019: 1968:Wayback Machine 1898: 1887: 1881: 1878: 1859: 1850:This article's 1846: 1842: 1835: 1830: 1829: 1817: 1813: 1804: 1802: 1800:bugs.webkit.org 1794: 1793: 1789: 1780: 1778: 1770: 1769: 1765: 1752: 1750: 1744: 1740: 1735: 1708: 1680: 1625:Mozilla Firefox 1598: 1594: 1583:letter ഷ (ṣha) 1575: 1571: 1549: 1545: 1523: 1519: 1496: 1492: 1469: 1465: 1446: 1442: 1420: 1416: 1392: 1388: 1366: 1362: 1340: 1336: 1322:capital letter 1314: 1310: 1291: 1287: 1273:capital letter 1265: 1261: 1242: 1238: 1219: 1215: 1201:capital letter 1193: 1189: 1161: 1152: 1145: 1140: 1121: 1096: 1079: 1077:Encoding trends 1058: 1054: 1044: 1036: 1032: 1016: 1013:byte order mark 1004: 993: 984: 957: 953: 949: 947: 940: 938: 932: 916: 910: 901: 897: 889: 879: 872: 870: 864: 856: 850: 825: 816: 815:. For example, 715: 699:markup language 691:writing systems 685:from different 641: 360: 359: 358: 349:Without proper 334: 323: 322: 321: 316: 305: 299: 296: 253: 251: 241: 229: 214: 203: 197: 194: 186:help improve it 183: 171: 167: 156: 145: 139: 136: 93: 91: 85: 81:primary sources 69: 28: 24: 17: 12: 11: 5: 3613: 3603: 3602: 3597: 3580: 3579: 3576: 3575: 3573: 3572: 3561: 3549: 3546: 3545: 3543: 3542: 3537: 3532: 3531: 3530: 3520: 3515: 3510: 3504: 3502: 3498: 3497: 3495: 3494: 3489: 3483: 3481: 3477: 3476: 3474: 3473: 3468: 3463: 3458: 3453: 3448: 3443: 3438: 3433: 3428: 3423: 3418: 3413: 3408: 3403: 3398: 3393: 3388: 3383: 3378: 3373: 3368: 3363: 3358: 3353: 3348: 3343: 3338: 3333: 3328: 3323: 3318: 3313: 3308: 3303: 3298: 3293: 3288: 3283: 3278: 3273: 3268: 3263: 3258: 3252: 3247: 3242: 3237: 3232: 3227: 3222: 3217: 3212: 3207: 3202: 3197: 3192: 3187: 3182: 3177: 3172: 3167: 3162: 3157: 3152: 3147: 3142: 3137: 3132: 3127: 3122: 3117: 3112: 3107: 3102: 3096: 3094: 3088: 3087: 3085: 3084: 3079: 3074: 3069: 3064: 3059: 3054: 3049: 3044: 3039: 3034: 3029: 3024: 3019: 3014: 3009: 3004: 2999: 2994: 2989: 2984: 2982:Sorang Sompeng 2979: 2974: 2969: 2964: 2959: 2954: 2949: 2944: 2939: 2934: 2929: 2924: 2919: 2914: 2909: 2904: 2899: 2894: 2889: 2884: 2879: 2874: 2872:Miao (Pollard) 2869: 2864: 2859: 2854: 2849: 2844: 2839: 2834: 2829: 2824: 2819: 2814: 2809: 2804: 2799: 2794: 2789: 2784: 2779: 2774: 2769: 2764: 2759: 2754: 2749: 2744: 2739: 2734: 2729: 2724: 2719: 2714: 2709: 2704: 2699: 2694: 2689: 2684: 2679: 2674: 2669: 2664: 2659: 2654: 2649: 2644: 2639: 2634: 2629: 2624: 2618: 2616: 2615:Modern scripts 2612: 2611: 2609: 2608: 2603: 2598: 2593: 2588: 2582: 2580: 2572: 2571: 2558: 2557: 2555: 2554: 2549: 2544: 2539: 2534: 2529: 2523: 2521: 2520:Related topics 2517: 2516: 2514: 2513: 2508: 2503: 2498: 2493: 2487: 2485: 2481: 2480: 2478: 2477: 2472: 2467: 2466: 2465: 2460: 2450: 2445: 2440: 2434: 2432: 2428: 2427: 2425: 2424: 2419: 2414: 2409: 2404: 2403: 2402: 2392: 2387: 2382: 2377: 2372: 2366: 2364: 2358: 2357: 2354: 2353: 2351: 2350: 2345: 2340: 2335: 2330: 2325: 2320: 2315: 2310: 2305: 2299: 2297: 2291: 2290: 2288: 2287: 2282: 2277: 2272: 2271: 2270: 2260: 2254: 2252: 2245: 2241: 2240: 2237: 2236: 2234: 2233: 2228: 2223: 2218: 2213: 2208: 2203: 2198: 2193: 2188: 2183: 2177: 2175: 2171: 2170: 2168: 2167: 2162: 2157: 2152: 2147: 2142: 2137: 2128: 2123: 2117: 2115: 2106: 2102: 2101: 2099: 2098: 2093: 2088: 2083: 2078: 2073: 2072: 2071: 2060: 2058: 2052: 2051: 2049: 2048: 2043: 2038: 2032: 2030: 2026: 2025: 2018: 2017: 2010: 2003: 1995: 1989: 1988: 1982: 1976: 1971: 1958: 1952: 1946: 1940: 1934: 1929: 1926:UnicodeMap.org 1923: 1909: 1900: 1899: 1854:external links 1849: 1847: 1840: 1834: 1833:External links 1831: 1828: 1827: 1811: 1787: 1763: 1737: 1736: 1734: 1731: 1730: 1729: 1724: 1719: 1714: 1707: 1704: 1679: 1676: 1641:Unicode blocks 1619: 1618: 1610: 1609: 1606: 1601: 1592: 1588: 1587: 1584: 1578: 1569: 1565: 1564: 1561: 1552: 1543: 1539: 1538: 1535: 1526: 1517: 1513: 1512: 1509: 1499: 1490: 1486: 1485: 1482: 1472: 1463: 1459: 1458: 1455: 1449: 1440: 1436: 1435: 1432: 1423: 1414: 1410: 1409: 1406: 1395: 1386: 1382: 1381: 1378: 1369: 1360: 1356: 1355: 1352: 1343: 1334: 1330: 1329: 1326: 1317: 1308: 1304: 1303: 1300: 1294: 1285: 1281: 1280: 1277: 1268: 1259: 1255: 1254: 1251: 1245: 1236: 1232: 1231: 1228: 1222: 1213: 1209: 1208: 1205: 1196: 1187: 1183: 1182: 1179: 1176: 1175:HTML char ref 1173: 1160: 1157: 1120: 1117: 1095: 1092: 1078: 1075: 1053: 1050: 1035:element, like 992: 989: 983: 980: 945: 936: 912:Main article: 909: 906: 877: 868: 852:Main article: 849: 846: 824: 821: 714: 711: 643: 642: 640: 639: 632: 625: 617: 614: 613: 612: 611: 606: 598: 597: 593: 592: 591: 590: 585: 580: 575: 570: 569: 568: 558: 557: 556: 551: 546: 536: 535: 534: 524: 519: 514: 513: 512: 502: 497: 492: 487: 486: 485: 480: 470: 465: 460: 459: 458: 451:HTML attribute 448: 447: 446: 441: 436: 431: 421: 420: 419: 417:Mobile Profile 414: 404: 403: 402: 397: 392: 387: 377: 369: 368: 353:, you may see 341: 340: 339: 336: 335: 318: 317: 232: 230: 223: 216: 215: 174: 172: 165: 158: 157: 72: 70: 63: 58: 32: 31: 29: 22: 15: 9: 6: 4: 3: 2: 3612: 3601: 3598: 3596: 3593: 3592: 3590: 3571: 3562: 3560: 3551: 3550: 3547: 3541: 3538: 3536: 3533: 3529: 3526: 3525: 3524: 3521: 3519: 3516: 3514: 3511: 3509: 3506: 3505: 3503: 3499: 3493: 3490: 3488: 3485: 3484: 3482: 3478: 3472: 3469: 3467: 3464: 3462: 3459: 3457: 3454: 3452: 3451:Tulu Tigalari 3449: 3447: 3444: 3442: 3439: 3437: 3434: 3432: 3429: 3427: 3426:Sylheti Nagri 3424: 3422: 3419: 3417: 3416:South Arabian 3414: 3412: 3409: 3407: 3404: 3402: 3399: 3397: 3394: 3392: 3389: 3387: 3384: 3382: 3379: 3377: 3374: 3372: 3369: 3367: 3364: 3362: 3359: 3357: 3354: 3352: 3349: 3347: 3344: 3342: 3341:Old Hungarian 3339: 3337: 3334: 3332: 3329: 3327: 3324: 3322: 3319: 3317: 3314: 3312: 3309: 3307: 3304: 3302: 3299: 3297: 3294: 3292: 3289: 3287: 3284: 3282: 3279: 3277: 3274: 3272: 3269: 3267: 3264: 3262: 3259: 3256: 3253: 3251: 3248: 3246: 3243: 3241: 3238: 3236: 3233: 3231: 3228: 3226: 3223: 3221: 3218: 3216: 3213: 3211: 3208: 3206: 3203: 3201: 3198: 3196: 3193: 3191: 3188: 3186: 3183: 3181: 3178: 3176: 3173: 3171: 3168: 3166: 3163: 3161: 3158: 3156: 3153: 3151: 3148: 3146: 3143: 3141: 3138: 3136: 3133: 3131: 3128: 3126: 3123: 3121: 3118: 3116: 3113: 3111: 3108: 3106: 3103: 3101: 3098: 3097: 3095: 3089: 3083: 3080: 3078: 3075: 3073: 3070: 3068: 3065: 3063: 3060: 3058: 3055: 3053: 3050: 3048: 3045: 3043: 3040: 3038: 3035: 3033: 3030: 3028: 3025: 3023: 3020: 3018: 3015: 3013: 3010: 3008: 3005: 3003: 3000: 2998: 2995: 2993: 2990: 2988: 2985: 2983: 2980: 2978: 2975: 2973: 2970: 2968: 2965: 2963: 2960: 2958: 2955: 2953: 2950: 2948: 2945: 2943: 2940: 2938: 2935: 2933: 2930: 2928: 2925: 2923: 2920: 2918: 2915: 2913: 2910: 2908: 2905: 2903: 2900: 2898: 2895: 2893: 2890: 2888: 2885: 2883: 2880: 2878: 2875: 2873: 2870: 2868: 2865: 2863: 2862:Mende Kikakui 2860: 2858: 2857:Masaram Gondi 2855: 2853: 2850: 2848: 2845: 2843: 2842:Lisu (Fraser) 2840: 2838: 2835: 2833: 2830: 2828: 2825: 2823: 2820: 2818: 2815: 2813: 2810: 2808: 2805: 2803: 2800: 2798: 2795: 2793: 2790: 2788: 2785: 2783: 2780: 2778: 2775: 2773: 2770: 2768: 2765: 2763: 2760: 2758: 2755: 2753: 2750: 2748: 2745: 2743: 2742:Gunjala Gondi 2740: 2738: 2735: 2733: 2730: 2728: 2725: 2723: 2720: 2718: 2715: 2713: 2710: 2708: 2705: 2703: 2700: 2698: 2695: 2693: 2690: 2688: 2685: 2683: 2680: 2678: 2675: 2673: 2670: 2668: 2665: 2663: 2660: 2658: 2655: 2653: 2650: 2648: 2645: 2643: 2640: 2638: 2635: 2633: 2630: 2628: 2625: 2623: 2620: 2619: 2617: 2613: 2607: 2604: 2602: 2599: 2597: 2594: 2592: 2589: 2587: 2584: 2583: 2581: 2579: 2573: 2568: 2563: 2559: 2553: 2550: 2548: 2545: 2543: 2540: 2538: 2535: 2533: 2530: 2528: 2525: 2524: 2522: 2518: 2512: 2509: 2507: 2504: 2502: 2499: 2497: 2494: 2492: 2489: 2488: 2486: 2482: 2476: 2473: 2471: 2468: 2464: 2461: 2459: 2456: 2455: 2454: 2451: 2449: 2446: 2444: 2441: 2439: 2436: 2435: 2433: 2429: 2423: 2420: 2418: 2415: 2413: 2410: 2408: 2405: 2401: 2398: 2397: 2396: 2393: 2391: 2388: 2386: 2383: 2381: 2378: 2376: 2373: 2371: 2368: 2367: 2365: 2359: 2349: 2346: 2344: 2341: 2339: 2336: 2334: 2331: 2329: 2326: 2324: 2321: 2319: 2316: 2314: 2311: 2309: 2306: 2304: 2301: 2300: 2298: 2296: 2292: 2286: 2283: 2281: 2278: 2276: 2273: 2269: 2268:ISO/IEC 14651 2266: 2265: 2264: 2261: 2259: 2256: 2255: 2253: 2249: 2246: 2242: 2232: 2229: 2227: 2224: 2222: 2219: 2217: 2214: 2212: 2209: 2207: 2204: 2202: 2199: 2197: 2194: 2192: 2189: 2187: 2184: 2182: 2179: 2178: 2176: 2172: 2166: 2163: 2161: 2158: 2156: 2153: 2151: 2148: 2146: 2143: 2141: 2138: 2136: 2132: 2129: 2127: 2124: 2122: 2119: 2118: 2116: 2114: 2110: 2107: 2103: 2097: 2094: 2092: 2089: 2087: 2084: 2082: 2079: 2077: 2074: 2070: 2067: 2066: 2065: 2062: 2061: 2059: 2057: 2053: 2047: 2044: 2042: 2039: 2037: 2034: 2033: 2031: 2027: 2023: 2016: 2011: 2009: 2004: 2002: 1997: 1996: 1993: 1986: 1983: 1980: 1977: 1975: 1972: 1969: 1965: 1962: 1959: 1956: 1953: 1950: 1947: 1944: 1941: 1938: 1935: 1933: 1930: 1927: 1924: 1921: 1917: 1913: 1910: 1907: 1904: 1903: 1896: 1893: 1885: 1875: 1871: 1870:inappropriate 1867: 1863: 1857: 1855: 1848: 1839: 1838: 1824: 1820: 1815: 1801: 1797: 1791: 1777: 1773: 1767: 1760: 1749: 1742: 1738: 1728: 1725: 1723: 1720: 1718: 1715: 1713: 1710: 1709: 1703: 1701: 1697: 1693: 1689: 1685: 1675: 1673: 1668: 1665: 1661: 1657: 1652: 1650: 1646: 1642: 1638: 1634: 1630: 1626: 1616: 1611: 1607: 1605: 1602: 1599:&#x1F602; 1595:&#128514; 1593: 1589: 1585: 1582: 1579: 1570: 1566: 1562: 1560: 1556: 1553: 1544: 1540: 1536: 1533: 1530: 1527: 1518: 1514: 1510: 1507: 1503: 1500: 1491: 1487: 1483: 1480: 1476: 1473: 1464: 1460: 1456: 1453: 1450: 1441: 1437: 1433: 1431: 1427: 1424: 1415: 1411: 1407: 1405: 1402: 1399: 1396: 1387: 1383: 1379: 1377: 1373: 1370: 1361: 1357: 1353: 1351: 1347: 1344: 1335: 1331: 1327: 1325: 1321: 1318: 1309: 1305: 1301: 1299: 1295: 1286: 1282: 1278: 1276: 1272: 1269: 1260: 1256: 1252: 1250: 1246: 1237: 1233: 1229: 1227: 1223: 1214: 1210: 1206: 1204: 1200: 1197: 1188: 1184: 1178:Unicode name 1170: 1164: 1156: 1149: 1137: 1135: 1131: 1127: 1116: 1114: 1110: 1106: 1102: 1091: 1088: 1084: 1074: 1072: 1068: 1064: 1049: 1042: 1030: 1026: 1022: 1014: 1010: 1002: 998: 988: 979: 977: 972: 970: 964: 960:, represents 943: 935: 929: 924: 921: 915: 905: 893: 887: 883: 875: 867: 861: 855: 845: 843: 839: 834: 830: 820: 814: 810: 806: 802: 798: 797: 792: 789: 785: 781: 776: 774: 770: 765: 763: 759: 755: 751: 747: 742: 740: 736: 732: 728: 724: 720: 710: 708: 704: 700: 696: 692: 688: 684: 680: 676: 671: 669: 665: 661: 657: 652: 650: 638: 633: 631: 626: 624: 619: 618: 616: 615: 610: 607: 605: 602: 601: 600: 599: 595: 594: 589: 586: 584: 581: 579: 576: 574: 571: 567: 564: 563: 562: 559: 555: 552: 550: 547: 545: 542: 541: 540: 537: 533: 530: 529: 528: 525: 523: 520: 518: 515: 511: 508: 507: 506: 503: 501: 498: 496: 493: 491: 490:Language code 488: 484: 481: 479: 476: 475: 474: 471: 469: 466: 464: 461: 457: 456:alt attribute 454: 453: 452: 449: 445: 442: 440: 437: 435: 432: 430: 427: 426: 425: 422: 418: 415: 413: 410: 409: 408: 405: 401: 398: 396: 393: 391: 388: 386: 383: 382: 381: 378: 376: 373: 372: 371: 370: 367: 364: 363: 356: 352: 348: 346: 332: 329: 314: 311: 303: 292: 289: 285: 282: 278: 275: 271: 268: 264: 261: –  260: 256: 255:Find sources: 249: 245: 239: 238: 233:This article 231: 227: 222: 221: 212: 209: 201: 198:December 2011 191: 187: 181: 180: 175:This article 173: 164: 163: 154: 151: 143: 140:December 2011 132: 129: 125: 122: 118: 115: 111: 108: 104: 101: –  100: 96: 95:Find sources: 89: 83: 82: 78: 73:This article 71: 67: 62: 61: 56: 54: 47: 46: 41: 40: 35: 30: 21: 20: 3306:Meetei Mayek 3257:(Chorasmian) 3160:Cypro-Minoan 2937:Pahawh Hmong 2752:Gurung Khema 2501:ISO/IEC 8859 2452: 2343:UTF-32/UCS-4 2338:UTF-16/UCS-2 2145:Variant form 1888: 1879: 1864:by removing 1851: 1814: 1803:. Retrieved 1799: 1790: 1779:. Retrieved 1775: 1766: 1758: 1753:17 September 1751:. Retrieved 1741: 1681: 1669: 1663: 1653: 1622: 1576:&#x0D37; 1550:&#x16A0; 1524:&#xB5AB; 1520:&#46507; 1497:&#x8449; 1493:&#33865; 1470:&#x53F6; 1466:&#21494; 1447:&#x3042; 1443:&#12354; 1421:&#x1250; 1298:Z with háček 1162: 1150: 1138: 1133: 1129: 1125: 1122: 1097: 1080: 1070: 1067:Windows-1251 1063:Windows-1252 1055: 994: 985: 973: 958:&#x2014; 952:, much like 939: 931: 927: 925: 919: 917: 902:&#x5408; 898:&#21512; 894: 880:is either a 871: 863: 857: 831:, where the 826: 817:&#x263A; 809:Windows-1252 794: 783: 777: 766: 749: 745: 743: 737:systems and 729:, which are 716: 707:web browsers 672: 656:Windows-1252 653: 646: 505:Style sheets 482: 434:div and span 424:HTML element 375:Dynamic HTML 342: 324: 306: 300:January 2011 297: 287: 280: 273: 266: 254: 242:Please help 237:verification 234: 204: 195: 176: 146: 137: 127: 120: 113: 106: 94: 74: 50: 43: 37: 36:Please help 33: 3492:SignWriting 3361:Old Sogdian 3331:Nandinagari 3255:Khwarezmian 3165:Dives Akuru 3091:Ancient and 3077:Warang Citi 2942:Pau Cin Hau 2897:New Tai Lue 2892:Nag Mundari 2867:Medefaidrin 2576:Common and 2385:Equivalence 2363:code points 2361:On pairs of 2275:Equivalence 2150:Word joiner 2140:Soft hyphen 2056:Code points 1572:&#3383; 1546:&#5792; 1417:&#4688; 1393:&#xE57; 1389:&#3671; 1367:&#x645; 1363:&#1605; 1341:&#x5E7; 1337:&#1511; 1315:&#x419; 1311:&#1049; 1292:&#x17D; 1266:&#x394; 1124:in concert 954:&#8212; 950:&mdash; 900:instead of 886:hexadecimal 780:file system 773:abstraction 596:Comparisons 583:Web storage 578:Quirks mode 517:Font family 468:HTML editor 3589:Categories 3386:Phoenician 3371:Old Uyghur 3366:Old Turkic 3351:Old Permic 3346:Old Italic 3296:Manichaean 3190:Glagolitic 2967:Saurashtra 2712:Devanagari 2591:Diacritics 2348:UTF-EBCDIC 2251:Algorithms 2244:Processing 2181:Characters 2105:Characters 1882:April 2020 1819:Mark Davis 1805:2023-03-09 1781:2023-03-09 1776:www.w3.org 1733:References 1288:&#381; 1262:&#916; 1243:&#xFE; 1239:&#254; 1220:&#xDF; 1216:&#223; 1194:&#x41; 1172:Character 946:EntityName 937:EntityName 842:endianness 750:code point 727:characters 527:JavaScript 522:Web colors 463:HTML frame 270:newspapers 110:newspapers 77:references 39:improve it 3381:ʼPhags-pa 3376:Palmyrene 3326:Nabataean 3250:Khudawadi 3235:Kharosthi 3150:Cuneiform 3125:Bhaiksuki 3120:Bassa Vah 2987:Sundanese 2962:Samaritan 2877:Mongolian 2852:Malayalam 2817:Kirat Rai 2527:Anomalies 2511:ISO 15924 2506:DIN 91379 2407:Z-variant 2390:Homoglyph 2263:Collation 1916:"Special" 1866:excessive 1694:(US) and 1658:4.77 and 1581:Malayalam 1428:syllable 1190:&#65; 1146:text/html 1141:text/html 1059:text/html 1023:(such as 1017:text/html 731:graphemes 683:web pages 660:ISO 10646 566:Validator 45:talk page 3513:Currency 3487:Duployan 3461:Vithkuqi 3456:Ugaritic 3311:Meroitic 3281:Mahajani 3266:Linear B 3261:Linear A 3052:Tifinagh 3017:Tai Viet 3012:Tai Tham 3002:Tagbanwa 2917:Ol Chiki 2807:Kayah Li 2802:Katakana 2787:Javanese 2782:Hiragana 2772:Hanunuoo 2747:Gurmukhi 2737:Gujarati 2727:Georgian 2702:Cyrillic 2692:Cherokee 2657:Bopomofo 2637:Balinese 2632:Armenian 2496:GB 18030 2313:Punycode 2201:Numerals 2133: / 2046:Versions 1964:Archived 1706:See also 1615:Code2000 1591:U+1F602 1532:syllable 1508:"Leaf") 1481:"Leaf") 1452:Hiragana 1320:Cyrillic 1113:UTF-32BE 1109:UTF-32LE 1105:UTF-16LE 1101:UTF-16BE 944:, where 876:, where 739:networks 701:syntax, 3600:Unicode 3421:Soyombo 3411:Sogdian 3406:Siddham 3401:Sharada 3321:Multani 3301:Marchen 3291:Mandaic 3286:Makasar 3200:Grantha 3185:Elymaic 3180:Elbasan 3155:Cypriot 3115:Avestan 3057:Tirhuta 3047:Tibetan 2992:Sunuwar 2977:Sinhala 2972:Shavian 2952:Ranjana 2932:Osmanya 2922:Ol Onal 2847:Lontara 2797:Kannada 2707:Deseret 2672:Burmese 2662:Braille 2652:Bengali 2606:Numbers 2567:Scripts 2216:Symbols 2206:Scripts 2029:Unicode 2022:Unicode 1912:Latin-1 1860:Please 1852:use of 1748:"HTML5" 1568:U+0D37 1557:letter 1542:U+16A0 1516:U+B5AB 1504:-8449 ( 1489:U+8449 1477:-53F6 ( 1462:U+53F6 1439:U+3042 1413:U+1250 1385:U+0E57 1374:letter 1359:U+0645 1348:letter 1333:U+05E7 1324:Short I 1307:U+0419 1284:U+017D 1258:U+0394 1235:U+00FE 1226:Sharp S 1212:U+00DF 1186:U+0041 1134:against 969:em dash 882:decimal 803:, like 784:encoded 764:(UCS). 758:Unicode 675:Unicode 483:Unicode 444:marquee 385:article 284:scholar 184:Please 124:scholar 3568:  3557:  3466:Yezidi 3446:Todhri 3441:Tangut 3276:Lydian 3271:Lycian 3245:Khojki 3225:Kaithi 3205:Hatran 3195:Gothic 3145:Coptic 3135:Carian 3130:Brāhmī 3072:Wancho 3037:Thaana 3032:Telugu 3027:Tangsa 3007:Tai Le 2997:Syriac 2957:Rejang 2832:Lepcha 2777:Hebrew 2757:Hangul 2682:Chakma 2627:Arabic 2601:Spaces 2308:CESU-8 2303:BOCU-1 2211:Spaces 1918:, and 1696:8859-1 1684:Google 1633:Safari 1529:Hangul 1372:Arabic 1346:Hebrew 1071:legacy 1029:UTF-16 967:: the 865:&# 838:UTF-16 791:octets 666:  573:WHATWG 549:WebGPU 395:canvas 286:  279:  272:  265:  257:  126:  119:  112:  105:  97:  3540:Emoji 3436:Takri 3396:Runic 3336:Ogham 3170:Dogra 3022:Tamil 2927:Osage 2902:Nüshu 2837:Limbu 2827:Latin 2812:Khmer 2792:Kanji 2767:Hanja 2732:Greek 2722:Geʽez 2717:Garay 2667:Buhid 2647:Batak 2642:Bamum 2622:Adlam 2470:Input 2448:Fonts 2443:Email 2431:Usage 2333:UTF-8 2328:UTF-7 2323:UTF-1 2174:Lists 2091:Plane 2064:Block 1692:ASCII 1688:UTF-8 1645:fonts 1629:Opera 1555:Runic 1426:Ge'ez 1401:digit 1275:Delta 1271:Greek 1249:Thorn 1199:Latin 1041:HTML5 1031:), a 1025:UTF-8 1021:ASCII 933:& 833:ASCII 829:UTF-8 805:UTF-8 796:bytes 723:XHTML 554:WebXR 544:WebGL 539:Web3D 532:WebCL 439:blink 412:Basic 407:XHTML 400:video 390:audio 380:HTML5 291:JSTOR 277:books 131:JSTOR 117:books 3595:HTML 3316:Modi 3230:Kawi 3100:Ahom 3062:Toto 3042:Thai 2912:Odia 2887:N'Ko 2687:Cham 2453:HTML 2400:list 2318:SCSU 2069:List 1755:2011 1700:1252 1664:will 1635:and 1559:Fehu 1398:Thai 1376:Meem 1126:with 1111:and 1085:and 1033:meta 1001:HTTP 997:MIME 965:2014 719:HTML 703:font 689:and 668:2070 649:HTML 429:meta 366:HTML 263:news 103:news 3067:Vai 2882:Mru 2822:Lao 2121:BOM 1868:or 1608:😂 1597:or 1574:or 1548:or 1522:or 1495:or 1468:or 1445:or 1430:Qha 1419:or 1391:or 1365:or 1350:Qof 1339:or 1313:or 1290:or 1264:or 1241:or 1218:or 1192:or 956:or 904:). 788:bit 769:XML 754:DTD 721:or 679:web 664:RFC 561:W3C 510:CSS 246:by 79:to 3591:: 3082:Yi 1914:, 1821:: 1798:. 1774:. 1757:. 1651:. 1631:, 1627:, 1617:. 1586:ഷ 1563:ᚠ 1537:떫 1511:葉 1484:叶 1457:あ 1434:ቐ 1408:๗ 1380:م 1354:ק 1328:Й 1302:Ž 1279:Δ 1253:þ 1230:ß 1207:A 1107:, 1103:, 1043:) 978:. 963:U+ 741:. 709:. 697:, 90:. 48:. 2014:e 2007:t 2000:v 1895:) 1889:( 1884:) 1880:( 1876:. 1858:. 1808:. 1784:. 1698:/ 1404:7 1203:A 941:; 890:x 878:N 873:; 869:N 793:( 636:e 629:t 622:v 357:. 347:. 331:) 325:( 313:) 307:( 302:) 298:( 288:· 281:· 274:· 267:· 240:. 211:) 205:( 200:) 196:( 192:. 153:) 147:( 142:) 138:( 128:· 121:· 114:· 107:· 84:. 55:) 51:(

Index

improve it
talk page
Learn how and when to remove these messages

references
primary sources
secondary or tertiary sources
"Unicode and HTML"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
personal reflection, personal essay, or argumentative essay
help improve it
encyclopedic style
Learn how and when to remove this message

verification
improve this article
adding citations to reliable sources
"Unicode and HTML"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
Learn how and when to remove this message

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.