Knowledge

Unicode

Source 📝

5030:
sidebearing (depending on the direction of the script they are intended to be used with). A mark handled this way will appear over whatever character precedes it, but will not adjust its position relative to the width or height of the base glyph; it may be visually awkward and it may overlap some glyphs. Real stacking is impossible but can be approximated in limited cases (for example, Thai top-combining vowels and tone marks can just be at different heights to start with). Generally, this approach is only effective in monospaced fonts but may be used as a fallback rendering method when more complex methods fail.
6570: 4855:, so in principle Unicode could have enabled their composition as it did with Hangul. While this could have greatly reduced the number of required code points, as well as allowing the algorithmic synthesis of many arbitrary new characters, the complexities of character etymologies and the post-hoc nature of radical systems add immense complexity to the proposal. Indeed, attempts to design CJK encodings on the basis of composing radicals have been met with difficulties resulting from the reality that Chinese characters do not decompose as simply or as regularly as Hangul does. 6343:. This was an attempt to provide a Unicode solution to encoding paragraphs and lines semantically, potentially replacing all of the various platform solutions. In doing so, Unicode does provide a way around the historical platform-dependent solutions. Nonetheless, few if any Unicode solutions have adopted these Unicode line and paragraph separators as the sole canonical line ending characters. However, a common approach to solving this issue is through newline normalization. This is achieved with the 4922: 14505: 12461: 12450: 6477: 29: 4945: 629: 5738:, the UTF-8 standard, recommends that byte order marks be forbidden in protocols using UTF-8, but discusses the cases where this may not be possible. In addition, the large restriction on possible patterns in UTF-8 (for instance there cannot be any lone bytes with the high bit set) means that it should be possible to distinguish UTF-8 from other character encodings without relying on the BOM. 6285:) is based on those). These font formats map Unicode code points to glyphs, but OpenType and TrueType font files are restricted to 65,535 glyphs. Collection files provide a "gap mode" mechanism for overcoming this limit in a single font file. (Each font within the collection still has the 65,535 limit, however.) A TrueType Collection file would typically have a file extension of ".ttc". 6789:, which worked in the same way, and was the way in which Thai had always been written on keyboards. This ordering problem complicates the Unicode collation process slightly, requiring table lookups to reorder Thai characters for collation. Even if Unicode had adopted encoding according to spoken order, it would still be problematic to collate words in dictionary order. E.g., the word 6766:
going against the practice for other writing systems, though Unicode contains some Arabic and other ligatures for backward compatibility purposes only. Encoding of any new ligatures in Unicode will not happen, in part, because the set of ligatures is font-dependent, and Unicode is an encoding independent of font variations. The same kind of issue arose for the
549:, and thousands of rarely used or obsolete characters that had not been anticipated for inclusion in the standard. Among these characters are various rarely used CJK characters—many mainly being used in proper names, making them far more necessary for a universal encoding than the original Unicode architecture envisioned. 280:, with the continued development thereof conducted by the Consortium as a part of the standard. Moreover, the widespread adoption of Unicode was in large part responsible for the initial popularization of emoji outside of Japan. Unicode is ultimately capable of encoding more than 1.1 million characters. 6879:
many code point "names" from version 1. At the same moment, Unicode stated that, thenceforth, an assigned name to a code point would never change. This implies that when mistakes are published, these mistakes cannot be corrected, even if they are trivial (as happened in one instance with the spelling
6820:
Characters with diacritical marks can generally be represented either as a single precomposed character or as a decomposed sequence of a base letter plus one or more non-spacing marks. For example, ḗ (precomposed e with macron and acute above) and ḗ (e followed by the combining macron above and
6448:
The earliest version of Unicode had a repertoire of fewer than 21,000 Han characters, largely limited to those in relatively common modern usage. As of version 16.0, the standard now encodes more than 97,000 Han characters, and work is continuing to add thousands more—largely historical and dialectal
6399:
is to be considered a handwriting/font difference (and thus unified), versus a spelling difference (to be encoded separately). Unicode's character model for CJK characters was based on the unification criteria used by JIS X 0208, as well as those developed by the Association for a Common Chinese Code
3192:
property. Here, at the uppermost level code points are categorized as one of Letter, Mark, Number, Punctuation, Symbol, Separator, or Other. Under each category, each code point is then further subcategorized. In most cases, other properties must be used to adequately describe all the characteristics
775:
was sold as a print volume containing the complete core specification, standard annexes, and code charts. However, version 5.0, published in 2006, was the last version printed this way. Starting with version 5.2, only the core specification, published as a print-on-demand paperback, may be purchased.
6488:
If the appropriate glyphs for characters in the same script differ only in the italic, Unicode has generally unified them, as can be seen in the comparison among a set of seven characters' italic glyphs as typically appearing in Russian, traditional Bulgarian, Macedonian, and Serbian texts at right,
6049:
Online tools for finding the code point for a known character include Unicode Lookup by Jonathan Hedley and Shapecatcher by Benjamin Milde. In Unicode Lookup, one enters a search key (e.g. "fractions"), and a list of corresponding characters with their code points is returned. In Shapecatcher, based
10111:
Select this deployment format if your system supports variable fonts and you prefer to use only one language, but also want full character coverage or the ability to language-tag text to use glyphs that are appropriate for the other languages (this requires an app that supports language tagging and
6347:
in Mac OS X and also with W3C XML and HTML recommendations. In this approach, every possible newline character is converted internally to a common newline (which one does not really matter since it is an internal operation just for rendering). In other words, the text system can correctly treat the
5745:
code unit serves as a fairly direct representation of any character's code point (although the endianness, which varies across different platforms, affects how the code unit manifests as a byte sequence). In the other encodings, each code point may be represented by a variable number of code units.
4906:
of an ideograph. There is no canonical description of unencoded ideographs; there is no semantic assigned to described ideographs; there is no equivalence defined for described ideographs. Conceptually, ideographic descriptions are more akin to the English phrase "an 'e' with an acute accent on it"
306:
is more than just a repertoire within which characters are assigned. To aid developers and designers, the standard also provides charts and reference data, as well as annexes explaining concepts germane to various scripts, providing guidance for their implementation. Topics covered by these annexes
6878:
has imposed rules intended to guarantee stability. Depending on the strictness of a rule, a change can be prohibited or allowed. For example, a "name" given to a code point cannot and will not change. But a "script" property is more flexible, by Unicode's own rules. In version 2.0, Unicode changed
6784:
support has been criticized for its ordering of Thai characters. The vowels เ, แ, โ, ใ, ไ that are written to the left of the preceding consonant are in visual order instead of phonetic order, unlike the Unicode representations of other Indic scripts. This complication is due to Unicode inheriting
5055:
minority languages and the official languages of Iceland, Liechtenstein, Norway, and Switzerland. To allow the transliteration of names in other writing systems to the Latin script according to the relevant ISO standards, all necessary combinations of base letters and diacritic signs are provided.
783:
has regularly released annual expanded versions, occasionally with more than one version released in a calendar year and with rare cases where the scheduled release had to be postponed. For instance, in April 2020, a month after version 13.0 was published, the Unicode Consortium announced they had
6765:
standard. The correct rendering of Unicode Indic text requires transforming the stored logical order characters into visual order and the forming of ligatures (also known as conjuncts) out of components. Some local scholars argued in favor of assignments of Unicode code points to these ligatures,
5054:
specifies a subset of Unicode letters, special characters, and sequences of letters and diacritic signs to allow the correct representation of names and to simplify data exchange in Europe. This standard supports all of the official languages of all European Union countries, as well as the German
509:
Unicode gives higher priority to ensuring utility for the future than to preserving past antiquities. Unicode aims in the first instance at the characters published in the modern text (e.g. in the union of all newspapers and magazines printed in the world in 1988), whose number is undoubtedly far
5046:
with 657 characters, which is considered to support all contemporary European languages using the Latin, Greek, or Cyrillic script. Other standardized subsets of Unicode include the Multilingual European Subsets: MES-1 (Latin scripts only; 335 characters), MES-2 (Latin, Greek, and Cyrillic; 1062
358:
Unicode was originally designed with the intent of transcending limitations present in all text encodings designed up to that point: each encoding was relied upon for use in its own context, but with no particular expectation of compatibility with any other. Indeed, any two encodings chosen were
6071:
defines two different mechanisms for encoding non-ASCII characters in email, depending on whether the characters are in email headers (such as the "Subject:"), or in the text body of the message; in both cases, the original character set is identified as well as a transfer encoding. For email
5029:
Instructions are also embedded in fonts to tell the operating system how to properly output different character sequences. A simple solution to the placement of combining marks or diacritics is assigning the marks a width of zero and placing the glyph itself to the left or right of the left
4674:
representable under Unicode. Unicode encodes characters by associating an abstract character with a particular code point. However, not all abstract characters are encoded as a single Unicode character, and some abstract characters may be represented in Unicode by a sequence of two or more
6295:
typically focus on supporting only basic ASCII and particular scripts or sets of characters or symbols. Several reasons justify this approach: applications and documents rarely need to render characters from more than one or two writing systems; fonts tend to demand resources in computing
6641:
to and from any preexisting character encodings, so that text files in older character sets can be converted to Unicode and then back and get back the same file, without employing context-dependent interpretation. That has meant that inconsistent legacy architectures, such as
6660:
mappings must be provided between characters in existing legacy character sets and characters in Unicode to facilitate conversion to Unicode and allow interoperability with legacy software. Lack of consistency in various mappings between earlier Japanese encodings such as
414:
standard, with the intent of trivializing the conversion of text already written in Western European scripts. To preserve the distinctions made by different legacy encodings, therefore allowing for conversion between them and Unicode without any loss of information, many
6404:
variants, possibly complicating processing of ancient and uncommon Japanese names. Since it places particular emphasis on Chinese, Japanese and Korean sharing many characters in common, Han unification is also sometimes perceived as treating the three as the same thing.
6845:, will often be placed incorrectly. Unicode characters that map to precomposed glyphs can be used in many cases, thus avoiding the problem, but where no precomposed character has been encoded, the problem can often be solved by using a specialist Unicode font such as 5888:
Unicode has become the dominant scheme for the internal processing and storage of text. Although a great deal of text is still stored in legacy encodings, Unicode is used almost exclusively for building new information processing systems. Early adopters tended to use
3092: 2656:, extended IPA, Arabic script additions for use in languages across Africa and in Iran, Pakistan, Malaysia, Indonesia, Java, and Bosnia, additions for honorifics and Quranic use, additions to support languages in North America, the Philippines, India, and Mongolia, 3179:
of related characters. The size of a block is always a multiple of 16, and is often a multiple of 128, but is otherwise arbitrary. Characters required for a given script may be spread out over several different, potentially disjunct blocks within the codespace.
6605:, many of which look very similar or identical to ASCII letters. Substitution of these can make an identifier or URL that looks correct, but directs to a different location than expected. Additionally, homoglyphs can also be used for manipulating the output of 6376:(IRG) is tasked with advising the Consortium and ISO regarding Han unification, or Unihan, especially the further addition of CJK unified and compatibility ideographs to the repertoire. The IRG is composed of experts from each region that has historically used 667:), although there are still scripts that are not yet encoded, particularly those mainly used in historical, liturgical, and academic contexts. Further additions of characters to the already encoded scripts, as well as symbols, in particular for mathematics and 609:
The Consortium has the ambitious goal of eventually replacing existing character encoding schemes with Unicode and its standard Unicode Transformation Format (UTF) schemes, as many of the existing schemes are limited in size and scope and are incompatible with
544:
In 1996, a surrogate character mechanism was implemented in Unicode 2.0, so that Unicode was no longer restricted to 16 bits. This increased the Unicode codespace to over a million code points, which allowed for the encoding of many historic scripts, such as
4772:
versions of most letter/diacritic combinations in normal use. These make the conversion to and from legacy encodings simpler, and allow applications to use Unicode as an internal text format without having to implement combining characters. For example,
6408:
Less-frequently-used alternative encodings exist, often predating Unicode, with character models differing from this paradigm, aimed at preserving the various stylistic differences between regional and/or nonstandard character forms. One example is the
6654:. Since version 3.0, any precomposed characters that can be represented by a combined sequence of already existing characters can no longer be added to the standard to preserve interoperability between software using different versions of Unicode. 482:, Becker published a draft proposal for an "international/multilingual text character encoding system in August 1988, tentatively called Unicode". He explained that "the name 'Unicode' is intended to suggest a unique, unified, universal encoding". 407:. However, partially with the intent of encouraging rapid adoption, the simplicity of this original model has become somewhat more elaborate over time, and various pragmatic concessions have been made over the course of the standard's development. 5002:. The rules governing ligature formation can be quite complex, requiring special script-shaping technologies such as ACE (Arabic Calligraphic Engine by DecoType in the 1980s and used to generate all the Arabic examples in the printed editions of 566:
The Unicode Consortium is a nonprofit organization that coordinates Unicode's development. Full members include most of the main computer software and hardware companies (and few others) with any interest in text-processing standards, including
6800:"perform" starts with a consonant cluster "สด" (with an inherent vowel for the consonant "ส"), the vowel แ-, in spoken order would come after the ด, but in a dictionary, the word is collated as it is written, with the vowel following the ส. 1125:
Original set of Hangul syllables removed, new set of 11,172 Hangul syllables added at new location, Tibetan added back in a new location and with a different character repertoire, Surrogate character mechanism defined, Plane 15 and Plane 16
5725:
allows the BOM "can serve as a signature for UTF-8 encoded text where the character set is unmarked". Some software developers have adopted it for other encodings, including UTF-8, in an attempt to distinguish UTF-8 from local 8-bit
3158:(BMP), and contains the most commonly used characters. All code points in the BMP are accessed as a single code unit in UTF-16 encoding and can be encoded in one, two or three bytes in UTF-8. Code points in planes 1 through 16 (the 5612:. All UTF encodings map code points to a unique sequence of bytes. The numbers in the names of the encodings indicate the number of bits per code unit (for UTF encodings) or the number of bytes per code unit (for UCS encodings and 6609:
systems. Mitigation requires disallowing these characters, displaying them differently, or requiring that they resolve to the same identifier; all of this is complicated due to the huge and constantly changing set of characters.
510:
below 2 = 16,384. Beyond those modern-use characters, all others may be defined to be obsolete or rare; these are better candidates for private-use registration than for congesting the public list of generally useful Unicode.
5860:
characters to display content, UTF-8 was designed with 8-bit ASCII as a subset and almost no websites now declare their encoding to only be ASCII instead of UTF-8. Over a third of the languages tracked have 100% UTF-8 use.
4356:>, <control-0088>. The Name remains blank, which can prevent inadvertently replacing, in documentation, a Control Name with a true Control code. Unicode also uses <not a character> for <noncharacter>. 4850:
presently only have codes for uncomposable radicals and precomposed forms. Most Han characters have either been intentionally composed from, or reconstructed as compositions of, simpler orthographic elements called
678:, Rick McGowan, Ken Whistler, V.S. Umamaheswaran) maintain the list of scripts that are candidates or potential candidates for encoding and their tentative code block assignments on the Unicode Roadmap page of the 6554:, which unifies the uppercase forms). Although it allows for case-insensitive comparison without needing to know the language of the text, this approach also has issues, requiring security measures relating to 501:" that has been stretched to 16 bits to encompass the characters of all the world's living languages. In a properly engineered design, 16 bits per character are more than sufficient for this purpose. 6732:, which was mapped to 0x5C in JIS X 0201, and a lot of legacy code exists with this usage. (This encoding also replaces tilde '~' 0x7E with macron '¯', now 0xAF.) The separation of these characters exists in 779:
A practical reason for this publication method highlights the second significant difference between the UCS and Unicode—the frequency with which updated versions are released and new characters added.
4474:
such that any interchange of such code points requires an independent agreement between the sender and receiver as to their interpretation. There are three private-use areas in the Unicode codespace:
6177:
according to the document's encoding, if the encoding supports them, or users may write them as numeric character references based on the character's Unicode code point. For example, the references
7399:(ideographic character unification). Unicode retains the many features of XCCS whose utility has been proved over the years in an international line of communication multilingual system products. 736:
was founded in 2002 with the goal of funding proposals for scripts not yet encoded in the standard. The project has become a major source of proposed additions to the standard in recent years.
7393:(XCCS) by the present author, a multilingual encoding that has been maintained by Xerox as an internal corporate standard since 1982, through the efforts of Ed Smura, Ron Pellar, and others. 826:
have been published. Update versions, which do not include any changes to character repertoire, are signified by the third number (e.g., "version 4.0.1") and are omitted in the table below.
276:
Many common characters, including numerals, punctuation, and other symbols, are unified within the standard and are not treated as specific to any given writing system. Unicode encodes 3790
7395:
Unicode arose as the result of eight years of working experience with XCCS. Its fundamental differences from XCCS were proposed by Peter Fenwick and Dave Opstad (pure 16-bit codes) and by
6258:, from the more common bold, italic and base letterforms to complex decorative styles. A font is "Unicode compliant" if the glyphs in the font can be accessed using code points defined in 4446:). The set of noncharacters is stable, and no new noncharacters will ever be defined. Like surrogates, the rule that these cannot be used is often ignored, although the operation of the 6024:
Because keyboard layouts cannot have simple key combinations for all characters, several operating systems provide alternative input methods that allow access to the entire repertoire.
759:
While the UCS is a simple character map, Unicode specifies the rules, algorithms, and properties necessary to achieve interoperability between different platforms and languages. Thus,
399:
to each character. Many issues of visual representation—including size, shape, and style—are intended to be up to the discretion of the software actually rendering the text, such as a
6400:
in China. Due to the standard's principle of encoding semantic instead of stylistic variants, Unicode has received criticism for not assigning code points to certain rare and archaic
6291:
exist on the market, but fewer than a dozen fonts—sometimes described as "pan-Unicode" fonts—attempt to support the majority of Unicode's character repertoire. Instead, Unicode-based
4578:
may be used to change the default shaping behavior of adjacent characters (e.g. to inhibit ligatures or request ligature formation). There are 172 format characters in Unicode 16.0.
10860: 530:'s Rick McGowan had also joined the group. By the end of 1990, most of the work of remapping existing standards had been completed, and a final review draft of Unicode was ready. 4410:
A small set of code points are guaranteed never to be assigned to characters, although third-parties may make independent use of them at their discretion. There are 66 of these
6887:
in a character name). In 2006 a list of anomalies in character names was first published, and, as of June 2021, there were 104 characters with identified issues, for example:
2600:
in Pakistan, Bopomofo additions used for Cantonese, Creative Commons license symbols, graphic characters for compatibility with teletext and home computer systems, 55 emoji
14123: 6456:
Modern typefaces provide a means to address some of the practical issues in depicting unified Han characters with various regional graphical representations. The 'locl'
5673:) per code point and, being compact for Latin scripts and ASCII-compatible, provides the de facto standard encoding for the interchange of Unicode text. It is used by 6535:. As such, case-insensitive comparisons for those languages have to use different rules than case-insensitive comparisons for other languages using the Latin script. 6300:. Furthermore, designing a consistent set of rendering instructions for tens of thousands of glyphs constitutes a monumental task; such a venture passes the point of 10845: – contains lists of word processors with Unicode capability; fonts and characters are grouped by type; characters are presented in lists, not grids. 6296:
environments; and operating systems and applications show increasing intelligence in regard to obtaining glyph information from separate font files as needed, i.e.,
5762:
programming language (beginning with 2.2) may also be configured to use UTF-32 as the representation for Unicode strings, effectively disseminating such encoding in
5746:
UTF-32 is widely used as an internal representation of text in programs (as opposed to stored or transmitted text), since every Unix operating system that uses the
6098:, and some devices, such as mobile phones, still cannot correctly handle Unicode data. Support has been improving, however. Many major free mail providers such as 552:
Version 1.0 of Microsoft's TrueType specification, published in 1992, used the name "Apple Unicode" instead of "Unicode" for the Platform ID in the naming table.
287:, each used within different locales and on different computer architectures. Unicode is used to encode the vast majority of text on the Internet, including most 4768:
that may be added after the base character by the user. Multiple combining diacritics may be simultaneously applied to the same character. Unicode also contains
4665: 4554:
characters are characters that do not have a visible appearance but may have an effect on the appearance or behavior of neighboring characters. For example,
12968: 3224: 207: 9271:"DIN 91379:2022-08: Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM" 6833:(◌́), but in practice, their appearance may vary depending upon what rendering engine and fonts are being used to display the characters. Similarly, 6550:, which usually look the same in uppercase (Đ), are given the opposite treatment, and encoded separately in both letter-cases (in contrast to the earlier 6650:, both exist in Unicode, giving more than one method of representing some text. This is most pronounced in the three different encoding forms for Korean 603: 390:, there is considerable disagreement regarding which differences justify their own encodings, and which are only graphical variants of other characters. 363:
by the other. Most encodings had only been designed to facilitate interoperation between a handful of scripts—often primarily between a given script and
12805: 5856:. As of 2024, UTF-8 accounts for on average 97.8% of all web pages (and 987 of the top 1,000 highest-ranked web pages). Although many pages only use 514:
In early 1989, the Unicode working group expanded to include Ken Whistler and Mike Kernaghan of Metaphor, Karen Smith-Yoshimura and Joan Aliprand of
13914: 11353: 7032: 6088:
characters. The details of the two different mechanisms are specified in the MIME standards and generally are hidden from users of email software.
5571:
to indicate the position of the unrecognized character. Some systems have made attempts to provide more information about such characters. Apple's
11008: 5897:(the variable-length current standard), as this was the least disruptive way to add support for non-BMP characters. The best known such system is 4706:
All assigned characters have a unique and immutable name by which they are identified. This immutability has been guaranteed since version 2.0 of
11295: 6489:
meaning that the differences are displayed through smart font technology or manually changing fonts. The same OpenType 'locl' technique is used.
756:(UCS) use identical character names and code points. However, the Unicode versions do differ from their ISO equivalents in two significant ways. 12913: 6091:
The IETF has defined a framework for internationalized email using UTF-8, and has updated several protocols in accordance with that framework.
12988: 12502: 12418: 4407:. In principle, these code points cannot otherwise be used, though in practice this rule is often ignored, especially when not using UTF-16. 420: 8646: 7114:
is an abstract representation of an UCS character by an integer between 0 and 1,114,111 (1,114,112 = 2 + 2 or 17 × 2 = 0x110000 code points)
6262:. The standard does not specify a minimum number of characters that must be included in the font; some fonts have quite a small repertoire. 10908: 4710:
by its Name Stability policy. In cases where a name is seriously defective and misleading, or has a serious typographical error, a formal
346:, though several others exist. Of these, UTF-8 is the most widely used by a large margin, in part due to its backwards-compatibility with 6673:
mismatches, particularly the mapping of the character JIS X 0208 '~' (1-33, WAVE DASH), heavily used in legacy database data, to either
6433:
encoding (introduced in 1984, four years after CCCII) having become more common than CCCII outside of library systems. Although work at
698:, no proposal has yet been made, and they await agreement on character repertoire and other details from the user communities involved. 12403: 7048: 497:
Unicode is intended to address the need for a workable, reliable world text encoding. Unicode could be roughly described as "wide-body
4764:
Unicode includes a mechanism for modifying characters that greatly extends the supported repertoire of glyphs. This covers the use of
4649:
code points are those code points that are valid and available for use, but have not yet been assigned. As of Unicode 15.1, there are
10052: 7127:
style (Ꝺ) with the crossbar positioned on the stem, particularly if it needs to be distinguished from the uppercase retroflex D (see
6380:. However, despite the deliberation within the committee, Han unification has consistently been one of the most contested aspects of 10535: 4703:. Unicode maintains a list of uniquely named character sequences for abstract characters that are not directly encoded in Unicode. 3217: 200: 7328: 5644: 14215: 13969: 12423: 11213: 10877:– displays the Unicode 6.1 value of any character in a document, including in the Private Use Area, rather than the glyph itself. 10010: 6771: 6413:
favored by some users for handling historical Japanese text, though not widely adopted among the Japanese public. Another is the
5815: 5803: 121: 7389:. Many persons contributed ideas to the development of a new encoding design. Beginning in 1980, these efforts evolved into the 3530: 14205: 11198: 9904: 9850: 9796: 9742: 9688: 9634: 9509: 9455: 9232: 5811: 505:
This design decision was made based on the assumption that only scripts and characters in "modern" use would require encoding:
478:, started investigating the practicalities of creating a universal character set. With additional input from Peter Fenwick and 427:
contained both "fullwidth" (matching the width of CJK characters) and "halfwidth" (matching ordinary Latin script) characters.
116: 9212: 6622: 6464:
can also provide in-text annotations for a desired glyph selection; this requires registration of the specific variant in the
5967:
operating systems (though others are also used by some libraries) because it is a relatively easy replacement for traditional
13954: 12908: 11442: 11121: 10795: 10236: 9323: 8954: 8925: 8896: 8859: 8785: 8732: 8706: 8680: 8612: 8583: 8557: 8514: 8468: 8422: 8376: 8330: 8260: 8214: 7162: 6972: 6870: 6030:, which standardises methods for entering Unicode characters from their code points, specifies several methods. There is the 5047:
characters) and MES-3A & MES-3B (two larger subsets, not shown here). MES-2 includes every character in MES-1 and WGL-4.
2767: 2733: 2689: 2614: 2541: 2494: 2434: 2354: 2279: 2219: 2151: 2020: 1986: 1941: 1873: 1808: 1705: 6054:, one draws the character in a box and a list of characters approximating the drawing, with their code points, is returned. 3189: 690:, encoding proposals have been made and they are working their way through the approval process. For other scripts, such as 14088: 11437: 7012: 10874: 10380: 430:
The Unicode Bulldog Award is given to people deemed to be influential in Unicode's development, with recipients including
10352: 10084: 6774:
proposed encoding 956 precomposed Tibetan syllables, but these were rejected for encoding by the relevant ISO committee (
3210: 767:, and rendering. It also provides a comprehensive catalog of character properties, including those needed for supporting 230: 193: 6441:'s CJK Thesaurus, which was used to maintain the EACC variant of CCCII, was one of the direct predecessors of Unicode's 5852:
since 2008. It has near-universal adoption, and much of the non-UTF-8 content is found in other Unicode encodings, e.g.
3202: 602:
Over the years several countries or government agencies have been members of the Unicode Consortium. Presently only the
367:—not between a large number of scripts, and not with all of the scripts supported being treated in a consistent manner. 14482: 13993: 13796: 12540: 10971: 7366: 5567:
Rendering software that cannot process a Unicode character appropriately often displays it as an open rectangle, or as
5109: 2999: 791:
Unicode 16.0, the latest version, was released on 10 September 2024. It added 5,185 characters and seven new scripts:
733: 6625:
can be used to make large sections of code do something different from what they appear to do. The problem was named "
14038: 13654: 13649: 13152: 12983: 12530: 12495: 10766: 10753: 10740: 10727: 10710: 10697: 10684: 10666: 8168: 8119: 8056: 8010: 6136:
have supported Unicode, especially UTF-8, for many years. There used to be display problems resulting primarily from
4883: 1543: 1398: 1203: 1099: 1002: 880: 9961: 4255: 13072: 11270: 10901: 10776:
Haralambous, Yannis; Martin Dürst (2019). "Unicode from a Linguistic Point of View". In Haralambous, Yannis (ed.).
9245: 7602: 5529: 5495: 5347: 3808: 721: 14290: 14225: 13979: 13959: 11275: 11190: 11091: 10865: 7007: 5772:, another encoding form, enables the encoding of Unicode strings into the limited character set supported by the 5763: 416: 327: 11752: 11471: 8869: 8795: 6094:
The adoption of Unicode in email has been very slow. Some East Asian text is still encoded in encodings such as
14043: 12636: 7037: 6465: 5785: 4998:, have special orthographic rules that require certain combinations of letterforms to be combined into special 10404: 7580: 6316:
problem that occurs when trying to read a text file on different platforms. Unicode defines a large number of
4670:
The set of graphic and format characters defined by Unicode does not correspond directly to the repertoire of
4339: 4237: 4216: 68: 14157: 14128: 13778: 11572: 11386: 11370: 11333: 11180: 11116: 10936: 9132: 9040: 7390: 7063: 7043: 6998:. This, however, is not an anomaly, but the rule: hyphens are replaced by underscores in script designators. 6155:) documents, by definition, comprise characters from most of the Unicode code points, with the exception of: 5890: 5759: 5617: 5600: 5545: 4714:
may be defined that applications are encouraged to use in place of the official character name. For example,
1269: 1048: 846: 753: 459: 299: 48: 9057: 784:
changed the intended release date for version 14.0, pushing it back six months to September 2021 due to the
724:
focused on special Latin medieval characters. Part of these proposals has been already included in Unicode.
320: 14220: 14108: 14068: 12488: 12311: 12181: 11496: 11106: 7053: 6815: 6745: 6670: 6638: 4852: 1772: 2569: 1522: 14537: 14532: 14462: 14073: 14003: 13989: 13974: 13878: 13791: 13763: 13729: 12430: 11547: 11358: 11158: 10894: 6791: 6221: 6046:
specified, where the characters are listed in a table on a screen, such as with a character map program.
5950: 5330: 5094: 3843:). Does not include parentheses and brackets, which are in categories Ps and Pe. Also does not include 764: 316: 4295: 330:, which define how to translate the standard's abstracted codes for characters into sequences of bytes. 14436: 14381: 14302: 14083: 13739: 13734: 13087: 12110: 11597: 11432: 11427: 11076: 10981: 10026: 9091: 7128: 7027: 6706:
Some Japanese computer programmers objected to Unicode because it requires them to separate the use of
6373: 4765: 2837: 1446: 710: 308: 291:, and relevant Unicode support has become a common consideration in contemporary software development. 14078: 12115: 11491: 8647:"These Are The Two Emoji That Weren't Approved For Unicode 9 But Which Google Added To Android Anyway" 6460:
table allows a renderer to select a different glyph for each code point based on the text locale. The
5702:, has the important property of unambiguity on byte reorder, regardless of the Unicode encoding used; 5284:
00–15, 18–1D, 20–45, 48–4D, 50–57, 59, 5B, 5D, 5F–7D, 80–B4, B6–C4, C6–D3, D6–DB, DD–EF, F2–F4, F6–FE
4470:
code points are considered to be assigned, but they intentionally have no interpretation specified by
3859:, which despite frequent use as mathematical operators, are primarily considered to be "punctuation". 541:
was published that October. The second volume, now adding Han ideographs, was published in June 1992.
454:
The origins of Unicode can be traced back to the 1980s, to a group of individuals with connections to
14143: 14098: 13934: 13483: 13187: 13132: 13097: 12035: 11312: 11021: 7755: 7718: 6900: 6858: 6396: 5273: 5023: 4298:
Stability policy: Some gc groups will never change. gc=Nd corresponds with Numeric Type=De (decimal).
2480:, Latin letters for Egyptological and Ugaritic transliteration, hieroglyph format controls, 61 emoji 2047: 10883:, all 293 known writing systems with their Unicode status (128 not yet encoded as of June 2024) 10165: 9367: 9343: 4898:
warns against using its characters as an alternate representation for characters encoded elsewhere:
4641:
Together, graphic, format, control code, and private use characters are collectively referred to as
13658: 13167: 13147: 13142: 13082: 13077: 12586: 11126: 7827: 7637: 6629:". In response, code editors started highlighting marks to indicate forced text-direction changes. 6438: 5559: 5252: 5217: 4991: 4951: 4818:. Thus, users often have multiple equivalent ways of encoding the same character. The mechanism of 4454:
will never be the first code point in a text. The exclusion of surrogates and noncharacters leaves
4151: 3735:. Does not include the ASCII "neutral" quotation mark. May behave like Ps or Pe depending on usage 3155: 2250: 1176: 515: 14033: 11912: 10106: 8020: 7974: 7938: 7902: 7866: 7791: 1526: 14508: 14492: 14419: 14414: 14376: 14347: 14312: 13744: 13478: 13177: 13062: 12465: 12366: 12286: 11722: 11627: 10976: 10964: 6618: 6614: 6163: 5819: 5747: 5591:
Several mechanisms have been specified for storing a series of code points as a series of bytes.
5453: 4859: 4602: 2307: 1736: 226: 10261: 4269: 14103: 14093: 13949: 13939: 13473: 13182: 12627: 12614: 12550: 12055: 11802: 11657: 11592: 11307: 11175: 11081: 11040: 9229: 9073: 6606: 5436: 5398: 2641: 2465: 2389: 2195: 1752: 1030: 8089: 7278: 5758:, use UTF-32 as an internal representation for strings and characters. Recent versions of the 5688:(BOM) for use at the beginnings of text files, which may be used for byte-order detection (or 419:, in both appearance and intended function, were given distinct code points. For example, the 14280: 14118: 14053: 13929: 13468: 12622: 12251: 12005: 12000: 11877: 11290: 11055: 10571:"Unicode chart: "actually this has the form of a lowercase calligraphic p, despite its name"" 7396: 7238: 6647: 6292: 5960: 5662:, which uses one to five 8-bit units per code point, intended to maximize compatibility with 5580: 5515: 4999: 4819: 4769: 4088: 3355: 2947:
9.0 added Amendment 2, as well as Adlam, Newa, Japanese TV symbols, and 74 emoji and symbols.
2845: 2476:, hiragana and katakana small letters, Tamil historic fractions and symbols, Lao letters for 2417: 2381: 2183: 2099: 1921: 1748: 1506: 471: 266: 13498: 11957: 2986:
once a year. Version 17.0, the next major version, is projected to include 4301 new unified
2581: 14441: 14113: 13873: 13493: 12236: 12150: 12070: 11792: 11757: 11637: 10492: 10062: 9881: 9827: 9773: 9719: 9665: 9611: 9486: 9432: 9339: 9249: 8622: 8524: 8478: 8432: 8386: 8340: 8270: 8224: 8178: 8135: 8066: 7685: 7022: 6842: 6569: 6278: 5930: 5788:
in all scripts that are supported by Unicode. Earlier and now historical proposals include
2397: 2385: 2315: 2083: 1740: 1674: 1502: 546: 475: 463: 295: 8822: 7184: 5880:
in 1998, which specified that all IETF protocols "MUST be able to use the UTF-8 charset".
8: 14396: 14023: 13508: 13393: 13383: 13378: 12281: 12191: 12135: 11501: 11481: 11280: 11265: 11170: 11096: 11086: 10527: 10088: 9541: 6809: 6643: 6461: 6301: 6144:
did not render many code points unless explicitly told to use a font that contains them.
5945:
also use it for internal representation. Partial support for Unicode can be installed on
5678: 5316: 4688: 3856: 2649: 2577: 2393: 1582: 1327: 687: 312: 12276: 9920:"Setting up Windows Internet Explorer 5, 5.5 and 6 for Multilingual and Unicode Support" 9527: 9172: 8748: 7461: 7320: 6981: 6084:
transfer encoding are recommended, depending on whether much of the message consists of
4399:
code points. A high-surrogate code point followed by a low-surrogate code point forms a
3108:
valid code points within the codespace. (This number arises from the limitations of the
3044:; actual text is processed as binary data via one of several Unicode encodings, such as 1578: 705:) or which do not qualify for inclusion in Unicode due to lack of real-world use (e.g., 14477: 14325: 14138: 14133: 14058: 13057: 13031: 12555: 12511: 12382: 12301: 12271: 12241: 12221: 11857: 11837: 11587: 11153: 11030: 11026: 10931: 10719: 10658: 10242: 10214: 10173: 10151: 10034: 9270: 7552: 7362: 6826: 6775: 6377: 5777: 5575:
will display a substitute glyph indicating the Unicode range of the character, and the
5367: 4316: 4096: 3632: 3631:. Unlike other punctuation characters, these may be classified as "word" characters by 3351: 2965:
11.0 added 46 Mtavruli Georgian capital letters, 5 CJK unified ideographs, and 66 emoji
2123: 2115: 2111: 2051: 1454: 1229: 768: 745: 679: 633: 561: 534: 383: 252: 34: 10006: 9391: 6147:
Although syntax rules may affect the order in which characters are allowed to appear,
14467: 14406: 14386: 14048: 14028: 14008: 13636: 13112: 13092: 12604: 12351: 12261: 12246: 12085: 12050: 11862: 11702: 11527: 11338: 11050: 10991: 10791: 10762: 10749: 10736: 10723: 10706: 10693: 10680: 10662: 10628: 10246: 10232: 10143: 9919: 9898: 9844: 9790: 9736: 9682: 9628: 9503: 9449: 9319: 9192: 9112: 8950: 8921: 8892: 8855: 8781: 8728: 8702: 8676: 8608: 8579: 8553: 8510: 8464: 8418: 8372: 8326: 8256: 8210: 8164: 8115: 8052: 8006: 7158: 6687: 6344: 6297: 6288: 6141: 6063: 5576: 5019: 4308: 4158: 4100: 4092: 2919: 2905: 2833: 2203: 2103: 2079: 1852: 1776: 1650: 1626: 1494: 1450: 1426: 1127: 922: 902: 876: 785: 776:
The full text, on the other hand, is published as a free PDF on the Unicode website.
714: 691: 435: 11777: 10510: 7436: 7210: 4833:: Unicode provides a mechanism for composing Hangul syllables from their individual 3055:
always precedes a written code point, and the code points themselves are written as
771:, as well as visual charts and reference data sets to aid implementers. Previously, 283:
Unicode has largely supplanted the previous environment of a myriad of incompatible
14527: 14424: 13998: 13964: 13674: 13503: 12454: 12413: 12306: 12256: 12145: 12105: 12030: 12020: 12010: 11882: 11867: 11772: 11747: 11622: 11602: 11462: 11348: 11101: 11060: 10783: 10329: 10224: 10125: 9871: 9817: 9763: 9709: 9655: 9601: 9476: 9422: 9216: 7351: 7299: 7246: 7100: 6498: 6233: 6119: 6081: 5873: 5742: 5731: 5714:
in places other than the beginning of text conveys the zero-width non-break space.
5572: 5235: 5158: 5140: 5007: 3844: 3628: 3426: 2597: 2405: 2246: 2095: 2067: 2043: 1844: 1780: 1744: 1728: 1654: 1618: 1442: 1319: 1245: 1241: 966: 906: 668: 652: 623: 519: 490: 431: 374:
and grapheme-like units—rather than graphical distinctions considered mere variant
270: 10375: 7661: 7524: 7254: 4666:
Universal Character Set characters § Characters, grapheme clusters and glyphs
423:
block encompasses a full semantic duplicate of the Latin alphabet, because legacy
14472: 14391: 13122: 13117: 13107: 13052: 12737: 12727: 12722: 12717: 12712: 12707: 12702: 12408: 12361: 12346: 12206: 12171: 12166: 12100: 12090: 12040: 11907: 11897: 11892: 11842: 11812: 11682: 11672: 11632: 11532: 11447: 11422: 11111: 11016: 10986: 10855: 10670: 10228: 9986: 8944: 8915: 8886: 8849: 8775: 8722: 8696: 8670: 7616: 7152: 6834: 6410: 6367: 5685: 4871: 4447: 3960: 3881: 3397: 3175: 3145: 2808: 2657: 2653: 2645: 2254: 2187: 2119: 1909: 1905: 1788: 1784: 1756: 1646: 1642: 1566: 1518: 1381: 1331: 1323: 1261: 942: 934: 930: 914: 706: 675: 611: 443: 387: 364: 11617: 10831: 9884: 9865: 9830: 9811: 9776: 9757: 9722: 9703: 9668: 9649: 9614: 9595: 9489: 9470: 9435: 9416: 8602: 8573: 8547: 8504: 8458: 8412: 8366: 8320: 8250: 8204: 8158: 8109: 8046: 8000: 7964: 7928: 7892: 7856: 7817: 7781: 7745: 7708: 6217:
as the prefix) should display on all browsers as Δ, Й, ק ,م, ๗, あ, 叶, 葉, and 말.
5877: 5735: 1233: 701:
Some modern invented scripts which have not yet been included in Unicode (e.g.,
370:
The philosophy that underpins Unicode seeks to encode the underlying characters—
13924: 13919: 13909: 13904: 13899: 13894: 13858: 13853: 13846: 13841: 13836: 13831: 13826: 13821: 13816: 13811: 13806: 13801: 13669: 13626: 13621: 13616: 13611: 13606: 13601: 13596: 13591: 13586: 13581: 13576: 13571: 13566: 13561: 13556: 13463: 13458: 13453: 13448: 13443: 13438: 13433: 13428: 13423: 13418: 13413: 13408: 13192: 12777: 12697: 12692: 12687: 12682: 12677: 12672: 12667: 12662: 12657: 12525: 12316: 12296: 12216: 12196: 12186: 12095: 11942: 11872: 11847: 11827: 11782: 11767: 11742: 11692: 11667: 11612: 11567: 10587: 10570: 10469: 10446: 10423: 10384: 10206: 7411: 7382: 7124: 7058: 6850: 6767: 6502: 6434: 5976: 5968: 5934: 5849: 5751: 5620:
is an obsolete subset of UTF-16; UCS-4 and UTF-32 are functionally equivalent.
5482: 5287: 5188: 5039: 5015: 4847: 3732: 3556: 2987: 2792: 2672: 2589: 2585: 2473: 2311: 2191: 2127: 2059: 2055: 1917: 1913: 1792: 1574: 1490: 1438: 1373: 1257: 1082: 986: 950: 926: 792: 683: 645: 637: 584: 424: 411: 404: 284: 256: 13172: 10205:
Boucher, Nicholas; Shumailov, Ilia; Anderson, Ross; Papernot, Nicolas (2022).
10092: 7484: 5643:, which uses either one or two 16-bit units per code point, but cannot encode 5604:(UCS) encodings. An encoding maps (possibly a subset of) the range of Unicode 2719:, 20 emoji, 4,192 CJK ideographs, control characters for Egyptian hieroglyphs 763:
includes more information, covering in-depth topics such as bitwise encoding,
14521: 14244: 13664: 13551: 13546: 13541: 13536: 13531: 13526: 13403: 13398: 13388: 13373: 13368: 13363: 13358: 13353: 13348: 13343: 13338: 13333: 13328: 13323: 13318: 13313: 13308: 13303: 13298: 13293: 13288: 13283: 13278: 13273: 13268: 13263: 13258: 13253: 13248: 13243: 13238: 13233: 13228: 13223: 13218: 13213: 13208: 13127: 13102: 13067: 13026: 12772: 12336: 12321: 12201: 12140: 12025: 11967: 11962: 11927: 11902: 11852: 11737: 11727: 11577: 11522: 11365: 11163: 10959: 10850: 10787: 10286: 6917: 6796: 6781: 6626: 6426: 6051: 6027: 6019: 5910: 5807: 3128:, which are used as surrogate pairs to encode the 2 code points in the range 2942: 2816: 2637: 2469: 2409: 2262: 2202:, additional CJK Unified Ideographs, lowercase letters for Cherokee, 5 emoji 2063: 1901: 1840: 1768: 1764: 1678: 1662: 1658: 1638: 1634: 1630: 1510: 978: 898: 816: 248: 10733:
Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard
7355: 6821:
combining acute above) should be rendered identically, both appearing as an
5893:(the fixed-length two-byte obsolete precursor to UTF-16) and later moved to 3040:. The codespace is a systematic, architecture-independent representation of 14264: 14259: 14254: 14249: 13984: 13724: 13719: 13714: 13709: 13704: 13699: 13694: 13689: 13684: 13679: 13162: 13157: 13137: 13021: 13013: 12646: 12331: 12065: 12060: 12015: 11917: 11832: 11822: 11732: 11707: 11647: 11562: 11542: 11537: 11517: 11396: 11343: 10777: 10493:"Proposal on Tibetan BrdaRten Characters Encoding for ISO/IEC 10646 in BMP" 7504: 6838: 6830: 6754: 6750: 6578: 6254:, seeing them as implementation choices. Any given character may have many 6245: 5902: 5382: 4684: 4635: 4606: 3852: 3060: 2841: 2796: 2716: 2573: 2377: 2258: 2242: 2107: 1925: 1848: 1836: 1732: 1430: 1369: 1237: 974: 962: 796: 156: 10821: 9392:"Usage Statistics and Market Share of US-ASCII for Websites, October 2021" 8970: 6613:
A security advisory was released in 2021 by two researchers, one from the
5583:
will display a box showing the hexadecimal scalar value of the character.
4932: 4921: 537:
was incorporated in California on 3 January 1991, and the first volume of
12579: 12562: 12387: 12226: 12211: 12125: 11995: 11972: 11937: 11807: 11787: 11762: 11582: 11045: 11035: 7381:
In 1978, the initial proposal for a set of "Universal Signs" was made by
6947: 6317: 6133: 6107: 6099: 5987: 5469: 4834: 4084: 3586: 3188:
Each code point is assigned a classification, listed as the code point's
3056: 2712: 2461: 2401: 2199: 2179: 2131: 2087: 1622: 1365: 982: 970: 732:
The Script Encoding Initiative, a project run by Deborah Anderson at the
479: 400: 11317: 10607: 10553:"Unicode Technical Note #27: Known Anomalies in Unicode Character Names" 7594: 7548: 6038:
is followed by the hexadecimal representation of the code point and the
4826:
ensures the practical interchangeability of these equivalent encodings.
4184:
66 (will not change unless the range of Unicode code points is expanded)
14429: 14337: 14190: 13868: 12963: 12933: 12928: 12923: 12918: 12883: 12767: 12762: 12752: 12747: 12545: 12535: 12266: 11717: 11607: 11243: 10951: 10880: 10705:, The Unicode Consortium, Addison-Wesley Professional, 27 August 2003. 9291: 7386: 7111: 6846: 6758: 6733: 6602: 6532: 6450: 6388: 5971:
character sets. UTF-8 is also the most common Unicode encoding used in
5946: 5926: 5922: 5906: 5898: 5689: 5681:
as a direct replacement for legacy encodings in general text handling.
5655: 5631: 4995: 4928: 4296:
Unicode Character Encoding Stability Policies: Property Value Stability
3624: 3018: 2959: 2928: 2323: 2091: 958: 918: 695: 572: 568: 467: 395: 126: 14370: 12480: 9230:
CWA 13873:2000 – Multilingual European Subsets in ISO/IEC 10646-1
8995: 8293: 6523:). However, the usual ASCII letters are used for the lowercase dotted 5839: 4844:
combinations of precomposed syllables made from the most common jamo.
830:
Unicode version history and notable changes to characters and scripts
359:
often totally unworkable when used together, with text encoded in one
273:
used in various ordinary, literary, academic, and technical contexts.
14317: 14295: 14200: 14013: 13042: 12973: 12953: 12948: 12873: 12868: 12130: 12045: 11977: 11712: 11486: 11406: 11401: 11302: 11285: 9876: 9822: 9768: 9714: 9660: 9606: 9481: 9427: 6662: 6657: 6555: 6506: 6418: 6395:) defined unification criteria, meaning rules for determining when a 6392: 6255: 5995: 5983: 5964: 5918: 5914: 5727: 5051: 5038:
Several subsets of Unicode are standardized: Microsoft Windows since
4680: 4541:
shape or representing a visible space. As of Unicode 16.0, there are
3590: 3422: 2800: 2514: 1856: 1570: 1498: 1273: 1161: 800: 744:
The Unicode Consortium together with the ISO have developed a shared
664: 632:
Many modern applications can render a substantial subset of the many
588: 523: 439: 393:
At the most abstract level, Unicode assigns a unique number called a
288: 10692:, The Unicode Consortium, Addison-Wesley Longman, Inc., April 2000. 10304: 10187: 9152: 7572: 4960: 3757:
Closing quotation mark. May behave like Ps or Pe depending on usage
3112:
character encoding, which can encode the 2 code points in the range
178: 14487: 14342: 14307: 14285: 14195: 14018: 12958: 12943: 12903: 12898: 12893: 12878: 12837: 12832: 12827: 12822: 12817: 12812: 12609: 12599: 12595: 12569: 12356: 12326: 12176: 12161: 12156: 11947: 11697: 11677: 11642: 11552: 11391: 11208: 10842: 10219: 9523: 8823:"The Unicode Standard, Version 13.0– Core Specification Appendix C" 6854: 6551: 6481: 6476: 6457: 6348:
character as a newline, regardless of the input's actual encoding.
6320:
that conforming applications should recognize as line terminators.
6274: 6270: 6137: 6095: 5799: 5769: 5011: 4631: 3848: 3582: 3059:
numbers. At least four hexadecimal digits are always written, with
2075: 2071: 1670: 1514: 1434: 1377: 954: 946: 910: 656: 379: 371: 360: 101: 28: 11797: 10735:, Richard Gillam, Addison-Wesley Professional; 1st edition, 2002. 10552: 9809: 6484:
characters shown with upright, oblique, and italic alternate forms
5929:), which uses UTF-16 as the sole internal character encoding. The 2319: 14357: 14063: 13944: 13518: 12888: 12863: 12853: 12591: 12080: 12075: 11952: 11887: 11817: 11557: 10826: 7123:
Rarely, the uppercase Icelandic eth may instead be written in an
6786: 6313: 5674: 4630:
are widely used in texts using Unicode. In a phenomenon known as
3964: 3956: 3684: 2887:
5.0 added Amendment 2 as well as four characters from Amendment 3
2804: 2457: 2413: 2327: 2135: 1666: 1277: 808: 804: 702: 660: 628: 592: 386:, or by some other means. In particularly complex cases, such as 10816: 10204: 9019: 6429:. These have their own drawbacks in general use, leading to the 5717:
The same character converted to UTF-8 becomes the byte sequence
4352:
may be used to identify a nameless code point. E.g. <control-
302:, each being code-for-code identical with one another. However, 14362: 14352: 14330: 14210: 14185: 14180: 13863: 13754: 13644: 13003: 12993: 12978: 12795: 12341: 12120: 11932: 11922: 11652: 11238: 11233: 11203: 10869: 10610:. The Unicode Consortium. 2021. 2.2 Relation to ISO 15924 Codes 8949:. South San Francisco, CA: The Unicode Consortium. 2024-09-10. 8920:. South San Francisco, CA: The Unicode Consortium. 2023-09-12. 7157:. South San Francisco, CA: The Unicode Consortium. 2024-09-10. 6971:
is spelled incorrectly. (Spelling errors are resolved by using
6666: 6651: 6590: 6442: 6422: 6077: 5894: 5853: 5827: 5663: 5649: 5640: 4944: 4830: 4676: 4638:
codepage, previously widely used in Western European contexts.
4120: 3658: 3163: 3109: 2812: 2593: 1760: 1265: 1078: 938: 812: 576: 343: 339: 111: 96: 9940: 5982:
Multilingual text-rendering engines which use Unicode include
4956: 2584:, 4,969 CJK ideographs, Arabic script additions used to write 14457: 14175: 14170: 14165: 13782: 13488: 12998: 12938: 12800: 12574: 12435: 12291: 12231: 11687: 11662: 11228: 11223: 11218: 9565: 6965:
PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
6762: 6414: 6401: 6282: 6152: 6103: 6085: 6073: 6007: 6003: 5999: 5991: 5956: 5938: 5857: 5845: 5823: 5793: 5789: 5773: 5755: 5635: 5627: 5613: 5043: 4742:
PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
4634:, the C1 code points are improperly decoded according to the 4573: 4561: 4538: 3840: 3836: 3832: 3816: 3167: 3045: 2832:
The total number of graphic and format characters, excluding
1253: 1249: 498: 455: 375: 347: 335: 277: 142: 137: 91: 10588:"Misspelling of BRACKET in character name is a known defect" 9580: 9368:"Usage Survey of Character Encodings broken down by Ranking" 7897:. Mountain View, CA: The Unicode Consortium. September 1999. 6593:
when a diacritic applies also depends on local conventions.
4422:
and the last two code points in each of the 17 planes (e.g.
3358:
containing an uppercase followed by a lowercase part (e.g.,
259:
that can be digitized. Version 16.0 of the standard defines
169: 13768: 12858: 10811: 8749:"Unicode Version 12.1 released in support of the Reiwa Era" 6471: 6430: 6266: 6229: 6213:(or the same numeric values expressed in hexadecimal, with 6174: 6068: 5972: 5872:, have required support for UTF-8 since the publication of 5865: 5781: 3091: 2477: 527: 10886: 7713:. Mountain View, CA: The Unicode Consortium. October 1991. 7095:"A Unicode Standard Annex (UAX) forms an integral part of 6979:
While Unicode defines the script designator (name) to be "
6547: 6543: 6539: 6513: 5616:). UTF-8 and UTF-16 are the most commonly used encodings. 4938:-ligature (द् + ध् + र् + य = द्ध्र्य) of JanaSanskritSans 4675:
characters. For example, a Latin small letter "i" with an
3828: 3824: 3555:
Numerals composed of letters or letterlike symbols (e.g.,
3371: 3367: 3363: 3359: 2982:
The Unicode Consortium normally releases a new version of
1791:, additional CJK Unified Ideographs, Jamo for Old Hangul, 388:
the treatment of orthographical variants in Han characters
255:
designed to support the use of text in all of the world's
14235: 10779:
Proceedings of Graphemics in the 21st Century, Brest 2018
10381:"An Unicode vendor-specific character table for japanese" 10057: 9250:"Multilingual European Character Set 2 (MES-2) Rationale" 8891:. Mountain View, CA: The Unicode Consortium. 2022-09-13. 8854:. Mountain View, CA: The Unicode Consortium. 2021-09-14. 8780:. Mountain View, CA: The Unicode Consortium. 2020-03-10. 8727:. Mountain View, CA: The Unicode Consortium. 2019-03-05. 8701:. Mountain View, CA: The Unicode Consortium. 2018-06-05. 8675:. Mountain View, CA: The Unicode Consortium. 2017-06-20. 8607:. Mountain View, CA: The Unicode Consortium. 2016-06-21. 8578:. Mountain View, CA: The Unicode Consortium. 2016-06-21. 8552:. Mountain View, CA: The Unicode Consortium. 2015-06-17. 8509:. Mountain View, CA: The Unicode Consortium. 2015-06-17. 8463:. Mountain View, CA: The Unicode Consortium. 2014-06-16. 8417:. Mountain View, CA: The Unicode Consortium. 2013-09-30. 8371:. Mountain View, CA: The Unicode Consortium. 2012-09-26. 8325:. Mountain View, CA: The Unicode Consortium. 2012-01-31. 8255:. Mountain View, CA: The Unicode Consortium. 2012-01-31. 8209:. Mountain View, CA: The Unicode Consortium. 2009-10-01. 8163:. Mountain View, CA: The Unicode Consortium. 2008-04-04. 8114:. Mountain View, CA: The Unicode Consortium. 2006-07-14. 8051:. Mountain View, CA: The Unicode Consortium. March 2004. 8005:. Mountain View, CA: The Unicode Consortium. April 2003. 6637:
Unicode was designed to provide code-point-by-code-point
6225: 6148: 6125: 5942: 5869: 4759: 4750:
PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRA
4745: 4403:
in UTF-16 in order to represent code points greater than
3173:
Within each plane, characters are allocated within named
2761: 2727: 2683: 2608: 2535: 2488: 2428: 2348: 2273: 2213: 2145: 2014: 1980: 1935: 1867: 1802: 1699: 1592: 1537: 1464: 1392: 1342: 1288: 1197: 1138: 1093: 1040: 996: 871: 671:(in the form of notes and rhythmic symbols), also occur. 596: 580: 9755: 7969:. Mountain View, CA: The Unicode Consortium. March 2002. 7933:. Mountain View, CA: The Unicode Consortium. March 2001. 6985:", in that script's character names, a hyphen is added: 5750:
compilers to generate software uses it as the standard "
5608:
to sequences of values in some fixed-size range, termed
655:
are included in the latest version of Unicode (covering
10775: 9941:"Extensible Markup Language (XML) 1.1 (Second Edition)" 7822:. Mountain View, CA: The Unicode Consortium. July 1996. 7786:. Mountain View, CA: The Unicode Consortium. July 1995. 7750:. Mountain View, CA: The Unicode Consortium. June 1992. 7040:(LMBCS), a parallel development with similar intentions 3820: 3812: 2592:, and other African languages, additions used to write 10087:. International Research Institute for Zen Buddhism / 10031:
The Unicode Standard Version 16.0 – Core Specification
9042:
The Unicode Standard Version 16.0 – Core Specification
7861:. Mountain View, CA: The Unicode Consortium. May 1998. 6761:
are each allocated only 128 code points, matching the
6445:
set, Unicode adopted the JIS-style unification model.
6387:
Existing character set standards such as the Japanese
4907:
than to the character sequence <U+0065, U+0301>.
4537:
to have particular semantics, either having a visible
10608:"Unicode Standard Annex #24: Unicode Script Property" 9813:
Internet Message Access Protocol Internationalization
6952:: This is not a Yi syllable, but a Yi iteration mark. 6822: 3051:
In this normative notation, the two-character prefix
9810:
C. Newman; A. Gulbrandsen; A. Melnikov (June 2008).
326:
Unicode text is processed and stored as binary data
10213:. San Francisco, CA, US: IEEE. pp. 1987–2004. 9863: 9107: 9105: 6632: 5684:The UCS-2 and UTF-16 encodings specify the Unicode 4829:An example of this arises with the Korean alphabet 604:
Ministry of Endowments and Religious Affairs (Oman)
11461: 10262:"Creative usernames and Spotify account hijacking" 9651:Overview and Framework for Internationalized Email 9597:Overview and Framework for Internationalized Email 9418:Internationalization of the File Transfer Protocol 9360: 7437:"History of Unicode Release and Publication Dates" 5491:A0–A1, AA–AC, B2, BA, BC, C4, CA–CB, CF, D8–D9, E6 682:website. For some scripts on the Roadmap, such as 223:This article contains uncommon Unicode characters. 10748:, Jukka K. Korpela, O'Reilly; 1st edition, 2006. 9960:Bigelow, Charles; Holmes, Kris (September 1993). 7211:"Unicode Standard Annex #45: U-source Ideographs" 6250:Unicode is not in principle concerned with fonts 6169:the permanently unassigned code points D800–DFFF, 5465:00, 02, 0C, 10, 14, 18, 1C, 24, 2C, 34, 3C, 50–6C 5263:02–03, 0A–0B, 1E–1F, 40–41, 56–57, 60–61, 6A–6B, 4805:), and equivalently as the precomposed character 3063:prepended as needed. For example, the code point 1529:First named character sequences were introduced. 14519: 14437:Unicode control, format and separator characters 10722:, Addison-Wesley Professional, 27 October 2006. 10716:The Unicode Standard, Version 5.0, Fifth Edition 10679:, James Felici, Adobe Press; 1st edition, 2002. 10211:2022 IEEE Symposium on Security and Privacy (SP) 9102: 7453: 7300:"The Unicode Standard: A Technical Introduction" 7033:List of XML and HTML character entity references 5754:" encoding. Some programming languages, such as 522:. In 1990, Michel Suignard and Asmus Freytag of 9647: 9593: 8870:"Announcing The Unicode Standard, Version 14.0" 8796:"Announcing The Unicode Standard, Version 13.0" 6531:, matching how they are handled in the earlier 6140:related issues; e.g. v6 and older of Microsoft 5963:) has become the main storage encoding on most 5802:is another encoding form for Unicode, from the 4169: 4035: 3929: 3782: 3597: 3503: 3432: 3276: 2993: 9756:A. Yang; S. Steele; N. Freed (February 2012). 7145: 6861:technologies for advanced rendering features. 4340:"Table 4-9: Construction of Code Point Labels" 636:, as demonstrated by this screenshot from the 378:thereof, that are instead best handled by the 12496: 10902: 10508: 9959: 9468: 9318:. The Unicode Consortium. 2013. p. 561. 9113:"Unicode Character Encoding Stability Policy" 7686:"Enumerated Versions of The Unicode Standard" 7630: 6269:based on Unicode are widely available, since 4319:. 2022-02-08. Unicode Technical Standard #18. 3218: 2962:characters, and 3 Zanabazar Square characters 1859:and emoji, additional CJK Unified Ideographs 1077:33 reclassified as control characters. 4,306 727: 201: 10782:. Brest: Fluxus Editions. pp. 167–183. 10021: 10019: 9701: 7346: 7344: 7342: 7340: 7338: 6561: 6323:In terms of the newline, Unicode introduced 6173:HTML characters manifest either directly as 5848:, has been the most common encoding for the 5822:of 2005 specified two parody UTF encodings, 5710:) does not equate to a legal character, and 5311:1E, 20–22, 26, 30, 32–33, 39–3A, 3C, 3E, 44, 4978: 4972: 4966: 4954: 4309:"Annex C: Compatibility Properties (§ word)" 4145:Character (but no interpretation specified) 3183: 3004: 2945:, nine CJK unified ideographs, and 41 emoji; 2927:7.0 added Amendments 1 and 2 as well as the 1292: 885: 10327: 10291:Initial Analysis of Underhanded Source Code 10259: 10207:"Bad Characters: Imperceptible NLP Attacks" 9472:IETF Policy on Character Sets and Languages 7185:"Unicode Technical Report #28: Unicode 3.2" 7099:, but is published as a separate document." 7016: 6128:recommendations have used Unicode as their 5652:, which uses one 32-bit unit per code point 4691:, is represented by the character sequence 2862:2.1 added two characters from Amendment 18. 1334:, 42,711 additional CJK Unified Ideographs 12503: 12489: 12404:Cultural, political, and religious symbols 10909: 10895: 10330:"Trojan Source: Invisible Vulnerabilities" 10305:"UTR #36: Unicode Security Considerations" 10188:"UTR #36: Unicode Security Considerations" 10126:"OpenType Feature: locl – Localized Forms" 10046: 10044: 9705:SMTP Extension for Internationalized Email 8943: 8914: 8885: 8848: 8774: 8721: 8695: 8669: 7479: 7477: 7151: 7049:Religious and political symbols in Unicode 6816:Unicode normalization § Normalization 4383:code points, and code points in the range 3627:characters such as "_", and other spacing 3225: 3211: 3120:except for the 2 code points in the range 2326:, 7,494 CJK Unified Ideographs, 56 emoji, 208: 194: 10602: 10600: 10218: 10076: 10016: 9875: 9821: 9767: 9713: 9659: 9605: 9480: 9426: 9414: 9332: 8601: 8572: 8546: 8503: 8457: 8411: 8365: 8319: 8249: 8203: 8157: 8108: 8045: 7999: 7963: 7927: 7891: 7855: 7816: 7780: 7744: 7707: 7335: 6922:: This is a small letter. The capital is 5810:of the People's Republic of China (PRC). 5630:, which uses one to four 8-bit units per 5594:Unicode defines two mapping methods: the 5249:92–C4, C7–C8, CB–CC, D0–EB, EE–F5, F8–F9 4837:subcomponents. However, it also provides 3150:The Unicode codespace is divided into 17 3139: 353: 10859:) is being considered for deletion. See 10547: 10545: 9012: 7848: 7846: 6568: 6475: 6472:Italic or cursive characters in Cyrillic 6356: 5784:, which is a system enabling the use of 5586: 4902:This process is different from a formal 2412:, 5 CJK Unified Ideographs, symbols for 713:, along with unofficial but widely used 627: 12510: 10937:ISO/IEC 10646 (Universal Character Set) 10284: 10170:Unicode Security Mechanisms for UTS #39 10082: 10050: 10041: 10009:, Steven J. Searle, originally written 9864:R. Gellens; C. Newman (February 2010). 8971:"Proposed New Characters: The Pipeline" 8937: 8905: 8879: 8839: 8765: 7719:"1.0.0/UnicodeData.txt (reconstructed)" 7474: 7208: 6803: 6772:Standardization Administration of China 6492: 6449:variant characters used throughout the 6417:encoding adopted by library systems in 6232:requests, non-ASCII characters must be 5804:Standardization Administration of China 5780:(DNS). The encoding is used as part of 5033: 3032:, notated according to the standard as 231:question marks, boxes, or other symbols 14520: 10597: 9563: 7807: 7735: 7698: 7617:"About The Script Encoding Initiative" 7350: 4760:Ready-made versus composite characters 4659: 4334: 4332: 4330: 4328: 4326: 4291: 4289: 4287: 4285: 4283: 4232: 4230: 3076:is padded with two leading zeros, but 2006:5 bidirectional formatting characters 12484: 11460: 10890: 10542: 10027:"Appendix E: Han Unification History" 9578: 9338: 9098:. Unicode Consortium. September 2024. 8644: 8638: 7990: 7954: 7918: 7882: 7843: 7249:. The Unicode Consortium. 2024-09-10. 7247:"Unicode 16.0 Versioned Charts Index" 7241:. The Unicode Consortium. 2024-09-10. 6871:Unicode alias names and abbreviations 6795: 5864:All internet protocols maintained by 5818:are Unicode compression schemes. The 5634:, and has maximal compatibility with 4346:. Unicode Consortium. September 2024. 4258:. The Unicode Consortium. 2024-04-30. 4244:. Unicode Consortium. September 2024. 4223:. Unicode Consortium. September 2024. 3162:) are accessed as surrogate pairs in 2977: 854: 845: 840: 837: 834: 748:following the initial publication of 606:is a full member with voting rights. 555: 417:characters nearly identical to others 410:The first 256 code points mirror the 11438:International Components for Unicode 11387:Common Locale Data Repository (CLDR) 10761:, Tony Graham, M&T books, 2000. 10085:"Chinese character codes: an update" 9522: 9244: 9092:"Appendix A: Notational Conventions" 8946:The Unicode Standard, Version 16.0.0 8917:The Unicode Standard, Version 15.1.0 8888:The Unicode Standard, Version 15.0.0 8851:The Unicode Standard, Version 14.0.0 8777:The Unicode Standard, Version 13.0.0 8724:The Unicode Standard, Version 12.0.0 8698:The Unicode Standard, Version 11.0.0 8672:The Unicode Standard, Version 10.0.0 7771: 7257:. The Unicode Consortium. 2024-09-10 7154:The Unicode Standard, Version 16.0.0 7013:International Components for Unicode 5883: 5669:UTF-8 uses one to four 8-bit units ( 2968:12.0 added 62 additional characters. 2924:6.3 added five additional characters 1851:symbols, transport and map symbols, 822:Thus far, the following versions of 644:Unicode currently covers most major 10509:Umamaheswaran, V. S. (2003-11-07). 10328:Boucher, Nicholas; Anderson, Ross. 10260:Engineering, Spotify (2013-06-18). 9648:J. Klensin; Y. Ko (February 2012). 9074:"Re: Origin of the U+nnnn notation" 8663: 8604:The Unicode Standard, Version 9.0.0 8575:The Unicode Standard, Version 9.0.0 8549:The Unicode Standard, Version 8.0.0 8506:The Unicode Standard, Version 8.0.0 8460:The Unicode Standard, Version 7.0.0 8414:The Unicode Standard, Version 6.3.0 8368:The Unicode Standard, Version 6.2.0 8322:The Unicode Standard, Version 6.1.0 8252:The Unicode Standard, Version 6.1.0 8206:The Unicode Standard, Version 5.2.0 8160:The Unicode Standard, Version 5.1.0 8111:The Unicode Standard, Version 5.0.0 8048:The Unicode Standard, Version 4.1.0 8002:The Unicode Standard, Version 4.0.0 7966:The Unicode Standard, Version 3.2.0 7930:The Unicode Standard, Version 3.1.0 7894:The Unicode Standard, Version 3.0.0 7858:The Unicode Standard, Version 2.1.2 7819:The Unicode Standard, Version 2.0.0 7783:The Unicode Standard, Version 1.1.5 7747:The Unicode Standard, Version 1.0.1 7710:The Unicode Standard, Version 1.0.0 4979: 4973: 4967: 4323: 4280: 4274:UTR #44: Unicode Character Database 4262: 4248: 4227: 4209: 4148:137,468 total (will never change) ( 4133: 4107: 4065: 4043: 4003: 3971: 3937: 3908: 3887: 3863: 3790: 3761: 3739: 3713: 3691: 3665: 3639: 3605: 3563: 3537: 3511: 3482: 3461: 3440: 3403: 3378: 3333: 3312: 3291: 3154:, numbered 0 to 16. Plane 0 is the 3098:There are a total of 2 + (2 − 2) = 13: 13847:Norwegian and Danish (alternative) 12419:Mathematical operators and symbols 10646: 10123: 10112:the OpenType 'locl' GSUB feature). 10007:A Brief History of Character Codes 9903:: CS1 maint: overridden setting ( 9849:: CS1 maint: overridden setting ( 9795:: CS1 maint: overridden setting ( 9741:: CS1 maint: overridden setting ( 9687:: CS1 maint: overridden setting ( 9633:: CS1 maint: overridden setting ( 9508:: CS1 maint: overridden setting ( 9454:: CS1 maint: overridden setting ( 9153:"Unicode Character Encoding Model" 9059:The Unicode Standard, Version 16.0 9039:"2.4 Code Points and Characters". 9032: 7459: 6384:since the genesis of the project. 6361: 4511:Supplementary Private Use Area-B: 4493:Supplementary Private Use Area-A: 3000:Universal Character Set characters 1330:, sets of symbols for Western and 734:University of California, Berkeley 617: 14: 14549: 10863:to help reach a consensus. › 10805: 10703:The Unicode Standard, Version 4.0 10690:The Unicode Standard, Version 3.0 10677:The Complete Manual of Typography 10655:The Unicode Standard, Version 6.0 10376:AFII contribution about WAVE DASH 10353:"Visual Studio Code October 2021" 10083:Wittern, Christian (1995-05-01). 9316:The Unicode Standard, Version 6.2 9292:"UTF-8, UTF-16, UTF-32 & BOM" 7638:"Unicode 6.1 Paperback Available" 6607:natural language processing (NLP) 6573:Localised forms of the letter í ( 4884:Ideographic Description Sequences 4777:can be represented in Unicode as 4276:. Unicode Consortium. 2024-08-27. 4238:"Table 2-3: Types of code points" 4095:), control characters to support 2941:Plus Amendment 1, as well as the 489:, Becker outlined a scheme using 361:interpreted as garbage characters 14504: 14503: 12460: 12459: 12449: 12448: 12431:Phonetic symbols (including IPA) 10621: 10580: 10563: 10538:from the original on 2024-01-01. 10520: 10511:"Resolutions of WG 2 meeting 44" 10502: 10485: 10462: 10439: 10416: 10398: 10369: 10345: 10321: 10297: 10278: 10253: 10198: 10180: 10158: 10136: 10117: 10099: 10000: 9989:. Unicode Consortium. 2017-06-28 9979: 9953: 9933: 9917: 9911: 9857: 9803: 9749: 9702:J. Yao; W. Mao (February 2012). 9695: 9641: 9587: 9572: 9557: 9534: 7605:from the original on 2023-03-25. 7583:from the original on 2023-12-08. 7485:"The Unicode Consortium Members" 7331:from the original on 2023-11-11. 7117: 6739: 6633:Mapping to legacy character sets 6621:, in which they assert that the 6312:Unicode partially addresses the 6013: 5692:detection). The BOM, encoded as 4943: 4920: 4533:characters are those defined by 3529:All these, and only these, have 3090: 3016:: a sequence of integers called 2859:2.0 added Amendments 5, 6, and 7 1596: 1468: 1142: 1044: 722:Medieval Unicode Font Initiative 462:(XCCS). In 1987, Xerox employee 334:itself defines three encodings: 27: 14291:Digital encoding of APL symbols 14226:Comparison of Unicode encodings 12744:Proposed but not approved 10470:"Alphabetic Presentation Forms" 10051:Topping, Suzanne (2013-06-25). 9759:Internationalized Email Headers 9594:J. Klensin; Y. Ko (July 2007). 9581:"Unicode Character Recognition" 9516: 9462: 9408: 9384: 9308: 9284: 9263: 9238: 9223: 9205: 9185: 9165: 9145: 9125: 9084: 9066: 9056:"3.4 Characters and Encoding". 9049: 8988: 8963: 8815: 8741: 8715: 8689: 8592: 8566: 8540: 8494: 8448: 8402: 8356: 8310: 8286: 8240: 8194: 8148: 8128: 8102: 8082: 8036: 7678: 7654: 7609: 7587: 7565: 7541: 7517: 7497: 7429: 7404: 7372:from the original on 2016-11-25 7239:"Unicode Character Count V16.0" 7209:Jenkins, John H. (2021-08-26). 7104: 7089: 7008:Comparison of Unicode encodings 5866:Internet Engineering Task Force 5177: 4862:block is assigned to the range 4816:LATIN SMALL LETTER E WITH ACUTE 4464:code points available for use. 4301: 4270:"5.7.1 General Category Values" 2950: 2935: 2910: 2898: 2876: 2867: 2851: 2826: 1521:disunified from Greek, ancient 674:The Unicode Roadmap Committee ( 10631:. The Unicode Consortium. 2023 10528:"Character Encoding Stability" 9962:"The design of a Unicode font" 9469:H. Alvestrand (January 1998). 7313: 7292: 7271: 7228: 7202: 7177: 7076: 7038:Lotus Multi-Byte Character Set 6601:Unicode has a large number of 6505:, Unicode includes a separate 6466:Ideographic Variation Database 5786:Internationalized Domain Names 5507:3A–3C, 40, 42, 60, 63, 65–66, 5162: 4187:No name, <noncharacter> 4087:, joining control characters ( 328:using one of several encodings 1: 11371:International Ideographs Core 11181:International Ideographs Core 11122:Alias names and abbreviations 10941: 10848: 10843:Alan Wood's Unicode Resources 10827:Unicode Character Code Charts 10447:"Arabic Presentation Forms-B" 10424:"Arabic Presentation Forms-A" 7391:Xerox Character Code Standard 7138: 7064:Universal Coded Character Set 7044:Open-source Unicode typefaces 6933:MATHEMATICAL SCRIPT CAPITAL P 6585:Whether the lowercase letter 6072:transmission of Unicode, the 6044:screen-selection entry method 5706:(the result of byte-swapping 5601:Universal Coded Character Set 5596:Unicode Transformation Format 5546:Alphabetic Presentation Forms 4217:"Table 4-4: General Category" 4165:No name, <private-use> 3166:and encoded in four bytes in 1270:Canadian Aboriginal syllabics 754:Universal Coded Character Set 421:Halfwidth and Fullwidth Forms 49:Universal Coded Character Set 11593:CJK Unified Ideographs (Han) 11443:People involved with Unicode 10832:Unicode Character Name Index 10383:. 2011-04-22. Archived from 10229:10.1109/SP46214.2022.9833641 10053:"The secret life of Unicode" 8645:Lobao, Martim (2016-06-07). 7642:announcements_at_unicode.org 7595:"script encoding initiative" 7054:Standards related to Unicode 6864: 6787:Thai Industrial Standard 620 6746:Tamil All Character Encoding 6736:, from long before Unicode. 6671:round-trip format conversion 6639:round-trip format conversion 5419:0F, 11–12, 15, 19–1A, 1E–1F, 4911: 4256:"DerivedGeneralCategory.txt" 3955:Includes the space, but not 2994:Architecture and terminology 2893:5.2 added Amendments 5 and 6 1689:LATIN CAPITAL LETTER SHARP S 1187:OBJECT REPLACEMENT CHARACTER 7: 14463:Character encodings in HTML 13797:National Replacement (NRCS) 13764:Japanese language in EBCDIC 10916: 10881:The World's Writing Systems 9542:"ISO/IEC JTC1/SC 18/WG 9 N" 9080:(Mailing list). 2005-11-08. 9020:"Glossary of Unicode Terms" 7001: 6596: 6462:Unicode variation sequences 6307: 5951:Microsoft Layer for Unicode 5833: 5658:, not specified as part of 5331:Superscripts and Subscripts 4955: 4933: 4766:combining diacritical marks 4581:65 code points, the ranges 4313:Unicode Regular Expressions 4129:No name, <surrogate> 3709:Closing bracket characters 2675:musical notation, 37 emoji 739: 651:As of 2024, a total of 168 485:In this document, entitled 251:standard maintained by the 16:Character encoding standard 10: 14554: 11433:Ideographic Research Group 11428:ConScript Unicode Registry 10285:Wheeler, David A. (2020). 10148:Unicode Character Database 10109:. Noto Fonts. 2023-02-18. 7129:African Reference Alphabet 7028:List of Unicode characters 6905:: Does not join graphemes. 6868: 6813: 6807: 6743: 6589:is expected to retain its 6527:and the uppercase dotless 6374:Ideographic Research Group 6365: 6243: 6117: 6061: 6017: 5959:(originally developed for 5837: 5014:(by Adobe and Microsoft), 4729:YI SYLLABLE ITERATION MARK 4663: 4395:code points) are known as 4201:No name, <reserved> 3719:Punctuation, initial quote 3143: 2997: 2753:Additional CJK ideographs 1661:, sets of symbols for the 728:Script Encoding Initiative 711:ConScript Unicode Registry 621: 587:(previously as Facebook), 559: 449: 14501: 14450: 14405: 14273: 14234: 14152: 13887: 13777: 13753: 13635: 13517: 13201: 13040: 13012: 12846: 12788: 12645: 12518: 12444: 12396: 12375: 11986: 11510: 11470: 11456: 11415: 11379: 11326: 11313:Regional indicator symbol 11256: 11189: 11146: 11139: 11069: 11022:Combining grapheme joiner 11007: 11000: 10950: 10924: 10875:Unicode BMP Fallback Font 10144:"Case Folding Properties" 9564:Hedley, Jonathan (2009). 9252:. University of Cambridge 9173:"Unicode Named Sequences" 9078:Unicode Mail List Archive 7082:Sometimes abbreviated as 6902:COMBINING GRAPHEME JOINER 6790: 6397:variant Chinese character 6351: 5741:In UTF-32 and UCS-4, one 5700:ZERO WIDTH NO-BREAK SPACE 5598:(UTF) encodings, and the 5460: 5354: 5294: 5274:Latin Extended Additional 5169: 5116: 5083: 4205: 4174: 4126:2,048 (will never change) 4061:No name, <control> 3593:digits, vigesimal digits 3198: 3193:of any given code point. 3184:General Category property 3005:Codespace and code points 2693: 2545: 2438: 2283: 2155: 1882: 1877: 1402: 1295: 1103: 1047: 897:Initial scripts covered: 862: 859: 851: 189: 163: 152: 84: 76: 61: 41: 26: 14493:Variable-length encoding 14274:Miscellaneous code pages 13032:Extended Unix Code / EUC 12723:-15 (New Western Europe) 12519:Early telecommunications 12466:Category: Unicode blocks 11271:Compatibility characters 10861:templates for discussion 10788:10.36824/2018-graf-hara1 10166:"confusablesSummary.txt" 9579:Milde, Benjamin (2011). 9235:Workshop Agreement 13873 8294:"Unicode 6.0 Emoji List" 7619:. The Unicode Consortium 7281:. The Unicode Consortium 7069: 6562:Diacritics on lowercase 6439:Research Libraries Group 6239: 6057: 5844:Unicode, in the form of 5230:84–8A, 8C, 8E–A1, A3–CE, 5218:Spacing Modifier Letters 4990:Many scripts, including 3745:Punctuation, final quote 3156:Basic Multilingual Plane 3087:EGYPTIAN HIEROGLYPH O004 1673:, additions to Burmese, 991: 866: 752:: Unicode and the ISO's 516:Research Libraries Group 14420:C0 and C1 control codes 11191:Comparison of encodings 11117:Halfwidth and fullwidth 10972:Universal Character Set 10661:, Mountain View, 2011, 10413:, Tomohiro Kubota, 2001 9415:B. Curtin (July 1999). 9344:"Moving to Unicode 5.1" 8090:"Named Sequences-4.1.0" 7321:"Unicode Bulldog Award" 7217:. §2.2 The Source Field 6619:University of Edinburgh 6617:and the other from the 6615:University of Cambridge 5990:for Microsoft Windows, 5937:bytecode environments, 5623:UTF encodings include: 5454:Miscellaneous Technical 4886:block covers the range 4860:CJK Radicals Supplement 4748:) has the formal alias 4687:, which is required in 4603:C0 and C1 control codes 4601:, corresponding to the 3467:Mark, spacing combining 3022:in the range from 0 to 2756: 2722: 2678: 2603: 2530: 2483: 2423: 2406:Old Sogdian and Sogdian 2343: 2268: 460:Character Code Standard 309:character normalization 12668:-3 (Maltese/Esperanto) 12619:World System Teletext 12116:Inscriptional Parthian 11803:Nyiakeng Puachue Hmong 11465:and symbols in Unicode 11082:CJK Unified Ideographs 10817:Unicode Technical Site 9867:POP3 Support for UTF-8 9193:"Unicode Name Aliases" 8996:"Unicode Version 16.0" 6648:precomposed characters 6582: 6485: 6130:document character set 6113: 6076:character set and the 5901:(and its descendants, 5437:Mathematical Operators 4909: 4803:COMBINING ACUTE ACCENT 4656:reserved code points. 4058:65 (will never change) 3611:Punctuation, connector 3140:Code planes and blocks 2873:3.2 added Amendment 1. 2834:private-use characters 2466:Nyiakeng Puachue Hmong 2208: 2140: 2009: 1975: 1930: 1862: 1797: 1753:Inscriptional Parthian 1694: 1587: 1532: 1459: 1387: 1337: 1283: 1192: 1133: 1088: 1035: 1031:CJK Unified Ideographs 641: 518:, and Glenn Wright of 512: 503: 354:Origin and development 14442:Whitespace characters 14119:Ventura International 12252:Old Persian cuneiform 12111:Inscriptional Pahlavi 12006:Ancient North Arabian 12001:Anatolian hieroglyphs 11291:Precomposed character 11127:Whitespace characters 11056:Zero-width non-joiner 10409:, Section 4.4.3.5 of 10357:code.visualstudio.com 9987:"Fonts and keyboards" 9969:Electronic Publishing 7828:"Unicode Data-2.0.14" 7573:"Roadmaps to Unicode" 7279:"Emoji Counts, v16.0" 6744:Further information: 6572: 6479: 6357:Character unification 6277:support Unicode (and 5840:UTF-8 § Adoption 5806:. It is the official 5587:Mapping and encodings 5581:Unicode fallback font 5516:Miscellaneous Symbols 5478:80, 84, 88, 8C, 90–93 4900: 4820:canonical equivalence 4727:has the formal alias 4664:Further information: 4563:ZERO WIDTH NON-JOINER 3517:Number, decimal digit 3286:(Lu, Ll, and Lt only) 3239:Category Major, minor 2890:5.1 added Amendment 4 2884:4.1 added Amendment 1 2846:surrogate code points 2525:SQUARE ERA NAME REIWA 2184:Anatolian hieroglyphs 1749:Inscriptional Pahlavi 1675:Scribal abbreviations 1298:ISO/IEC 10646-2:2001 1208:ISO/IEC 10646-1:2000 631: 507: 495: 382:, through the use of 313:character composition 298:is synchronized with 13837:Norwegian and Danish 12071:Egyptian hieroglyphs 11276:Duplicate characters 11092:Duplicate characters 10822:The Unicode Standard 10629:"Scripts-15.1.0.txt" 10516:. Resolution M44.20. 10411:Introduction to I18n 9348:Official Google Blog 9096:The Unicode Standard 8828:. Unicode Consortium 8623:"Unicode Data 9.0.0" 8525:"Unicode Data 8.0.0" 8479:"Unicode Data 7.0.0" 8433:"Unicode Data 6.3.0" 8387:"Unicode Data 6.2.0" 8341:"Unicode Data 6.1.0" 8271:"Unicode Data 6.0.0" 8225:"Unicode Data 5.2.0" 8179:"Unicode Data 5.1.0" 8136:"Unicode Data 5.0.0" 8067:"Unicode Data-4.1.0" 8021:"Unicode Data-4.0.0" 7975:"Unicode Data-3.2.0" 7939:"Unicode Data-3.1.0" 7903:"Unicode Data-3.0.0" 7867:"Unicode Data-2.1.2" 7756:"Unicode Data 1.0.1" 7599:Berkeley Linguistics 7549:"Roadmap to the BMP" 7097:The Unicode Standard 7023:List of binary codes 6876:The Unicode Standard 6804:Combining characters 6644:combining diacritics 6493:Localised case pairs 6382:The Unicode Standard 6304:for most typefaces. 6279:Web Open Font Format 6260:The Unicode Standard 5820:April Fools' Day RFC 5723:The Unicode Standard 5660:The Unicode Standard 5645:surrogate characters 5034:Standardized subsets 5006:), which became the 5004:The Unicode Standard 4896:The Unicode Standard 4824:The Unicode Standard 4788:LATIN SMALL LETTER E 4708:The Unicode Standard 4660:Abstract characters 4548:graphic characters. 4535:The Unicode Standard 4472:The Unicode Standard 4371:points in the range 4344:The Unicode Standard 4242:The Unicode Standard 4221:The Unicode Standard 4009:Separator, paragraph 3809:Mathematical symbols 3160:supplementary planes 3042:The Unicode Standard 3010:The Unicode Standard 2984:The Unicode Standard 1910:Meroitic hieroglyphs 1741:Egyptian hieroglyphs 1737:Gardiner's sign list 1296:ISO/IEC 10646-1:2000 824:The Unicode Standard 781:The Unicode Standard 773:The Unicode Standard 761:The Unicode Standard 750:The Unicode Standard 709:) are listed in the 547:Egyptian hieroglyphs 539:The Unicode Standard 332:The Unicode Standard 304:The Unicode Standard 296:character repertoire 244:The Unicode Standard 14397:Unified Hangul Code 14069:PostScript Standard 13792:Multinational (MCS) 12663:-2 (Central Europe) 12658:-1 (Western Europe) 12512:Character encodings 12136:Khitan small script 11573:Canadian Aboriginal 11308:Variation sequences 11266:Combining character 11176:Variation sequences 11087:Combining character 10266:Spotify Engineering 10089:Hanazono University 10013:, last updated 2004 7792:"Unicode Data 1995" 7525:"Supported Scripts" 7462:"Unicode Revisited" 7460:Searle, Stephen J. 7412:"Summary Narrative" 7255:"Supported Scripts" 6973:Unicode alias names 6837:, as needed in the 6810:Combining character 6669:and Unicode led to 6341:PARAGRAPH SEPARATOR 6302:diminishing returns 5679:Linux distributions 5317:General Punctuation 5069: 4672:abstract characters 4643:assigned characters 4175:Other, not assigned 4029:PARAGRAPH SEPARATOR 2958:Plus 56 emoji, 285 2578:Khitan small script 2546:ISO/IEC 10646:2020 2394:Indic Siyaq Numbers 2284:ISO/IEC 10646:2017 2204:skin tone modifiers 2156:ISO/IEC 10646:2014 1878:ISO/IEC 10646:2012 1813:ISO/IEC 10646:2010 1403:ISO/IEC 10646:2003 1029:The initial 20,902 831: 688:Khitan large script 315:and decomposition, 23: 14538:Digital typography 14533:Character encoding 14478:Hardware code page 14238:typesetting system 14074:PostScript Latin 1 13730:Cyrillic + Finnish 13637:Windows code pages 13519:IBM AIX code pages 12847:National standards 12778:Ukrainian Cyrillic 12376:Notational scripts 12327:Tagalog (Baybayin) 12036:Caucasian Albanian 11359:numeric references 11334:Domain names (IDN) 11154:Bidirectional text 11031:Right-to-left mark 11027:Left-to-right mark 10982:Character property 10932:Unicode Consortium 10720:Unicode Consortium 10659:Unicode Consortium 10174:Unicode Consortium 10152:Unicode Consortium 10035:Unicode Consortium 9213:"JanaSanskritSans" 7553:Unicode Consortium 7363:Unicode Consortium 7215:Unicode Consortium 7189:Unicode Consortium 7015:(ICU), now as ICU- 6996:PHAGS-PA LETTER KA 6776:ISO/IEC JTC 1/SC 2 6583: 6507:dotless lowercase 6486: 6378:Chinese characters 6289:Thousands of fonts 6042:. There is also a 6036:beginning sequence 5778:Domain Name System 5368:Letterlike Symbols 5176:Latin Extended-B ( 5110:Latin-1 Supplement 5058: 4597:, are reserved as 4478:Private Use Area: 4317:Unicode Consortium 4139:Other, private use 4119:Not (only used in 4097:bidirectional text 3767:Punctuation, other 3697:Punctuation, close 3633:regular expression 3245:Character assigned 3203:Character Property 2978:Projected versions 2838:control characters 2048:Caucasian Albanian 1853:alchemical symbols 829: 769:bidirectional text 717:code assignments. 680:Unicode Consortium 642: 634:scripts in Unicode 562:Unicode Consortium 556:Unicode Consortium 535:Unicode Consortium 253:Unicode Consortium 35:Unicode Consortium 21: 14515: 14514: 14468:Charset detection 14407:Control character 14089:Sharp calculators 13960:Casio calculators 13888:Platform specific 13740:Cyrillic + German 13735:Cyrillic + French 13153:Maltese/Esperanto 12789:Bibliographic use 12673:-4 (North Europe) 12605:T.51/ISO/IEC 6937 12563:Baudot and Murray 12478: 12477: 12474: 12473: 12455:Category: Unicode 11492:Punctuation marks 11474:inherited scripts 11380:Related standards 11354:entity references 11252: 11251: 11135: 11134: 11051:Zero-width joiner 10797:978-2-9570549-1-6 10759:Unicode: A Primer 10746:Unicode Explained 10406:ISO 646-* Problem 10287:"Countermeasures" 10238:978-1-66541-316-9 9325:978-1-936213-08-5 8956:978-1-936213-34-4 8927:978-1-936213-33-7 8898:978-1-936213-32-0 8861:978-1-936213-29-0 8787:978-1-936213-26-9 8734:978-1-936213-22-1 8708:978-1-936213-19-1 8682:978-1-936213-16-0 8614:978-1-936213-13-9 8585:978-1-936213-13-9 8559:978-1-936213-10-8 8516:978-1-936213-10-8 8470:978-1-936213-09-2 8424:978-1-936213-08-5 8378:978-1-936213-07-8 8332:978-1-936213-02-3 8262:978-1-936213-02-3 8216:978-1-936213-00-9 7352:Becker, Joseph D. 7164:978-1-936213-34-4 7019:a part of Unicode 6770:in 2003 when the 6703:(other vendors). 6688:Microsoft Windows 6540:Icelandic eth (ð) 6538:By contrast, the 6514:dotted uppercase 6345:Cocoa text system 6298:font substitution 6224:, for example as 6142:Internet Explorer 6064:Unicode and email 5975:documents on the 5884:Operating systems 5577:SIL International 5565: 5564: 5020:SIL International 4575:ZERO WIDTH JOINER 4365: 4364: 3671:Punctuation, open 3657:Includes several 3645:Punctuation, dash 3623:Includes spacing 3425:or a letter in a 3339:Letter, titlecase 3318:Letter, lowercase 3297:Letter, uppercase 3199:General Category 3095:) is not padded. 2920:Turkish lira sign 2906:Indian rupee sign 2822: 2821: 2768:978-1-936213-34-4 2734:978-1-936213-33-7 2690:978-1-936213-32-0 2615:978-1-936213-29-0 2542:978-1-936213-26-9 2495:978-1-936213-25-2 2435:978-1-936213-22-1 2384:capital letters, 2382:Georgian Mtavruli 2355:978-1-936213-19-1 2280:978-1-936213-16-0 2220:978-1-936213-13-9 2152:978-1-936213-10-8 2100:Old North Arabian 2021:978-1-936213-09-2 1987:978-1-936213-08-5 1971:TURKISH LIRA SIGN 1942:978-1-936213-07-8 1874:978-1-936213-02-3 1809:978-1-936213-01-6 1773:Old South Arabian 1706:978-1-936213-00-9 1427:Cypriot syllabary 1128:Private Use Areas 786:COVID-19 pandemic 715:Private Use Areas 436:Roozbeh Pournader 227:rendering support 218: 217: 179:Technical website 14545: 14507: 14506: 13999:DG International 13874:Special Graphics 13675:Extended Latin-8 13073:Central European 13063:Barents Cyrillic 12768:Barents Cyrillic 12738:-12 (Devanagari) 12734:Abandoned parts 12505: 12498: 12491: 12482: 12481: 12463: 12462: 12452: 12451: 12414:Control Pictures 12367:Zanabazar Square 12106:Imperial Aramaic 11989:historic scripts 11458: 11457: 11318:Emoji skin color 11144: 11143: 11061:Zero-width space 11005: 11004: 10992:Private Use Area 10977:Character charts 10911: 10904: 10897: 10888: 10887: 10801: 10653:Julie D. Allen. 10640: 10639: 10637: 10636: 10625: 10619: 10618: 10616: 10615: 10604: 10595: 10594: 10592: 10584: 10578: 10577: 10575: 10567: 10561: 10560: 10549: 10540: 10539: 10524: 10518: 10517: 10515: 10506: 10500: 10499: 10497: 10489: 10483: 10482: 10480: 10479: 10474: 10466: 10460: 10459: 10457: 10456: 10451: 10443: 10437: 10436: 10434: 10433: 10428: 10420: 10414: 10402: 10396: 10395: 10393: 10392: 10373: 10367: 10366: 10364: 10363: 10349: 10343: 10342: 10340: 10339: 10334: 10325: 10319: 10318: 10316: 10315: 10301: 10295: 10294: 10282: 10276: 10275: 10273: 10272: 10257: 10251: 10250: 10222: 10202: 10196: 10195: 10184: 10178: 10177: 10162: 10156: 10155: 10140: 10134: 10133: 10121: 10115: 10114: 10107:"Noto CJK fonts" 10103: 10097: 10096: 10091:. Archived from 10080: 10074: 10073: 10071: 10070: 10061:. Archived from 10048: 10039: 10038: 10023: 10014: 10004: 9998: 9997: 9995: 9994: 9983: 9977: 9976: 9966: 9957: 9951: 9950: 9948: 9947: 9937: 9931: 9930: 9928: 9927: 9915: 9909: 9908: 9902: 9894: 9892: 9891: 9879: 9877:10.17487/RFC5721 9861: 9855: 9854: 9848: 9840: 9838: 9837: 9825: 9823:10.17487/RFC5255 9807: 9801: 9800: 9794: 9786: 9784: 9783: 9771: 9769:10.17487/RFC6532 9753: 9747: 9746: 9740: 9732: 9730: 9729: 9717: 9715:10.17487/RFC6531 9699: 9693: 9692: 9686: 9678: 9676: 9675: 9663: 9661:10.17487/RFC6530 9645: 9639: 9638: 9632: 9624: 9622: 9621: 9609: 9607:10.17487/RFC4952 9591: 9585: 9584: 9576: 9570: 9569: 9566:"Unicode Lookup" 9561: 9555: 9554: 9552: 9551: 9546: 9538: 9532: 9531: 9520: 9514: 9513: 9507: 9499: 9497: 9496: 9484: 9482:10.17487/RFC2277 9466: 9460: 9459: 9453: 9445: 9443: 9442: 9430: 9428:10.17487/RFC2640 9412: 9406: 9405: 9403: 9402: 9388: 9382: 9381: 9379: 9378: 9364: 9358: 9357: 9355: 9354: 9336: 9330: 9329: 9312: 9306: 9305: 9303: 9302: 9288: 9282: 9281: 9279: 9278: 9267: 9261: 9260: 9258: 9257: 9242: 9236: 9227: 9221: 9220: 9215:. Archived from 9209: 9203: 9202: 9200: 9199: 9189: 9183: 9182: 9180: 9179: 9169: 9163: 9162: 9160: 9159: 9149: 9143: 9142: 9140: 9139: 9129: 9123: 9122: 9120: 9119: 9109: 9100: 9099: 9088: 9082: 9081: 9070: 9064: 9063: 9053: 9047: 9046: 9036: 9030: 9029: 9027: 9026: 9016: 9010: 9009: 9007: 9006: 8992: 8986: 8985: 8983: 8982: 8967: 8961: 8960: 8941: 8935: 8931: 8909: 8903: 8902: 8883: 8877: 8873: 8865: 8843: 8837: 8836: 8834: 8833: 8827: 8819: 8813: 8809: 8807: 8806: 8800:The Unicode Blog 8791: 8769: 8763: 8762: 8760: 8759: 8753:The Unicode Blog 8745: 8739: 8738: 8719: 8713: 8712: 8693: 8687: 8686: 8667: 8661: 8660: 8658: 8657: 8642: 8636: 8632: 8630: 8629: 8618: 8596: 8590: 8589: 8570: 8564: 8563: 8544: 8538: 8534: 8532: 8531: 8520: 8498: 8492: 8488: 8486: 8485: 8474: 8452: 8446: 8442: 8440: 8439: 8428: 8406: 8400: 8396: 8394: 8393: 8382: 8360: 8354: 8350: 8348: 8347: 8336: 8314: 8308: 8307: 8305: 8304: 8290: 8284: 8280: 8278: 8277: 8266: 8244: 8238: 8234: 8232: 8231: 8220: 8198: 8192: 8188: 8186: 8185: 8174: 8152: 8146: 8145: 8143: 8142: 8132: 8126: 8125: 8106: 8100: 8099: 8097: 8096: 8086: 8080: 8076: 8074: 8073: 8062: 8040: 8034: 8030: 8028: 8027: 8016: 7994: 7988: 7984: 7982: 7981: 7970: 7958: 7952: 7948: 7946: 7945: 7934: 7922: 7916: 7912: 7910: 7909: 7898: 7886: 7880: 7876: 7874: 7873: 7862: 7850: 7841: 7837: 7835: 7834: 7823: 7811: 7805: 7801: 7799: 7798: 7787: 7775: 7769: 7765: 7763: 7762: 7751: 7739: 7733: 7729: 7727: 7726: 7714: 7702: 7696: 7695: 7693: 7692: 7682: 7676: 7675: 7673: 7672: 7662:"Unicode 16.0.0" 7658: 7652: 7651: 7649: 7648: 7634: 7628: 7627: 7625: 7624: 7613: 7607: 7606: 7591: 7585: 7584: 7569: 7563: 7562: 7560: 7559: 7545: 7539: 7538: 7536: 7535: 7521: 7515: 7514: 7512: 7511: 7501: 7495: 7494: 7492: 7491: 7481: 7472: 7471: 7469: 7468: 7457: 7451: 7450: 7448: 7447: 7433: 7427: 7426: 7424: 7423: 7408: 7402: 7401: 7378: 7377: 7371: 7360: 7348: 7333: 7332: 7317: 7311: 7310: 7308: 7307: 7296: 7290: 7289: 7287: 7286: 7275: 7269: 7265: 7263: 7262: 7250: 7242: 7232: 7226: 7225: 7223: 7222: 7206: 7200: 7199: 7197: 7196: 7181: 7169: 7168: 7149: 7132: 7121: 7115: 7108: 7102: 7093: 7087: 7080: 7018: 6997: 6994: 6991: 6989: 6984: 6966: 6963: 6960: 6958: 6950: 6946: 6943: 6941: 6934: 6931: 6928: 6926: 6920: 6919:SCRIPT CAPITAL P 6916: 6913: 6911: 6903: 6899: 6896: 6894: 6886: 6882: 6799: 6797:[sadɛːŋ] 6794: 6731: 6728: 6725: 6723: 6719:(backslash) and 6718: 6715: 6712: 6710: 6702: 6699: 6696: 6694: 6685: 6682: 6679: 6677: 6588: 6576: 6565: 6530: 6526: 6522: 6517: 6510: 6499:Turkish alphabet 6342: 6339: 6337: 6332: 6329: 6327: 6265:Free and retail 6220:When specifying 6216: 6212: 6208: 6204: 6200: 6196: 6192: 6188: 6184: 6180: 6164:C0 control codes 6132:since HTML 4.0. 6120:Unicode and HTML 6082:Quoted-printable 5766:coded software. 5720: 5713: 5709: 5705: 5701: 5698: 5696: 5677:and most recent 5573:Last Resort font 5570: 5530:Private Use Area 5496:Geometric Shapes 5348:Currency Symbols 5179: 5164: 5159:Latin Extended-B 5141:Latin Extended-A 5070: 5068: 5057: 5008:proof of concept 4982: 4981: 4977:‎=‎ 4976: 4975: 4971:‎+‎ 4970: 4969: 4964: 4947: 4936: 4924: 4893: 4889: 4881: 4877: 4874:are assigned to 4869: 4865: 4843: 4842: 4817: 4814: 4811: 4809: 4804: 4801: 4800: 4796: 4794: 4789: 4786: 4783: 4781: 4776: 4755: 4743: 4740: 4737: 4735: 4730: 4726: 4723: 4720: 4718: 4702: 4698: 4694: 4655: 4654: 4629: 4626: 4622: 4619: 4615: 4612: 4596: 4592: 4588: 4584: 4576: 4572: 4570: 4564: 4560: 4558: 4547: 4546: 4525: 4524: 4518: 4514: 4507: 4506: 4500: 4496: 4489: 4485: 4481: 4463: 4462: 4459: 4453: 4445: 4441: 4437: 4433: 4429: 4425: 4421: 4417: 4406: 4394: 4390: 4386: 4378: 4374: 4370: 4357: 4350:Code Point Label 4347: 4336: 4321: 4320: 4305: 4299: 4293: 4278: 4277: 4266: 4260: 4259: 4252: 4246: 4245: 4234: 4225: 4224: 4213: 4172: 4161: 4154: 4136: 4113:Other, surrogate 4110: 4068: 4046: 4038: 4030: 4027: 4025: 4006: 3998: 3995: 3993: 3974: 3943:Separator, space 3940: 3932: 3911: 3893:Symbol, modifier 3890: 3882:Currency symbols 3869:Symbol, currency 3866: 3793: 3785: 3764: 3742: 3716: 3694: 3668: 3642: 3608: 3600: 3583:vulgar fractions 3566: 3540: 3514: 3506: 3485: 3464: 3446:Mark, nonspacing 3443: 3435: 3427:unicase alphabet 3406: 3384:Letter, modifier 3381: 3336: 3315: 3294: 3287: 3283: 3279: 3254: 3253: 3227: 3220: 3213: 3206: 3196: 3195: 3190:General Category 3135: 3131: 3127: 3123: 3119: 3115: 3107: 3106: 3103: 3094: 3088: 3085: 3082: 3080: 3075: 3072: 3069: 3067: 3054: 3039: 3035: 3031: 3030: 3027: 2971: 2954: 2948: 2939: 2933: 2914: 2908: 2902: 2896: 2880: 2874: 2871: 2865: 2855: 2849: 2830: 2789: 2788: 2782: 2781: 2763: 2750: 2749: 2743: 2742: 2729: 2709: 2708: 2702: 2701: 2685: 2669: 2666: 2663: 2661: 2634: 2633: 2627: 2626: 2610: 2566: 2565: 2559: 2558: 2537: 2526: 2523: 2520: 2518: 2511: 2510: 2504: 2503: 2490: 2454: 2453: 2447: 2446: 2430: 2374: 2373: 2367: 2366: 2350: 2339: 2336: 2333: 2331: 2308:Zanabazar Square 2304: 2303: 2297: 2296: 2275: 2239: 2238: 2232: 2231: 2215: 2176: 2175: 2169: 2168: 2147: 2040: 2039: 2033: 2032: 2016: 2003: 2002: 1996: 1995: 1982: 1972: 1969: 1966: 1964: 1958: 1957: 1951: 1950: 1937: 1906:Meroitic cursive 1898: 1897: 1891: 1890: 1869: 1833: 1832: 1826: 1825: 1804: 1745:Imperial Aramaic 1725: 1724: 1718: 1717: 1701: 1690: 1687: 1684: 1682: 1615: 1614: 1608: 1607: 1594: 1563: 1562: 1556: 1555: 1539: 1487: 1486: 1480: 1479: 1466: 1455:Hexagram symbols 1423: 1422: 1416: 1415: 1394: 1362: 1361: 1355: 1354: 1344: 1316: 1315: 1309: 1308: 1290: 1226: 1225: 1219: 1218: 1199: 1188: 1185: 1182: 1180: 1173: 1170: 1167: 1165: 1158: 1157: 1151: 1150: 1140: 1122: 1121: 1119: 1112: 1111: 1095: 1074: 1073: 1071: 1064: 1063: 1042: 1026: 1025: 1023: 1016: 1015: 998: 927:Greek and Coptic 894: 873: 832: 828: 720:There is also a 624:Script (Unicode) 520:Sun Microsystems 432:Tatsuo Kobayashi 365:Latin characters 265: 264: 210: 203: 196: 182: 181: 173: 172: 170:Official website 85:Encoding formats 80:Unicode Standard 31: 24: 20: 14553: 14552: 14548: 14547: 14546: 14544: 14543: 14542: 14518: 14517: 14516: 14511: 14497: 14473:Han unification 14446: 14401: 14269: 14230: 14148: 13970:Compucolor 8001 13883: 13879:Technical (TCS) 13802:French Canadian 13773: 13749: 13745:Polytonic Greek 13631: 13513: 13197: 13183:Turkic Cyrillic 13098:Font X (Kermit) 13093:Farsi (Persian) 13045: 13036: 13008: 12842: 12784: 12654:Approved parts 12641: 12514: 12509: 12479: 12470: 12440: 12424:List by subject 12397:Symbols, emojis 12392: 12371: 12287:Psalter Pahlavi 11988: 11982: 11843:Pracalit (Newa) 11658:Hanifi Rohingya 11506: 11482:Combining marks 11473: 11466: 11452: 11448:Han unification 11411: 11375: 11322: 11258: 11248: 11185: 11131: 11065: 11009:Special purpose 10996: 10946: 10920: 10915: 10864: 10808: 10798: 10772: 10649: 10647:Further reading 10644: 10643: 10634: 10632: 10627: 10626: 10622: 10613: 10611: 10606: 10605: 10598: 10590: 10586: 10585: 10581: 10573: 10569: 10568: 10564: 10551: 10550: 10543: 10526: 10525: 10521: 10513: 10507: 10503: 10495: 10491: 10490: 10486: 10477: 10475: 10472: 10468: 10467: 10463: 10454: 10452: 10449: 10445: 10444: 10440: 10431: 10429: 10426: 10422: 10421: 10417: 10403: 10399: 10390: 10388: 10379: 10374: 10370: 10361: 10359: 10351: 10350: 10346: 10337: 10335: 10332: 10326: 10322: 10313: 10311: 10303: 10302: 10298: 10283: 10279: 10270: 10268: 10258: 10254: 10239: 10203: 10199: 10186: 10185: 10181: 10164: 10163: 10159: 10142: 10141: 10137: 10122: 10118: 10105: 10104: 10100: 10081: 10077: 10068: 10066: 10049: 10042: 10025: 10024: 10017: 10005: 10001: 9992: 9990: 9985: 9984: 9980: 9964: 9958: 9954: 9945: 9943: 9939: 9938: 9934: 9925: 9923: 9916: 9912: 9896: 9895: 9889: 9887: 9862: 9858: 9842: 9841: 9835: 9833: 9808: 9804: 9788: 9787: 9781: 9779: 9754: 9750: 9734: 9733: 9727: 9725: 9700: 9696: 9680: 9679: 9673: 9671: 9646: 9642: 9626: 9625: 9619: 9617: 9592: 9588: 9577: 9573: 9562: 9558: 9549: 9547: 9544: 9540: 9539: 9535: 9528:"UTF-8 history" 9521: 9517: 9501: 9500: 9494: 9492: 9467: 9463: 9447: 9446: 9440: 9438: 9413: 9409: 9400: 9398: 9390: 9389: 9385: 9376: 9374: 9366: 9365: 9361: 9352: 9350: 9337: 9333: 9326: 9314: 9313: 9309: 9300: 9298: 9296:Unicode.org FAQ 9290: 9289: 9285: 9276: 9274: 9269: 9268: 9264: 9255: 9253: 9243: 9239: 9228: 9224: 9211: 9210: 9206: 9197: 9195: 9191: 9190: 9186: 9177: 9175: 9171: 9170: 9166: 9157: 9155: 9151: 9150: 9146: 9137: 9135: 9131: 9130: 9126: 9117: 9115: 9111: 9110: 9103: 9090: 9089: 9085: 9072: 9071: 9067: 9055: 9054: 9050: 9038: 9037: 9033: 9024: 9022: 9018: 9017: 9013: 9004: 9002: 8994: 8993: 8989: 8980: 8978: 8969: 8968: 8964: 8957: 8942: 8938: 8934: 8928: 8910: 8906: 8899: 8884: 8880: 8876: 8868: 8862: 8844: 8840: 8831: 8829: 8825: 8821: 8820: 8816: 8812: 8804: 8802: 8794: 8788: 8770: 8766: 8757: 8755: 8747: 8746: 8742: 8735: 8720: 8716: 8709: 8694: 8690: 8683: 8668: 8664: 8655: 8653: 8643: 8639: 8635: 8627: 8625: 8621: 8615: 8597: 8593: 8586: 8571: 8567: 8560: 8545: 8541: 8537: 8529: 8527: 8523: 8517: 8499: 8495: 8491: 8483: 8481: 8477: 8471: 8453: 8449: 8445: 8437: 8435: 8431: 8425: 8407: 8403: 8399: 8391: 8389: 8385: 8379: 8361: 8357: 8353: 8345: 8343: 8339: 8333: 8315: 8311: 8302: 8300: 8292: 8291: 8287: 8283: 8275: 8273: 8269: 8263: 8245: 8241: 8237: 8229: 8227: 8223: 8217: 8199: 8195: 8191: 8183: 8181: 8177: 8171: 8153: 8149: 8140: 8138: 8134: 8133: 8129: 8122: 8107: 8103: 8094: 8092: 8088: 8087: 8083: 8079: 8071: 8069: 8065: 8059: 8041: 8037: 8033: 8025: 8023: 8019: 8013: 7995: 7991: 7987: 7979: 7977: 7973: 7959: 7955: 7951: 7943: 7941: 7937: 7923: 7919: 7915: 7907: 7905: 7901: 7887: 7883: 7879: 7871: 7869: 7865: 7851: 7844: 7840: 7832: 7830: 7826: 7812: 7808: 7804: 7796: 7794: 7790: 7776: 7772: 7768: 7760: 7758: 7754: 7740: 7736: 7732: 7724: 7722: 7717: 7703: 7699: 7690: 7688: 7684: 7683: 7679: 7670: 7668: 7660: 7659: 7655: 7646: 7644: 7636: 7635: 7631: 7622: 7620: 7615: 7614: 7610: 7593: 7592: 7588: 7571: 7570: 7566: 7557: 7555: 7547: 7546: 7542: 7533: 7531: 7523: 7522: 7518: 7509: 7507: 7503: 7502: 7498: 7489: 7487: 7483: 7482: 7475: 7466: 7464: 7458: 7454: 7445: 7443: 7435: 7434: 7430: 7421: 7419: 7410: 7409: 7405: 7394: 7375: 7373: 7369: 7358: 7354:(1998-09-10) . 7349: 7336: 7319: 7318: 7314: 7305: 7303: 7298: 7297: 7293: 7284: 7282: 7277: 7276: 7272: 7268: 7260: 7258: 7253: 7245: 7237: 7233: 7229: 7220: 7218: 7207: 7203: 7194: 7192: 7183: 7182: 7178: 7173: 7172: 7165: 7150: 7146: 7141: 7136: 7135: 7122: 7118: 7109: 7105: 7094: 7090: 7081: 7077: 7072: 7004: 6995: 6992: 6987: 6986: 6980: 6964: 6961: 6956: 6955: 6948: 6944: 6939: 6938: 6932: 6929: 6924: 6923: 6918: 6914: 6909: 6908: 6901: 6897: 6892: 6891: 6884: 6880: 6873: 6867: 6843:Indic languages 6818: 6812: 6806: 6748: 6742: 6729: 6726: 6721: 6720: 6717:REVERSE SOLIDUS 6716: 6713: 6708: 6707: 6700: 6697: 6692: 6691: 6684:FULLWIDTH TILDE 6683: 6680: 6675: 6674: 6635: 6599: 6586: 6574: 6567: 6563: 6548:retroflex D (ɖ) 6528: 6524: 6520: 6515: 6508: 6497:For use in the 6495: 6474: 6370: 6368:Han unification 6364: 6362:Han unification 6359: 6354: 6340: 6335: 6334: 6330: 6325: 6324: 6310: 6248: 6242: 6234:percent-encoded 6214: 6210: 6206: 6202: 6198: 6194: 6190: 6186: 6182: 6178: 6122: 6116: 6066: 6060: 6040:ending sequence 6022: 6016: 5998:for macOS, and 5886: 5842: 5836: 5718: 5711: 5707: 5703: 5699: 5694: 5693: 5690:byte endianness 5686:byte order mark 5589: 5568: 5228:74–75, 7A, 7E, 5059: 5036: 4988: 4987: 4986: 4985: 4984: 4948: 4940: 4939: 4925: 4914: 4891: 4887: 4879: 4875: 4872:Kangxi radicals 4867: 4863: 4840: 4838: 4815: 4812: 4807: 4806: 4802: 4798: 4797: 4792: 4791: 4787: 4784: 4779: 4778: 4774: 4762: 4749: 4741: 4738: 4733: 4732: 4728: 4724: 4721: 4716: 4715: 4700: 4696: 4692: 4668: 4662: 4652: 4650: 4628:CARRIAGE RETURN 4627: 4624: 4620: 4617: 4614:LINE TABULATION 4613: 4610: 4594: 4590: 4586: 4582: 4574: 4568: 4567: 4562: 4556: 4555: 4544: 4542: 4522: 4520: 4516: 4512: 4504: 4502: 4498: 4494: 4487: 4483: 4479: 4460: 4457: 4455: 4451: 4448:byte order mark 4443: 4439: 4435: 4431: 4427: 4423: 4419: 4415: 4404: 4392: 4388: 4384: 4376: 4372: 4368: 4361: 4360: 4338: 4337: 4324: 4307: 4306: 4302: 4294: 4281: 4268: 4267: 4263: 4254: 4253: 4249: 4236: 4235: 4228: 4215: 4214: 4210: 4170: 4156: 4149: 4134: 4108: 4066: 4044: 4036: 4028: 4023: 4022: 4004: 3996: 3991: 3990: 3977:Separator, line 3972: 3967:, which are Cc 3938: 3930: 3909: 3888: 3864: 3791: 3783: 3762: 3740: 3714: 3692: 3666: 3640: 3606: 3598: 3564: 3538: 3512: 3504: 3488:Mark, enclosing 3483: 3462: 3441: 3433: 3404: 3398:modifier letter 3379: 3334: 3313: 3292: 3285: 3284:, Cased Letter 3281: 3277: 3251: 3250: 3249: 3231: 3200: 3186: 3148: 3146:Plane (Unicode) 3142: 3133: 3129: 3125: 3121: 3117: 3113: 3104: 3101: 3099: 3086: 3083: 3078: 3077: 3073: 3070: 3065: 3064: 3052: 3037: 3033: 3028: 3025: 3023: 3007: 3002: 2996: 2980: 2975: 2974: 2955: 2951: 2946: 2940: 2936: 2915: 2911: 2903: 2899: 2881: 2877: 2872: 2868: 2856: 2852: 2831: 2827: 2787: 2785: 2784: 2783: 2779: 2777: 2748: 2746: 2745: 2744: 2740: 2738: 2707: 2705: 2704: 2703: 2699: 2697: 2667: 2664: 2659: 2658: 2632: 2630: 2629: 2628: 2624: 2622: 2564: 2562: 2561: 2560: 2556: 2554: 2524: 2521: 2516: 2515: 2509: 2507: 2506: 2505: 2501: 2499: 2452: 2450: 2449: 2448: 2444: 2442: 2390:Hanifi Rohingya 2372: 2370: 2369: 2368: 2364: 2362: 2337: 2334: 2329: 2328: 2302: 2300: 2299: 2298: 2294: 2292: 2237: 2235: 2234: 2233: 2229: 2227: 2174: 2172: 2171: 2170: 2166: 2164: 2120:Psalter Pahlavi 2038: 2036: 2035: 2034: 2030: 2028: 2001: 1999: 1998: 1997: 1993: 1991: 1970: 1967: 1962: 1961: 1956: 1954: 1953: 1952: 1948: 1946: 1896: 1894: 1893: 1892: 1888: 1886: 1831: 1829: 1828: 1827: 1823: 1821: 1723: 1721: 1720: 1719: 1715: 1713: 1688: 1685: 1680: 1679: 1613: 1611: 1610: 1609: 1605: 1603: 1561: 1559: 1558: 1557: 1553: 1551: 1527:musical symbols 1485: 1483: 1482: 1481: 1477: 1475: 1421: 1419: 1418: 1417: 1413: 1411: 1360: 1358: 1357: 1356: 1352: 1350: 1332:Byzantine music 1314: 1312: 1311: 1310: 1306: 1304: 1297: 1224: 1222: 1221: 1220: 1216: 1214: 1186: 1183: 1178: 1177: 1171: 1168: 1163: 1162: 1156: 1154: 1153: 1152: 1148: 1146: 1120: 1117: 1115: 1114: 1113: 1109: 1107: 1072: 1069: 1067: 1066: 1065: 1061: 1059: 1024: 1021: 1019: 1018: 1017: 1013: 1011: 892: 842: 742: 730: 676:Michael Everson 646:writing systems 626: 620: 618:Scripts covered 564: 558: 452: 444:Michael Everson 434:, Thomas Milo, 356: 262: 260: 257:writing systems 236: 235: 234: 225:Without proper 214: 185: 177: 176: 168: 167: 147: 133: 131: 107: 106: 57: 37: 17: 12: 11: 5: 14551: 14541: 14540: 14535: 14530: 14513: 14512: 14509:Character sets 14502: 14499: 14498: 14496: 14495: 14490: 14485: 14480: 14475: 14470: 14465: 14460: 14454: 14452: 14451:Related topics 14448: 14447: 14445: 14444: 14439: 14434: 14433: 14432: 14427: 14417: 14415:Morse prosigns 14411: 14409: 14403: 14402: 14400: 14399: 14394: 14389: 14384: 14379: 14374: 14367: 14366: 14365: 14360: 14355: 14345: 14340: 14335: 14334: 14333: 14328: 14320: 14315: 14310: 14305: 14300: 14299: 14298: 14288: 14283: 14277: 14275: 14271: 14270: 14268: 14267: 14262: 14257: 14252: 14247: 14241: 14239: 14232: 14231: 14229: 14228: 14223: 14218: 14213: 14208: 14203: 14198: 14193: 14188: 14183: 14178: 14173: 14168: 14162: 14160: 14150: 14149: 14147: 14146: 14141: 14136: 14131: 14126: 14121: 14116: 14111: 14109:TI calculators 14106: 14101: 14096: 14091: 14086: 14081: 14076: 14071: 14066: 14061: 14056: 14051: 14046: 14041: 14036: 14031: 14026: 14021: 14016: 14011: 14006: 14001: 13996: 13987: 13982: 13977: 13972: 13967: 13962: 13957: 13952: 13947: 13942: 13937: 13932: 13927: 13922: 13917: 13912: 13907: 13902: 13897: 13891: 13889: 13885: 13884: 13882: 13881: 13876: 13871: 13866: 13861: 13856: 13851: 13850: 13849: 13844: 13839: 13834: 13829: 13824: 13819: 13817:United Kingdom 13814: 13809: 13804: 13794: 13788: 13786: 13775: 13774: 13772: 13771: 13766: 13760: 13758: 13751: 13750: 13748: 13747: 13742: 13737: 13732: 13727: 13722: 13717: 13712: 13707: 13702: 13697: 13692: 13687: 13682: 13677: 13672: 13667: 13662: 13652: 13647: 13641: 13639: 13633: 13632: 13630: 13629: 13624: 13619: 13614: 13609: 13604: 13599: 13594: 13589: 13584: 13579: 13574: 13569: 13564: 13559: 13554: 13549: 13544: 13539: 13534: 13529: 13523: 13521: 13515: 13514: 13512: 13511: 13506: 13501: 13496: 13491: 13486: 13481: 13476: 13471: 13466: 13461: 13456: 13451: 13446: 13441: 13436: 13431: 13426: 13421: 13416: 13411: 13406: 13401: 13396: 13391: 13386: 13381: 13376: 13371: 13366: 13361: 13356: 13351: 13346: 13341: 13336: 13331: 13326: 13321: 13316: 13311: 13306: 13301: 13296: 13291: 13286: 13281: 13276: 13271: 13266: 13261: 13256: 13251: 13246: 13241: 13236: 13231: 13226: 13221: 13216: 13211: 13205: 13203: 13202:DOS code pages 13199: 13198: 13196: 13195: 13190: 13185: 13180: 13175: 13170: 13165: 13160: 13155: 13150: 13148:Latin (Kermit) 13145: 13140: 13135: 13130: 13125: 13120: 13115: 13110: 13105: 13100: 13095: 13090: 13085: 13080: 13075: 13070: 13065: 13060: 13055: 13049: 13047: 13038: 13037: 13035: 13034: 13029: 13024: 13018: 13016: 13010: 13009: 13007: 13006: 13001: 12996: 12991: 12986: 12981: 12976: 12971: 12966: 12961: 12956: 12951: 12946: 12941: 12936: 12931: 12926: 12921: 12916: 12911: 12906: 12901: 12896: 12891: 12886: 12881: 12876: 12871: 12866: 12861: 12856: 12850: 12848: 12844: 12843: 12841: 12840: 12835: 12830: 12825: 12820: 12815: 12810: 12809: 12808: 12803: 12792: 12790: 12786: 12785: 12783: 12782: 12781: 12780: 12775: 12770: 12765: 12757: 12756: 12755: 12750: 12748:KOI-8 Cyrillic 12742: 12741: 12740: 12732: 12731: 12730: 12728:-16 (Romanian) 12725: 12720: 12715: 12710: 12705: 12700: 12695: 12690: 12685: 12680: 12675: 12670: 12665: 12660: 12651: 12649: 12643: 12642: 12640: 12639: 12634: 12633: 12632: 12631: 12630: 12625: 12617: 12612: 12607: 12589: 12584: 12583: 12582: 12572: 12567: 12566: 12565: 12560: 12559: 12558: 12553: 12548: 12543: 12533: 12526:Telegraph code 12522: 12520: 12516: 12515: 12508: 12507: 12500: 12493: 12485: 12476: 12475: 12472: 12471: 12469: 12468: 12457: 12445: 12442: 12441: 12439: 12438: 12433: 12428: 12427: 12426: 12416: 12411: 12406: 12400: 12398: 12394: 12393: 12391: 12390: 12385: 12379: 12377: 12373: 12372: 12370: 12369: 12364: 12359: 12354: 12349: 12344: 12339: 12334: 12329: 12324: 12319: 12314: 12309: 12304: 12299: 12294: 12289: 12284: 12279: 12274: 12269: 12264: 12259: 12254: 12249: 12244: 12239: 12234: 12229: 12224: 12219: 12214: 12209: 12204: 12199: 12194: 12189: 12184: 12179: 12174: 12169: 12164: 12159: 12154: 12148: 12143: 12138: 12133: 12128: 12123: 12118: 12113: 12108: 12103: 12098: 12093: 12088: 12083: 12078: 12073: 12068: 12063: 12058: 12053: 12048: 12043: 12038: 12033: 12028: 12023: 12018: 12013: 12008: 12003: 11998: 11992: 11990: 11984: 11983: 11981: 11980: 11975: 11970: 11965: 11960: 11955: 11950: 11945: 11940: 11935: 11930: 11925: 11920: 11915: 11910: 11905: 11900: 11895: 11890: 11885: 11880: 11878:Sorang Sompeng 11875: 11870: 11865: 11860: 11855: 11850: 11845: 11840: 11835: 11830: 11825: 11820: 11815: 11810: 11805: 11800: 11795: 11790: 11785: 11780: 11775: 11770: 11768:Miao (Pollard) 11765: 11760: 11755: 11750: 11745: 11740: 11735: 11730: 11725: 11720: 11715: 11710: 11705: 11700: 11695: 11690: 11685: 11680: 11675: 11670: 11665: 11660: 11655: 11650: 11645: 11640: 11635: 11630: 11625: 11620: 11615: 11610: 11605: 11600: 11595: 11590: 11585: 11580: 11575: 11570: 11565: 11560: 11555: 11550: 11545: 11540: 11535: 11530: 11525: 11520: 11514: 11512: 11511:Modern scripts 11508: 11507: 11505: 11504: 11499: 11494: 11489: 11484: 11478: 11476: 11468: 11467: 11454: 11453: 11451: 11450: 11445: 11440: 11435: 11430: 11425: 11419: 11417: 11416:Related topics 11413: 11412: 11410: 11409: 11404: 11399: 11394: 11389: 11383: 11381: 11377: 11376: 11374: 11373: 11368: 11363: 11362: 11361: 11356: 11346: 11341: 11336: 11330: 11328: 11324: 11323: 11321: 11320: 11315: 11310: 11305: 11300: 11299: 11298: 11288: 11283: 11278: 11273: 11268: 11262: 11260: 11254: 11253: 11250: 11249: 11247: 11246: 11241: 11236: 11231: 11226: 11221: 11216: 11211: 11206: 11201: 11195: 11193: 11187: 11186: 11184: 11183: 11178: 11173: 11168: 11167: 11166: 11156: 11150: 11148: 11141: 11137: 11136: 11133: 11132: 11130: 11129: 11124: 11119: 11114: 11109: 11104: 11099: 11094: 11089: 11084: 11079: 11073: 11071: 11067: 11066: 11064: 11063: 11058: 11053: 11048: 11043: 11038: 11033: 11024: 11019: 11013: 11011: 11002: 10998: 10997: 10995: 10994: 10989: 10984: 10979: 10974: 10969: 10968: 10967: 10956: 10954: 10948: 10947: 10945: 10944: 10939: 10934: 10928: 10926: 10922: 10921: 10914: 10913: 10906: 10899: 10891: 10885: 10884: 10878: 10872: 10846: 10840: 10839: 10838: 10837: 10836: 10835: 10834: 10829: 10807: 10806:External links 10804: 10803: 10802: 10796: 10771: 10770: 10756: 10743: 10730: 10713: 10700: 10687: 10674: 10650: 10648: 10645: 10642: 10641: 10620: 10596: 10579: 10562: 10541: 10519: 10501: 10484: 10461: 10438: 10415: 10397: 10368: 10344: 10320: 10296: 10277: 10252: 10237: 10197: 10179: 10157: 10135: 10130:preusstype.com 10124:Preuss, Ingo. 10116: 10098: 10095:on 2004-10-12. 10075: 10040: 10015: 9999: 9978: 9952: 9932: 9910: 9856: 9802: 9748: 9694: 9640: 9586: 9571: 9556: 9533: 9526:(2003-04-30). 9515: 9485:. BCP 18. 9461: 9407: 9383: 9359: 9342:(2008-05-05). 9331: 9324: 9307: 9283: 9273:. Beuth Verlag 9262: 9237: 9222: 9219:on 2011-07-16. 9204: 9184: 9164: 9144: 9124: 9101: 9083: 9065: 9048: 9031: 9011: 9000:emojipedia.org 8987: 8962: 8955: 8936: 8933: 8932: 8926: 8911: 8904: 8897: 8878: 8875: 8874: 8866: 8860: 8845: 8838: 8814: 8811: 8810: 8792: 8786: 8771: 8764: 8740: 8733: 8714: 8707: 8688: 8681: 8662: 8651:Android Police 8637: 8634: 8633: 8619: 8613: 8598: 8591: 8584: 8565: 8558: 8539: 8536: 8535: 8521: 8515: 8500: 8493: 8490: 8489: 8475: 8469: 8454: 8447: 8444: 8443: 8429: 8423: 8408: 8401: 8398: 8397: 8383: 8377: 8362: 8355: 8352: 8351: 8337: 8331: 8316: 8309: 8298:emojipedia.org 8285: 8282: 8281: 8267: 8261: 8246: 8239: 8236: 8235: 8221: 8215: 8200: 8193: 8190: 8189: 8175: 8169: 8154: 8147: 8127: 8120: 8101: 8081: 8078: 8077: 8063: 8057: 8042: 8035: 8032: 8031: 8017: 8011: 7996: 7989: 7986: 7985: 7971: 7960: 7953: 7950: 7949: 7935: 7924: 7917: 7914: 7913: 7899: 7888: 7881: 7878: 7877: 7863: 7852: 7842: 7839: 7838: 7824: 7813: 7806: 7803: 7802: 7788: 7777: 7770: 7767: 7766: 7752: 7741: 7734: 7731: 7730: 7715: 7704: 7697: 7677: 7653: 7629: 7608: 7586: 7564: 7540: 7516: 7496: 7473: 7452: 7428: 7403: 7383:Bob Belleville 7334: 7312: 7291: 7270: 7267: 7266: 7251: 7243: 7234: 7227: 7201: 7175: 7174: 7171: 7170: 7163: 7143: 7142: 7140: 7137: 7134: 7133: 7116: 7103: 7088: 7074: 7073: 7071: 7068: 7067: 7066: 7061: 7059:Unicode symbol 7056: 7051: 7046: 7041: 7035: 7030: 7025: 7020: 7010: 7003: 7000: 6977: 6976: 6953: 6949:YI SYLLABLE WU 6936: 6906: 6869:Main article: 6866: 6863: 6808:Main article: 6805: 6802: 6768:Tibetan script 6741: 6738: 6634: 6631: 6598: 6595: 6566: 6560: 6503:Azeri alphabet 6494: 6491: 6473: 6470: 6366:Main article: 6363: 6360: 6358: 6355: 6353: 6350: 6331:LINE SEPARATOR 6309: 6306: 6244:Main article: 6241: 6238: 6171: 6170: 6167: 6160: 6118:Main article: 6115: 6112: 6062:Main article: 6059: 6056: 6018:Main article: 6015: 6012: 5977:World Wide Web 5969:extended ASCII 5885: 5882: 5850:World Wide Web 5835: 5832: 5752:wide character 5667: 5666: 5653: 5647: 5638: 5588: 5585: 5563: 5562: 5557: 5554: 5550: 5549: 5543: 5538: 5534: 5533: 5527: 5524: 5520: 5519: 5513: 5504: 5500: 5499: 5493: 5487: 5486: 5483:Block Elements 5480: 5474: 5473: 5467: 5462: 5458: 5457: 5451: 5448:02, 0A, 20–21, 5445: 5441: 5440: 5434: 5433:82–83, 95, 97 5407: 5403: 5402: 5396: 5387: 5386: 5380: 5372: 5371: 5365: 5356: 5352: 5351: 5345: 5335: 5334: 5328: 5321: 5320: 5314: 5296: 5292: 5291: 5288:Greek Extended 5285: 5282: 5278: 5277: 5271: 5261: 5257: 5256: 5250: 5244: 5240: 5239: 5233: 5226: 5222: 5221: 5215: 5193: 5192: 5189:IPA Extensions 5186: 5182: 5181: 5174: 5171: 5167: 5166: 5156: 5145: 5144: 5138: 5118: 5114: 5113: 5107: 5099: 5098: 5092: 5085: 5081: 5080: 5077: 5074: 5040:Windows NT 4.0 5035: 5032: 5000:ligature forms 4949: 4942: 4941: 4926: 4919: 4918: 4917: 4916: 4915: 4913: 4910: 4848:CJK characters 4761: 4758: 4725:YI SYLLABLE WU 4661: 4658: 4605:as defined in 4528: 4527: 4509: 4491: 4401:surrogate pair 4381:high-surrogate 4363: 4362: 4359: 4358: 4322: 4315:. Version 23. 4300: 4279: 4261: 4247: 4226: 4207: 4206: 4203: 4202: 4199: 4196: 4193: 4189: 4188: 4185: 4182: 4179: 4176: 4173: 4167: 4166: 4163: 4146: 4143: 4140: 4137: 4131: 4130: 4127: 4124: 4117: 4114: 4111: 4105: 4104: 4081: 4078: 4075: 4072: 4069: 4063: 4062: 4059: 4056: 4053: 4050: 4049:Other, control 4047: 4041: 4040: 4033: 4032: 4019: 4016: 4013: 4010: 4007: 4001: 4000: 3997:LINE SEPARATOR 3987: 3984: 3981: 3978: 3975: 3969: 3968: 3953: 3950: 3947: 3944: 3941: 3935: 3934: 3927: 3926: 3924: 3921: 3918: 3915: 3912: 3906: 3905: 3903: 3900: 3897: 3894: 3891: 3885: 3884: 3879: 3876: 3873: 3870: 3867: 3861: 3860: 3806: 3803: 3800: 3797: 3794: 3788: 3787: 3780: 3779: 3777: 3774: 3771: 3768: 3765: 3759: 3758: 3755: 3752: 3749: 3746: 3743: 3737: 3736: 3733:quotation mark 3729: 3726: 3723: 3720: 3717: 3711: 3710: 3707: 3704: 3701: 3698: 3695: 3689: 3688: 3681: 3678: 3675: 3672: 3669: 3663: 3662: 3655: 3652: 3649: 3646: 3643: 3637: 3636: 3629:tie characters 3621: 3618: 3615: 3612: 3609: 3603: 3602: 3601:, Punctuation 3595: 3594: 3579: 3576: 3573: 3570: 3567: 3561: 3560: 3557:Roman numerals 3553: 3550: 3547: 3544: 3543:Number, letter 3541: 3535: 3534: 3527: 3524: 3521: 3518: 3515: 3509: 3508: 3501: 3500: 3498: 3495: 3492: 3489: 3486: 3480: 3479: 3477: 3474: 3471: 3468: 3465: 3459: 3458: 3456: 3453: 3450: 3447: 3444: 3438: 3437: 3430: 3429: 3419: 3416: 3413: 3410: 3407: 3401: 3400: 3394: 3391: 3388: 3385: 3382: 3376: 3375: 3349: 3346: 3343: 3340: 3337: 3331: 3330: 3328: 3325: 3322: 3319: 3316: 3310: 3309: 3307: 3304: 3301: 3298: 3295: 3289: 3288: 3274: 3273: 3271: 3269: 3267: 3265: 3263: 3259: 3258: 3255: 3246: 3243: 3240: 3237: 3233: 3232: 3230: 3229: 3222: 3215: 3207: 3185: 3182: 3144:Main article: 3141: 3138: 3006: 3003: 2995: 2992: 2988:CJK characters 2979: 2976: 2973: 2972: 2970: 2969: 2966: 2963: 2949: 2934: 2932: 2931: 2925: 2922: 2918:6.2 added the 2909: 2897: 2895: 2894: 2891: 2888: 2885: 2875: 2866: 2864: 2863: 2860: 2850: 2824: 2823: 2820: 2819: 2790: 2786: 2775: 2772: 2770: 2764: 2762:September 2024 2759: 2755: 2754: 2751: 2747: 2736: 2730: 2728:September 2023 2725: 2721: 2720: 2710: 2706: 2695: 2692: 2686: 2684:September 2022 2681: 2677: 2676: 2635: 2631: 2620: 2617: 2611: 2609:September 2021 2606: 2602: 2601: 2567: 2563: 2552: 2549: 2544: 2538: 2533: 2529: 2528: 2512: 2508: 2497: 2491: 2486: 2482: 2481: 2455: 2451: 2440: 2437: 2431: 2426: 2422: 2421: 2375: 2371: 2360: 2357: 2351: 2346: 2342: 2341: 2305: 2301: 2290: 2287: 2282: 2276: 2271: 2267: 2266: 2240: 2236: 2225: 2222: 2216: 2211: 2207: 2206: 2177: 2173: 2162: 2159: 2154: 2148: 2143: 2139: 2138: 2041: 2037: 2026: 2023: 2017: 2012: 2008: 2007: 2004: 2000: 1989: 1983: 1981:September 2013 1978: 1974: 1973: 1959: 1955: 1944: 1938: 1936:September 2012 1933: 1929: 1928: 1899: 1895: 1884: 1881: 1876: 1870: 1865: 1861: 1860: 1834: 1830: 1819: 1816: 1811: 1805: 1800: 1796: 1795: 1793:Vedic Sanskrit 1726: 1722: 1711: 1708: 1702: 1697: 1693: 1692: 1616: 1612: 1601: 1598: 1595: 1590: 1586: 1585: 1564: 1560: 1549: 1546: 1540: 1535: 1531: 1530: 1488: 1484: 1473: 1470: 1467: 1462: 1458: 1457: 1424: 1420: 1409: 1406: 1401: 1395: 1390: 1386: 1385: 1363: 1359: 1348: 1345: 1340: 1336: 1335: 1317: 1313: 1302: 1299: 1294: 1291: 1286: 1282: 1281: 1227: 1223: 1212: 1209: 1206: 1200: 1198:September 1999 1195: 1191: 1190: 1159: 1155: 1144: 1141: 1136: 1132: 1131: 1123: 1116: 1105: 1102: 1096: 1091: 1087: 1086: 1075: 1068: 1057: 1054: 1046: 1043: 1038: 1034: 1033: 1027: 1020: 1009: 1006: 999: 994: 990: 989: 895: 890: 887: 884: 874: 869: 865: 864: 861: 857: 856: 853: 850: 844: 839: 836: 741: 738: 729: 726: 648:in use today. 638:OpenOffice.org 622:Main article: 619: 616: 614:environments. 560:Main article: 557: 554: 451: 448: 412:ISO/IEC 8859-1 405:word processor 398: 355: 352: 321:directionality 285:character sets 229:, you may see 221: 220: 219: 216: 215: 213: 212: 205: 198: 190: 187: 186: 184: 183: 174: 164: 161: 160: 159:, among others 154: 150: 149: 146: 145: 140: 134: 130: 129: 124: 119: 114: 108: 105: 104: 99: 94: 88: 86: 82: 81: 78: 74: 73: 63: 59: 58: 56: 55: 52: 45: 43: 39: 38: 32: 15: 9: 6: 4: 3: 2: 14550: 14539: 14536: 14534: 14531: 14529: 14526: 14525: 14523: 14510: 14500: 14494: 14491: 14489: 14486: 14484: 14481: 14479: 14476: 14474: 14471: 14469: 14466: 14464: 14461: 14459: 14456: 14455: 14453: 14449: 14443: 14440: 14438: 14435: 14431: 14428: 14426: 14423: 14422: 14421: 14418: 14416: 14413: 14412: 14410: 14408: 14404: 14398: 14395: 14393: 14390: 14388: 14385: 14383: 14380: 14378: 14375: 14373: 14372: 14368: 14364: 14361: 14359: 14356: 14354: 14351: 14350: 14349: 14346: 14344: 14341: 14339: 14336: 14332: 14329: 14327: 14324: 14323: 14321: 14319: 14316: 14314: 14311: 14309: 14306: 14304: 14301: 14297: 14294: 14293: 14292: 14289: 14287: 14284: 14282: 14279: 14278: 14276: 14272: 14266: 14263: 14261: 14258: 14256: 14253: 14251: 14248: 14246: 14243: 14242: 14240: 14237: 14233: 14227: 14224: 14222: 14219: 14217: 14214: 14212: 14209: 14207: 14204: 14202: 14199: 14197: 14194: 14192: 14189: 14187: 14184: 14182: 14179: 14177: 14174: 14172: 14169: 14167: 14164: 14163: 14161: 14159: 14158:ISO/IEC 10646 14155: 14151: 14145: 14142: 14140: 14137: 14135: 14132: 14130: 14127: 14125: 14122: 14120: 14117: 14115: 14112: 14110: 14107: 14105: 14102: 14100: 14097: 14095: 14092: 14090: 14087: 14085: 14082: 14080: 14077: 14075: 14072: 14070: 14067: 14065: 14062: 14060: 14057: 14055: 14052: 14050: 14047: 14045: 14042: 14040: 14037: 14035: 14032: 14030: 14027: 14025: 14022: 14020: 14017: 14015: 14012: 14010: 14007: 14005: 14002: 14000: 13997: 13995: 13991: 13988: 13986: 13983: 13981: 13978: 13976: 13975:Compucolor II 13973: 13971: 13968: 13966: 13963: 13961: 13958: 13956: 13953: 13951: 13948: 13946: 13943: 13941: 13938: 13936: 13933: 13931: 13930:Acorn RISC OS 13928: 13926: 13923: 13921: 13918: 13916: 13913: 13911: 13908: 13906: 13903: 13901: 13898: 13896: 13893: 13892: 13890: 13886: 13880: 13877: 13875: 13872: 13870: 13867: 13865: 13862: 13860: 13859:8-bit Turkish 13857: 13855: 13852: 13848: 13845: 13843: 13840: 13838: 13835: 13833: 13830: 13828: 13825: 13823: 13820: 13818: 13815: 13813: 13810: 13808: 13805: 13803: 13800: 13799: 13798: 13795: 13793: 13790: 13789: 13787: 13784: 13780: 13776: 13770: 13767: 13765: 13762: 13761: 13759: 13756: 13752: 13746: 13743: 13741: 13738: 13736: 13733: 13731: 13728: 13726: 13723: 13721: 13718: 13716: 13713: 13711: 13708: 13706: 13703: 13701: 13698: 13696: 13693: 13691: 13688: 13686: 13683: 13681: 13678: 13676: 13673: 13671: 13668: 13666: 13663: 13660: 13656: 13653: 13651: 13648: 13646: 13643: 13642: 13640: 13638: 13634: 13628: 13625: 13623: 13620: 13618: 13615: 13613: 13610: 13608: 13605: 13603: 13600: 13598: 13595: 13593: 13590: 13588: 13585: 13583: 13580: 13578: 13575: 13573: 13570: 13568: 13565: 13563: 13560: 13558: 13555: 13553: 13550: 13548: 13545: 13543: 13540: 13538: 13535: 13533: 13530: 13528: 13525: 13524: 13522: 13520: 13516: 13510: 13507: 13505: 13502: 13500: 13497: 13495: 13492: 13490: 13487: 13485: 13482: 13480: 13477: 13475: 13472: 13470: 13467: 13465: 13462: 13460: 13457: 13455: 13452: 13450: 13447: 13445: 13442: 13440: 13437: 13435: 13432: 13430: 13427: 13425: 13422: 13420: 13417: 13415: 13412: 13410: 13407: 13405: 13402: 13400: 13397: 13395: 13392: 13390: 13387: 13385: 13382: 13380: 13377: 13375: 13372: 13370: 13367: 13365: 13362: 13360: 13357: 13355: 13352: 13350: 13347: 13345: 13342: 13340: 13337: 13335: 13332: 13330: 13327: 13325: 13322: 13320: 13317: 13315: 13312: 13310: 13307: 13305: 13302: 13300: 13297: 13295: 13292: 13290: 13287: 13285: 13282: 13280: 13277: 13275: 13272: 13270: 13267: 13265: 13262: 13260: 13257: 13255: 13252: 13250: 13247: 13245: 13242: 13240: 13237: 13235: 13232: 13230: 13227: 13225: 13222: 13220: 13217: 13215: 13212: 13210: 13207: 13206: 13204: 13200: 13194: 13191: 13189: 13186: 13184: 13181: 13179: 13176: 13174: 13171: 13169: 13166: 13164: 13161: 13159: 13156: 13154: 13151: 13149: 13146: 13144: 13141: 13139: 13136: 13134: 13131: 13129: 13126: 13124: 13121: 13119: 13116: 13114: 13111: 13109: 13106: 13104: 13101: 13099: 13096: 13094: 13091: 13089: 13086: 13084: 13081: 13079: 13076: 13074: 13071: 13069: 13066: 13064: 13061: 13059: 13056: 13054: 13051: 13050: 13048: 13044: 13039: 13033: 13030: 13028: 13027:ISO/IEC 10367 13025: 13023: 13020: 13019: 13017: 13015: 13011: 13005: 13002: 13000: 12997: 12995: 12992: 12990: 12987: 12985: 12982: 12980: 12977: 12975: 12972: 12970: 12967: 12965: 12962: 12960: 12957: 12955: 12952: 12950: 12947: 12945: 12942: 12940: 12937: 12935: 12932: 12930: 12927: 12925: 12922: 12920: 12917: 12915: 12912: 12910: 12907: 12905: 12902: 12900: 12897: 12895: 12892: 12890: 12887: 12885: 12882: 12880: 12877: 12875: 12872: 12870: 12867: 12865: 12862: 12860: 12857: 12855: 12852: 12851: 12849: 12845: 12839: 12836: 12834: 12831: 12829: 12826: 12824: 12821: 12819: 12816: 12814: 12811: 12807: 12804: 12802: 12799: 12798: 12797: 12794: 12793: 12791: 12787: 12779: 12776: 12774: 12771: 12769: 12766: 12764: 12761: 12760: 12758: 12754: 12751: 12749: 12746: 12745: 12743: 12739: 12736: 12735: 12733: 12729: 12726: 12724: 12721: 12719: 12716: 12714: 12711: 12709: 12706: 12704: 12701: 12699: 12696: 12694: 12691: 12689: 12686: 12684: 12681: 12679: 12678:-5 (Cyrillic) 12676: 12674: 12671: 12669: 12666: 12664: 12661: 12659: 12656: 12655: 12653: 12652: 12650: 12648: 12644: 12638: 12635: 12629: 12626: 12624: 12621: 12620: 12618: 12616: 12613: 12611: 12608: 12606: 12603: 12602: 12601: 12597: 12593: 12590: 12588: 12585: 12581: 12578: 12577: 12576: 12573: 12571: 12568: 12564: 12561: 12557: 12554: 12552: 12549: 12547: 12544: 12542: 12539: 12538: 12537: 12534: 12532: 12529: 12528: 12527: 12524: 12523: 12521: 12517: 12513: 12506: 12501: 12499: 12494: 12492: 12487: 12486: 12483: 12467: 12458: 12456: 12447: 12446: 12443: 12437: 12434: 12432: 12429: 12425: 12422: 12421: 12420: 12417: 12415: 12412: 12410: 12407: 12405: 12402: 12401: 12399: 12395: 12389: 12386: 12384: 12381: 12380: 12378: 12374: 12368: 12365: 12363: 12360: 12358: 12355: 12353: 12350: 12348: 12347:Tulu Tigalari 12345: 12343: 12340: 12338: 12335: 12333: 12330: 12328: 12325: 12323: 12322:Sylheti Nagri 12320: 12318: 12315: 12313: 12312:South Arabian 12310: 12308: 12305: 12303: 12300: 12298: 12295: 12293: 12290: 12288: 12285: 12283: 12280: 12278: 12275: 12273: 12270: 12268: 12265: 12263: 12260: 12258: 12255: 12253: 12250: 12248: 12245: 12243: 12240: 12238: 12237:Old Hungarian 12235: 12233: 12230: 12228: 12225: 12223: 12220: 12218: 12215: 12213: 12210: 12208: 12205: 12203: 12200: 12198: 12195: 12193: 12190: 12188: 12185: 12183: 12180: 12178: 12175: 12173: 12170: 12168: 12165: 12163: 12160: 12158: 12155: 12152: 12149: 12147: 12144: 12142: 12139: 12137: 12134: 12132: 12129: 12127: 12124: 12122: 12119: 12117: 12114: 12112: 12109: 12107: 12104: 12102: 12099: 12097: 12094: 12092: 12089: 12087: 12084: 12082: 12079: 12077: 12074: 12072: 12069: 12067: 12064: 12062: 12059: 12057: 12054: 12052: 12049: 12047: 12044: 12042: 12039: 12037: 12034: 12032: 12029: 12027: 12024: 12022: 12019: 12017: 12014: 12012: 12009: 12007: 12004: 12002: 11999: 11997: 11994: 11993: 11991: 11985: 11979: 11976: 11974: 11971: 11969: 11966: 11964: 11961: 11959: 11956: 11954: 11951: 11949: 11946: 11944: 11941: 11939: 11936: 11934: 11931: 11929: 11926: 11924: 11921: 11919: 11916: 11914: 11911: 11909: 11906: 11904: 11901: 11899: 11896: 11894: 11891: 11889: 11886: 11884: 11881: 11879: 11876: 11874: 11871: 11869: 11866: 11864: 11861: 11859: 11856: 11854: 11851: 11849: 11846: 11844: 11841: 11839: 11836: 11834: 11831: 11829: 11826: 11824: 11821: 11819: 11816: 11814: 11811: 11809: 11806: 11804: 11801: 11799: 11796: 11794: 11791: 11789: 11786: 11784: 11781: 11779: 11776: 11774: 11771: 11769: 11766: 11764: 11761: 11759: 11758:Mende Kikakui 11756: 11754: 11753:Masaram Gondi 11751: 11749: 11746: 11744: 11741: 11739: 11738:Lisu (Fraser) 11736: 11734: 11731: 11729: 11726: 11724: 11721: 11719: 11716: 11714: 11711: 11709: 11706: 11704: 11701: 11699: 11696: 11694: 11691: 11689: 11686: 11684: 11681: 11679: 11676: 11674: 11671: 11669: 11666: 11664: 11661: 11659: 11656: 11654: 11651: 11649: 11646: 11644: 11641: 11639: 11638:Gunjala Gondi 11636: 11634: 11631: 11629: 11626: 11624: 11621: 11619: 11616: 11614: 11611: 11609: 11606: 11604: 11601: 11599: 11596: 11594: 11591: 11589: 11586: 11584: 11581: 11579: 11576: 11574: 11571: 11569: 11566: 11564: 11561: 11559: 11556: 11554: 11551: 11549: 11546: 11544: 11541: 11539: 11536: 11534: 11531: 11529: 11526: 11524: 11521: 11519: 11516: 11515: 11513: 11509: 11503: 11500: 11498: 11495: 11493: 11490: 11488: 11485: 11483: 11480: 11479: 11477: 11475: 11469: 11464: 11459: 11455: 11449: 11446: 11444: 11441: 11439: 11436: 11434: 11431: 11429: 11426: 11424: 11421: 11420: 11418: 11414: 11408: 11405: 11403: 11400: 11398: 11395: 11393: 11390: 11388: 11385: 11384: 11382: 11378: 11372: 11369: 11367: 11364: 11360: 11357: 11355: 11352: 11351: 11350: 11347: 11345: 11342: 11340: 11337: 11335: 11332: 11331: 11329: 11325: 11319: 11316: 11314: 11311: 11309: 11306: 11304: 11301: 11297: 11294: 11293: 11292: 11289: 11287: 11284: 11282: 11279: 11277: 11274: 11272: 11269: 11267: 11264: 11263: 11261: 11255: 11245: 11242: 11240: 11237: 11235: 11232: 11230: 11227: 11225: 11222: 11220: 11217: 11215: 11212: 11210: 11207: 11205: 11202: 11200: 11197: 11196: 11194: 11192: 11188: 11182: 11179: 11177: 11174: 11172: 11169: 11165: 11164:ISO/IEC 14651 11162: 11161: 11160: 11157: 11155: 11152: 11151: 11149: 11145: 11142: 11138: 11128: 11125: 11123: 11120: 11118: 11115: 11113: 11110: 11108: 11105: 11103: 11100: 11098: 11095: 11093: 11090: 11088: 11085: 11083: 11080: 11078: 11075: 11074: 11072: 11068: 11062: 11059: 11057: 11054: 11052: 11049: 11047: 11044: 11042: 11039: 11037: 11034: 11032: 11028: 11025: 11023: 11020: 11018: 11015: 11014: 11012: 11010: 11006: 11003: 10999: 10993: 10990: 10988: 10985: 10983: 10980: 10978: 10975: 10973: 10970: 10966: 10963: 10962: 10961: 10958: 10957: 10955: 10953: 10949: 10943: 10940: 10938: 10935: 10933: 10930: 10929: 10927: 10923: 10919: 10912: 10907: 10905: 10900: 10898: 10893: 10892: 10889: 10882: 10879: 10876: 10873: 10871: 10867: 10862: 10858: 10857: 10852: 10847: 10844: 10841: 10833: 10830: 10828: 10825: 10824: 10823: 10820: 10819: 10818: 10815: 10814: 10813: 10812:Unicode, Inc. 10810: 10809: 10799: 10793: 10789: 10785: 10781: 10780: 10774: 10773: 10768: 10767:0-7645-4625-2 10764: 10760: 10757: 10755: 10754:0-596-10121-X 10751: 10747: 10744: 10742: 10741:0-201-70052-2 10738: 10734: 10731: 10729: 10728:0-321-48091-0 10725: 10721: 10717: 10714: 10712: 10711:0-321-18578-1 10708: 10704: 10701: 10699: 10698:0-201-61633-5 10695: 10691: 10688: 10686: 10685:0-321-12730-7 10682: 10678: 10675: 10672: 10671:Unicode 6.0.0 10668: 10667:9781936213016 10664: 10660: 10656: 10652: 10651: 10630: 10624: 10609: 10603: 10601: 10589: 10583: 10572: 10566: 10559:. 2021-06-14. 10558: 10554: 10548: 10546: 10537: 10533: 10529: 10523: 10512: 10505: 10498:. 2002-12-02. 10494: 10488: 10471: 10465: 10448: 10442: 10425: 10419: 10412: 10408: 10407: 10401: 10387:on 2011-04-22 10386: 10382: 10377: 10372: 10358: 10354: 10348: 10331: 10324: 10310: 10306: 10300: 10292: 10288: 10281: 10267: 10263: 10256: 10248: 10244: 10240: 10234: 10230: 10226: 10221: 10216: 10212: 10208: 10201: 10193: 10189: 10183: 10176:. 2023-08-11. 10175: 10171: 10167: 10161: 10154:. 2023-05-12. 10153: 10149: 10145: 10139: 10131: 10127: 10120: 10113: 10108: 10102: 10094: 10090: 10086: 10079: 10065:on 2013-06-25 10064: 10060: 10059: 10054: 10047: 10045: 10036: 10032: 10028: 10022: 10020: 10012: 10008: 10003: 9988: 9982: 9974: 9970: 9963: 9956: 9942: 9936: 9921: 9914: 9906: 9900: 9886: 9883: 9878: 9873: 9869: 9868: 9860: 9852: 9846: 9832: 9829: 9824: 9819: 9815: 9814: 9806: 9798: 9792: 9778: 9775: 9770: 9765: 9761: 9760: 9752: 9744: 9738: 9724: 9721: 9716: 9711: 9707: 9706: 9698: 9690: 9684: 9670: 9667: 9662: 9657: 9653: 9652: 9644: 9636: 9630: 9616: 9613: 9608: 9603: 9599: 9598: 9590: 9582: 9575: 9567: 9560: 9543: 9537: 9529: 9525: 9519: 9511: 9505: 9491: 9488: 9483: 9478: 9474: 9473: 9465: 9457: 9451: 9437: 9434: 9429: 9424: 9420: 9419: 9411: 9397: 9393: 9387: 9373: 9369: 9363: 9349: 9345: 9341: 9335: 9327: 9321: 9317: 9311: 9297: 9293: 9287: 9272: 9266: 9251: 9247: 9241: 9234: 9231: 9226: 9218: 9214: 9208: 9194: 9188: 9174: 9168: 9154: 9148: 9134: 9128: 9114: 9108: 9106: 9097: 9093: 9087: 9079: 9075: 9069: 9061: 9060: 9052: 9044: 9043: 9035: 9021: 9015: 9001: 8997: 8991: 8976: 8972: 8966: 8958: 8952: 8948: 8947: 8940: 8929: 8923: 8919: 8918: 8913: 8912: 8908: 8900: 8894: 8890: 8889: 8882: 8871: 8867: 8863: 8857: 8853: 8852: 8847: 8846: 8842: 8824: 8818: 8801: 8797: 8793: 8789: 8783: 8779: 8778: 8773: 8772: 8768: 8754: 8750: 8744: 8736: 8730: 8726: 8725: 8718: 8710: 8704: 8700: 8699: 8692: 8684: 8678: 8674: 8673: 8666: 8652: 8648: 8641: 8624: 8620: 8616: 8610: 8606: 8605: 8600: 8599: 8595: 8587: 8581: 8577: 8576: 8569: 8561: 8555: 8551: 8550: 8543: 8526: 8522: 8518: 8512: 8508: 8507: 8502: 8501: 8497: 8480: 8476: 8472: 8466: 8462: 8461: 8456: 8455: 8451: 8434: 8430: 8426: 8420: 8416: 8415: 8410: 8409: 8405: 8388: 8384: 8380: 8374: 8370: 8369: 8364: 8363: 8359: 8342: 8338: 8334: 8328: 8324: 8323: 8318: 8317: 8313: 8299: 8295: 8289: 8272: 8268: 8264: 8258: 8254: 8253: 8248: 8247: 8243: 8226: 8222: 8218: 8212: 8208: 8207: 8202: 8201: 8197: 8180: 8176: 8172: 8170:0-321-48091-0 8166: 8162: 8161: 8156: 8155: 8151: 8137: 8131: 8123: 8121:0-321-48091-0 8117: 8113: 8112: 8105: 8091: 8085: 8068: 8064: 8060: 8058:0-321-18578-1 8054: 8050: 8049: 8044: 8043: 8039: 8022: 8018: 8014: 8012:0-321-18578-1 8008: 8004: 8003: 7998: 7997: 7993: 7976: 7972: 7968: 7967: 7962: 7961: 7957: 7940: 7936: 7932: 7931: 7926: 7925: 7921: 7904: 7900: 7896: 7895: 7890: 7889: 7885: 7868: 7864: 7860: 7859: 7854: 7853: 7849: 7847: 7829: 7825: 7821: 7820: 7815: 7814: 7810: 7793: 7789: 7785: 7784: 7779: 7778: 7774: 7757: 7753: 7749: 7748: 7743: 7742: 7738: 7720: 7716: 7712: 7711: 7706: 7705: 7701: 7687: 7681: 7667: 7663: 7657: 7643: 7639: 7633: 7618: 7612: 7604: 7600: 7596: 7590: 7582: 7578: 7574: 7568: 7554: 7550: 7544: 7530: 7526: 7520: 7506: 7505:"Unicode FAQ" 7500: 7486: 7480: 7478: 7463: 7456: 7442: 7438: 7432: 7417: 7413: 7407: 7400: 7398: 7392: 7388: 7384: 7368: 7364: 7357: 7353: 7347: 7345: 7343: 7341: 7339: 7330: 7326: 7322: 7316: 7301: 7295: 7280: 7274: 7256: 7252: 7248: 7244: 7240: 7236: 7235: 7231: 7216: 7212: 7205: 7190: 7186: 7180: 7176: 7166: 7160: 7156: 7155: 7148: 7144: 7130: 7126: 7120: 7113: 7107: 7101: 7098: 7092: 7085: 7079: 7075: 7065: 7062: 7060: 7057: 7055: 7052: 7050: 7047: 7045: 7042: 7039: 7036: 7034: 7031: 7029: 7026: 7024: 7021: 7014: 7011: 7009: 7006: 7005: 6999: 6983: 6974: 6970: 6954: 6951: 6937: 6921: 6907: 6904: 6890: 6889: 6888: 6877: 6872: 6862: 6860: 6857:('gsub'), or 6856: 6852: 6848: 6844: 6840: 6836: 6832: 6828: 6824: 6817: 6811: 6801: 6798: 6793: 6788: 6783: 6782:Thai alphabet 6779: 6777: 6773: 6769: 6764: 6760: 6756: 6752: 6751:Indic scripts 6747: 6740:Indic scripts 6737: 6735: 6704: 6689: 6672: 6668: 6664: 6659: 6655: 6653: 6649: 6645: 6640: 6630: 6628: 6627:Trojan Source 6624: 6620: 6616: 6611: 6608: 6604: 6594: 6592: 6580: 6571: 6559: 6557: 6553: 6549: 6545: 6541: 6536: 6534: 6518: 6511: 6504: 6500: 6490: 6483: 6478: 6469: 6467: 6463: 6459: 6454: 6452: 6446: 6444: 6440: 6436: 6432: 6428: 6427:United States 6424: 6420: 6416: 6412: 6406: 6403: 6398: 6394: 6390: 6385: 6383: 6379: 6375: 6369: 6349: 6346: 6321: 6319: 6315: 6305: 6303: 6299: 6294: 6290: 6286: 6284: 6280: 6276: 6272: 6268: 6263: 6261: 6257: 6253: 6247: 6237: 6235: 6231: 6227: 6223: 6218: 6176: 6168: 6165: 6161: 6159:FFFE or FFFF. 6158: 6157: 6156: 6154: 6150: 6145: 6143: 6139: 6135: 6131: 6127: 6121: 6111: 6109: 6105: 6101: 6097: 6092: 6089: 6087: 6083: 6079: 6075: 6070: 6065: 6055: 6053: 6052:Shape context 6047: 6045: 6041: 6037: 6033: 6029: 6028:ISO/IEC 14755 6025: 6021: 6020:Unicode input 6014:Input methods 6011: 6009: 6005: 6001: 5997: 5993: 5989: 5985: 5980: 5978: 5974: 5970: 5966: 5962: 5958: 5954: 5952: 5948: 5944: 5940: 5936: 5932: 5928: 5924: 5920: 5916: 5912: 5908: 5904: 5900: 5896: 5892: 5881: 5879: 5875: 5871: 5867: 5862: 5859: 5855: 5851: 5847: 5841: 5831: 5829: 5825: 5821: 5817: 5813: 5809: 5808:character set 5805: 5801: 5797: 5795: 5791: 5787: 5783: 5779: 5775: 5771: 5767: 5765: 5761: 5757: 5753: 5749: 5744: 5739: 5737: 5733: 5729: 5724: 5715: 5691: 5687: 5682: 5680: 5676: 5672: 5665: 5661: 5657: 5654: 5651: 5648: 5646: 5642: 5639: 5637: 5633: 5629: 5626: 5625: 5624: 5621: 5619: 5615: 5611: 5607: 5603: 5602: 5597: 5592: 5584: 5582: 5578: 5574: 5561: 5558: 5555: 5552: 5551: 5547: 5544: 5542: 5539: 5536: 5535: 5531: 5528: 5525: 5522: 5521: 5517: 5514: 5512: 5510: 5505: 5502: 5501: 5497: 5494: 5492: 5489: 5488: 5484: 5481: 5479: 5476: 5475: 5471: 5468: 5466: 5463: 5459: 5455: 5452: 5449: 5446: 5443: 5442: 5438: 5435: 5432: 5431:60–61, 64–65, 5428: 5424: 5420: 5416: 5412: 5408: 5405: 5404: 5400: 5397: 5395: 5393: 5389: 5388: 5384: 5381: 5379: 5378: 5374: 5373: 5369: 5366: 5364: 5362: 5357: 5353: 5349: 5346: 5343: 5342: 5337: 5336: 5332: 5329: 5326: 5323: 5322: 5318: 5315: 5312: 5310: 5306: 5302: 5297: 5293: 5289: 5286: 5283: 5280: 5279: 5275: 5272: 5270: 5266: 5262: 5259: 5258: 5254: 5251: 5248: 5247:00–5F, 90–91, 5245: 5242: 5241: 5237: 5234: 5231: 5227: 5224: 5223: 5219: 5216: 5213: 5212: 5208: 5203: 5201: 5195: 5194: 5190: 5187: 5184: 5183: 5175: 5173:18–1B, 1E–1F 5172: 5168: 5160: 5157: 5155: 5151: 5147: 5146: 5142: 5139: 5137: 5135: 5131: 5127: 5123: 5119: 5115: 5111: 5108: 5106: 5105: 5101: 5100: 5096: 5093: 5091: 5090: 5086: 5082: 5078: 5075: 5072: 5071: 5066: 5062: 5056: 5053: 5050:The standard 5048: 5045: 5041: 5031: 5027: 5025: 5021: 5017: 5013: 5009: 5005: 5001: 4997: 4993: 4963: 4962: 4958: 4953: 4946: 4937: 4935: 4930: 4923: 4908: 4905: 4899: 4897: 4885: 4873: 4861: 4856: 4854: 4849: 4845: 4836: 4832: 4827: 4825: 4821: 4771: 4767: 4757: 4753: 4747: 4713: 4709: 4704: 4690: 4686: 4682: 4678: 4673: 4667: 4657: 4648: 4644: 4639: 4637: 4633: 4608: 4604: 4600: 4599:control codes 4579: 4577: 4565: 4553: 4549: 4540: 4536: 4532: 4510: 4492: 4477: 4476: 4475: 4473: 4469: 4465: 4450:assumes that 4449: 4413: 4412:noncharacters 4408: 4402: 4398: 4397:low-surrogate 4382: 4379:are known as 4355: 4351: 4345: 4341: 4335: 4333: 4331: 4329: 4327: 4318: 4314: 4310: 4304: 4297: 4292: 4290: 4288: 4286: 4284: 4275: 4271: 4265: 4257: 4251: 4243: 4239: 4233: 4231: 4222: 4218: 4212: 4208: 4204: 4200: 4197: 4194: 4191: 4190: 4186: 4183: 4180: 4177: 4168: 4164: 4160: 4153: 4147: 4144: 4141: 4138: 4132: 4128: 4125: 4122: 4118: 4115: 4112: 4106: 4102: 4098: 4094: 4090: 4086: 4083:Includes the 4082: 4079: 4076: 4073: 4071:Other, format 4070: 4064: 4060: 4057: 4054: 4051: 4048: 4042: 4034: 4020: 4017: 4014: 4011: 4008: 4002: 3988: 3985: 3982: 3979: 3976: 3970: 3966: 3962: 3958: 3954: 3951: 3948: 3945: 3942: 3936: 3928: 3925: 3922: 3919: 3916: 3914:Symbol, other 3913: 3907: 3904: 3901: 3898: 3895: 3892: 3886: 3883: 3880: 3877: 3874: 3871: 3868: 3862: 3858: 3854: 3850: 3846: 3842: 3838: 3834: 3830: 3826: 3822: 3818: 3814: 3810: 3807: 3804: 3801: 3798: 3795: 3789: 3781: 3778: 3775: 3772: 3769: 3766: 3760: 3756: 3753: 3750: 3747: 3744: 3738: 3734: 3730: 3727: 3724: 3721: 3718: 3712: 3708: 3705: 3702: 3699: 3696: 3690: 3686: 3682: 3679: 3676: 3673: 3670: 3664: 3660: 3656: 3653: 3650: 3647: 3644: 3638: 3634: 3630: 3626: 3622: 3619: 3616: 3613: 3610: 3604: 3596: 3592: 3588: 3584: 3580: 3577: 3574: 3571: 3569:Number, other 3568: 3562: 3558: 3554: 3551: 3548: 3545: 3542: 3536: 3532: 3528: 3525: 3522: 3519: 3516: 3510: 3502: 3499: 3496: 3493: 3490: 3487: 3481: 3478: 3475: 3472: 3469: 3466: 3460: 3457: 3454: 3451: 3448: 3445: 3439: 3431: 3428: 3424: 3420: 3417: 3414: 3411: 3409:Letter, other 3408: 3402: 3399: 3395: 3392: 3389: 3386: 3383: 3377: 3373: 3369: 3365: 3361: 3357: 3353: 3350: 3347: 3344: 3341: 3338: 3332: 3329: 3326: 3323: 3320: 3317: 3311: 3308: 3305: 3302: 3299: 3296: 3290: 3275: 3272: 3270: 3268: 3266: 3264: 3261: 3260: 3256: 3247: 3244: 3241: 3238: 3235: 3234: 3228: 3223: 3221: 3216: 3214: 3209: 3208: 3204: 3197: 3194: 3191: 3181: 3178: 3177: 3171: 3169: 3165: 3161: 3157: 3153: 3147: 3137: 3111: 3096: 3093: 3074:DIVISION SIGN 3062: 3061:leading zeros 3058: 3049: 3047: 3043: 3021: 3020: 3015: 3011: 3001: 2991: 2989: 2985: 2967: 2964: 2961: 2957: 2956: 2953: 2944: 2938: 2930: 2926: 2923: 2921: 2917: 2916: 2913: 2907: 2901: 2892: 2889: 2886: 2883: 2882: 2879: 2870: 2861: 2858: 2857: 2854: 2847: 2843: 2842:noncharacters 2839: 2835: 2829: 2825: 2818: 2817:Tulu-Tigalari 2814: 2810: 2806: 2802: 2798: 2794: 2791: 2776: 2773: 2771: 2769: 2765: 2760: 2757: 2752: 2737: 2735: 2731: 2726: 2723: 2718: 2714: 2711: 2696: 2691: 2687: 2682: 2679: 2674: 2670: 2655: 2651: 2647: 2643: 2639: 2636: 2621: 2618: 2616: 2612: 2607: 2604: 2599: 2595: 2591: 2587: 2583: 2579: 2575: 2571: 2568: 2553: 2550: 2548: 2543: 2539: 2534: 2531: 2527: 2513: 2498: 2496: 2492: 2487: 2484: 2479: 2475: 2471: 2467: 2463: 2459: 2456: 2441: 2436: 2432: 2427: 2424: 2419: 2415: 2411: 2410:Maya numerals 2407: 2403: 2399: 2395: 2391: 2387: 2386:Gunjala Gondi 2383: 2379: 2376: 2361: 2358: 2356: 2352: 2347: 2344: 2340: 2325: 2321: 2317: 2316:Masaram Gondi 2313: 2309: 2306: 2291: 2288: 2286: 2281: 2277: 2272: 2269: 2264: 2260: 2256: 2252: 2248: 2244: 2241: 2226: 2223: 2221: 2217: 2212: 2209: 2205: 2201: 2197: 2196:Old Hungarian 2193: 2189: 2185: 2181: 2178: 2163: 2160: 2158: 2153: 2149: 2144: 2141: 2137: 2133: 2129: 2125: 2121: 2117: 2113: 2109: 2105: 2101: 2097: 2093: 2089: 2085: 2084:Mende Kikakui 2081: 2077: 2073: 2069: 2065: 2061: 2057: 2053: 2049: 2045: 2042: 2027: 2024: 2022: 2018: 2013: 2010: 2005: 1990: 1988: 1984: 1979: 1976: 1960: 1945: 1943: 1939: 1934: 1931: 1927: 1923: 1919: 1915: 1911: 1907: 1903: 1900: 1885: 1880: 1875: 1871: 1866: 1863: 1858: 1854: 1850: 1846: 1842: 1838: 1835: 1820: 1817: 1815: 1812: 1810: 1806: 1801: 1798: 1794: 1790: 1786: 1782: 1778: 1774: 1770: 1766: 1762: 1758: 1754: 1750: 1746: 1742: 1738: 1734: 1730: 1727: 1712: 1709: 1707: 1703: 1698: 1695: 1691: 1676: 1672: 1668: 1664: 1663:Phaistos Disc 1660: 1656: 1652: 1648: 1644: 1640: 1636: 1632: 1628: 1624: 1620: 1617: 1602: 1599: 1591: 1588: 1584: 1580: 1576: 1572: 1568: 1565: 1550: 1547: 1545: 1544:0-321-48091-0 1541: 1536: 1533: 1528: 1524: 1523:Greek numbers 1520: 1516: 1512: 1511:Sylheti Nagri 1508: 1504: 1500: 1496: 1492: 1489: 1474: 1471: 1463: 1460: 1456: 1452: 1448: 1444: 1440: 1436: 1432: 1428: 1425: 1410: 1407: 1405: 1400: 1399:0-321-18578-1 1396: 1391: 1388: 1383: 1379: 1375: 1371: 1367: 1364: 1349: 1346: 1341: 1338: 1333: 1329: 1325: 1321: 1318: 1303: 1300: 1287: 1284: 1279: 1275: 1271: 1267: 1263: 1259: 1255: 1251: 1247: 1243: 1239: 1235: 1231: 1228: 1213: 1210: 1207: 1205: 1204:0-201-61633-5 1201: 1196: 1193: 1189: 1174: 1160: 1145: 1137: 1134: 1129: 1124: 1106: 1101: 1100:0-201-48345-9 1097: 1092: 1089: 1084: 1080: 1076: 1058: 1055: 1053: 1050: 1049:ISO/IEC 10646 1039: 1036: 1032: 1028: 1010: 1007: 1004: 1003:0-201-60845-6 1000: 995: 992: 988: 984: 980: 976: 972: 968: 964: 960: 956: 952: 948: 944: 940: 936: 932: 928: 924: 920: 916: 912: 908: 904: 900: 896: 891: 888: 882: 881:0-201-56788-1 878: 875: 870: 867: 858: 848: 843:(book, text) 833: 827: 825: 820: 818: 817:Tulu-Tigalari 814: 810: 806: 802: 798: 794: 789: 787: 782: 777: 774: 770: 766: 762: 757: 755: 751: 747: 737: 735: 725: 723: 718: 716: 712: 708: 704: 699: 697: 693: 689: 685: 681: 677: 672: 670: 666: 662: 658: 654: 649: 647: 639: 635: 630: 625: 615: 613: 607: 605: 600: 598: 594: 590: 586: 582: 578: 574: 570: 563: 553: 550: 548: 542: 540: 536: 531: 529: 525: 521: 517: 511: 506: 502: 500: 494: 492: 488: 483: 481: 477: 473: 469: 466:, along with 465: 461: 457: 447: 445: 441: 437: 433: 428: 426: 425:CJK encodings 422: 418: 413: 408: 406: 402: 397: 394: 391: 389: 385: 381: 377: 373: 368: 366: 362: 351: 349: 345: 341: 337: 333: 329: 324: 322: 318: 314: 310: 305: 301: 300:ISO/IEC 10646 297: 292: 290: 286: 281: 279: 274: 272: 268: 258: 254: 250: 249:text encoding 246: 245: 240: 232: 228: 224: 211: 206: 204: 199: 197: 192: 191: 188: 180: 175: 171: 166: 165: 162: 158: 155: 151: 144: 141: 139: 136: 135: 128: 125: 123: 120: 118: 115: 113: 110: 109: 103: 100: 98: 95: 93: 90: 89: 87: 83: 79: 75: 72: 70: 64: 60: 54:ISO/IEC 10646 53: 50: 47: 46: 44: 40: 36: 30: 25: 19: 14425:ISO/IEC 6429 14382:Stanford/ITS 14369: 14303:ARIB STD-B24 14153: 14084:Sega SC-3000 13985:DEC RADIX 50 13022:ISO/IEC 8859 13014:ISO/IEC 2022 12759:Adaptations 12718:-14 (Celtic) 12713:-13 (Baltic) 12703:-10 (Nordic) 12698:-9 (Turkish) 12647:ISO/IEC 8859 12202:Meetei Mayek 12153:(Chorasmian) 12056:Cypro-Minoan 11833:Pahawh Hmong 11648:Gurung Khema 11397:ISO/IEC 8859 11239:UTF-32/UCS-4 11234:UTF-16/UCS-2 11041:Variant form 10917: 10854: 10778: 10758: 10745: 10732: 10715: 10702: 10689: 10676: 10654: 10633:. Retrieved 10623: 10612:. Retrieved 10582: 10565: 10556: 10531: 10522: 10504: 10487: 10476:. Retrieved 10464: 10453:. Retrieved 10441: 10430:. Retrieved 10418: 10410: 10405: 10400: 10389:. Retrieved 10385:the original 10371: 10360:. Retrieved 10356: 10347: 10336:. Retrieved 10323: 10312:. Retrieved 10308: 10299: 10290: 10280: 10269:. Retrieved 10265: 10255: 10210: 10200: 10191: 10182: 10169: 10160: 10147: 10138: 10129: 10119: 10110: 10101: 10093:the original 10078: 10067:. Retrieved 10063:the original 10056: 10030: 10002: 9991:. Retrieved 9981: 9972: 9968: 9955: 9944:. Retrieved 9935: 9924:. Retrieved 9918:Wood, Alan. 9913: 9888:. Retrieved 9866: 9859: 9834:. Retrieved 9812: 9805: 9780:. Retrieved 9758: 9751: 9726:. Retrieved 9704: 9697: 9672:. Retrieved 9650: 9643: 9618:. Retrieved 9596: 9589: 9574: 9559: 9548:. Retrieved 9536: 9518: 9493:. Retrieved 9471: 9464: 9439:. Retrieved 9417: 9410: 9399:. Retrieved 9395: 9386: 9375:. Retrieved 9371: 9362: 9351:. Retrieved 9347: 9334: 9315: 9310: 9299:. Retrieved 9295: 9286: 9275:. Retrieved 9265: 9254:. Retrieved 9246:Kuhn, Markus 9240: 9225: 9217:the original 9207: 9196:. Retrieved 9187: 9176:. Retrieved 9167: 9156:. Retrieved 9147: 9136:. Retrieved 9133:"Properties" 9127: 9116:. Retrieved 9095: 9086: 9077: 9068: 9058: 9051: 9041: 9034: 9023:. Retrieved 9014: 9003:. Retrieved 8999: 8990: 8979:. Retrieved 8977:. 2024-09-10 8974: 8965: 8945: 8939: 8916: 8907: 8887: 8881: 8850: 8841: 8830:. Retrieved 8817: 8803:. Retrieved 8799: 8776: 8767: 8756:. Retrieved 8752: 8743: 8723: 8717: 8697: 8691: 8671: 8665: 8654:. Retrieved 8650: 8640: 8626:. Retrieved 8603: 8594: 8574: 8568: 8548: 8542: 8528:. Retrieved 8505: 8496: 8482:. Retrieved 8459: 8450: 8436:. Retrieved 8413: 8404: 8390:. Retrieved 8367: 8358: 8344:. Retrieved 8321: 8312: 8301:. Retrieved 8297: 8288: 8274:. Retrieved 8251: 8242: 8228:. Retrieved 8205: 8196: 8182:. Retrieved 8159: 8150: 8139:. Retrieved 8130: 8110: 8104: 8093:. Retrieved 8084: 8070:. Retrieved 8047: 8038: 8024:. Retrieved 8001: 7992: 7978:. Retrieved 7965: 7956: 7942:. Retrieved 7929: 7920: 7906:. Retrieved 7893: 7884: 7870:. Retrieved 7857: 7831:. Retrieved 7818: 7809: 7795:. Retrieved 7782: 7773: 7759:. Retrieved 7746: 7737: 7723:. Retrieved 7709: 7700: 7689:. Retrieved 7680: 7669:. Retrieved 7665: 7656: 7645:. Retrieved 7641: 7632: 7621:. Retrieved 7611: 7598: 7589: 7576: 7567: 7556:. Retrieved 7543: 7532:. Retrieved 7528: 7519: 7508:. Retrieved 7499: 7488:. Retrieved 7465:. Retrieved 7455: 7444:. Retrieved 7440: 7431: 7420:. Retrieved 7418:. 2006-08-31 7415: 7406: 7380: 7374:. Retrieved 7356:"Unicode 88" 7324: 7315: 7304:. Retrieved 7302:. 2019-08-22 7294: 7283:. Retrieved 7273: 7259:. Retrieved 7230: 7219:. Retrieved 7214: 7204: 7193:. Retrieved 7191:. 2002-03-27 7188: 7179: 7153: 7147: 7119: 7106: 7096: 7091: 7083: 7078: 6978: 6968: 6875: 6874: 6839:romanization 6831:acute accent 6819: 6780: 6749: 6705: 6656: 6636: 6612: 6600: 6584: 6579:acute accent 6544:barred D (đ) 6537: 6496: 6487: 6455: 6447: 6407: 6391:(encoded by 6386: 6381: 6371: 6322: 6311: 6287: 6264: 6259: 6251: 6249: 6246:Unicode font 6219: 6211:&#47568; 6207:&#33865; 6203:&#21494; 6199:&#12354; 6172: 6162:most of the 6146: 6134:Web browsers 6129: 6123: 6110:support it. 6093: 6090: 6067: 6048: 6043: 6039: 6035: 6032:Basic method 6031: 6026: 6023: 5981: 5955: 5949:through the 5887: 5863: 5843: 5798: 5768: 5740: 5722: 5716: 5683: 5670: 5668: 5659: 5622: 5609: 5605: 5599: 5595: 5593: 5590: 5566: 5540: 5532:(00–FF ...) 5508: 5506: 5490: 5477: 5464: 5447: 5430: 5426: 5422: 5418: 5414: 5410: 5391: 5390: 5383:Number Forms 5376: 5375: 5360: 5359:05, 13, 16, 5358: 5340: 5338: 5324: 5308: 5304: 5300: 5298: 5268: 5264: 5246: 5229: 5210: 5206: 5205: 5199: 5197: 5153: 5149: 5133: 5129: 5125: 5121: 5120: 5103: 5102: 5088: 5087: 5064: 5060: 5049: 5037: 5028: 5026:(by Apple). 5003: 4989: 4931: 4903: 4901: 4895: 4857: 4846: 4828: 4823: 4790:followed by 4763: 4751: 4711: 4707: 4705: 4685:acute accent 4671: 4669: 4646: 4642: 4640: 4636:Windows-1252 4607:ISO/IEC 6429 4598: 4580: 4551: 4550: 4534: 4530: 4529: 4526:characters). 4508:characters), 4490:characters), 4471: 4467: 4466: 4411: 4409: 4400: 4396: 4380: 4366: 4353: 4349: 4343: 4312: 4303: 4273: 4264: 4250: 4241: 4220: 4211: 4178:Noncharacter 4159:Planes 15–16 4101:language tag 3933:, Separator 3796:Symbol, math 3531:Numeric Type 3252:(as of 16.0) 3187: 3174: 3172: 3159: 3151: 3149: 3097: 3050: 3041: 3017: 3013: 3009: 3008: 2983: 2981: 2952: 2937: 2912: 2900: 2878: 2869: 2853: 2828: 2797:Gurung Khema 2642:Cypro-Minoan 2574:Dhives Akuru 2547: 2420:, 145 emoji 2418:star ratings 2338:BITCOIN SIGN 2285: 2157: 2108:Pahawh Hmong 1922:Sora Sompeng 1879: 1868:January 2012 1849:playing card 1814: 1803:October 2010 1769:Meetei Mayek 1700:October 2009 1671:Domino tiles 1404: 1274:Yi Syllables 1052: 872:October 1991 823: 821: 797:Gurung Khema 790: 780: 778: 772: 760: 758: 749: 743: 731: 719: 700: 673: 650: 643: 640:application. 612:multilingual 608: 601: 565: 551: 543: 538: 532: 513: 508: 504: 496: 493:characters: 486: 484: 453: 429: 409: 392: 369: 357: 331: 325: 303: 294:The Unicode 293: 282: 275: 243: 242: 238: 237: 222: 157:ISO/IEC 8859 66: 65:168 scripts 33:Logo of the 18: 14144:ZX Spectrum 14099:Sinclair QL 13935:Amstrad CPC 13854:8-bit Greek 13781:terminals ( 13494:Iran System 13046:("scripts") 12693:-8 (Hebrew) 12683:-6 (Arabic) 12580:ISO/IEC 646 12388:SignWriting 12257:Old Sogdian 12227:Nandinagari 12151:Khwarezmian 12061:Dives Akuru 11987:Ancient and 11973:Warang Citi 11838:Pau Cin Hau 11793:New Tai Lue 11788:Nag Mundari 11763:Medefaidrin 11472:Common and 11281:Equivalence 11259:code points 11257:On pairs of 11171:Equivalence 11046:Word joiner 11036:Soft hyphen 10952:Code points 10849:‹ The 9922:. Alan Wood 9340:Davis, Mark 7397:Lee Collins 6195:&#3671; 6191:&#1605; 6187:&#1511; 6183:&#1049; 6151:(including 6108:Outlook.com 6100:Yahoo! Mail 5988:DirectWrite 5606:code points 5470:Box Drawing 5339:A3–A4, A7, 5185:59, 7C, 92 5152:B7, DE-EF, 5095:Basic Latin 4835:Hangul Jamo 4770:precomposed 4468:Private-use 4142:Private-use 4103:characters 4085:soft hyphen 3687:characters 3661:characters 3635:libraries. 3587:superscript 3057:hexadecimal 3019:code points 2474:Miao script 2462:Nandinagari 2402:Medefaidrin 2265:, 72 emoji 2200:SignWriting 2132:Warang Citi 2116:Pau Cin Hau 1507:Old Persian 1503:New Tai Lue 1081:syllables, 863:Characters 841:Publication 665:syllabaries 480:Dave Opstad 472:Lee Collins 401:web browser 241:, formally 153:Preceded by 62:Language(s) 14522:Categories 14430:JIS X 0211 14338:ISO-IR-169 14191:UTF-EBCDIC 13757:code pages 13484:CSX+ Indic 13088:Devanagari 13043:Code pages 12964:LST 1590-4 12934:JIS X 0213 12929:JIS X 0212 12924:JIS X 0208 12919:JIS X 0201 12884:GOST 10859 12806:CCCII/EACC 12708:-11 (Thai) 12688:-7 (Greek) 12623:background 12546:Wabun/Kana 12282:Phoenician 12267:Old Uyghur 12262:Old Turkic 12247:Old Permic 12242:Old Italic 12192:Manichaean 12086:Glagolitic 11863:Saurashtra 11608:Devanagari 11487:Diacritics 11244:UTF-EBCDIC 11147:Algorithms 11140:Processing 11077:Characters 11001:Characters 10635:2023-09-12 10614:2022-04-29 10478:2010-03-20 10455:2010-03-20 10432:2010-03-20 10391:2019-05-20 10362:2021-11-11 10338:2021-11-02 10314:2022-06-27 10271:2023-04-15 10220:2106.09898 10069:2023-03-20 9993:2019-10-13 9946:2013-11-01 9926:2012-06-04 9890:2022-08-17 9836:2022-08-17 9782:2022-08-17 9728:2022-08-17 9674:2022-08-17 9620:2022-08-17 9550:2012-06-04 9495:2022-08-17 9441:2022-08-17 9401:2020-11-01 9377:2023-01-16 9353:2021-02-19 9301:2016-12-12 9277:2022-08-21 9256:2023-03-20 9198:2010-03-16 9178:2022-09-16 9158:2023-09-12 9138:2024-09-13 9118:2010-03-16 9025:2010-03-16 9005:2023-09-13 8981:2024-09-13 8832:2020-03-11 8805:2020-03-11 8758:2019-05-07 8656:2016-09-04 8628:2016-06-21 8530:2015-06-17 8484:2014-06-15 8438:2013-09-30 8392:2012-09-26 8346:2012-01-31 8303:2022-09-21 8276:2010-10-11 8230:2010-03-17 8184:2010-03-17 8141:2010-03-17 8095:2010-03-16 8072:2010-03-16 8026:2023-10-02 7980:2023-10-02 7944:2023-10-02 7908:2023-10-02 7872:2010-03-16 7833:2010-03-16 7797:2010-03-16 7761:2010-03-16 7725:2010-03-16 7691:2016-06-21 7671:2024-09-13 7647:2012-05-30 7623:2012-06-04 7558:2018-07-30 7534:2022-09-16 7510:2020-04-02 7490:2024-02-12 7467:2013-01-18 7446:2023-03-20 7422:2010-03-15 7387:Xerox PARC 7376:2016-10-25 7306:2024-09-11 7285:2024-09-10 7261:2024-09-11 7221:2022-06-23 7195:2022-06-23 7139:References 7112:code point 6849:that uses 6847:Charis SIL 6814:See also: 6759:Devanagari 6734:ISO 8859-1 6623:BiDi marks 6603:homoglyphs 6533:ISO 8859-9 6512:(ı) and a 6451:Sinosphere 6389:JIS X 0208 6318:characters 6281:(WOFF and 6256:allographs 6179:&#916; 6034:, where a 5947:Windows 9x 5899:Windows NT 5838:See also: 5764:high-level 5730:. However 5728:code pages 5656:UTF-EBCDIC 5632:code point 5610:code units 5232:D7, DA–E1 4996:Devanāgarī 4965:ligature ( 4929:Devanāgarī 4870:, and the 4689:Lithuanian 4155:, 131,068 4077:Character 4055:Character 4015:Character 3983:Character 3949:Character 3920:Character 3899:Character 3875:Character 3802:Character 3773:Character 3751:Character 3725:Character 3703:Character 3677:Character 3651:Character 3625:underscore 3617:Character 3575:Character 3549:Character 3523:Character 3494:Character 3473:Character 3452:Character 3415:Character 3390:Character 3345:Character 3324:Character 3303:Character 3280:, Letter; 3242:Basic type 3012:defines a 2998:See also: 2960:hentaigana 2929:ruble sign 2766:ISBN  2732:ISBN  2688:ISBN  2650:Old Uyghur 2613:ISBN  2570:Chorasmian 2540:ISBN  2536:March 2020 2493:ISBN  2433:ISBN  2429:March 2019 2353:ISBN  2324:hentaigana 2278:ISBN  2218:ISBN  2150:ISBN  2104:Old Permic 2080:Manichaean 2019:ISBN  1985:ISBN  1940:ISBN  1872:ISBN  1807:ISBN  1777:Old Turkic 1704:ISBN  1651:Saurashtra 1593:April 2008 1583:Phoenician 1542:ISBN  1495:Glagolitic 1465:March 2005 1397:ISBN  1393:April 2003 1366:Philippine 1343:March 2002 1328:Old Italic 1289:March 2001 1202:ISBN  1130:allocated 1098:ISBN  1001:ISBN  919:Devanagari 746:repertoire 696:Rongorongo 487:Unicode 88 476:Mark Davis 470:employees 464:Joe Becker 396:code point 267:characters 148:(obsolete) 132:(uncommon) 127:UTF-EBCDIC 14483:MICR code 14318:IEC-P27-1 14296:ISO-IR-68 14201:DIN 91379 14079:SAM Coupé 14014:GSM 03.38 14004:Galaksija 13499:Kamenický 13479:CSX Indic 13188:Ukrainian 12974:Shift JIS 12954:KS X 1002 12949:KS X 1001 12874:DIN 66003 12869:CNS 11643 12637:Transcode 12615:ITU T.101 12541:Non-Latin 12277:ʼPhags-pa 12272:Palmyrene 12222:Nabataean 12146:Khudawadi 12131:Kharosthi 12046:Cuneiform 12021:Bhaiksuki 12016:Bassa Vah 11883:Sundanese 11858:Samaritan 11773:Mongolian 11748:Malayalam 11713:Kirat Rai 11423:Anomalies 11407:ISO 15924 11402:DIN 91379 11303:Z-variant 11286:Homoglyph 11159:Collation 10247:235485405 9975:(3): 292. 9524:Pike, Rob 6930:𝒫 6865:Anomalies 6835:underdots 6829:(◌̄) and 6701:WAVE DASH 6663:Shift-JIS 6658:Injective 6558:attacks. 6556:homoglyph 6437:based on 6419:Hong Kong 6411:TRON Code 6393:Shift JIS 6010:desktop. 5996:Core Text 5984:Uniscribe 5965:Unix-like 5394:94–95, A8 5079:Range(s) 5067:and MES-2 5052:DIN 91379 5042:supports 4912:Ligatures 4683:, and an 4681:dot above 4621:LINE FEED 4150:6,400 in 4116:Surrogate 3786:, Symbol 3591:subscript 3507:, Number 3423:ideograph 3352:Ligatures 3201:(Unicode 3084:𓉔 3014:codespace 2943:Lari sign 2904:Plus the 2801:Kirat Rai 2349:June 2018 2274:June 2017 2247:Bhaiksuki 2214:June 2016 2146:June 2015 2112:Palmyrene 2096:Nabataean 2068:Khudawadi 2044:Bassa Vah 2015:June 2014 1857:emoticons 1781:Samaritan 1655:Sundanese 1579:ʼPhags-pa 1571:cuneiform 1538:July 2006 1499:Kharosthi 1368:scripts ( 1280:patterns 1242:Mongolian 1172:EURO SIGN 1094:July 1996 1041:June 1993 1005:(vol. 2) 997:June 1992 967:Malayalam 883:(vol. 1) 801:Kirat Rai 765:collation 657:alphabets 589:Microsoft 524:Microsoft 440:Ken Lunde 372:graphemes 317:collation 289:web pages 42:Alias(es) 14488:Mojibake 14343:ISO 2033 14308:Fieldata 14286:ASMO 449 14196:GB 18030 14156: / 14104:Teletext 14094:Sharp MZ 14024:HP FOCAL 14019:HP Roman 13950:Atari ST 13940:Apple II 13474:CS Indic 13168:Romanian 13143:Keyboard 13123:Gurmukhi 13118:Gujarati 13108:Georgian 13083:Cyrillic 13078:Croatian 13053:Armenian 12959:LST 1564 12944:KPS 9566 12904:GB 18030 12899:GB 12052 12894:GB 12345 12879:ELOT 927 12813:ISO 5426 12773:Estonian 12610:ITU T.61 12600:Teletext 12596:Videotex 12570:Fieldata 12556:Cyrillic 12409:Currency 12383:Duployan 12357:Vithkuqi 12352:Ugaritic 12207:Meroitic 12177:Mahajani 12162:Linear B 12157:Linear A 11948:Tifinagh 11913:Tai Viet 11908:Tai Tham 11898:Tagbanwa 11813:Ol Chiki 11703:Kayah Li 11698:Katakana 11683:Javanese 11678:Hiragana 11668:Hanunuoo 11643:Gurmukhi 11633:Gujarati 11623:Georgian 11598:Cyrillic 11588:Cherokee 11553:Bopomofo 11533:Balinese 11528:Armenian 11392:GB 18030 11209:Punycode 11097:Numerals 11029: / 10942:Versions 10851:template 10536:Archived 9899:citation 9845:citation 9791:citation 9737:citation 9683:citation 9629:citation 9504:citation 9450:citation 9248:(1998). 7603:Archived 7581:Archived 7367:Archived 7329:Archived 7002:See also 6993:ꡀ 6982:Phags_Pa 6962:︘ 6945:ꀕ 6915:℘ 6855:OpenType 6851:Graphite 6753:such as 6730:YEN SIGN 6698:〜 6681:~ 6597:Security 6552:ISO 6937 6546:and the 6482:Cyrillic 6480:Various 6458:OpenType 6425:and the 6308:Newlines 6275:OpenType 6271:TrueType 6096:ISO-2022 6006:and the 5834:Adoption 5770:Punycode 5719:EF BB BF 5560:Specials 5548:(00–4F) 5526:(01–02) 5518:(00–FF) 5498:(A0–FF) 5485:(80–9F) 5472:(00–7F) 5456:(00–FF) 5439:(00–FF) 5401:(90–FF) 5385:(50–8F) 5370:(00–4F) 5350:(A0–CF) 5333:(70–9F) 5319:(00–6F) 5290:(00–FF) 5276:(00–FF) 5255:(00–FF) 5253:Cyrillic 5238:(70–FF) 5220:(B0–FF) 5191:(50–AF) 5143:(00–7F) 5112:(80–FF) 5097:(00–7F) 5016:Graphite 5012:OpenType 4904:encoding 4853:radicals 4799:◌́ 4739:︘ 4722:ꀕ 4647:Reserved 4632:mojibake 4517:U+10FFFD 4513:U+100000 4444:U+10FFFF 4440:U+10FFFE 4192:Reserved 4039:, Other 3731:Opening 3683:Opening 3356:digraphs 3257:Remarks 3134:U+10FFFF 3132:through 3124:through 3116:through 3038:U+10FFFF 2673:Znamenny 2668:SOM SIGN 2665:⃀ 2646:Vithkuqi 2522:㋿ 2489:May 2019 2335:₿ 2136:dingbats 2076:Mahajani 2072:Linear A 2052:Duployan 1968:₺ 1789:Tai Viet 1785:Tai Tham 1757:Javanese 1686:ẞ 1643:Ol Chiki 1627:Kayah Li 1567:Balinese 1515:Tifinagh 1491:Buginese 1451:Ugaritic 1435:Linear B 1382:Tagbanwa 1230:Cherokee 1184: 1169:€ 1139:May 1998 1085:removed 1051:-1:1993 955:Katakana 947:Hiragana 935:Gurmukhi 931:Gujarati 923:Georgian 915:Cyrillic 911:Bopomofo 903:Armenian 860:Scripts 855:Details 849:edition 835:Version 740:Versions 692:Numidian 661:abugidas 380:typeface 307:include 269:and 168 77:Standard 14528:Unicode 14377:SEASCII 14371:Mojikyō 14358:KOI8-RU 14281:ABICOMP 14154:Unicode 14064:PETSCII 14054:NEC APC 13990:DEC MCS 13945:ATASCII 13842:Swedish 13827:Finnish 13812:Spanish 13504:Mazovia 13469:ABICOMP 13178:Turkish 13133:Iceland 13041:Mac OS 12984:TIS-620 12889:GB 2312 12864:BraSCII 12854:ArmSCII 12592:Teletex 12551:Chinese 12317:Soyombo 12307:Sogdian 12302:Siddham 12297:Sharada 12217:Multani 12197:Marchen 12187:Mandaic 12182:Makasar 12096:Grantha 12081:Elymaic 12076:Elbasan 12051:Cypriot 12011:Avestan 11953:Tirhuta 11943:Tibetan 11888:Sunuwar 11873:Sinhala 11868:Shavian 11848:Ranjana 11828:Osmanya 11818:Ol Onal 11743:Lontara 11693:Kannada 11603:Deseret 11568:Burmese 11558:Braille 11548:Bengali 11502:Numbers 11463:Scripts 11112:Symbols 11102:Scripts 10925:Unicode 10918:Unicode 10866:Unicode 10853:below ( 10557:Unicode 10532:Unicode 10309:Unicode 10192:Unicode 10037:. 2024. 9396:W3Techs 9372:W3Techs 9062:. 2024. 9045:. 2024. 8975:Unicode 7666:Unicode 7577:Unicode 7529:Unicode 7441:Unicode 7416:Unicode 7325:Unicode 7125:insular 6969:bracket 6925:U+1D4AB 6898:͏ 6885:BRACKET 6881:BRAKCET 6825:with a 6314:newline 6215:&#x 6080:or the 5868:, e.g. 5800:GB18030 5776:-based 5675:FreeBSD 5427:2B, 48, 5421:27–28, 5417:08–09, 5361:22, 26, 5307:1A–1B, 5299:13–14, 5214:DF, EE 5196:BB–BD, 5180:00–4F) 5161:(80–FF 5132:4E–4F, 5128:2C–2D, 5124:14–15, 4822:within 4531:Graphic 4499:U+FFFFD 4495:U+F0000 4438:, ..., 4436:U+1FFFF 4432:U+1FFFE 4198:819,467 4052:Control 4031:(PSEP) 3999:(LSEP) 3946:Graphic 3917:Graphic 3896:Graphic 3872:Graphic 3811:(e.g., 3799:Graphic 3770:Graphic 3748:Graphic 3722:Graphic 3700:Graphic 3685:bracket 3674:Graphic 3648:Graphic 3614:Graphic 3572:Graphic 3546:Graphic 3520:Graphic 3491:Graphic 3470:Graphic 3449:Graphic 3436:, Mark 3418:136,477 3412:Graphic 3387:Graphic 3342:Graphic 3321:Graphic 3300:Graphic 3262:  3130:U+10000 3079:U+13254 2809:Sunuwar 2805:Ol Onal 2717:Mundari 2598:Punjabi 2458:Elymaic 2414:xiangqi 2398:Makasar 2312:Soyombo 2251:Marchen 2192:Multani 2128:Tirhuta 2124:Siddham 2060:Grantha 2056:Elbasan 1918:Sharada 1845:Mandaic 1729:Avestan 1669:tiles, 1667:Mahjong 1443:Shavian 1439:Osmanya 1378:Tagalog 1374:Hanunoo 1320:Deseret 1278:Braille 1258:Sinhala 1246:Burmese 1083:Tibetan 987:Tibetan 951:Kannada 907:Bengali 809:Sunuwar 805:Ol Onal 707:Klingon 703:Tengwar 684:Jurchen 653:scripts 593:Netflix 450:History 271:scripts 247:, is a 239:Unicode 102:GB18030 22:Unicode 14387:Symbol 14363:KOI8-U 14353:KOI8-R 14221:TACE16 14211:CESU-8 14206:BOCU-1 14186:UTF-32 14181:UTF-16 14124:WISCII 14114:TRS-80 14034:SQUOZE 14029:HP RPL 13869:Hebrew 13864:SI 960 13832:French 13755:EBCDIC 13645:CER-GS 13128:Hebrew 13103:Gaelic 13068:Celtic 13058:Arabic 13004:YUSCII 12994:VISCII 12979:SI 960 12969:PASCII 12818:5426-2 12796:MARC-8 12531:Needle 12464:  12453:  12362:Yezidi 12342:Todhri 12337:Tangut 12172:Lydian 12167:Lycian 12141:Khojki 12121:Kaithi 12101:Hatran 12091:Gothic 12041:Coptic 12031:Carian 12026:Brāhmī 11968:Wancho 11933:Thaana 11928:Telugu 11923:Tangsa 11903:Tai Le 11893:Syriac 11853:Rejang 11728:Lepcha 11673:Hebrew 11653:Hangul 11578:Chakma 11523:Arabic 11497:Spaces 11204:CESU-8 11199:BOCU-1 11107:Spaces 10870:Curlie 10856:Curlie 10794:  10765:  10752:  10739:  10726:  10718:, The 10709:  10696:  10683:  10665:  10657:, The 10293:: 4–1. 10245:  10235:  9322:  8953:  8924:  8895:  8858:  8784:  8731:  8705:  8679:  8611:  8582:  8556:  8513:  8467:  8421:  8375:  8329:  8259:  8213:  8167:  8118:  8055:  8009:  7721:. 2004 7161:  6990: 6988:U+A840 6959: 6957:U+FE18 6942: 6940:U+A015 6927: 6912: 6910:U+2118 6895: 6893:U+034F 6827:macron 6727:¥ 6724: 6722:U+00A5 6714:\ 6711: 6709:U+005C 6695: 6693:U+301C 6678: 6676:U+FF5E 6667:EUC-JP 6652:Hangul 6591:tittle 6542:, the 6443:Unihan 6423:Taiwan 6352:Issues 6338: 6336:U+2029 6328: 6326:U+2028 6252:per se 6209:, and 6106:, and 6078:Base64 5961:Plan 9 5941:, and 5925:, and 5895:UTF-16 5876:  5854:UTF-16 5828:UTF-18 5812:BOCU-1 5760:Python 5743:32-bit 5734:  5712:U+FEFF 5708:U+FEFF 5704:U+FFFE 5697: 5695:U+FEFF 5664:EBCDIC 5650:UTF-32 5641:UTF-16 5569:U+FFFD 5450:29–2A 5399:Arrows 5392:90–93, 5309:1C–1D, 5305:18–19, 5265:80–85, 5207:D8–DB, 5134:50–7E, 5130:2E–4D, 5126:16–2B, 5122:00–13, 5022:), or 4992:Arabic 4952:Arabic 4934:ddhrya 4894:, but 4892:U+2FFB 4888:U+2FF0 4882:. The 4880:U+2FDF 4876:U+2F00 4868:U+2EFF 4864:U+2E80 4831:Hangul 4813:é 4810: 4808:U+00E9 4795: 4793:U+0301 4785:e 4782: 4780:U+0065 4736: 4734:U+FE18 4731:, and 4719: 4717:U+A015 4701:U+0301 4697:U+0307 4693:U+012F 4677:ogonek 4625:U+000D 4623:, and 4618:U+008A 4611:U+0089 4595:U+009F 4591:U+007F 4587:U+001F 4583:U+0000 4571: 4569:U+200D 4559: 4557:U+200C 4552:Format 4484:U+F8FF 4480:U+E000 4452:U+FFFE 4428:U+FFFF 4424:U+FFFE 4420:U+FDEF 4416:U+FDD0 4405:U+FFFF 4389:U+DFFF 4385:U+DC00 4377:U+DBFF 4373:U+D800 4121:UTF-16 4099:, and 4074:Format 4026: 4024:U+2029 4012:Format 3994: 3992:U+2028 3980:Format 3659:hyphen 3581:E.g., 3370:, and 3176:blocks 3164:UTF-16 3152:planes 3126:U+DFFF 3122:U+D800 3118:U+FFFF 3114:U+0000 3110:UTF-16 3081: 3071:÷ 3068: 3066:U+00F7 3034:U+0000 2844:, and 2813:Todhri 2662: 2660:U+20C0 2654:Tangsa 2594:Hindko 2582:Yezidi 2519: 2517:U+32FF 2470:Wancho 2332: 2330:U+20BF 2263:Tangut 2188:Hatran 2134:, and 2064:Khojki 1965: 1963:U+20BA 1924:, and 1902:Chakma 1841:Brahmi 1761:Kaithi 1683: 1681:U+1E9E 1657:, and 1647:Rejang 1639:Lydian 1635:Lycian 1631:Lepcha 1619:Carian 1519:Coptic 1513:, and 1449:, and 1447:Tai Le 1380:, and 1324:Gothic 1272:, and 1266:Thaana 1262:Syriac 1181: 1179:U+FFFC 1166: 1164:U+20AC 1079:Hangul 993:1.0.1 985:, and 979:Telugu 943:Hebrew 939:Hangul 899:Arabic 879:  868:1.0.0 852:Total 815:, and 813:Todhri 595:, and 577:Google 491:16-bit 442:, and 384:markup 376:glyphs 344:UTF-32 342:, and 340:UTF-16 319:, and 112:UTF-32 97:UTF-16 14458:CCSID 14331:8-bit 14326:7-bit 14322:INIS 14176:UTF-8 14171:UTF-7 14166:UTF-1 14044:LMBCS 13980:CP/M+ 13822:Dutch 13807:Swiss 13489:CWI-2 13193:VT100 13163:Roman 13158:Ogham 13138:Inuit 13113:Greek 12999:VSCII 12989:TSCII 12939:KOI-7 12914:ISCII 12909:HKSCS 12801:ANSEL 12763:Welsh 12587:BCDIC 12575:ASCII 12536:Morse 12436:Emoji 12332:Takri 12292:Runic 12232:Ogham 12066:Dogra 11918:Tamil 11823:Osage 11798:Nüshu 11733:Limbu 11723:Latin 11708:Khmer 11688:Kanji 11663:Hanja 11628:Greek 11618:Geʽez 11613:Garay 11563:Buhid 11543:Batak 11538:Bamum 11518:Adlam 11366:Input 11344:Fonts 11339:Email 11327:Usage 11229:UTF-8 11224:UTF-7 11219:UTF-1 11070:Lists 10987:Plane 10960:Block 10591:(PDF) 10574:(PDF) 10514:(PDF) 10496:(PDF) 10473:(PDF) 10450:(PDF) 10427:(PDF) 10333:(PDF) 10243:S2CID 10215:arXiv 9965:(PDF) 9545:(PDF) 8826:(PDF) 7370:(PDF) 7359:(PDF) 7070:Notes 6763:ISCII 6755:Tamil 6690:) or 6577:with 6435:Apple 6415:CCCII 6402:kanji 6293:fonts 6283:WOFF2 6267:fonts 6240:Fonts 6175:bytes 6153:XHTML 6104:Gmail 6086:ASCII 6074:UTF-8 6058:Email 6008:GNOME 6000:Pango 5992:ATSUI 5957:UTF-8 5939:macOS 5911:Vista 5891:UCS-2 5858:ASCII 5846:UTF-8 5824:UTF-9 5794:UTF-6 5790:UTF-5 5774:ASCII 5756:Seed7 5671:bytes 5636:ASCII 5628:UTF-8 5618:UCS-2 5614:UTF-1 5541:01–02 5377:5B–5E 5327:, 82 5269:F2–F3 5236:Greek 5154:FA–FF 5104:A0–FF 5089:20–7E 5076:Cells 5065:MES-1 5061:WGL-4 5044:WGL-4 4712:alias 4539:glyph 4021:Only 3989:Only 3963:, or 3923:7,376 3855:, or 3533:= De 3455:2,020 3327:2,258 3306:1,858 3248:Count 3236:Value 3168:UTF-8 3046:UTF-8 2793:Garay 2758:16.0 2724:15.1 2680:15.0 2605:14.0 2590:Wolof 2586:Hausa 2532:13.0 2485:12.1 2425:12.0 2378:Dogra 2345:11.0 2320:Nüshu 2270:10.0 2259:Osage 2243:Adlam 1926:Takri 1837:Batak 1733:Bamum 1431:Limbu 1370:Buhid 1254:runes 1250:Ogham 1238:Khmer 1234:Geʽez 1118:−6656 975:Tamil 963:Latin 838:Date 793:Garay 669:music 573:Apple 569:Adobe 499:ASCII 468:Apple 456:Xerox 348:ASCII 336:UTF-8 278:emoji 143:UTF-1 138:UTF-7 92:UTF-8 51:(UCS) 14392:TRON 14245:Cork 14216:SCSU 14139:ZX81 14134:ZX80 14129:XCCS 14059:NeXT 14039:LICS 13994:NRCS 13955:BICS 13925:1058 13920:1057 13915:1056 13910:1055 13905:1054 13900:1053 13895:1052 13769:DKOI 13725:1270 13720:1258 13715:1257 13710:1256 13705:1255 13700:1254 13695:1253 13690:1252 13685:1251 13680:1250 13670:1169 13627:1133 13622:1124 13617:1046 13612:1019 13607:1018 13602:1017 13597:1016 13592:1015 13587:1014 13582:1013 13577:1012 13572:1010 13567:1009 13562:1008 13557:1006 13464:3846 13459:1127 13454:1118 13449:1117 13444:1116 13439:1115 13434:1098 13429:1044 13424:1043 13419:1042 13414:1040 13409:1034 13173:Sámi 12859:Big5 12838:6862 12833:6438 12828:5428 12823:5427 12753:Sámi 12628:sets 12594:and 12212:Modi 12126:Kawi 11996:Ahom 11958:Toto 11938:Thai 11808:Odia 11783:N'Ko 11583:Cham 11349:HTML 11296:list 11214:SCSU 10965:List 10792:ISBN 10763:ISBN 10750:ISBN 10737:ISBN 10724:ISBN 10707:ISBN 10694:ISBN 10681:ISBN 10663:ISBN 10233:ISBN 10011:1999 9905:link 9885:5721 9851:link 9831:5255 9797:link 9777:6532 9743:link 9723:6531 9689:link 9669:6530 9635:link 9615:4952 9510:link 9490:2277 9456:link 9436:2640 9320:ISBN 8951:ISBN 8922:ISBN 8893:ISBN 8856:ISBN 8782:ISBN 8729:ISBN 8703:ISBN 8677:ISBN 8609:ISBN 8580:ISBN 8554:ISBN 8511:ISBN 8465:ISBN 8419:ISBN 8373:ISBN 8327:ISBN 8257:ISBN 8211:ISBN 8165:ISBN 8116:ISBN 8053:ISBN 8007:ISBN 7159:ISBN 6883:for 6792:แสดง 6785:the 6757:and 6686:(in 6646:and 6501:and 6431:Big5 6372:The 6333:and 6273:and 6230:HTTP 6226:URLs 6222:URIs 6138:font 6124:All 6069:MIME 6004:GTK+ 6002:for 5994:and 5986:and 5973:HTML 5935:.NET 5933:and 5931:Java 5903:2000 5878:2277 5826:and 5816:SCSU 5814:and 5792:and 5782:IDNA 5736:3629 5429:59, 5425:2A, 5413:03, 5409:00, 5303:17, 5267:9B, 5209:DC, 5204:D6, 5198:C6, 5148:8F, 5018:(by 5010:for 4994:and 4961:alif 4950:The 4927:The 4858:The 4679:, a 4589:and 4566:and 4488:6400 4393:1024 4369:1024 4367:The 4354:hhhh 4195:Not 4181:Not 4091:and 4089:ZWNJ 3589:and 2774:168 2715:and 2713:Kawi 2694:161 2638:Toto 2619:159 2596:and 2551:154 2478:Pali 2439:150 2416:and 2359:146 2289:139 2255:Newa 2224:135 2210:9.0 2180:Ahom 2161:129 2142:8.0 2088:Modi 2025:123 2011:7.0 1977:6.3 1932:6.2 1914:Miao 1883:100 1864:6.1 1799:6.0 1787:and 1765:Lisu 1696:5.2 1623:Cham 1589:5.1 1575:N'Ko 1534:5.0 1525:and 1461:4.1 1389:4.0 1339:3.2 1326:and 1285:3.1 1194:3.0 1135:2.1 1090:2.0 1037:1.1 983:Thai 971:Odia 893:7129 877:ISBN 694:and 686:and 663:and 585:Meta 533:The 528:NeXT 526:and 474:and 122:SCSU 117:BOCU 69:list 14348:KOI 14265:OT1 14260:OMS 14255:OML 14250:LY1 14236:TeX 14049:MSX 14009:GEM 13965:CDC 13783:VTx 13779:DEC 13665:950 13659:GBK 13655:936 13650:932 13552:922 13547:921 13542:915 13537:912 13532:896 13527:895 13509:MIK 13404:951 13399:950 13394:949 13389:942 13384:936 13379:932 13374:904 13369:903 13364:899 13359:897 13354:869 13349:868 13344:867 13339:866 13334:865 13329:864 13324:863 13319:862 13314:861 13309:860 13304:859 13299:858 13294:857 13289:856 13284:855 13279:853 13274:852 13269:851 13264:850 13259:778 13254:777 13249:776 13244:775 13239:773 13234:770 13229:737 13224:720 13219:708 13214:668 13209:437 11963:Vai 11778:Mru 11718:Lao 11017:BOM 10868:at 10784:doi 10669:, ( 10225:doi 10058:IBM 9882:RFC 9872:doi 9828:RFC 9818:doi 9774:RFC 9764:doi 9720:RFC 9710:doi 9666:RFC 9656:doi 9612:RFC 9602:doi 9487:RFC 9477:doi 9433:RFC 9423:doi 9233:CEN 7385:at 7084:TUS 6859:AAT 6841:of 6778:). 6665:or 6228:in 6149:XML 6126:W3C 6114:Web 6050:on 5943:KDE 5874:RFC 5870:FTP 5748:gcc 5732:RFC 5579:'s 5556:FD 5553:FF 5537:FB 5523:F0 5509:6A, 5503:26 5461:25 5444:23 5423:29, 5415:06, 5411:02, 5406:22 5355:21 5344:AF 5341:AC, 5313:4A 5301:15, 5295:20 5281:1F 5260:1E 5243:04 5225:03 5211:DD, 5202:C9, 5200:C7, 5178:... 5170:02 5163:... 5150:92, 5117:01 5084:00 5073:Row 5024:AAT 4957:lām 4841:172 4746:sic 4653:467 4651:819 4545:826 4543:154 4523:534 4505:534 4461:998 4458:111 4157:in 4152:BMP 4093:ZWJ 4080:170 3957:TAB 3902:125 3805:950 3776:640 3578:915 3552:236 3526:760 3476:468 3421:An 3393:404 3354:or 3136:.) 3105:064 3102:112 3029:111 3026:114 2780:998 2778:154 2741:813 2739:149 2700:186 2698:149 2625:697 2623:144 2557:859 2555:143 2502:929 2500:137 2445:928 2443:137 2365:374 2363:137 2295:690 2293:136 2230:172 2228:128 2167:672 2165:120 2092:Mro 2031:956 2029:112 1994:122 1992:110 1949:117 1947:110 1889:116 1887:110 1824:384 1822:109 1818:93 1739:of 1716:296 1714:107 1710:90 1659:Vai 1606:648 1604:100 1600:75 1554:024 1548:64 1478:655 1472:59 1414:382 1408:52 1353:156 1347:45 1307:140 1301:41 1217:194 1211:38 1149:887 1110:885 1104:25 1062:168 1056:24 1014:327 1008:25 959:Lao 889:24 847:UCS 597:SAP 581:IBM 458:'s 403:or 263:998 261:154 14524:: 14313:HZ 11978:Yi 10790:. 10673:). 10599:^ 10555:. 10544:^ 10534:. 10530:. 10378:, 10355:. 10307:. 10289:. 10264:. 10241:. 10231:. 10223:. 10209:. 10190:. 10172:. 10168:. 10150:. 10146:. 10128:. 10055:. 10043:^ 10033:. 10029:. 10018:^ 9971:. 9967:. 9901:}} 9897:{{ 9880:. 9870:. 9847:}} 9843:{{ 9826:. 9816:. 9793:}} 9789:{{ 9772:. 9762:. 9739:}} 9735:{{ 9718:. 9708:. 9685:}} 9681:{{ 9664:. 9654:. 9631:}} 9627:{{ 9610:. 9600:. 9506:}} 9502:{{ 9475:. 9452:}} 9448:{{ 9431:. 9421:. 9394:. 9370:. 9346:. 9294:. 9104:^ 9094:. 9076:. 8998:. 8973:. 8798:. 8751:. 8649:. 8296:. 7845:^ 7664:. 7640:. 7601:. 7597:. 7579:. 7575:. 7551:. 7527:. 7476:^ 7439:. 7414:. 7379:. 7365:. 7361:. 7337:^ 7327:. 7323:. 7213:. 7187:. 7131:). 7110:a 7017:TC 6975:.) 6967:: 6853:, 6468:. 6453:. 6421:, 6236:. 6205:, 6201:, 6197:, 6193:, 6189:, 6185:, 6181:, 6102:, 5979:. 5953:. 5927:11 5923:10 5921:, 5917:, 5913:, 5909:, 5907:XP 5905:, 5830:. 5796:. 5721:. 5511:6B 5363:2E 5325:7F 5165:) 5136:7F 5063:, 4980:لا 4839:11 4756:. 4754:ET 4752:CK 4699:; 4695:; 4645:. 4616:, 4609:. 4521:65 4503:65 4442:, 4434:, 4430:, 4426:, 4414:: 4348:A 4342:. 4325:^ 4311:. 4282:^ 4272:. 4240:. 4229:^ 4219:. 4171:Cn 4135:Co 4123:) 4109:Cs 4067:Cf 4045:Cc 4005:Zp 3973:Zl 3965:LF 3961:CR 3959:, 3952:17 3939:Zs 3910:So 3889:Sk 3878:63 3865:Sc 3851:, 3847:, 3839:, 3835:, 3831:, 3827:, 3823:, 3819:, 3815:, 3792:Sm 3763:Po 3754:10 3741:Pf 3728:12 3715:Pi 3706:77 3693:Pe 3680:79 3667:Ps 3654:27 3641:Pd 3620:10 3607:Pc 3585:, 3565:No 3559:) 3539:Nl 3513:Nd 3497:13 3484:Me 3463:Mc 3442:Mn 3405:Lo 3396:A 3380:Lm 3374:) 3366:, 3362:, 3348:31 3335:Lt 3314:Ll 3293:Lu 3282:LC 3170:. 3053:U+ 3048:. 2990:. 2848:). 2840:, 2836:, 2815:, 2811:, 2807:, 2803:, 2799:, 2795:, 2671:, 2652:, 2648:, 2644:, 2640:, 2588:, 2580:, 2576:, 2572:, 2472:, 2468:, 2464:, 2460:, 2408:, 2404:, 2400:, 2396:, 2392:, 2388:, 2380:, 2322:, 2318:, 2314:, 2310:, 2261:, 2257:, 2253:, 2249:, 2245:, 2198:, 2194:, 2190:, 2186:, 2182:, 2130:, 2126:, 2122:, 2118:, 2114:, 2110:, 2106:, 2102:, 2098:, 2094:, 2090:, 2086:, 2082:, 2078:, 2074:, 2070:, 2066:, 2062:, 2058:, 2054:, 2050:, 2046:, 1920:, 1916:, 1912:, 1908:, 1904:, 1855:, 1847:, 1843:, 1839:, 1783:, 1779:, 1775:, 1771:, 1767:, 1763:, 1759:, 1755:, 1751:, 1747:, 1743:, 1735:, 1731:, 1677:, 1665:, 1653:, 1649:, 1645:, 1641:, 1637:, 1633:, 1629:, 1625:, 1621:, 1597:— 1581:, 1577:, 1573:, 1569:, 1552:99 1517:, 1509:, 1505:, 1501:, 1497:, 1493:, 1476:97 1469:— 1453:, 1445:, 1441:, 1437:, 1433:, 1429:, 1412:96 1384:) 1376:, 1372:, 1351:95 1322:, 1305:94 1293:— 1276:, 1268:, 1264:, 1260:, 1256:, 1252:, 1248:, 1244:, 1240:, 1236:, 1232:, 1215:49 1175:, 1147:38 1143:— 1108:38 1070:−9 1060:34 1045:— 1022:−6 1012:28 981:, 977:, 973:, 969:, 965:, 961:, 957:, 953:, 949:, 945:, 941:, 937:, 933:, 929:, 925:, 921:, 917:, 913:, 909:, 905:, 901:, 886:— 819:. 811:, 807:, 803:, 799:, 795:, 788:. 659:, 599:. 591:, 583:, 579:, 575:, 571:, 446:. 438:, 350:. 338:, 323:. 311:, 13992:/ 13785:) 13661:) 13657:( 12598:/ 12504:e 12497:t 12490:v 10910:e 10903:t 10896:v 10800:. 10786:: 10769:. 10638:. 10617:. 10593:. 10576:. 10481:. 10458:. 10435:. 10394:. 10365:. 10341:. 10317:. 10274:. 10249:. 10227:: 10217:: 10194:. 10132:. 10072:. 9996:. 9973:6 9949:. 9929:. 9907:) 9893:. 9874:: 9853:) 9839:. 9820:: 9799:) 9785:. 9766:: 9745:) 9731:. 9712:: 9691:) 9677:. 9658:: 9637:) 9623:. 9604:: 9583:. 9568:. 9553:. 9530:. 9512:) 9498:. 9479:: 9458:) 9444:. 9425:: 9404:. 9380:. 9356:. 9328:. 9304:. 9280:. 9259:. 9201:. 9181:. 9161:. 9141:. 9121:. 9028:. 9008:. 8984:. 8959:. 8930:. 8901:. 8872:. 8864:. 8835:. 8808:. 8790:. 8761:. 8737:. 8711:. 8685:. 8659:. 8631:. 8617:. 8588:. 8562:. 8533:. 8519:. 8487:. 8473:. 8441:. 8427:. 8395:. 8381:. 8349:. 8335:. 8306:. 8279:. 8265:. 8233:. 8219:. 8187:. 8173:. 8144:. 8124:. 8098:. 8075:. 8061:. 8029:. 8015:. 7983:. 7947:. 7911:. 7875:. 7836:. 7800:. 7764:. 7728:. 7694:. 7674:. 7650:. 7626:. 7561:. 7537:. 7513:. 7493:. 7470:. 7449:. 7425:. 7309:. 7288:. 7264:. 7224:. 7198:. 7167:. 7086:. 6935:. 6823:e 6587:I 6581:) 6575:I 6564:I 6529:I 6525:I 6521:İ 6519:( 6516:I 6509:I 6166:, 5919:8 5915:7 4983:) 4974:ا 4968:ل 4959:- 4890:– 4878:– 4866:– 4775:é 4744:( 4593:– 4585:– 4519:( 4515:– 4501:( 4497:– 4486:( 4482:– 4456:1 4418:– 4391:( 4387:– 4375:– 4162:) 4037:C 4018:1 3986:1 3931:Z 3857:/ 3853:- 3849:* 3845:! 3841:≠ 3837:∊ 3833:√ 3829:÷ 3825:× 3821:= 3817:− 3813:+ 3784:S 3599:P 3505:N 3434:M 3372:Dz 3368:Nj 3364:Lj 3360:Dž 3278:L 3226:e 3219:t 3212:v 3205:) 3100:1 3089:( 3036:– 3024:1 233:. 209:e 202:t 195:v 71:) 67:(

Index


Unicode Consortium
Universal Coded Character Set
list
UTF-8
UTF-16
GB18030
UTF-32
BOCU
SCSU
UTF-EBCDIC
UTF-7
UTF-1
ISO/IEC 8859
Official website
Technical website
v
t
e
rendering support
question marks, boxes, or other symbols
text encoding
Unicode Consortium
writing systems
characters
scripts
emoji
character sets
web pages
character repertoire

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.