Knowledge

Mojibake

Source 📝

3137:" is parsed as "鋜". Compared to the above mojibake, this is harder to read, since letters unrelated to the problematic å, ä or ö are missing, and is especially problematic for short words starting with å, ä or ö (e.g. "än" becomes "鋘"). Since two letters are combined, the mojibake also seems more random (over 50 variants compared to the normal three, not counting the rarer capitals). In some rare cases, an entire text string which happens to include a pattern of particular word lengths, such as the sentence " 3057:"computer", "kompajlirati" for "compile," etc.), and if they are unaccustomed to the translated terms, they may not understand what some option in a menu is supposed to do based on the translated phrase. Therefore, people who understand English, as well as those who are accustomed to English terminology (who are most, because English terminology is also mostly taught in schools because of these problems) regularly choose the original English versions of non-specialist software. 4771: 6820: 25: 3985: 144: 122: 4785: 4606:
were reserved for Myanmar's ethnic languages. Not only does the re-mapping prevent future ethnic language support, it also results in a typing system that can be confusing and inefficient, even for experienced users. ... Huawei and Samsung, the two most popular smartphone brands in Myanmar, are motivated only by capturing the largest market share, which means they support Zawgyi out of the box.
3760:, and others, even if the character set employed is properly recognized by the application. This is because, in many Indic scripts, the rules by which individual letter symbols combine to create symbols for syllables may not be properly understood by a computer missing the appropriate software, even if the glyphs for the individual letter forms are available. 2808:, Russian-language school) becomes "ûËÏÌÁ ÒÕÓÓËÏÇÏ ÑÚÙËÁ"). Using Code Page 1251 to view text in KOI8, or vice versa, results in garbled text that consists mostly of capital letters (KOI8 and Code Page 1251 share the same ASCII region, but KOI8 has uppercase letters in the region where Code Page 1251 has lowercase, and vice versa). 1233:€ and the French œ, but otherwise any confusion of these three character sets does not create mojibake in these languages. Furthermore, it is always safe to interpret ISO 8859-1 as Windows-1252, and fairly safe to interpret it as ISO 8859-15, in particular with respect to the Euro sign, which replaces the rarely used 798:
from country to country. As such, these systems will potentially display mojibake when loading text generated on a system from a different country. Likewise, many early operating systems do not support multiple encoding formats and thus will end up displaying mojibake if made to display non-standard text—early versions of
3778:
The idea of Plain Text requires the operating system to provide a font to display Unicode codes. This font is different from OS to OS for Singhala and it makes orthographically incorrect glyphs for some letters (syllables) across all operating systems. For instance, the 'reph', the short form for 'r'
1249:
computers, due to UTF-8's incompatibility with Latin-1 and Windows-1252. But UTF-8 has the ability to be directly recognised by a simple algorithm, so that well written software should be able to avoid mixing UTF-8 up with other encodings, so this was most common when many had software not supporting
806:
for example, are localized on a per-country basis and will only support encoding standards relevant to the country the localized version will be sold in, and will display mojibake if a file containing a text in a different encoding format from the version that the OS is designed to support is opened.
1651:
which are not in Latin-1. These two characters can be correctly encoded in Latin-2, Windows-1250, and Unicode. However, before Unicode became common in e-mail clients, e-mails containing Hungarian text often had the letters ő and ű corrupted, sometimes to the point of unrecognizability. It is common
256:
To correctly reproduce the original text that was encoded, the correspondence between the encoded data and the notion of its encoding must be preserved (i.e. the source and target encoding standards must be the same). As mojibake is the instance of non-compliance between these, it can be achieved by
223:
representation is considered invalid. A replacement can also involve multiple consecutive symbols, as viewed in one encoding, when the same binary code constitutes one symbol in the other encoding. This is either because of differing constant length encoding (as in Asian 16-bit encodings vs European
3723:
Newspapers have dealt with missing characters in various ways, including using image editing software to synthesize them by combining other radicals and characters; using a picture of the personalities (in the case of people's names), or simply substituting homophones in the hope that readers would
4605:
With the release of Windows XP service pack 2, complex scripts were supported, which made it possible for Windows to render a Unicode-compliant Burmese font such as Myanmar1 (released in 2005). ... Myazedi, BIT, and later Zawgyi, circumscribed the rendering problem by adding extra code points that
845:
The problem gets more complicated when it occurs in an application that normally does not support a wide range of character encoding, such as in a non-Unicode computer game. In this case, the user must change the operating system's encoding settings to match that of the game. However, changing the
797:
Much older hardware is typically designed to support only one character set and the character set typically cannot be altered. The character table contained within the display firmware will be localized to have characters for the country the device is to be sold in, and typically the table differs
3044:
encoding is important because the English versions of the Windows operating system are most widespread, not localized ones. The reasons for this include a relatively small and fragmented market, increasing the price of high quality localization, a high degree of software piracy (in turn caused by
2811:
During the early years of the Russian sector of the World Wide Web, both KOI8 and Code Page 1251 were common. Nearly all websites now use Unicode, but as of November 2023, an estimated 0.35% of all web pages worldwide – all languages included – are still encoded in Code Page 1251, while less
4722:
It makes communication on digital platforms difficult, as content written in Unicode appears garbled to Zawgyi users and vice versa. ... In order to better reach their audiences, content producers in Myanmar often post in both Zawgyi and Unicode in a single post, not to mention English or other
2789:
whose specific appearance varied depending on the exact combination of text and font encoding. For example, attempting to view non-Unicode Cyrillic text using a font that is limited to the Latin alphabet, or using the default ("Western") encoding, typically results in text that consists almost
3056:
from the other three creates many problems. There are many different localizations, using different standards and of different quality. There are no common translations for the vast amount of computer terminology originating in English. In the end, people use English loanwords ("kompjuter" for
3036:
When confined to basic ASCII (most user names, for example), common replacements are: š→s, đ→dj, č→c, ć→c, ž→z (capital forms analogously, with Đ→Dj or Đ→DJ depending on word case). All of these replacements introduce ambiguities, so reconstructing the original from such a form is usually done
3850:
encodings, communications between users of Zawgyi and Unicode would render as garbled text. To get around this issue, content producers would make posts in both Zawgyi and Unicode. Myanmar government designated 1 October 2019 as "U-Day" to officially switch to Unicode. The full transition was
3505:
but is displayed using the wrong encoding. When this occurs, it is often possible to fix the issue by switching the character encoding without loss of data. The situation is complicated because of the existence of several Chinese character encoding systems in use, the most common ones being:
2847:". Unlike the former USSR, South Slavs never used something like KOI8, and Code Page 1251 was the dominant Cyrillic encoding before Unicode; therefore, these languages experienced fewer encoding incompatibility troubles than Russian. In the 1980s, Bulgarian computers used their own 1320:
Some users transliterate their writing when using a computer, either by omitting the problematic diacritics, or by using digraph replacements (å → aa, ä/æ → ae, ö/ø → oe, ü → ue etc.). Thus, an author might write "ueber" instead of "über", which is standard practice in German when
2380:
The default Western European Windows encoding is used instead of the Central-European one. Only ő-Ő (õ-Õ) and ű-Ű (û-Û) are wrong, and the text is completely readable. This is the most common error nowadays; due to ignorance, it occurs often on webpages or even in printed media.
3032:
Although Mojibake can occur with any of these characters, the letters that are not included in Windows-1252 are much more prone to errors. Thus, even nowadays, "šđčćž ŠĐČĆŽ" is often displayed as "šðèæž ŠÐÈÆŽ", although ð, È, and Æ are never used in Slavic languages.
2699:
which encodes Cyrillic letters only with high-bit set octets corresponding to 7-bit codes from KOI7. It is for this reason that KOI8 text, even Russian, remains partially readable after stripping the eighth bit, which was considered as a major advantage in the age of
2619:
succeeded as the "Internet standard" with limited support of the dominant vendors' software (today largely replaced by Unicode). With the numerous problems caused by the variety of encodings, even today some users tend to refer to Polish diacritical characters as
3016:
add to the basic Latin alphabet the letters š, đ, č, ć, ž, and their capital counterparts Š, Đ, Č, Ć, Ž (only č/Č, š/Š and ž/Ž in Slovenian; officially, although others are used when needed, mostly in foreign names, as well). All of these letters are defined in
1992:. Lowercase letters are mainly correct, except for ű and ő. Ü/ü and Ö/ö are correct because CP 437 and CP 850 were made compatible with German. Although this is rare nowadays, it can still be seen in places such as on printed prescriptions and cheques. 2812:
than 0.003% of sites are still encoded in KOI8-R. Though the HTML standard includes the ability to specify the encoding for any given web page in its source, this is sometimes neglected, forcing the user to switch encodings in the browser manually.
1250:
UTF-8. Most of these languages were supported by MS-DOS default CP437 and other machine default encodings, except ASCII, so problems when buying an operating system version were less common. Windows and MS-DOS are not compatible, however.
784:. This is the encoding that the author's editor actually saved it in. Unless an accidental encoding conversion has happened (by opening it in one encoding and saving it in another), this will be correct. It is, however, only available in 737:
and appending atop it; as a result, these encodings are partially compatible with each other. Examples of this include Windows-1252 and ISO 8859-1. People thus may mistake the expanded encoding set they are using with plain ASCII.
3779:
is a diacritic that normally goes on top of a plain letter. However, it is wrong to go on top of some letters like 'ya' or 'la' in specific contexts. For Sanskritic words or names inherited by modern languages, such as कार्य, IAST:
2466:
Mainly caused by web services or webmail clients that are configured incorrectly or not tested for international usage (as the problem remains concealed for English texts). In this case the actual (often generated) content is in
3822:
Due to Western sanctions and the late arrival of Burmese language support in computers, much of the early Burmese localization was homegrown without international cooperation. The prevailing means of Burmese support is via the
4199:
way to store the encoding together with the data – prepend it. This is by intention invisible to humans using compliant software, but will by design be perceived as "garbage characters" to incompliant software (including many
4153:
The examples in this article do not have UTF-8 as browser setting, because UTF-8 is easily recognisable, so if a browser supports UTF-8 it should recognise it automatically, and not try to interpret something else as UTF-8.
1329:. For example, in Norwegian, digraphs are associated with archaic Danish, and may be used jokingly. However, digraphs are useful in communication with other parts of the world. As an example, the Norwegian football player 3108:
or ARMSCII, a set of obsolete character encodings for the Armenian alphabet which have been superseded by Unicode standards. ArmSCII is not widely used because of a lack of support in the computer industry. For example,
4181:– The conventions for representing the line break differ between Windows and Unix systems. Though most software supports both conventions (which is trivial), software that must preserve or display the difference (e.g. 3799:. But it happens in most operating systems. This appears to be a fault of internal programming of the fonts. In Mac OS and iOS, the muurdhaja l (dark l) and 'u' combination and its long form both yield wrong shapes. 1265:
in German, which becomes "für". This way, even though the reader has to guess what the original letter is, almost all texts remain legible. Finnish, on the other hand, frequently uses repeating vowels in words like
3079:
to be changed (older versions require special English versions with this support), but this setting can be and often was incorrectly set. For example, Windows 98 and Windows Me can be set to most non-right-to-left
152: 260:
Mojibake is often seen with text data that have been tagged with a wrong encoding; it may not even be tagged at all, but moved between computers with different default encodings. A major source of trouble are
3591:) radical, while kanji are other characters. Many of the substitute characters are extremely uncommon in modern Chinese. Somewhat easy to identify due to the presence of multiple consecutive 亻 characters. 235:
Failed rendering of glyphs due to either missing fonts or missing glyphs in a font is a different issue that is not to be confused with mojibake. Symptoms of this failed rendering include blocks with the
826:
The difficulty of resolving an instance of mojibake varies depending on the application within which it occurs and the causes of it. Two of the most common applications in which mojibake may occur are
754:
in the HTTP header. This information can be based on server configuration (for instance, when serving a file off disk) or controlled by the application running on the server (for dynamic websites).
1643:, meaning "letter garbage". Hungarian has been particularly susceptible as it contains the accented letters á, é, í, ó, ú, ö, ü (all present in the Latin-1 character set), plus the two characters 3787:, it is apt to put it on top of these letters. By contrast, for similar sounds in modern languages which result from their specific rules, it is not put on top, such as the word करणाऱ्या, IAST: 746:
When there are layers of protocols, each trying to specify the encoding based on different information, the least certain information may be misleading to the recipient. For example, consider a
2716:), when encoded in KOI8 and passed through the high bit stripping process, end up being rendered as "[KOLA RUSSKOGO qZYKA". Eventually, KOI8 gained different flavors for Russian and Bulgarian ( 2186:
Also common in the days of DOS, this could be seen when Apple computers tried to display Hungarian text sent using DOS or Windows machines, as they would often default to Apple's own encoding.
854:, an application that allows the changing of per-application locale settings. Even so, changing the operating system encoding settings is not possible on earlier operating systems such as 3615:
An additional problem in Chinese occurs when rare or antiquated characters, many of which are still used in personal or place names, do not exist in some encodings. Examples of this are:
1615:
languages can also be affected. Because most computers were not connected to any network during the mid- to late-1980s, there were different character encodings for every language with
4573:
Oct. 1 is "U-Day", when Myanmar officially will adopt the new system.... Microsoft and Apple helped other countries standardize years ago, but Western sanctions meant Myanmar lost out.
1282:
appears as "hÃ⁠¤Ã⁠¤yÃ⁠¶"). Icelandic has ten possibly confounding characters, and Faroese has eight, making many words almost completely unintelligible when corrupted (e.g. Icelandic
3359:, mojibake is especially problematic as there are many different Japanese text encodings. Alongside Unicode encodings (UTF-8 and UTF-16), there are other standard encodings, such as 1253:
In Swedish, Norwegian, Danish and German, vowels are rarely repeated, and it is usually obvious when one character gets corrupted, e.g. the second letter in the Swedish word
630:, and many other conditions. Therefore, the assumed encoding is systematically wrong for files that come from a computer with a different setting, or even from a differently 339:
appears as "譁�蟄怜喧縺�" if interpreted as Shift-JIS, as "文字化け" if interpreted as Western, or (for example) as "鏂囧瓧鍖栥亼" if interpreted as being in a
6438: 4638:
Standard Myanmar Unicode fonts were never mainstreamed unlike the private and partially Unicode compliant Zawgyi font. ... Unicode will improve natural language processing
611:
If the encoding is not specified, it is up to the software to decide it by other means. Depending on the type of software, the typical solution is either configuration or
3367:(UNIX systems). Even to this day, mojibake is often encountered by both Japanese and non-Japanese people when attempting to run software written for the Japanese market. 823:. UTF-8 also has the ability to be directly recognised by a simple algorithm, so that well written software should be able to avoid mixing UTF-8 up with other encodings. 3775:
vowel, easily recognizable as mojibake generated by a computer not configured to display Indic text. The logo as redesigned as of May 2010 has fixed these errors.
4622: 4552: 3049: 4736: 721:), that were not displayed properly in software complying with the ISO standard; this especially affected software running under other operating systems such as 3767:, which attempts to show the character analogous to "wi" (the first syllable of "Knowledge") on each of many puzzle pieces. The puzzle piece meant to bear the 642:
and other machine readable text, many parsers do not tolerate this. Another is storing the encoding as metadata in the file system. File systems that support
3121:
Another type of mojibake occurs when text encoded in a single-byte encoding is erroneously parsed in a multi-byte encoding, such as one of the encodings for
5283: 3045:
high price of software compared to income), which discourages localization efforts, and people preferring English versions of Windows and other software.
5120: 3843:. With the advent of mobile phones, mobile vendors such as Samsung and Huawei simply replaced the Unicode compliant system fonts with Zawgyi versions. 1922:
encoding was designed so that Hungarian text remains fairly well-readable even if the device on the receiving end uses one of the default encodings (
6229: 2349:
Both encodings are Central European, but the text is encoded with the Windows encoding and decoded with the DOS encoding. The use of ű is correct.
2101:
Both encodings are Central European, but the text is encoded with the DOS encoding and decoded with the Windows encoding. The use of ű is correct.
1653: 5228: 5303: 4817: 2785:
Before Unicode, it was necessary to match text encoding with a font using the same encoding system; failure to do this produced unreadable
838:
encoding setting on the fly, while word processors allow the user to select the appropriate encoding when opening a file. It may take some
693:
Mojibake also occurs when the encoding is incorrectly specified. This often happens between encodings that are similar. For example, the
834:. Modern browsers and word processors often support a wide array of character encodings. Browsers often allow a user to change their 819:
as a default encoding may achieve a greater degree of interoperability because of its widespread use and backward compatibility with
6530: 6284: 4445: 1984:("Central European"), but the software on the receiving end often did not support CP 852 and instead tried to display text using 1157: 6520: 3518:(with several backward compatible versions), and the possibility of Chinese characters being encoded using Japanese encoding. 3125:. With this kind of mojibake more than one (typically two) characters are corrupted at once. For example, if the Swedish word 6269: 5223: 3990: 1026:. The additional characters are typically the ones that become corrupted, making texts only mildly unreadable with mojibake: 681:
that are used to substitute for missing HTTP headers if the server cannot be configured to send the proper HTTP headers; see
307:, several encodings have historically been employed, causing users to see mojibake relatively often. As an example, the word 4560: 204:, "character transformation") is the garbled or gibberish text that is the result of text being decoded using an unintended 6403: 1930:). This encoding was used very heavily between the early 1980s and early 1990s, but nowadays it is completely deprecated. 3164: 2612:
with the needed glyphs for Polish—arbitrarily located without reference to where other computer sellers had placed them.
216: 176: 4744: 3609:
which in most cases make no sense. Probably the easiest to identify because of spaces between every several characters.
2768: 858:; to resolve this issue on earlier operating systems, a user would have to use third party font rendering applications. 6797: 6308: 6111: 4855: 2471:, but some older software may default to localized encodings if UTF-8 is not explicitly specified in the HTML headers. 89: 4479: 6353: 5969: 5964: 5467: 5298: 4845: 4810: 4461: 4382: 3150: 108: 61: 5387: 3912: 1221:) has been in use. However, ISO 8859-1 has been obsoleted by two competing standards, the backward compatible 750:
serving a static HTML file over HTTP. The character set may be communicated to the client in any number of 3 ways:
650:. This also requires support in software that wants to take advantage of it, but does not disturb other software. 6605: 6540: 6294: 6274: 3350: 4775: 1656:(literally "Flood-resistant mirror-drilling machine") which contains all accented characters used in Hungarian. 68: 6358: 4951: 2592:
computers created their own mutually-incompatible ways to encode Polish characters and simply reprogrammed the
1171: 46: 1325:
are not available. The latter practice seems to be better tolerated in the German language sphere than in the
6472: 6443: 6093: 3606: 6535: 6423: 6383: 4803: 208:. The result is a systematic replacement of symbols with completely unrelated ones, often from a different 3563:
Garbled characters with almost no hint of original meaning. The red character is not a valid codepoint in
954:
In older eras, some computers had vendor-specific encodings which caused mismatch also for English text.
653:
While a few encodings are easy to detect, such as UTF-8, there are many that are hard to distinguish (see
75: 6842: 6777: 6388: 6318: 6304: 6289: 6193: 6106: 6078: 6044: 4285: 4207: 936: 682: 319:), or "ハクサ郾ス、ア" if interpreted as Shift-JIS, or as "ʸ»ú²½¤±" in software that assumes text to be in the 970:. PETSCII printers worked fine on other computers of the era, but inverted the case of all letters. IBM 6751: 6696: 6617: 6398: 6054: 6049: 5402: 4218:), applying it too many times results in garbling of these characters. For example, the quotation mark 3502: 623: 6393: 6458: 6413: 6249: 5798: 5502: 5447: 5412: 4789: 4210:– An encoding of special characters in HTML, mostly optional, but required for certain characters to 3860: 3093: 2764: 2601: 57: 42: 6847: 5973: 5482: 5462: 5457: 5397: 5392: 4901: 4331: 3712: 2775: 987: 643: 340: 6348: 1049:(š and ž are present in some Finnish loanwords, é marginally in Swedish, mainly also in loanwords) 296:
generally uses UTF-16, and sometimes uses 8-bit code pages for text files in different languages.
6823: 6807: 6734: 6729: 6691: 6662: 6627: 6059: 5793: 5492: 5377: 4201: 2729: 2668: 1234: 955: 172: 166: 130: 35: 6418: 6408: 6264: 6254: 5788: 5497: 4942: 4929: 4865: 4528: 4182: 2605: 2597: 694: 262: 4589: 3183:
was commonly seen on computers that ran pre-Vista versions of Windows or cheap mobile phones.
1330: 6595: 6433: 6368: 6244: 5783: 4937: 4705: 4214:
interpretation as markup. While failure to apply this transformation is a vulnerability (see
4168: 2683:, which translates to "Code for Information Exchange"). This began with Cyrillic-only 7-bit 631: 272:
The differing default settings between computers are in part due to differing deployments of
5813: 1241:, mojibake has become more common in certain scenarios, e.g. exchange of text files between 6756: 6428: 6188: 5808: 4706:"Integrating autoconversion: Facebook's path from Zawgyi to Unicode - Facebook Engineering" 4215: 4173: 3741: 3122: 3053: 966:
encoding, particularly notable for inverting the upper and lower case compared to standard
4395: 3771:
character for "wi" instead used to display the "wa" character followed by an unpaired "i"
2691:
but with Latin and some other characters replaced with Cyrillic letters. Then came 8-bit
8: 6711: 6338: 5823: 5708: 5698: 5693: 3745: 3493: 3215: 3156: 3061: 2744: 2725: 1167: 1015: 851: 6792: 6640: 6453: 6448: 6373: 5372: 5346: 4870: 4826: 4237: 3900: 3863:, unencoded text is unreadable. Texts that may produce mojibake include those from the 3484: 3138: 2948: 2748: 2740: 2664: 2609: 1636: 1322: 1135: 1066: 971: 846:
system-wide encoding settings can also cause Mojibake in pre-existing applications. In
777:
declaration. This is the encoding that the author meant to save the particular file in.
594: 285: 205: 4455: 4431: 4310: 1354:" being rendered as "Ring meg nÃ¥", was seen in 2014 in an SMS scam targeting Norway. 1261:("love") when it is encoded in UTF-8 but decoded in Western, producing "kÃ⁠¤rlek", or 82: 6852: 6782: 6721: 6701: 6363: 6343: 6323: 5951: 5427: 5407: 4919: 4623:"Unified under one font system as Myanmar prepares to migrate from Zawgyi to Unicode" 4378: 4257:
King, Ritchie (2012). "Will unicode soon be the universal code? [The Data]".
4189:
tools) can get substantially more difficult to use if not adhering to one convention.
3924: 3908: 3110: 3101: 3097: 3076: 2997: 2756: 2569: 1334: 1246: 1190: 995: 799: 698: 654: 612: 335:. This is further exacerbated if other locales are involved: the same text stored as 293: 200: 188: 4413: 3131:
is encoded in Windows-1252 but decoded using GBK, it will appear as "k鋜lek", where "
6739: 6313: 6279: 5989: 5818: 4266: 3920: 3892: 3796: 3757: 3753: 3749: 3716: 3632: 3472: 3065: 3013: 3005: 3001: 2828: 2652: 2647: 2615:
The situation began to improve when, after pressure from academic and user groups,
2581: 1790: 1694: 1326: 1161: 1147: 1121: 1053: 1046: 1042: 1019: 1011: 999: 991: 888: 710: 627: 277: 3501:, meaning 'chaotic code'), and can occur when computerised text is encoded in one 2782:
for virtually all characters in all languages, including all Cyrillic characters.
6787: 6706: 5437: 5432: 5422: 5367: 5052: 5042: 5037: 5032: 5027: 5022: 5017: 4553:"Unicode in, Zawgyi out: Modernity finally catches up in Myanmar's digital world" 4211: 4192: 4186: 3948: 3915:, but these are not generally supported. Various other writing systems native to 3896: 3888: 3868: 3733: 1380: 1199: 1184: 1141: 1095: 1070: 1007: 1003: 900: 839: 781: 714: 635: 304: 3955:), in which text becomes completely unreadable when the encodings do not match. 244:
or using the generic replacement character. Importantly, these replacements are
6239: 6234: 6224: 6219: 6214: 6209: 6173: 6168: 6161: 6156: 6151: 6146: 6141: 6136: 6131: 6126: 6121: 6116: 5984: 5941: 5936: 5931: 5926: 5921: 5916: 5911: 5906: 5901: 5896: 5891: 5886: 5881: 5876: 5871: 5778: 5773: 5768: 5763: 5758: 5753: 5748: 5743: 5738: 5733: 5728: 5723: 5507: 5092: 5012: 5007: 5002: 4997: 4992: 4987: 4982: 4977: 4972: 4840: 4270: 3884: 3864: 3764: 3018: 3009: 2696: 1612: 1608: 1506: 1178: 1081: 1023: 835: 831: 300: 281: 209: 5487: 4650: 6836: 6559: 5979: 5866: 5861: 5856: 5851: 5846: 5841: 5718: 5713: 5703: 5688: 5683: 5678: 5673: 5668: 5663: 5658: 5653: 5648: 5643: 5638: 5633: 5628: 5623: 5618: 5613: 5608: 5603: 5598: 5593: 5588: 5583: 5578: 5573: 5568: 5563: 5558: 5553: 5548: 5543: 5538: 5533: 5528: 5523: 5442: 5417: 5382: 5341: 5087: 3932: 3811: 3220: 2915: 2848: 2736: 1981: 1552: 1393: 1385: 6579: 6574: 6569: 6564: 6299: 6039: 6034: 6029: 6024: 6019: 6014: 6009: 6004: 5999: 5994: 5477: 5472: 5452: 5336: 5328: 4961: 4692:"UTF-8" technically does not apply to ad hoc font encodings such as Zawgyi. 4675: 4480:"Some Errors Defy Fixes: A Typo in Knowledge's Logo Fractures the Sanskrit" 4352: 4142: 4132: 4052: 4042: 4032: 4022: 3831:
but was in fact only partially Unicode compliant. In the Zawgyi font, some
3828: 3515: 3176: 3041: 3026: 3022: 2981: 2897: 2760: 2660: 2585: 2354: 2191: 2025: 1620: 1222: 1077: 908:
but interpreted by the recipient as one of the Western European encodings (
758: 706: 670: 320: 137: 3104:, may produce mojibake. This problem is particularly acute in the case of 4894: 4877: 3916: 3824: 3650: 3175:). It can occur when a computer tries to decode text encoded in UTF-8 as 1226: 883:(“,”,‘,’), but rarely in character text, since most encodings agree with 880: 827: 658: 639: 316: 280:
families, and partly the legacy encodings' specializations for different
241: 220: 4770: 2851:, which is superficially similar to (although incompatible with) CP866. 1333:
had his last name spelled "SOLSKJAER" on his uniform when he played for
1274:("wedding night") which can make corrupted text very hard to read (e.g. 6744: 6652: 6505: 6183: 5278: 5248: 5243: 5238: 5233: 5198: 5082: 5077: 5067: 5062: 4860: 4850: 4163: 4122: 4102: 4072: 4062: 3807: 3803: 3768: 3737: 3696: 3052:
Croatian from Serbian, Bosnian from Croatian and Serbian, and now even
2779: 2616: 2565: 1787: 1539: 1398: 1345: 1210: 1195: 913: 892: 855: 847: 747: 702: 634:
software within the same system. For Unicode, one solution is to use a
324: 237: 6685: 4795: 1778:
Mainly caused by incorrectly configured mail servers but may occur in
265:
that rely on settings on each computer rather than sending or storing
6632: 6610: 6515: 6328: 5357: 5288: 5268: 5263: 5188: 5183: 4450: 4112: 3832: 3772: 3672: 3360: 2786: 2476: 2106: 1616: 1565: 1442: 1230: 666: 619: 2655:, which was and remains complicated by several systems for encoding 24: 6657: 6622: 6600: 6510: 6333: 5273: 5258: 5218: 5213: 5208: 5193: 5152: 5147: 5142: 5137: 5132: 5127: 4924: 4914: 4910: 4884: 4504: 4465: 3872: 2701: 2656: 1301:("letter salad") is a common term for this phenomenon, in Spanish, 939:(’), when encoded in UTF-8 and decoded using Windows-1252, becomes 871:
Mojibake in English texts generally occurs in punctuation, such as
820: 678: 266: 224:
8-bit encodings), or the use of variable length encodings (notably
2790:
entirely of capitalized vowels with diacritical marks (e.g. KOI8 "
733:
Of the encodings still in common use, many originated from taking
143: 6672: 6468: 6378: 6259: 5833: 5203: 5178: 5168: 4906: 4196: 4178: 3936: 3880: 3876: 3836: 3688: 3507: 3105: 2573: 1578: 963: 876: 872: 803: 785: 574: 273: 121: 3839:, but others were not. The Unicode Consortium refers to this as 3025:, while only some (š, Š, ž, Ž, Đ) exist in the usual OS-default 6677: 6667: 6645: 6525: 6500: 6495: 6178: 6069: 5959: 5318: 5308: 5293: 5110: 4784: 4092: 4082: 4012: 3928: 3904: 3628: 3480: 3364: 2802:, library) becomes "âÉÂÌÉÏÔÅËÁ", while "Школа русского языка" ( 2752: 2721: 2717: 2273: 1997: 1989: 1985: 1935: 1927: 1923: 1877: 975: 909: 662: 315:
might be incorrectly displayed as "ハクサ�ス、ア", "ハクサ嵂ス、ア" (
312: 229: 156: 4027:الإعلان العالمى Ů„Ř­­Ů‚وق الإنسان 1477:
The same problem occurs also in Romanian, see these examples:
248:
and are the result of correct error handling by the software.
6772: 6490: 6485: 6480: 6097: 5803: 5313: 5253: 5115: 4889: 4505:"Marathi Typing | English to Marathi | Online Marathi Typing" 4057:ط§ظ„ط¥ط¹ظ„ط§ظ† ط§ظ„ط¹ط§ظ„ظ…ظ‰ ظ„ط­ظ‚ظˆظ‚ ط§ظ„ط¥ظ†ط³ط§ظ† 4047:الإعلان العالمى لحقوق الإنسان 4037:Ш§Щ„ШҐШ№Щ„Ш§Щ† Ш§Щ„Ш№Ш§Щ„Щ…Щ‰ Щ„Ш­Щ‚Щ€Щ‚ Ш§Щ„ШҐЩ†ШіШ§Щ† 4007: 3521:
It is relatively easy to identify the original encoding when
3399: 3356: 3069: 2844: 2688: 2629: 2593: 2577: 2568:
in 1987, users of various computing platforms used their own
2468: 2386: 1919: 1872: 1699: 1691: 1624: 1526: 1519: 1341: 1238: 1073:
as well as optional acute accents on é etc for disambiguation
967: 959: 905: 884: 816: 734: 336: 289: 225: 148: 126: 4737:"Myanmar switch to Unicode to take two years: app developer" 6083: 5173: 3814:. However, various sites have made free-to-download fonts. 3620: 3511: 3081: 2692: 2684: 1242: 722: 718: 674: 1652:
to respond to a corrupted e-mail with the nonsense phrase
1648: 1644: 1153: 1131: 1127: 1117: 1113: 1109: 1105: 1101: 1091: 1087: 1062: 1058: 1038: 1034: 1030: 709:. Windows-1252 contains extra printable characters in the 6550: 3581:
Kana is displayed as characters with the 亻 (Chinese:
2589: 1977: 1779: 1511: 1290:, "outstanding hospitality", appears as "þjóðlöð"). 774: 16:
Garbled text as a result of incorrect character encodings
4117:ÿߟÑÿ•ÿπŸÑÿßŸÜ ÿߟÑÿπÿߟџ֟⠟Ñÿ≠ŸÇŸàŸÇ ÿߟÑÿ•ŸÜÿ≥ÿßŸÜ 4107:ظ'عÑظ٪ظ٩عÑظ'عÜ ظ'عÑظ٩ظ'عÑعÖعâ عÑظ-عÇعàعÇ ظ'عÑظ٪عÜظ٣ظ'عÜ 4097:╪з┘Д╪е╪╣┘Д╪з┘Ж ╪з┘Д╪╣╪з┘Д┘Е┘Й ┘Д╪н┘В┘И┘В ╪з┘Д╪е┘Ж╪│╪з┘Ж 4087:ěž┘äěąě╣┘äěž┘ć ěž┘äě╣ěž┘ä┘ů┘ë ┘äěş┘é┘ł┘é ěž┘äěą┘ćě│ěž┘ć 4077:ظ�ع„ظ�ظ�ع„ظ�ع† ظ�ع„ظ�ظ�ع„ع…ع‰ ع„ظ­ع‚عˆع‚ ظ�ع„ظ�ع†ظ�ظ�ع† 4017:ь╖ы└ь╔ь╧ы└ь╖ы├ ь╖ы└ь╧ь╖ы└ы┘ы┴ ы└ь╜ы┌ы┬ы┌ ь╖ы└ь╔ы├ьЁь╖ы├ 861: 626:
setting, which depends on the user's language, brand of
4067:иЇй�иЅиЙй�иЇй� иЇй�иЙиЇй�й�й� й�ий�й�й� иЇй�иЅй�иГиЇй� 3444:このメールは皆様へのメッセージです。 3436:ك“ك�كƒ�كƒ�كƒ�ك�هš†ن�˜ك�ك�كƒ�كƒƒك‚؛كƒ�ك‚�ك�ك™ك€‚ 3428:„Åì„ÅÆ„É°„ɺ„É´„ÅØÁöÜÊßò„Å∏„ÅÆ„É°„ÉɄǪ„ɺ„Ç∏„Åß„Åô„ÄÇ 792: 1309:(literally "deformation") is used, and in Portuguese, 257:
manipulating the data itself, or just relabelling it.
3084:
code pages including 1250, but only at install time.
1206:... and their uppercase counterparts, if applicable. 3835:
for Burmese script were implemented as specified in
3802:Some Indic and Indic-derived scripts, most notably 981: 49:. Unsourced material may be challenged and removed. 4375:Control + Alt + Delete: A Dictionary of Cyberslang 3791:, a stem form of the common word करणारा/री, IAST: 2576:on Amiga, Atari Club on Atari ST and Masovia, IBM 1140:à, â, ç, è, é, ë, ê, ï, î, ô, ù, û, ü, ÿ, æ, œ in 4704:LaGrow, Nick; Pruzan, Miri (September 26, 2019). 3029:, and are there because of some other languages. 669:if the encoding is not assigned explicitly using 6834: 6752:Unicode control, format and separator characters 4332:"Unicode mailinglist on the Eudora email client" 2704:-unaware email systems. For example, the words " 1311: 4396:"Usage statistics of Windows-1251 for websites" 2822: 2816: 1602: 1349: 1284: 661:may not be able to distinguish a page coded in 916:). If iterated using CP1252, this can lead to 615:heuristics. Both are prone to mis-prediction. 4811: 4547: 4545: 3496: 3487: 3132: 3126: 2874: 2838: 2832: 2803: 2797: 2791: 2711: 2705: 2678: 2672: 2641: 2635: 1303: 1276: 1268: 1255: 713:range (the most frequently seen being curved 4734: 4703: 4283: 3998: 3075:Newer versions of English Windows allow the 2621: 1295: 978:encoding which does not match ASCII at all. 930:£ 850:or later, a user also has the option to use 673:sent along with the documents, or using the 153:Russian Knowledge article on Church Slavonic 4668: 2630:Russian and other Cyrillic-based alphabets 2588:on IBM PCs. Polish companies selling early 1627:), often also varying by operating system. 4818: 4804: 4583: 4581: 4542: 4286:"curl -v linux.ars (Internationalization)" 4728: 4587: 4414:"Usage statistics of KOI8-R for websites" 441:EUC-JP bytes interpreted as Windows-1252 109:Learn how and when to remove this message 842:for users to find the correct encoding. 560:UTF-8 bytes interpreted as Windows-1252 142: 120: 4825: 4578: 4471: 4468:or GBK in browser to view it correctly. 4432:"Declaring character encodings in HTML" 4284:WINDISCHMANN, Stephan (31 March 2004). 3724:be able to make the correct inference. 2815:In Bulgarian, mojibake is often called 177:question marks, boxes, or other symbols 6835: 4697: 4616: 4614: 4377:, Jonathon Keats, Globe Pequot, 2007, 3919:present similar problems, such as the 3087: 1980:, as the text was often encoded using 1782:messages on some cell phones as well. 1166:à, á, â, ã, ç, é, ê, í, ó, ô, õ, ú in 398:EUC-JP bytes interpreted as Shift-JIS 4799: 4588:Hotchkiss, Griffin (March 23, 2016). 4477: 4464:and Unicode. Need manually selecting 3991:Universal Declaration of Human Rights 3460:‚±‚̃[ƒ‹‚ÍŠF—l‚ւ̃ƒbƒZ[ƒW‚Å‚·B 3452:¤³¤Î¥á¡¼¥ë¤Ï³§ÍͤؤΥá¥Ã¥»¡¼¥¸¤Ç¤¹¡£ 3412:�����<�若������罕��吾���<���祉�若�吾�с���� 2992: 2646: 862:Problems in different writing systems 701:was known to send emails labelled as 606: 511:UTF-8 bytes interpreted as Shift-JIS 215:This display may include the generic 199: 4311:"Guidelines for extended attributes" 4256: 3854: 1976:This was very common in the days of 1317:(literally "deformatting") is used. 793:Lack of hardware or software support 741: 688: 47:adding citations to reliable sources 18: 4620: 4611: 4559:. 27 September 2019. Archived from 4250: 3999: 3806:, were not officially supported by 1639:, the phenomenon is referred to as 904:if it was encoded by the sender as 13: 6162:Norwegian and Danish (alternative) 3420:縺薙�繝。繝シ繝ォ縺ッ逧�ァ倥∈縺ョ繝。繝�そ繝シ繧ク縺ァ縺吶€� 3116: 3060:When Cyrillic script is used (for 1500:(Characters in red are incorrect.) 1374:(Characters in red are incorrect.) 1209:These are languages for which the 788:encodings such as UTF-8 or UTF-16. 14: 6864: 4763: 4735:Saw Yi Nanda (21 November 2019). 3151:Vietnamese language and computers 3096:region, including the scripts of 2769:other Slavic variants of Cyrillic 1237:(¤). However, with the advent of 728: 6819: 6818: 4783: 4769: 3983: 3913:Democratic Republic of the Congo 3475:, the same phenomenon is called 2634:Mojibake is colloquially called 2384: 1052:à, ç, è, é, ï, í, ò, ó, ú, ü in 982:Other Western European languages 424:EUC-JP bytes interpreted as GBK 23: 6606:Digital encoding of APL symbols 6541:Comparison of Unicode encodings 5059:Proposed but not approved 4676:"Myanmar Scripts and Languages" 4643: 4621:Sin, Thant (7 September 2019). 4521: 4497: 4438: 3887:, and other languages, and the 3827:, a font that was created as a 3763:One example of this is the old 3351:Japanese language and computers 3092:The writing systems of certain 537:UTF-8 bytes interpreted as GBK 155:displayed as if interpreted as 136:displayed as if interpreted as 131:Japanese Knowledge article for 34:needs additional citations for 4424: 4406: 4388: 4367: 4345: 4324: 4303: 4277: 4147:ÇáÅÚáÇä ÇáÚÇáãì áÍÞæÞ ÇáÅäÓÇä 4137:«·≈⁄·«‰ «·⁄«·„Ï ·ÕfiÊfi «·≈‰”«‰ 4127:«‰≈Ÿ‰«Ê†«‰Ÿ«‰ÂȆ‰Õ‚Ë‚†«‰≈Ê”«Ê 3903:is used to write languages of 3732:A similar effect can occur in 3705: 3681: 3676: 3663: 3658: 3654: 3641: 3636: 3587: 3525:occurs in Guobiao encodings: 3497: 3488: 3167:: 𡨸魔, "ghost characters") or 2673: 2596:of the video cards (typically 810: 1: 4774:The dictionary definition of 4478:Cohen, Noam (June 25, 2007). 4355:(in Norwegian). June 18, 2014 4243: 4000:الإعلان العالمى لحقوق الإنسان 3958: 3952: 3947:Another affected language is 3851:estimated to take two years. 3742:Indo-Aryan or Indic languages 3727: 3144: 3068:), the problem is similar to 1213:character set (also known as 369:Raw bytes of EUC-JP encoding 3623:encoding's lack of the "煊" ( 3179:, TCVN3 or VNI. In Vietnam, 3070:other Cyrillic-based scripts 1663: 1630: 1603:Central and Eastern European 470:Raw bytes of UTF-8 encoding 327:encodings, usually labelled 284:of human languages. Whereas 7: 6778:Character encodings in HTML 6112:National Replacement (NRCS) 6079:Japanese language in EBCDIC 4655:Google Code: Zawgyi Project 4157: 3657:; traditional Chinese: 3344: 3206:Trăm năm trong cõi người ta 3159:, the phenomenon is called 3008:(the seceding varieties of 2947:(The second character is a 2833: 2827:), meaning "monkey's ". In 2817: 2804: 2798: 2712: 2636: 1675: 1659: 1225:, and the slightly altered 937:right single quotation mark 683:character encodings in HTML 10: 6869: 4680:Frequently Asked Questions 4271:10.1109/MSPEC.2012.6221090 4006: 3997: 3982: 3817: 3503:Chinese character encoding 3466: 3388: 3348: 3148: 3141:", may be misinterpreted. 2873: 2626:(, lit. "little shrubs"). 1683: 1022:are all extensions of the 866: 311:itself ("文字化け") stored as 219:("�") in places where the 6816: 6765: 6720: 6588: 6549: 6467: 6202: 6092: 6068: 5950: 5832: 5516: 5355: 5327: 5161: 5103: 4960: 4833: 4529:"Content Moved (Windows)" 4131: 4111: 3942: 3861:writing systems of Africa 3700: 3653:(simplified Chinese: 3582: 3553: 3440: 3404:���̃��(�q���Y�_�C�G�b�g) 3393: 3227: 3204: 3133: 3094:languages of the Caucasus 2944:п я─п╟п╨п╬п╥я▐п╠я─я▀ 2932: 2906: 2893: 2875: 2839: 2823: 2792: 2747:, as well as Russian and 2706: 2679: 2642: 2564:Prior to the creation of 2559: 2465: 2385: 2353: 2190: 2185: 2105: 1975: 1934: 1876: 1777: 1698: 1518: 1505: 1441: 1415: 1397: 1379: 1202:(æ and œ are rarely used) 599: 591: 588: 585: 582: 579: 571: 568: 565: 562: 554: 551: 548: 545: 542: 539: 531: 528: 525: 522: 519: 516: 513: 505: 502: 499: 496: 493: 490: 487: 484: 481: 478: 475: 472: 464: 461: 458: 455: 452: 449: 446: 443: 435: 432: 429: 426: 418: 415: 412: 409: 406: 403: 400: 392: 389: 386: 383: 380: 377: 374: 371: 363: 360: 357: 354: 343:(Mainland China) locale. 251: 192: 6808:Variable-length encoding 6589:Miscellaneous code pages 5347:Extended Unix Code / EUC 5038:-15 (New Western Europe) 4834:Early telecommunications 3734:Brahmic or Indic scripts 3671:) in the name of singer 2735:Meanwhile, in the West, 1654:"Árvíztűrő tükörfúrógép" 988:North Germanic languages 644:extended file attributes 269:together with the data. 6735:C0 and C1 control codes 4651:"Why Unicode is Needed" 4509:marathi.indiatyping.com 4460:Conversion map between 4183:version control systems 3363:(Windows machines) and 3337:§æì¢Ü••™û°éù†äõ‰∫õ 3309:����������������������� 3281:�¤¾��¢��¥¥ª��¡�����¾ä�� 3253:đ¤¾“đ¢†¥đ¥ªžđ¡Žđ Š›äº› 2674:Kod Obmena Informatsiey 2648:[krɐkɐˈzʲæbrɪ̈] 1686:árvíztűrő tükörfúrógép 887:on the encoding of the 263:communication protocols 4983:-3 (Maltese/Esperanto) 4934:World System Teletext 3127: 3037:manually if required. 2805:shkola russkogo yazyka 2713:shkola russkogo yazyka 2680:Код Обмена Информацией 2622: 1684:ÁRVÍZTŰRŐ TÜKÖRFÚRÓGÉP 1350: 1312: 1304: 1296: 1285: 1277: 1269: 1256: 165:This article contains 159: 140: 6757:Whitespace characters 6434:Ventura International 4590:"Battle of the fonts" 4169:Replacement character 3841:ad hoc font encodings 3810:until the release of 3607:simplified characters 3113:does not support it. 2986:–Ъ—А–∞–Ї–Њ–Ј—П–±—А—Л 2976:–ö—Ä–∞–∫–æ–∑—è–±—Ä—ã 2968:Кракозябры 2960:лџЛђл░л║лЙлиЛЈл▒ЛђЛІ 2936:Кракозябры 986:The alphabets of the 705:that were in reality 217:replacement character 146: 124: 6152:Norwegian and Danish 4792:at Wikimedia Commons 4710:Facebook Engineering 4682:. Unicode Consortium 4563:on 30 September 2019 4531:. Msdn.microsoft.com 4216:cross-site scripting 4174:Substitute character 3911:was created for the 3891:, which employs the 3717:copyright symbol "©" 3695:) in ex-PRC Premier 3691:'s lack of the "镕" ( 3123:East Asian languages 2728:(KOI8-RU), and even 2707:Школа русского языка 2695:encoding that is an 1146:à, è, é, ì, ò, ù in 926:£ 201:[mod͡ʑibake] 43:improve this article 6712:Unified Hangul Code 6384:PostScript Standard 6107:Multinational (MCS) 4978:-2 (Central Europe) 4973:-1 (Western Europe) 4827:Character encodings 4747:on 24 December 2019 3494:Traditional Chinese 3389:このメールは皆様へのメッセージです。 3371: 3187: 3088:Caucasian languages 2856: 2774:Most recently, the 2610:hardware code pages 2570:character encodings 1484: 1358: 1331:Ole Gunnar Solskjær 891:. For example, the 852:Microsoft AppLocale 815:Applications using 757:in the file, as an 348: 288:mostly switched to 286:Linux distributions 6843:Character encoding 6793:Hardware code page 6553:typesetting system 6389:PostScript Latin 1 6045:Cyrillic + Finnish 5952:Windows code pages 5834:IBM AIX code pages 5162:National standards 5093:Ukrainian Cyrillic 4484:The New York Times 4238:Bush hid the facts 4232:" 3901:Mwangwego alphabet 3783:, or आर्या, IAST: 3485:Simplified Chinese 3370: 3186: 3171:(from Chinese 乱码, 3139:Bush hid the facts 2993:Yugoslav languages 2949:non-breaking space 2854: 2778:encoding includes 2763:added support for 2665:Russian Federation 1591:IBM/CP037 (EBCDIC) 1483: 1432:IBM/CP037 (EBCDIC) 1357: 1344:misinterpreted as 949:’ 836:rendering engine's 780:in the file, as a 646:can store this as 607:Underspecification 346: 206:character encoding 167:special characters 160: 141: 6830: 6829: 6783:Charset detection 6722:Control character 6404:Sharp calculators 6275:Casio calculators 6203:Platform specific 6055:Cyrillic + German 6050:Cyrillic + French 5468:Maltese/Esperanto 5104:Bibliographic use 4988:-4 (North Europe) 4920:T.51/ISO/IEC 6937 4878:Baudot and Murray 4788:Media related to 4741:The Myanmar Times 4151: 4150: 3969:Browser rendering 3925:Manding languages 3909:Mandombe alphabet 3855:African languages 3649:) in the name of 3627:) in the name of 3613: 3612: 3464: 3463: 3342: 3341: 3111:Microsoft Windows 2990: 2989: 2757:Microsoft Windows 2557: 2556: 1667:Hungarian example 1600: 1599: 1475: 1474: 1335:Manchester United 1189:ă, â, î, ș, ț in 1183:à, è, ì, ò, ù in 1177:á, é, í, ó, ú in 800:Microsoft Windows 742:Overspecification 697:email client for 689:Mis-specification 655:charset detection 613:charset detection 604: 603: 347:Mojibake example 294:Microsoft Windows 173:rendering support 119: 118: 111: 93: 6860: 6822: 6821: 6314:DG International 6189:Special Graphics 5990:Extended Latin-8 5388:Central European 5378:Barents Cyrillic 5083:Barents Cyrillic 5053:-12 (Devanagari) 5049:Abandoned parts 4820: 4813: 4806: 4797: 4796: 4787: 4773: 4757: 4756: 4754: 4752: 4743:. Archived from 4732: 4726: 4725: 4719: 4717: 4701: 4695: 4694: 4689: 4687: 4672: 4666: 4665: 4663: 4661: 4647: 4641: 4640: 4635: 4633: 4618: 4609: 4608: 4602: 4600: 4594:Frontier Myanmar 4585: 4576: 4575: 4570: 4568: 4549: 4540: 4539: 4537: 4536: 4525: 4519: 4518: 4516: 4515: 4501: 4495: 4494: 4492: 4490: 4475: 4469: 4459: 4454:. Archived from 4442: 4436: 4435: 4428: 4422: 4421: 4410: 4404: 4403: 4392: 4386: 4371: 4365: 4364: 4362: 4360: 4349: 4343: 4342: 4340: 4339: 4328: 4322: 4321: 4319: 4318: 4307: 4301: 4300: 4298: 4296: 4281: 4275: 4274: 4254: 4233: 4229: 4225: 4221: 4002: 4001: 3987: 3986: 3963: 3962: 3893:Osmanya alphabet 3797:Marathi language 3707: 3702: 3683: 3678: 3667:), and the "喆" ( 3665: 3660: 3656: 3643: 3638: 3633:Wang Chien-shien 3589: 3584: 3566: 3559: 3528: 3527: 3500: 3499: 3491: 3490: 3372: 3369: 3338: 3333: 3329: 3325: 3321: 3310: 3305: 3301: 3297: 3293: 3282: 3277: 3273: 3269: 3265: 3254: 3249: 3245: 3241: 3237: 3188: 3185: 3136: 3135: 3130: 2920:Çá ÆÖóÞ¢áñ 2878: 2877: 2857: 2853: 2842: 2841: 2836: 2826: 2825: 2820: 2807: 2801: 2795: 2794: 2715: 2709: 2708: 2682: 2681: 2676: 2675: 2650: 2645: 2644: 2639: 2625: 2552: 2548: 2544: 2540: 2536: 2532: 2528: 2524: 2520: 2515: 2511: 2507: 2503: 2499: 2495: 2491: 2487: 2483: 2462: 2458: 2454: 2450: 2446: 2442: 2438: 2434: 2430: 2425: 2421: 2417: 2413: 2409: 2405: 2401: 2397: 2393: 2376: 2372: 2366: 2362: 2345: 2341: 2337: 2333: 2329: 2325: 2321: 2317: 2312: 2308: 2304: 2300: 2296: 2292: 2288: 2284: 2280: 2267: 2263: 2259: 2255: 2251: 2247: 2243: 2239: 2235: 2230: 2226: 2222: 2218: 2214: 2210: 2206: 2202: 2198: 2182: 2178: 2174: 2170: 2166: 2162: 2158: 2154: 2150: 2145: 2141: 2137: 2133: 2129: 2125: 2121: 2117: 2113: 2097: 2093: 2089: 2085: 2081: 2077: 2073: 2069: 2064: 2060: 2056: 2052: 2048: 2044: 2040: 2036: 2032: 2019: 2015: 2009: 2005: 1972: 1968: 1962: 1958: 1954: 1950: 1946: 1942: 1914: 1910: 1904: 1900: 1896: 1892: 1888: 1884: 1866: 1862: 1858: 1854: 1850: 1846: 1842: 1838: 1834: 1829: 1825: 1821: 1817: 1813: 1809: 1805: 1801: 1797: 1791:Quoted-printable 1774: 1770: 1766: 1762: 1758: 1754: 1750: 1746: 1742: 1738: 1734: 1730: 1726: 1722: 1718: 1714: 1710: 1706: 1695:Quoted-printable 1664: 1619:characters (see 1613:Eastern European 1596: 1586: 1573: 1560: 1547: 1534: 1501: 1488:Romanian example 1485: 1482: 1470: 1466: 1454: 1450: 1437: 1426: 1422: 1410: 1406: 1375: 1359: 1356: 1353: 1327:Nordic countries 1315: 1307: 1299: 1288: 1280: 1272: 1259: 1227:ISO 8859-15 950: 946: 942: 931: 927: 923: 919: 903: 897: 889:English alphabet 773:attribute of an 772: 768: 764: 649: 628:operating system 618:The encoding of 597: 577: 349: 345: 333:Western European 278:operating system 203: 198: 194: 114: 107: 103: 100: 94: 92: 51: 27: 19: 6868: 6867: 6863: 6862: 6861: 6859: 6858: 6857: 6848:Computer errors 6833: 6832: 6831: 6826: 6812: 6788:Han unification 6761: 6716: 6584: 6545: 6463: 6285:Compucolor 8001 6198: 6194:Technical (TCS) 6117:French Canadian 6088: 6064: 6060:Polytonic Greek 5946: 5828: 5512: 5498:Turkic Cyrillic 5413:Font X (Kermit) 5408:Farsi (Persian) 5360: 5351: 5323: 5157: 5099: 4969:Approved parts 4956: 4829: 4824: 4766: 4761: 4760: 4750: 4748: 4733: 4729: 4715: 4713: 4702: 4698: 4685: 4683: 4674: 4673: 4669: 4659: 4657: 4649: 4648: 4644: 4631: 4629: 4619: 4612: 4598: 4596: 4586: 4579: 4566: 4564: 4557:The Japan Times 4551: 4550: 4543: 4534: 4532: 4527: 4526: 4522: 4513: 4511: 4503: 4502: 4498: 4488: 4486: 4476: 4472: 4446:"PRC GBK (XGB)" 4444: 4443: 4439: 4430: 4429: 4425: 4412: 4411: 4407: 4394: 4393: 4389: 4372: 4368: 4358: 4356: 4351: 4350: 4346: 4337: 4335: 4330: 4329: 4325: 4316: 4314: 4309: 4308: 4304: 4294: 4292: 4282: 4278: 4255: 4251: 4246: 4231: 4227: 4223: 4219: 4193:Byte order mark 4187:data comparison 4160: 3988: 3984: 3975:Target encoding 3972:Source encoding 3961: 3945: 3897:Southern Africa 3889:Somali language 3857: 3820: 3740:, used in such 3730: 3715:'s lack of the 3564: 3557: 3537:Target encoding 3534:Source encoding 3469: 3381:Target encoding 3378:Source encoding 3353: 3347: 3336: 3335: 3331: 3327: 3323: 3319: 3308: 3307: 3303: 3299: 3295: 3291: 3280: 3279: 3275: 3271: 3267: 3263: 3252: 3251: 3247: 3243: 3239: 3235: 3212: 3208: 3197:Target encoding 3194:Source encoding 3153: 3147: 3119: 3117:Asian encodings 3090: 2995: 2945: 2866:Target encoding 2863:Source encoding 2831:, it is called 2697:ASCII extension 2632: 2562: 2550: 2546: 2542: 2538: 2534: 2530: 2526: 2522: 2518: 2517: 2513: 2509: 2505: 2501: 2497: 2493: 2489: 2485: 2481: 2460: 2456: 2452: 2448: 2444: 2440: 2436: 2432: 2428: 2427: 2423: 2419: 2415: 2411: 2407: 2403: 2399: 2395: 2391: 2374: 2370: 2368: 2364: 2360: 2343: 2339: 2335: 2331: 2327: 2323: 2319: 2315: 2314: 2310: 2306: 2302: 2298: 2294: 2290: 2286: 2282: 2278: 2265: 2261: 2257: 2253: 2249: 2245: 2241: 2237: 2233: 2232: 2228: 2224: 2220: 2216: 2212: 2208: 2204: 2200: 2196: 2180: 2176: 2172: 2168: 2164: 2160: 2156: 2152: 2148: 2147: 2143: 2139: 2135: 2131: 2127: 2123: 2119: 2115: 2111: 2095: 2091: 2087: 2083: 2079: 2075: 2071: 2067: 2066: 2062: 2058: 2054: 2050: 2046: 2042: 2038: 2034: 2030: 2017: 2013: 2011: 2007: 2003: 1970: 1966: 1964: 1960: 1956: 1952: 1948: 1944: 1940: 1912: 1908: 1906: 1902: 1898: 1894: 1890: 1886: 1882: 1864: 1860: 1856: 1852: 1848: 1844: 1840: 1836: 1832: 1831: 1827: 1823: 1819: 1815: 1811: 1807: 1803: 1799: 1795: 1772: 1768: 1764: 1760: 1756: 1752: 1748: 1744: 1740: 1736: 1732: 1728: 1724: 1720: 1716: 1712: 1708: 1704: 1685: 1673:Target encoding 1670:Source encoding 1662: 1633: 1605: 1594: 1584: 1571: 1558: 1545: 1532: 1509: 1499: 1498: 1494:Target encoding 1491:Source encoding 1480: 1468: 1464: 1460:ISO 8859-1 1452: 1448: 1435: 1424: 1420: 1408: 1404: 1399:ISO 8859-1 1383: 1373: 1372: 1368:Target encoding 1365:Source encoding 1362:Swedish example 1346:ISO 8859-1 1340:An artifact of 1297:Buchstabensalat 1229:. Both add the 1211:ISO 8859-1 1200:British English 1185:Scottish Gaelic 1174:no longer used) 984: 962:computers used 948: 944: 940: 935:Similarly, the 929: 925: 921: 917: 914:ISO 8859-1 899: 898:will appear as 895: 879:(–), and 869: 864: 840:trial and error 832:word processors 813: 795: 782:byte order mark 770: 766: 762: 744: 731: 715:quotation marks 703:ISO 8859-1 691: 665:and another in 647: 636:byte order mark 622:is affected by 609: 595: 575: 325:ISO 8859-1 301:writing systems 282:writing systems 254: 196: 182: 181: 180: 171:Without proper 115: 104: 98: 95: 52: 50: 40: 28: 17: 12: 11: 5: 6866: 6856: 6855: 6850: 6845: 6828: 6827: 6824:Character sets 6817: 6814: 6813: 6811: 6810: 6805: 6800: 6795: 6790: 6785: 6780: 6775: 6769: 6767: 6766:Related topics 6763: 6762: 6760: 6759: 6754: 6749: 6748: 6747: 6742: 6732: 6730:Morse prosigns 6726: 6724: 6718: 6717: 6715: 6714: 6709: 6704: 6699: 6694: 6689: 6682: 6681: 6680: 6675: 6670: 6660: 6655: 6650: 6649: 6648: 6643: 6635: 6630: 6625: 6620: 6615: 6614: 6613: 6603: 6598: 6592: 6590: 6586: 6585: 6583: 6582: 6577: 6572: 6567: 6562: 6556: 6554: 6547: 6546: 6544: 6543: 6538: 6533: 6528: 6523: 6518: 6513: 6508: 6503: 6498: 6493: 6488: 6483: 6477: 6475: 6465: 6464: 6462: 6461: 6456: 6451: 6446: 6441: 6436: 6431: 6426: 6424:TI calculators 6421: 6416: 6411: 6406: 6401: 6396: 6391: 6386: 6381: 6376: 6371: 6366: 6361: 6356: 6351: 6346: 6341: 6336: 6331: 6326: 6321: 6316: 6311: 6302: 6297: 6292: 6287: 6282: 6277: 6272: 6267: 6262: 6257: 6252: 6247: 6242: 6237: 6232: 6227: 6222: 6217: 6212: 6206: 6204: 6200: 6199: 6197: 6196: 6191: 6186: 6181: 6176: 6171: 6166: 6165: 6164: 6159: 6154: 6149: 6144: 6139: 6134: 6132:United Kingdom 6129: 6124: 6119: 6109: 6103: 6101: 6090: 6089: 6087: 6086: 6081: 6075: 6073: 6066: 6065: 6063: 6062: 6057: 6052: 6047: 6042: 6037: 6032: 6027: 6022: 6017: 6012: 6007: 6002: 5997: 5992: 5987: 5982: 5977: 5967: 5962: 5956: 5954: 5948: 5947: 5945: 5944: 5939: 5934: 5929: 5924: 5919: 5914: 5909: 5904: 5899: 5894: 5889: 5884: 5879: 5874: 5869: 5864: 5859: 5854: 5849: 5844: 5838: 5836: 5830: 5829: 5827: 5826: 5821: 5816: 5811: 5806: 5801: 5796: 5791: 5786: 5781: 5776: 5771: 5766: 5761: 5756: 5751: 5746: 5741: 5736: 5731: 5726: 5721: 5716: 5711: 5706: 5701: 5696: 5691: 5686: 5681: 5676: 5671: 5666: 5661: 5656: 5651: 5646: 5641: 5636: 5631: 5626: 5621: 5616: 5611: 5606: 5601: 5596: 5591: 5586: 5581: 5576: 5571: 5566: 5561: 5556: 5551: 5546: 5541: 5536: 5531: 5526: 5520: 5518: 5517:DOS code pages 5514: 5513: 5511: 5510: 5505: 5500: 5495: 5490: 5485: 5480: 5475: 5470: 5465: 5463:Latin (Kermit) 5460: 5455: 5450: 5445: 5440: 5435: 5430: 5425: 5420: 5415: 5410: 5405: 5400: 5395: 5390: 5385: 5380: 5375: 5370: 5364: 5362: 5353: 5352: 5350: 5349: 5344: 5339: 5333: 5331: 5325: 5324: 5322: 5321: 5316: 5311: 5306: 5301: 5296: 5291: 5286: 5281: 5276: 5271: 5266: 5261: 5256: 5251: 5246: 5241: 5236: 5231: 5226: 5221: 5216: 5211: 5206: 5201: 5196: 5191: 5186: 5181: 5176: 5171: 5165: 5163: 5159: 5158: 5156: 5155: 5150: 5145: 5140: 5135: 5130: 5125: 5124: 5123: 5118: 5107: 5105: 5101: 5100: 5098: 5097: 5096: 5095: 5090: 5085: 5080: 5072: 5071: 5070: 5065: 5063:KOI-8 Cyrillic 5057: 5056: 5055: 5047: 5046: 5045: 5043:-16 (Romanian) 5040: 5035: 5030: 5025: 5020: 5015: 5010: 5005: 5000: 4995: 4990: 4985: 4980: 4975: 4966: 4964: 4958: 4957: 4955: 4954: 4949: 4948: 4947: 4946: 4945: 4940: 4932: 4927: 4922: 4904: 4899: 4898: 4897: 4887: 4882: 4881: 4880: 4875: 4874: 4873: 4868: 4863: 4858: 4848: 4841:Telegraph code 4837: 4835: 4831: 4830: 4823: 4822: 4815: 4808: 4800: 4794: 4793: 4781: 4765: 4764:External links 4762: 4759: 4758: 4727: 4696: 4667: 4642: 4610: 4577: 4541: 4520: 4496: 4470: 4458:on 2002-10-01. 4437: 4423: 4405: 4387: 4366: 4344: 4323: 4302: 4276: 4248: 4247: 4245: 4242: 4241: 4240: 4235: 4228:" 4205: 4190: 4176: 4171: 4166: 4159: 4156: 4149: 4148: 4145: 4139: 4138: 4135: 4129: 4128: 4125: 4119: 4118: 4115: 4109: 4108: 4105: 4099: 4098: 4095: 4089: 4088: 4085: 4079: 4078: 4075: 4069: 4068: 4065: 4059: 4058: 4055: 4049: 4048: 4045: 4039: 4038: 4035: 4029: 4028: 4025: 4019: 4018: 4015: 4010: 4004: 4003: 3995: 3994: 3980: 3979: 3976: 3973: 3970: 3967: 3966:Arabic example 3960: 3957: 3944: 3941: 3865:Horn of Africa 3856: 3853: 3819: 3816: 3765:Knowledge logo 3748:(Hindi-Urdu), 3729: 3726: 3721: 3720: 3710: 3703:; pinyin: 3699:(Chinese: 3686: 3679:; pinyin: 3675:(Chinese: 3661:; pinyin: 3639:; pinyin: 3635:(Chinese: 3611: 3610: 3603: 3600: 3597: 3593: 3592: 3585:; pinyin: 3579: 3576: 3573: 3569: 3568: 3561: 3555: 3552: 3549: 3545: 3544: 3541: 3538: 3535: 3532: 3468: 3465: 3462: 3461: 3458: 3454: 3453: 3450: 3446: 3445: 3442: 3438: 3437: 3434: 3430: 3429: 3426: 3422: 3421: 3418: 3414: 3413: 3410: 3406: 3405: 3402: 3396: 3395: 3391: 3390: 3386: 3385: 3382: 3379: 3376: 3349:Main article: 3346: 3343: 3340: 3339: 3316: 3312: 3311: 3288: 3287:VNI (Windows) 3284: 3283: 3260: 3256: 3255: 3232: 3229: 3225: 3224: 3202: 3201: 3198: 3195: 3192: 3149:Main article: 3146: 3143: 3118: 3115: 3089: 3086: 3064:and partially 3012:language) and 3010:Serbo-Croatian 2994: 2991: 2988: 2987: 2984: 2978: 2977: 2974: 2970: 2969: 2966: 2962: 2961: 2958: 2954: 2953: 2942: 2938: 2937: 2934: 2930: 2929: 2926: 2922: 2921: 2918: 2912: 2911: 2908: 2904: 2903: 2900: 2895: 2891: 2890: 2887: 2884: 2880: 2879: 2871: 2870: 2867: 2864: 2861: 2761:Code Page 1251 2720:), Ukrainian ( 2631: 2628: 2586:Windows CP1250 2561: 2558: 2555: 2554: 2479: 2473: 2472: 2464: 2389: 2383: 2382: 2378: 2357: 2351: 2350: 2347: 2276: 2270: 2269: 2194: 2188: 2187: 2184: 2109: 2103: 2102: 2099: 2028: 2022: 2021: 2000: 1994: 1993: 1974: 1938: 1932: 1931: 1916: 1880: 1875: 1869: 1868: 1793: 1784: 1783: 1776: 1702: 1697: 1688: 1687: 1681: 1680: 1677: 1674: 1671: 1668: 1661: 1658: 1632: 1629: 1604: 1601: 1598: 1597: 1592: 1588: 1587: 1581: 1575: 1574: 1568: 1562: 1561: 1555: 1549: 1548: 1542: 1536: 1535: 1529: 1523: 1522: 1516: 1515: 1503: 1502: 1495: 1492: 1489: 1473: 1472: 1461: 1457: 1456: 1445: 1439: 1438: 1433: 1429: 1428: 1417: 1413: 1412: 1401: 1396: 1390: 1389: 1377: 1376: 1369: 1366: 1363: 1204: 1203: 1193: 1187: 1181: 1175: 1164: 1150: 1144: 1138: 1130:, í, ó, ú, ý, 1124: 1098: 1084: 1074: 1056: 1050: 1024:Latin alphabet 983: 980: 868: 865: 863: 860: 812: 809: 794: 791: 790: 789: 778: 755: 743: 740: 730: 729:User oversight 727: 690: 687: 608: 605: 602: 601: 598: 593: 590: 587: 584: 581: 578: 573: 570: 567: 564: 561: 557: 556: 553: 550: 547: 544: 541: 538: 534: 533: 530: 527: 524: 521: 518: 515: 512: 508: 507: 504: 501: 498: 495: 492: 489: 486: 483: 480: 477: 474: 471: 467: 466: 463: 460: 457: 454: 451: 448: 445: 442: 438: 437: 434: 431: 428: 425: 421: 420: 417: 414: 411: 408: 405: 402: 399: 395: 394: 391: 388: 385: 382: 379: 376: 373: 370: 366: 365: 362: 359: 356: 353: 352:Original text 253: 250: 210:writing system 175:, you may see 163: 162: 161: 117: 116: 31: 29: 22: 15: 9: 6: 4: 3: 2: 6865: 6854: 6851: 6849: 6846: 6844: 6841: 6840: 6838: 6825: 6815: 6809: 6806: 6804: 6801: 6799: 6796: 6794: 6791: 6789: 6786: 6784: 6781: 6779: 6776: 6774: 6771: 6770: 6768: 6764: 6758: 6755: 6753: 6750: 6746: 6743: 6741: 6738: 6737: 6736: 6733: 6731: 6728: 6727: 6725: 6723: 6719: 6713: 6710: 6708: 6705: 6703: 6700: 6698: 6695: 6693: 6690: 6688: 6687: 6683: 6679: 6676: 6674: 6671: 6669: 6666: 6665: 6664: 6661: 6659: 6656: 6654: 6651: 6647: 6644: 6642: 6639: 6638: 6636: 6634: 6631: 6629: 6626: 6624: 6621: 6619: 6616: 6612: 6609: 6608: 6607: 6604: 6602: 6599: 6597: 6594: 6593: 6591: 6587: 6581: 6578: 6576: 6573: 6571: 6568: 6566: 6563: 6561: 6558: 6557: 6555: 6552: 6548: 6542: 6539: 6537: 6534: 6532: 6529: 6527: 6524: 6522: 6519: 6517: 6514: 6512: 6509: 6507: 6504: 6502: 6499: 6497: 6494: 6492: 6489: 6487: 6484: 6482: 6479: 6478: 6476: 6474: 6473:ISO/IEC 10646 6470: 6466: 6460: 6457: 6455: 6452: 6450: 6447: 6445: 6442: 6440: 6437: 6435: 6432: 6430: 6427: 6425: 6422: 6420: 6417: 6415: 6412: 6410: 6407: 6405: 6402: 6400: 6397: 6395: 6392: 6390: 6387: 6385: 6382: 6380: 6377: 6375: 6372: 6370: 6367: 6365: 6362: 6360: 6357: 6355: 6352: 6350: 6347: 6345: 6342: 6340: 6337: 6335: 6332: 6330: 6327: 6325: 6322: 6320: 6317: 6315: 6312: 6310: 6306: 6303: 6301: 6298: 6296: 6293: 6291: 6290:Compucolor II 6288: 6286: 6283: 6281: 6278: 6276: 6273: 6271: 6268: 6266: 6263: 6261: 6258: 6256: 6253: 6251: 6248: 6246: 6245:Acorn RISC OS 6243: 6241: 6238: 6236: 6233: 6231: 6228: 6226: 6223: 6221: 6218: 6216: 6213: 6211: 6208: 6207: 6205: 6201: 6195: 6192: 6190: 6187: 6185: 6182: 6180: 6177: 6175: 6174:8-bit Turkish 6172: 6170: 6167: 6163: 6160: 6158: 6155: 6153: 6150: 6148: 6145: 6143: 6140: 6138: 6135: 6133: 6130: 6128: 6125: 6123: 6120: 6118: 6115: 6114: 6113: 6110: 6108: 6105: 6104: 6102: 6099: 6095: 6091: 6085: 6082: 6080: 6077: 6076: 6074: 6071: 6067: 6061: 6058: 6056: 6053: 6051: 6048: 6046: 6043: 6041: 6038: 6036: 6033: 6031: 6028: 6026: 6023: 6021: 6018: 6016: 6013: 6011: 6008: 6006: 6003: 6001: 5998: 5996: 5993: 5991: 5988: 5986: 5983: 5981: 5978: 5975: 5971: 5968: 5966: 5963: 5961: 5958: 5957: 5955: 5953: 5949: 5943: 5940: 5938: 5935: 5933: 5930: 5928: 5925: 5923: 5920: 5918: 5915: 5913: 5910: 5908: 5905: 5903: 5900: 5898: 5895: 5893: 5890: 5888: 5885: 5883: 5880: 5878: 5875: 5873: 5870: 5868: 5865: 5863: 5860: 5858: 5855: 5853: 5850: 5848: 5845: 5843: 5840: 5839: 5837: 5835: 5831: 5825: 5822: 5820: 5817: 5815: 5812: 5810: 5807: 5805: 5802: 5800: 5797: 5795: 5792: 5790: 5787: 5785: 5782: 5780: 5777: 5775: 5772: 5770: 5767: 5765: 5762: 5760: 5757: 5755: 5752: 5750: 5747: 5745: 5742: 5740: 5737: 5735: 5732: 5730: 5727: 5725: 5722: 5720: 5717: 5715: 5712: 5710: 5707: 5705: 5702: 5700: 5697: 5695: 5692: 5690: 5687: 5685: 5682: 5680: 5677: 5675: 5672: 5670: 5667: 5665: 5662: 5660: 5657: 5655: 5652: 5650: 5647: 5645: 5642: 5640: 5637: 5635: 5632: 5630: 5627: 5625: 5622: 5620: 5617: 5615: 5612: 5610: 5607: 5605: 5602: 5600: 5597: 5595: 5592: 5590: 5587: 5585: 5582: 5580: 5577: 5575: 5572: 5570: 5567: 5565: 5562: 5560: 5557: 5555: 5552: 5550: 5547: 5545: 5542: 5540: 5537: 5535: 5532: 5530: 5527: 5525: 5522: 5521: 5519: 5515: 5509: 5506: 5504: 5501: 5499: 5496: 5494: 5491: 5489: 5486: 5484: 5481: 5479: 5476: 5474: 5471: 5469: 5466: 5464: 5461: 5459: 5456: 5454: 5451: 5449: 5446: 5444: 5441: 5439: 5436: 5434: 5431: 5429: 5426: 5424: 5421: 5419: 5416: 5414: 5411: 5409: 5406: 5404: 5401: 5399: 5396: 5394: 5391: 5389: 5386: 5384: 5381: 5379: 5376: 5374: 5371: 5369: 5366: 5365: 5363: 5359: 5354: 5348: 5345: 5343: 5342:ISO/IEC 10367 5340: 5338: 5335: 5334: 5332: 5330: 5326: 5320: 5317: 5315: 5312: 5310: 5307: 5305: 5302: 5300: 5297: 5295: 5292: 5290: 5287: 5285: 5282: 5280: 5277: 5275: 5272: 5270: 5267: 5265: 5262: 5260: 5257: 5255: 5252: 5250: 5247: 5245: 5242: 5240: 5237: 5235: 5232: 5230: 5227: 5225: 5222: 5220: 5217: 5215: 5212: 5210: 5207: 5205: 5202: 5200: 5197: 5195: 5192: 5190: 5187: 5185: 5182: 5180: 5177: 5175: 5172: 5170: 5167: 5166: 5164: 5160: 5154: 5151: 5149: 5146: 5144: 5141: 5139: 5136: 5134: 5131: 5129: 5126: 5122: 5119: 5117: 5114: 5113: 5112: 5109: 5108: 5106: 5102: 5094: 5091: 5089: 5086: 5084: 5081: 5079: 5076: 5075: 5073: 5069: 5066: 5064: 5061: 5060: 5058: 5054: 5051: 5050: 5048: 5044: 5041: 5039: 5036: 5034: 5031: 5029: 5026: 5024: 5021: 5019: 5016: 5014: 5011: 5009: 5006: 5004: 5001: 4999: 4996: 4994: 4993:-5 (Cyrillic) 4991: 4989: 4986: 4984: 4981: 4979: 4976: 4974: 4971: 4970: 4968: 4967: 4965: 4963: 4959: 4953: 4950: 4944: 4941: 4939: 4936: 4935: 4933: 4931: 4928: 4926: 4923: 4921: 4918: 4917: 4916: 4912: 4908: 4905: 4903: 4900: 4896: 4893: 4892: 4891: 4888: 4886: 4883: 4879: 4876: 4872: 4869: 4867: 4864: 4862: 4859: 4857: 4854: 4853: 4852: 4849: 4847: 4844: 4843: 4842: 4839: 4838: 4836: 4832: 4828: 4821: 4816: 4814: 4809: 4807: 4802: 4801: 4798: 4791: 4786: 4782: 4780:at Wiktionary 4779: 4778: 4772: 4768: 4767: 4746: 4742: 4738: 4731: 4724: 4711: 4707: 4700: 4693: 4681: 4677: 4671: 4656: 4652: 4646: 4639: 4628: 4627:Rising Voices 4624: 4617: 4615: 4607: 4595: 4591: 4584: 4582: 4574: 4562: 4558: 4554: 4548: 4546: 4530: 4524: 4510: 4506: 4500: 4485: 4481: 4474: 4467: 4463: 4462:Code page 936 4457: 4453: 4452: 4447: 4441: 4433: 4427: 4419: 4415: 4409: 4401: 4397: 4391: 4384: 4383:1-59921-039-8 4380: 4376: 4370: 4354: 4348: 4333: 4327: 4312: 4306: 4291: 4287: 4280: 4272: 4268: 4264: 4260: 4259:IEEE Spectrum 4253: 4249: 4239: 4236: 4217: 4213: 4209: 4208:HTML entities 4206: 4203: 4198: 4194: 4191: 4188: 4184: 4180: 4177: 4175: 4172: 4170: 4167: 4165: 4162: 4161: 4155: 4146: 4144: 4141: 4140: 4136: 4134: 4130: 4126: 4124: 4121: 4120: 4116: 4114: 4110: 4106: 4104: 4101: 4100: 4096: 4094: 4091: 4090: 4086: 4084: 4081: 4080: 4076: 4074: 4071: 4070: 4066: 4064: 4061: 4060: 4056: 4054: 4051: 4050: 4046: 4044: 4041: 4040: 4036: 4034: 4031: 4030: 4026: 4024: 4021: 4020: 4016: 4014: 4011: 4009: 4005: 3996: 3992: 3981: 3977: 3974: 3971: 3968: 3965: 3964: 3956: 3954: 3950: 3940: 3938: 3934: 3933:Vai syllabary 3930: 3926: 3922: 3921:N'Ko alphabet 3918: 3914: 3910: 3906: 3902: 3898: 3894: 3890: 3886: 3882: 3878: 3874: 3870: 3866: 3862: 3852: 3849: 3846:Due to these 3844: 3842: 3838: 3834: 3830: 3826: 3815: 3813: 3809: 3805: 3800: 3798: 3794: 3790: 3786: 3782: 3776: 3774: 3770: 3766: 3761: 3759: 3755: 3751: 3747: 3743: 3739: 3735: 3725: 3718: 3714: 3711: 3708: 3698: 3694: 3690: 3687: 3684: 3674: 3670: 3666: 3652: 3648: 3644: 3642:Wáng Jiànxuān 3634: 3630: 3626: 3622: 3618: 3617: 3616: 3608: 3604: 3601: 3598: 3595: 3594: 3590: 3580: 3577: 3574: 3571: 3570: 3562: 3556: 3550: 3547: 3546: 3542: 3539: 3536: 3533: 3531:Original text 3530: 3529: 3526: 3524: 3519: 3517: 3513: 3509: 3504: 3495: 3486: 3482: 3478: 3474: 3459: 3456: 3455: 3451: 3448: 3447: 3443: 3439: 3435: 3432: 3431: 3427: 3424: 3423: 3419: 3416: 3415: 3411: 3408: 3407: 3403: 3401: 3398: 3397: 3392: 3387: 3383: 3380: 3377: 3375:Original text 3374: 3373: 3368: 3366: 3362: 3358: 3352: 3317: 3314: 3313: 3289: 3286: 3285: 3261: 3258: 3257: 3233: 3231:Windows-1258 3230: 3226: 3222: 3218: 3217: 3211: 3207: 3203: 3199: 3196: 3193: 3190: 3189: 3184: 3182: 3178: 3174: 3170: 3166: 3162: 3158: 3152: 3142: 3140: 3129: 3124: 3114: 3112: 3107: 3103: 3099: 3095: 3085: 3083: 3078: 3073: 3071: 3067: 3063: 3058: 3055: 3051: 3050:differentiate 3048:The drive to 3046: 3043: 3038: 3034: 3030: 3028: 3024: 3020: 3015: 3011: 3007: 3003: 2999: 2985: 2983: 2980: 2979: 2975: 2972: 2971: 2967: 2964: 2963: 2959: 2956: 2955: 2952: 2950: 2943: 2940: 2939: 2935: 2931: 2927: 2924: 2923: 2919: 2917: 2914: 2913: 2909: 2905: 2901: 2899: 2896: 2892: 2888: 2885: 2882: 2881: 2872: 2868: 2865: 2862: 2860:Original text 2859: 2858: 2852: 2850: 2846: 2835: 2830: 2819: 2813: 2809: 2806: 2800: 2788: 2783: 2781: 2777: 2772: 2770: 2766: 2762: 2758: 2754: 2750: 2746: 2742: 2738: 2737:Code page 866 2733: 2731: 2727: 2723: 2719: 2714: 2703: 2698: 2694: 2690: 2686: 2670: 2669:KOI encodings 2666: 2662: 2658: 2654: 2649: 2638: 2627: 2624: 2618: 2613: 2611: 2608:) to provide 2607: 2603: 2599: 2595: 2591: 2587: 2583: 2579: 2575: 2571: 2567: 2480: 2478: 2475: 2474: 2470: 2390: 2388: 2379: 2377:tükörfúrógép 2358: 2356: 2352: 2348: 2277: 2275: 2272: 2271: 2195: 2193: 2189: 2110: 2108: 2104: 2100: 2029: 2027: 2024: 2023: 2020:tükörfúrógép 2001: 1999: 1996: 1995: 1991: 1987: 1983: 1982:code page 852 1979: 1973:tükörfúrógép 1939: 1937: 1933: 1929: 1925: 1921: 1917: 1915:tükörfúrógép 1881: 1879: 1874: 1871: 1870: 1794: 1792: 1789: 1786: 1785: 1781: 1703: 1701: 1696: 1693: 1690: 1689: 1682: 1678: 1672: 1669: 1666: 1665: 1657: 1655: 1650: 1646: 1642: 1638: 1628: 1626: 1622: 1618: 1614: 1610: 1593: 1590: 1589: 1582: 1580: 1577: 1576: 1569: 1567: 1564: 1563: 1556: 1554: 1551: 1550: 1543: 1541: 1538: 1537: 1530: 1528: 1525: 1524: 1521: 1517: 1513: 1508: 1504: 1496: 1493: 1490: 1487: 1486: 1481: 1478: 1462: 1459: 1458: 1446: 1444: 1440: 1434: 1431: 1430: 1418: 1414: 1402: 1400: 1395: 1392: 1391: 1387: 1386:open sandwich 1382: 1378: 1370: 1367: 1364: 1361: 1360: 1355: 1352: 1347: 1343: 1338: 1336: 1332: 1328: 1324: 1318: 1316: 1314: 1313:desformatação 1308: 1306: 1300: 1298: 1291: 1289: 1287: 1281: 1279: 1273: 1271: 1264: 1260: 1258: 1251: 1248: 1244: 1240: 1236: 1235:currency sign 1232: 1228: 1224: 1220: 1216: 1212: 1207: 1201: 1197: 1194: 1192: 1188: 1186: 1182: 1180: 1176: 1173: 1169: 1165: 1163: 1159: 1155: 1151: 1149: 1145: 1143: 1139: 1137: 1133: 1129: 1125: 1123: 1119: 1115: 1111: 1107: 1103: 1099: 1097: 1093: 1089: 1085: 1083: 1080:, è, ë, ï in 1079: 1075: 1072: 1068: 1064: 1060: 1057: 1055: 1051: 1048: 1044: 1040: 1036: 1032: 1029: 1028: 1027: 1025: 1021: 1017: 1013: 1009: 1005: 1001: 997: 993: 989: 979: 977: 973: 969: 965: 961: 957: 952: 951:, and so on. 938: 933: 932:, and so on. 915: 911: 907: 902: 894: 890: 886: 882: 878: 874: 859: 857: 853: 849: 843: 841: 837: 833: 829: 824: 822: 818: 808: 805: 801: 787: 783: 779: 776: 760: 759:HTML meta tag 756: 753: 752: 751: 749: 739: 736: 726: 724: 720: 716: 712: 708: 704: 700: 696: 686: 684: 680: 676: 672: 668: 664: 660: 656: 651: 645: 641: 637: 633: 629: 625: 621: 616: 614: 559: 558: 536: 535: 510: 509: 469: 468: 440: 439: 423: 422: 397: 396: 368: 367: 351: 350: 344: 342: 338: 334: 330: 326: 322: 318: 314: 310: 306: 302: 297: 295: 291: 287: 283: 279: 275: 270: 268: 264: 258: 249: 247: 243: 240:displayed in 239: 233: 231: 227: 222: 218: 213: 211: 207: 202: 190: 186: 178: 174: 170: 168: 158: 154: 150: 145: 139: 135: 134: 128: 123: 113: 110: 102: 91: 88: 84: 81: 77: 74: 70: 67: 63: 60: –  59: 55: 54:Find sources: 48: 44: 38: 37: 32:This article 30: 26: 21: 20: 6802: 6740:ISO/IEC 6429 6697:Stanford/ITS 6684: 6618:ARIB STD-B24 6399:Sega SC-3000 6300:DEC RADIX 50 5337:ISO/IEC 8859 5329:ISO/IEC 2022 5074:Adaptations 5033:-14 (Celtic) 5028:-13 (Baltic) 5018:-10 (Nordic) 5013:-9 (Turkish) 4962:ISO/IEC 8859 4776: 4749:. Retrieved 4745:the original 4740: 4730: 4721: 4714:. Retrieved 4709: 4699: 4691: 4684:. Retrieved 4679: 4670: 4658:. Retrieved 4654: 4645: 4637: 4630:. Retrieved 4626: 4604: 4597:. Retrieved 4593: 4572: 4565:. Retrieved 4561:the original 4556: 4533:. Retrieved 4523: 4512:. Retrieved 4508: 4499: 4487:. Retrieved 4483: 4473: 4456:the original 4449: 4440: 4426: 4417: 4408: 4399: 4390: 4374: 4369: 4357:. Retrieved 4347: 4336:. Retrieved 4334:. 2001-05-13 4326: 4315:. Retrieved 4313:. 2013-05-17 4305: 4293:. Retrieved 4290:Ars Technica 4289: 4279: 4262: 4258: 4252: 4202:interpreters 4152: 4143:Windows-1252 4133:Windows-1256 4053:Windows-1256 4043:Windows-1252 4033:Windows-1251 4023:Windows-1250 3946: 3869:Ge'ez script 3867:such as the 3858: 3847: 3845: 3840: 3829:Unicode font 3821: 3801: 3792: 3788: 3784: 3780: 3777: 3762: 3731: 3722: 3704: 3692: 3680: 3668: 3662: 3646: 3645:), the "堃" ( 3640: 3624: 3614: 3586: 3522: 3520: 3476: 3470: 3441:Windows-1252 3354: 3214: 3209: 3205: 3180: 3177:Windows-1258 3172: 3168: 3160: 3154: 3120: 3091: 3074: 3059: 3047: 3042:Windows-1252 3039: 3035: 3031: 3027:Windows-1252 3023:Windows-1250 2996: 2982:Mac Cyrillic 2965:Windows-1251 2946: 2925:Windows-1251 2907:Windows-1252 2898:Windows-1251 2883:Windows-1251 2849:MIK encoding 2843:), meaning " 2814: 2810: 2784: 2773: 2734: 2661:Soviet Union 2633: 2614: 2563: 2367:TÜKÖRFÚRÓGÉP 2355:Windows-1252 2192:Windows-1250 2026:Windows-1250 2010:TÜKÖRFÚRÓGÉP 1640: 1634: 1621:ISO/IEC 8859 1606: 1479: 1476: 1339: 1319: 1310: 1302: 1294: 1292: 1283: 1275: 1267: 1262: 1254: 1252: 1223:Windows-1252 1218: 1214: 1208: 1205: 985: 953: 934: 881:curly quotes 870: 844: 828:web browsers 825: 814: 796: 745: 732: 707:Windows-1252 692: 671:HTTP headers 652: 648:user.charset 617: 610: 332: 328: 321:Windows-1252 308: 298: 271: 259: 255: 245: 234: 214: 184: 183: 164: 138:Windows-1252 132: 105: 96: 86: 79: 72: 65: 53: 41:Please help 36:verification 33: 6459:ZX Spectrum 6414:Sinclair QL 6250:Amstrad CPC 6169:8-bit Greek 6096:terminals ( 5809:Iran System 5361:("scripts") 5008:-8 (Hebrew) 4998:-6 (Arabic) 4895:ISO/IEC 646 4751:24 December 4716:25 December 4686:24 December 4632:24 December 4599:24 December 4567:24 December 4418:w3techs.com 4400:w3techs.com 4195:– The most 3923:, used for 3917:West Africa 3879:, used for 3859:In certain 3825:Zawgyi font 3793:karaṇārā/rī 3651:Yu Shyi-kun 3631:politician 3602:叼力捞钙胶 抛农聪墨 3596:디제이맥스 테크니카 3216:Truyện Kiều 3210:𤾓𢆥𥪞𡎝𠊛些 3082:single-byte 3054:Montenegrin 2928:Êðàêîçÿáðû 2910:ëÒÁËÏÚÑÂÒÙ 2902:лТБЛПЪСВТЩ 2889:йПЮЙНГЪАПШ 2780:code points 2687:, based on 2643:кракозя́бры 2637:krakozyabry 1700:7-bit ASCII 1679:Occurrence 1617:diacritical 1595:äÁ>ÍHrDc 1351:Ring meg nå 1305:deformación 1293:In German, 1156:, ó, ú, ü, 875:(—), 811:Resolutions 677:document's 659:web browser 640:source code 242:hexadecimal 6837:Categories 6745:JIS X 0211 6653:ISO-IR-169 6506:UTF-EBCDIC 6072:code pages 5799:CSX+ Indic 5403:Devanagari 5358:Code pages 5279:LST 1590-4 5249:JIS X 0213 5244:JIS X 0212 5239:JIS X 0208 5234:JIS X 0201 5199:GOST 10859 5121:CCCII/EACC 5023:-11 (Thai) 5003:-7 (Greek) 4938:background 4861:Wabun/Kana 4723:languages. 4712:. Facebook 4660:31 October 4535:2014-02-05 4514:2022-08-02 4353:"sms-scam" 4338:2014-11-01 4317:2015-02-15 4244:References 4234:and so on. 4224:" 4164:Code point 4123:Mac Arabic 4103:Mac Arabic 4073:ISO 8859-6 4063:ISO 8859-5 3935:, used in 3931:, and the 3833:codepoints 3808:Windows XP 3769:Devanagari 3746:Hindustani 3738:South Asia 3728:Indic text 3706:Zhū Róngjī 3697:Zhu Rongji 3588:dānrénpáng 3575:Shift-JIS 3433:ISO 8859-6 3315:Mac Roman 3157:Vietnamese 3145:Vietnamese 3062:Macedonian 2957:MS-DOS 855 2916:MS-DOS 855 2876:Кракозябры 2799:biblioteka 2793:Библиотека 2745:Belarusian 2739:supported 2732:(KOI8-T). 2726:Belarusian 2667:developed 2663:and early 2617:ISO 8859-2 2566:ISO 8859-2 1788:ISO 8859-2 1641:betűszemét 1540:ISO 8859-2 1394:MS-DOS 437 1168:Portuguese 1134:, æ, ö in 1120:, æ, ø in 1016:Portuguese 972:mainframes 893:pound sign 856:Windows 98 848:Windows XP 763:http-equiv 748:web server 717:and extra 638:, but for 620:text files 303:, such as 238:code point 99:March 2023 69:newspapers 58:"Mojibake" 6798:MICR code 6633:IEC-P27-1 6611:ISO-IR-68 6516:DIN 91379 6394:SAM Coupé 6329:GSM 03.38 6319:Galaksija 5814:Kamenický 5794:CSX Indic 5503:Ukrainian 5289:Shift JIS 5269:KS X 1002 5264:KS X 1001 5189:DIN 66003 5184:CNS 11643 4952:Transcode 4930:ITU T.101 4856:Non-Latin 4451:Microsoft 4295:5 October 4265:(7): 60. 4113:Mac Roman 3795:, in the 3789:karaṇāryā 3673:David Tao 3664:Yóu Xíkūn 3629:Taiwanese 3457:Shift-JIS 3425:Mac Roman 3417:Shift-JIS 3361:Shift-JIS 3326:m trong c 3298:m trong c 3270:m trong c 3242:m trong c 3221:Nguyễn Du 3077:code page 3014:Slovenian 2973:Mac Roman 2824:маймуница 2818:majmunica 2787:gibberish 2749:Bulgarian 2741:Ukrainian 2477:Mac Roman 2107:Mac Roman 1637:Hungarian 1631:Hungarian 1607:Users of 1566:Shift-JIS 1443:Mac Roman 1436:ë_C¶ÊÅCvË 1231:Euro sign 1152:á, é, í, 1136:Icelandic 1076:á, é, ó, 1067:Norwegian 956:Commodore 922:£ 877:en dashes 873:em dashes 769:) or the 679:meta tags 667:Shift-JIS 632:localized 532:� 299:For some 292:in 2004, 151:-encoded 129:-encoded 6853:Nonsense 6803:Mojibake 6658:ISO 2033 6623:Fieldata 6601:ASMO 449 6511:GB 18030 6471: / 6419:Teletext 6409:Sharp MZ 6339:HP FOCAL 6334:HP Roman 6265:Atari ST 6255:Apple II 5789:CS Indic 5483:Romanian 5458:Keyboard 5438:Gurmukhi 5433:Gujarati 5423:Georgian 5398:Cyrillic 5393:Croatian 5368:Armenian 5274:LST 1564 5259:KPS 9566 5219:GB 18030 5214:GB 12052 5209:GB 12345 5194:ELOT 927 5128:ISO 5426 5088:Estonian 4925:ITU T.61 4915:Teletext 4911:Videotex 4885:Fieldata 4871:Cyrillic 4790:Mojibake 4777:mojibake 4489:July 17, 4466:GB 18030 4373:p. 141, 4359:June 19, 4222:becomes 4158:See also 3959:Examples 3907:and the 3873:Ethiopia 3773:modifier 3578:暥帤壔偗僥僗僩 3572:文字化けテスト 3345:Japanese 3102:Armenian 3098:Georgian 2998:Croatian 2855:Example 2702:8BITMIME 2657:Cyrillic 2623:krzaczki 2606:Hercules 2572:such as 1660:Examples 1191:Romanian 996:Romanian 974:use the 945:’ 821:US-ASCII 771:encoding 517:� 309:mojibake 305:Japanese 267:metadata 189:Japanese 185:Mojibake 133:Mojibake 6692:SEASCII 6686:Mojikyō 6673:KOI8-RU 6596:ABICOMP 6469:Unicode 6379:PETSCII 6369:NEC APC 6305:DEC MCS 6260:ATASCII 6157:Swedish 6142:Finnish 6127:Spanish 5819:Mazovia 5784:ABICOMP 5493:Turkish 5448:Iceland 5356:Mac OS 5299:TIS-620 5204:GB 2312 5179:BraSCII 5169:ArmSCII 4907:Teletex 4866:Chinese 4197:in-band 4179:Newline 3978:Result 3937:Liberia 3881:Amharic 3877:Eritrea 3837:Unicode 3818:Burmese 3758:Marathi 3754:Punjabi 3750:Bengali 3689:GB 2312 3682:Táo Zhé 3605:Random 3599:EUC-KR 3565:GB 2312 3560:T瓣в变巨肚 3548:三國志曹操傳 3516:Guobiao 3508:Unicode 3477:Luàn mǎ 3473:Chinese 3467:Chinese 3384:Result 3200:Result 3191:Example 3169:loạn mã 3165:Hán–Nôm 3106:ArmSCII 3066:Serbian 3019:Latin-2 3006:Serbian 3002:Bosnian 2869:Result 2829:Serbian 2776:Unicode 2765:Serbian 2755:. For 2659:. The 2653:Russian 2582:Mazovia 2574:AmigaPL 1847: t 1810: T 1609:Central 1579:TIS-620 1553:OEM 737 1381:Smörgås 1323:umlauts 1286:þjóðlöð 1247:Windows 1219:Western 1215:Latin 1 1162:Spanish 1148:Italian 1122:Faroese 1065:, å in 1054:Catalan 1047:Swedish 1043:Finnish 1020:Spanish 1012:Italian 1000:Finnish 992:Catalan 964:PETSCII 867:English 804:Palm OS 786:Unicode 767:charset 699:Windows 329:Western 274:Unicode 83:scholar 6702:Symbol 6678:KOI8-U 6668:KOI8-R 6536:TACE16 6526:CESU-8 6521:BOCU-1 6501:UTF-32 6496:UTF-16 6439:WISCII 6429:TRS-80 6349:SQUOZE 6344:HP RPL 6184:Hebrew 6179:SI 960 6147:French 6070:EBCDIC 5960:CER-GS 5443:Hebrew 5418:Gaelic 5383:Celtic 5373:Arabic 5319:YUSCII 5309:VISCII 5294:SI 960 5284:PASCII 5133:5426-2 5111:MARC-8 4846:Needle 4381:  4212:escape 4093:CP 866 4083:CP 852 4013:KOI8-R 3949:Arabic 3943:Arabic 3929:Guinea 3905:Malawi 3899:, the 3848:ad hoc 3709:), and 3540:Result 3523:luànmǎ 3514:, and 3481:Pinyin 3449:EUC-JP 3409:EUC-JP 3394:UTF-8 3365:EUC-JP 3259:TCVN3 3228:UTF-8 3181:chữ ma 3173:luànmǎ 3161:chữ ma 3128:kärlek 2941:KOI8-R 2894:KOI8-R 2886:KOI8-R 2753:MS-DOS 2722:KOI8-U 2718:KOI8-R 2594:EPROMs 2560:Polish 2369:árvízt 2359:ÁRVÍZT 2274:CP 852 2068:  2012:árvízt 2002:ÁRVÍZT 1998:CP 850 1990:CP 850 1986:CP 437 1965:árvízt 1955:TÜKÖRF 1936:CP 852 1928:CP 850 1924:CP 437 1907:árvízt 1897:TÜKÖRF 1878:CP 437 1773:=C3=A9 1769:=C3=B3 1765:=C3=BA 1761:=C3=B6 1757:=C3=BC 1753:=C5=91 1749:=C5=B1 1745:=C3=AD 1741:=C3=A1 1737:=C3=89 1733:=C3=93 1729:=C3=9A 1725:=C3=96 1721:=C3=9C 1717:=C5=90 1713:=C5=B0 1709:=C3=8D 1705:=C3=81 1676:Result 1507:Cenușă 1497:Result 1371:Result 1257:kärlek 1142:French 1126:á, ð, 1096:German 1090:, and 1086:ä, ö, 1071:Danish 1008:German 1004:French 976:EBCDIC 958:brand 910:CP1252 719:dashes 695:Eudora 663:EUC-JP 624:locale 317:MS-932 313:EUC-JP 276:among 252:Causes 230:UTF-16 221:binary 157:KOI8-R 85:  78:  71:  64:  56:  6773:CCSID 6646:8-bit 6641:7-bit 6637:INIS 6491:UTF-8 6486:UTF-7 6481:UTF-1 6359:LMBCS 6295:CP/M+ 6137:Dutch 6122:Swiss 5804:CWI-2 5508:VT100 5478:Roman 5473:Ogham 5453:Inuit 5428:Greek 5314:VSCII 5304:TSCII 5254:KOI-7 5229:ISCII 5224:HKSCS 5116:ANSEL 5078:Welsh 4902:BCDIC 4890:ASCII 4851:Morse 4008:UTF-8 3953:below 3951:(see 3895:. In 3885:Tigre 3812:Vista 3781:kārya 3551:Big5 3543:Note 3400:UTF-7 3357:Japan 3332:∆∞·ªù 3248:ườ 2933:UTF-8 2845:trash 2840:ђубре 2834:đubre 2730:Tajik 2689:ASCII 2651:) in 2604:, or 2578:CP852 2469:UTF-8 2387:UTF-8 1920:CWI-2 1873:CWI-2 1692:UTF-8 1625:KOI-8 1527:ASCII 1520:UTF-8 1416:UTF-8 1342:UTF-8 1278:hääyö 1270:hääyö 1239:UTF-8 1179:Irish 1082:Dutch 968:ASCII 960:8-bit 906:UTF-8 885:ASCII 817:UTF-8 735:ASCII 657:). A 337:UTF-8 290:UTF-8 246:valid 226:UTF-8 149:UTF-8 127:UTF-8 90:JSTOR 76:books 6707:TRON 6560:Cork 6531:SCSU 6454:ZX81 6449:ZX80 6444:XCCS 6374:NeXT 6354:LICS 6309:NRCS 6270:BICS 6240:1058 6235:1057 6230:1056 6225:1055 6220:1054 6215:1053 6210:1052 6084:DKOI 6040:1270 6035:1258 6030:1257 6025:1256 6020:1255 6015:1254 6010:1253 6005:1252 6000:1251 5995:1250 5985:1169 5942:1133 5937:1124 5932:1046 5927:1019 5922:1018 5917:1017 5912:1016 5907:1015 5902:1014 5897:1013 5892:1012 5887:1010 5882:1009 5877:1008 5872:1006 5779:3846 5774:1127 5769:1118 5764:1117 5759:1116 5754:1115 5749:1098 5744:1044 5739:1043 5734:1042 5729:1040 5724:1034 5488:Sámi 5174:Big5 5153:6862 5148:6438 5143:5428 5138:5427 5068:Sámi 4943:sets 4909:and 4753:2019 4718:2019 4688:2019 4662:2013 4634:2019 4601:2019 4569:2019 4491:2009 4379:ISBN 4361:2014 4297:2018 4185:and 3875:and 3785:āryā 3693:róng 3625:xuān 3621:Big5 3619:The 3512:Big5 3334:i ta 3330:i ng 3306:i ta 3302:i ng 3278:i ta 3274:i ng 3250:i ta 3246:i ng 3100:and 3040:The 3021:and 2767:and 2743:and 2693:KOI8 2685:KOI7 2584:and 2322:ztűr 2074:ztűr 1918:The 1647:and 1623:and 1611:and 1585:ศ™ฤƒ 1583:Cenu 1570:Cenu 1559:╚β─Δ 1557:Cenu 1546:șă 1544:Cenu 1533:șă 1531:Cenu 1245:and 1243:UNIX 1158:¡, ¿ 1069:and 1045:and 1018:and 918:£ 830:and 802:and 723:Unix 675:HTML 228:and 197:IPA: 193:文字化け 147:The 125:The 62:news 6663:KOI 6580:OT1 6575:OMS 6570:OML 6565:LY1 6551:TeX 6364:MSX 6324:GEM 6280:CDC 6098:VTx 6094:DEC 5980:950 5974:GBK 5970:936 5965:932 5867:922 5862:921 5857:915 5852:912 5847:896 5842:895 5824:MIK 5719:951 5714:950 5709:949 5704:942 5699:936 5694:932 5689:904 5684:903 5679:899 5674:897 5669:869 5664:868 5659:867 5654:866 5649:865 5644:864 5639:863 5634:862 5629:861 5624:860 5619:859 5614:858 5609:857 5604:856 5599:855 5594:853 5589:852 5584:851 5579:850 5574:778 5569:777 5564:776 5559:775 5554:773 5549:770 5544:737 5539:720 5534:708 5529:668 5524:437 4267:doi 3927:in 3871:in 3804:Lao 3744:as 3736:of 3713:GBK 3701:朱镕基 3669:zhé 3659:游錫堃 3655:游锡堃 3647:kūn 3637:王建煊 3583:單人旁 3554:GB 3471:In 3355:In 3322:m n 3304:öôø 3294:m n 3266:m n 3238:m n 3155:In 2796:" ( 2751:in 2724:), 2710:" ( 2602:EGA 2598:CGA 2590:DOS 1988:or 1978:DOS 1963:GÉP 1926:or 1905:GÉP 1865:=E9 1861:=F3 1857:=FA 1853:=F6 1849:=FC 1845:=F5 1841:=FB 1837:=ED 1833:=E1 1828:=C9 1824:=D3 1820:=DA 1816:=D6 1812:=DC 1808:=D5 1804:=DB 1800:=CD 1796:=C1 1780:SMS 1635:In 1572:ネ卞 1512:ash 1348:, " 1263:für 1217:or 1198:in 1160:in 1100:á, 1094:in 1041:in 941:’ 912:or 775:XML 765:or 596:HOP 576:SHY 506:91 393:B1 341:GBK 331:or 323:or 232:). 45:by 6839:: 6628:HZ 4739:. 4720:. 4708:. 4690:. 4678:. 4653:. 4636:. 4625:. 4613:^ 4603:. 4592:. 4580:^ 4571:. 4555:. 4544:^ 4507:. 4482:. 4448:. 4416:. 4398:. 4288:. 4263:49 4261:. 4230:, 4226:, 4204:). 3993:) 3939:. 3883:, 3756:, 3752:, 3685:), 3677:陶喆 3567:. 3510:, 3498:亂碼 3492:, 3489:乱码 3483:, 3328:√µ 3324:ƒÉ 3320:ƒÉ 3318:Tr 3300:oõ 3296:aê 3292:aê 3290:Tr 3262:Tr 3244:õ 3240:ă 3236:ă 3234:Tr 3223:) 3219:, 3134:är 3072:. 3004:, 3000:, 2771:. 2759:, 2677:, 2600:, 2580:, 2553:p 2551:√© 2547:√≥ 2543:√∫ 2541:rf 2539:√∂ 2535:√º 2531:≈ë 2527:≈± 2525:zt 2523:√≠ 2521:rv 2519:√° 2514:√â 2510:√ì 2506:√ö 2504:RF 2502:√ñ 2498:√ú 2494:≈ê 2490:≈∞ 2488:ZT 2486:√ç 2484:RV 2482:√Å 2463:p 2461:é 2457:ó 2453:ú 2451:rf 2449:ö 2445:ü 2441:Å‘ 2437:ű 2435:zt 2431:rv 2429:á 2424:É 2420:Ã" 2416:Ú 2414:RF 2412:Ö 2408:Ãœ 2404:Ő 2400:Å° 2398:ZT 2396:Í 2394:RV 2392:Á 2346:p 2334:rf 2318:rv 2301:RF 2285:ZT 2281:RV 2268:p 2256:rf 2240:zt 2236:rv 2219:RF 2203:ZT 2199:RV 2183:p 2171:rf 2155:zt 2151:rv 2134:RF 2118:ZT 2114:RV 2098:p 2086:rf 2070:rv 2053:RF 2037:ZT 2033:RV 1947:ZT 1943:RV 1889:ZT 1885:RV 1867:p 1855:rf 1839:zt 1835:rv 1818:RF 1802:ZT 1798:RV 1775:p 1763:rf 1747:zt 1743:rv 1739:P 1727:RF 1711:ZT 1707:RV 1514:) 1471:s 1467:rg 1463:Sm 1455:s 1453:√• 1451:rg 1449:√∂ 1447:Sm 1427:s 1425:Ã¥ 1423:rg 1421:ö 1419:Sm 1411:s 1407:rg 1403:Sm 1388:) 1337:. 1116:, 1112:, 1108:, 1104:, 1061:, 1037:, 1033:, 1014:, 1010:, 1006:, 1002:, 998:, 994:, 990:, 947:, 943:, 928:, 924:, 920:, 901:£ 725:. 711:C1 685:. 600:‘ 555:亼 503:81 500:E3 497:96 494:8C 491:E5 488:97 485:AD 482:E5 479:87 476:96 473:E6 465:± 436:け 419:ア 390:A4 387:BD 384:B2 381:FA 378:BB 375:B8 372:CA 364:け 361:化 358:字 355:文 212:. 195:; 191:: 6307:/ 6100:) 5976:) 5972:( 4913:/ 4819:e 4812:t 4805:v 4755:. 4664:. 4538:. 4517:. 4493:. 4434:. 4420:. 4402:. 4385:. 4363:. 4341:. 4320:. 4299:. 4273:. 4269:: 4220:" 3989:( 3719:. 3558:� 3479:( 3276:ê 3272:â 3268:¨ 3264:¨ 3213:( 3163:( 2951:) 2837:( 2821:( 2671:( 2640:( 2549:g 2545:r 2537:k 2533:t 2529:r 2516:P 2512:G 2508:R 2500:K 2496:T 2492:R 2459:g 2455:r 2447:k 2443:t 2439:r 2433:à 2426:P 2422:G 2418:R 2410:K 2406:T 2402:R 2375:õ 2373:r 2371:û 2365:Õ 2363:R 2361:Û 2344:Ú 2342:g 2340:ˇ 2338:r 2336:˙ 2332:÷ 2330:k 2328:Ř 2326:t 2324:§ 2320:Ý 2316:ß 2313:P 2311:╔ 2309:G 2307:Ë 2305:R 2303:┌ 2299:Í 2297:K 2295:▄ 2293:T 2291:Ň 2289:R 2287:█ 2283:═ 2279:┴ 2266:È 2264:g 2262:Û 2260:r 2258:˙ 2254:ˆ 2252:k 2250:¸ 2248:t 2246:ı 2244:r 2242:˚ 2238:Ì 2234:· 2231:P 2229:… 2227:G 2225:” 2223:R 2221:⁄ 2217:÷ 2215:K 2213:‹ 2211:T 2209:’ 2207:R 2205:€ 2201:Õ 2197:¡ 2181:Ç 2179:g 2177:¢ 2175:r 2173:£ 2169:î 2167:k 2165:Å 2163:t 2161:ã 2159:r 2157:˚ 2153:° 2149:† 2146:P 2144:ê 2142:G 2140:‡ 2138:R 2136:È 2132:ô 2130:K 2128:ö 2126:T 2124:ä 2122:R 2120:Î 2116:÷ 2112:µ 2096:‚ 2094:g 2092:˘ 2090:r 2088:Ł 2084:" 2082:k 2080: 2078:t 2076:‹ 2072:ˇ 2065:P 2063: 2061:G 2059:ŕ 2057:R 2055:é 2051:™ 2049:K 2047:š 2045:T 2043:Š 2041:R 2039:ë 2035:Ö 2031:µ 2018:ï 2016:r 2014:¹ 2008:è 2006:R 2004:Ù 1971:ï 1969:r 1967:√ 1961:α 1959:R 1957:Θ 1953:è 1951:R 1949:δ 1945:╓ 1941:╡ 1913:ô 1911:r 1909:û 1903:ò 1901:R 1899:ù 1895:º 1893:R 1891:ÿ 1887:ì 1883:Å 1863:g 1859:r 1851:k 1843:r 1830:P 1826:G 1822:R 1814:K 1806:R 1771:g 1767:r 1759:k 1755:t 1751:r 1735:G 1731:R 1723:K 1719:T 1715:R 1649:ű 1645:ő 1510:( 1469: 1465:ˆ 1409:† 1405:” 1384:( 1196:£ 1172:ü 1170:( 1154:ñ 1132:þ 1128:é 1118:ý 1114:ú 1110:ó 1106:í 1102:ð 1092:ß 1088:ü 1078:ij 1063:ø 1059:æ 1039:ö 1035:ä 1031:å 896:£ 761:( 592:ã 589:– 586:Œ 583:å 580:— 572:å 569:‡ 566:– 563:æ 552:栥 549:鍖 546:瓧 543:囧 540:鏂 529:縺 526:喧 523:怜 520:蟄 514:譁 462:¤ 459:½ 456:² 453:ú 450:» 447:¸ 444:Ê 433:步 430:机 427:矢 416:、 413:ス 410:郾 407:サ 404:ク 401:ハ 187:( 179:. 169:. 112:) 106:( 101:) 97:( 87:· 80:· 73:· 66:· 39:.

Index


verification
improve this article
adding citations to reliable sources
"Mojibake"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message

UTF-8
Japanese Knowledge article for Mojibake
Windows-1252

UTF-8
Russian Knowledge article on Church Slavonic
KOI8-R
special characters
rendering support
question marks, boxes, or other symbols
Japanese
[mod͡ʑibake]
character encoding
writing system
replacement character
binary
UTF-8
UTF-16

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.