Knowledge

String (computer science)

Source 📝

31: 2284: 832:
bit to delimit strings at the left, where the operation would start at the right. This bit had to be clear in all other parts of the string. This meant that, while the IBM 1401 had a seven-bit word, almost no-one ever thought to use this as a feature, and override the assignment of the seventh bit to
159:
A primary purpose of strings is to store human-readable text, like words and sentences. Strings are used to communicate information from a computer program to the user of the program. A program may also accept string input from its user. Further, strings may store data expressed as characters yet not
2262:
for an alternative string ordering that preserves well-foundedness. For the example alphabet, the shortlex order is ε < 0 < 1 < 00 < 01 < 10 < 11 < 000 < 001 < 010 < 011 < 100 < 101 < 0110 < 111 < 0000 < 0001 < 0010 < 0011 < ... < 1111 <
1149:
is the one that manages the string (sequence of characters) that represents the current state of the file being edited. While that state could be stored in a single long consecutive array of characters, a typical text editor instead uses an alternative representation as its sequence data structure—a
1277:
retrieved from a communications medium. This data may or may not be represented by a string-specific datatype, depending on the needs of the application, the desire of the programmer, and the capabilities of the programming language being used. If the programming language's string implementation is
1213:
Sometimes, strings need to be embedded inside a text file that is both human-readable and intended for consumption by a machine. This is needed in, for example, source code of programming languages, or in configuration files. In this case, the NUL character does not work well as a terminator since
2254:
for any nontrivial alphabet, even if the alphabetical order is. For example, if Σ = {0, 1} and 0 < 1, then the lexicographical order on Σ includes the relationships ε < 0 < 00 < 000 < ... < 0001 < ... < 001 < ... < 01 < 010 < ... < 011 < 0110 < ... <
369:
String datatypes have historically allocated one byte per character, and, although the exact character set varied by region, character encodings were similar enough that programmers could often get away with ignoring this, since characters a program treated specially (such as period and space and
266:
A mathematical system is any set of strings of recognisable marks in which some of the strings are taken initially and the remainder derived from these by operations performed according to rules which are independent of any meaning assigned to the marks. That a system should consist of 'marks'
562:
usually indicates a general-purpose string of bytes, rather than strings of only (readable) characters, strings of bits, or such. Byte strings often imply that bytes can take any value and any data can be stored as-is, meaning that there should be no value interpreted as a termination value.
421:
do not make such guarantees, making matching on byte codes unsafe. These encodings also were not "self-synchronizing", so that locating character boundaries required backing up to the start of a string, and pasting two strings together could result in corruption of the second string.
1126:
It is possible to create data structures and functions that manipulate them that do not have the problems associated with character termination and can in principle overcome length code bounds. It is also possible to optimize the string represented using techniques from
579:
code points) can take anywhere from one to four bytes, and single characters can take an arbitrary number of codes. In these cases, the logical length of the string (number of characters) differs from the physical length of the array (number of bytes in use).
1285:
C programmers draw a sharp distinction between a "string", aka a "string of characters", which by definition is always null terminated, vs. a "byte string" or "pseudo string" which may be stored in the same array but is often not null terminated. Using
550:
Representations of strings depend heavily on the choice of character repertoire and the method of character encoding. Older string implementations were designed to work with repertoire and encoding defined by ASCII, or more recent extensions like the
357:. The string length can be stored as a separate integer (which may put another artificial limit on the length) or implicitly through a termination character, usually a character value with all bits zero such as in C programming language. See also " 404:) need far more than 256 characters (the limit of a one 8-bit byte per-character encoding) for reasonable representation. The normal solutions involved keeping single-byte representations for ASCII and using two-byte representations for CJK 2255:
01111 < ... < 1 < 10 < 100 < ... < 101 < ... < 111 < ... < 1111 < ... < 11111 ... With respect to this ordering, e.g. the infinite set { 1, 01, 001, 0001, 00001, 000001, ... } has no minimal element.
1717: 1514:
function – the function that returns the length of a string (not counting any terminator characters or any of the string's internal structural information) and does not modify the string. This function is often named
574:
of corresponding characters. The principal difference is that, with certain encodings, a single logical character may take up more than one entry in the array. This happens for example with UTF-8, where single codes
1174:
The differing memory layout and storage requirements of strings can affect the security of the program accessing the string data. String representations requiring a terminating character are commonly susceptible to
785:", is 5 characters, but it occupies 6 bytes. Characters after the terminator do not form part of the representation; they may be either part of other data or just garbage. (Strings of this form are sometimes called 1214:
it is normally invisible (non-printable) and is difficult to input via a keyboard. Storing the string length would also be inconvenient as manual computation and tracking of the length is tedious and error-prone.
436:
require the programmer to know that the fixed-size code units are different from the "characters", the main difficulty currently is incorrectly designed APIs that attempt to hide this difference (UTF-32 does make
1991: 1183:
deliberately altering the data. String representations adopting a separate length field are also susceptible if the length can be manipulated. In such cases, program code accessing the string data requires
1739:
over Σ. For example, if Σ = {0, 1}, the set of strings with an even number of zeros, {ε, 1, 00, 11, 001, 010, 100, 111, 0000, 0011, 0101, 0110, 1001, 1010, 1100, 1111, ...}, is a formal language over Σ.
412:
family guarantee that a byte value in the ASCII range will represent only that ASCII character, making the encoding safe for systems that use those characters as field separators. Other encodings such as
1903: 922:
In the latter case, the length-prefix field itself does not have fixed length, therefore the actual string data needs to be moved when the string grows such that the length field needs to be increased.
333:
Although formal strings can have an arbitrary finite length, the length of strings in real languages is often constrained to an artificial maximum. In general, there are two types of string datatypes:
1191:
String data is frequently obtained from user input to a program. As such, it is the responsibility of the program to validate the string to ensure that it represents the expected format. Performing
1382:
Character strings are such a useful datatype that several languages have been designed in order to make string processing applications easy to write. Examples include the following languages:
408:. Use of these with existing code led to problems with matching and cutting of strings, the severity of which depended on how the character encoding was designed. Some encodings such as the 1503:
are used to create strings or change the contents of a mutable string. They also are used to query information about a string. The set of functions and their names varies depending on the
801:
Using a special byte other than null for terminating strings has historically appeared in both hardware and software, though sometimes with a value that was also a printing character.
836:
Early microcomputer software relied upon the fact that ASCII codes do not use the high-order bit, and set it to indicate the end of a string. It must be reset to 0 prior to output.
891:
is the number of characters in a word (8 for 8-bit ASCII on a 64-bit machine, 1 for 32-bit UTF-32/UCS-4 on a 32-bit machine, etc.). If the length is not bounded, encoding a length
523:
of bytes, characters, or code units, in order to allow fast access to individual units or substrings—including characters when they have a fixed length. A few languages such as
1441:
utilities perform simple string manipulations and can be used to easily program some powerful string processing algorithms. Files and finite streams may be viewed as strings.
2344:
The natural topology on the set of fixed-length strings or variable-length strings is the discrete topology, but the natural topology on the set of infinite strings is the
512:. There are both advantages and disadvantages to immutability: although immutable strings may require inefficiently creating many copies, they are simpler and completely 317:
of most high-level programming languages allows for a string, usually quoted in some way, to represent an instance of a string datatype; such a meta-string is called a
1517: 856:. Storing the string length as byte limits the maximum string length to 255. To avoid such limitations, improved implementations of P-strings use 16-, 32-, or 64-bit 1237:), used by most programming languages. To be able to include special characters such as the quotation mark itself, newline characters, or non-printable characters, 4024: 2406: 1495: 292: 1116:
Both character termination and length codes limit strings: For example, C character arrays that contain null (NUL) characters cannot be handled directly by
1265:
While character strings are very common uses of strings, a string in computer science may refer generically to any sequence of homogeneously typed data. A
844:
The length of a string can also be stored explicitly, for example by prefixing the string with the length as a byte value. This convention is used in many
555:
series. Modern implementations often use the extensive repertoire defined by Unicode along with a variety of complex encodings such as UTF-8 and UTF-16.
17: 2945: 2736: 1652: 301:
is a datatype modeled on the idea of a formal string. Strings are such an important and useful datatype that they are implemented in nearly every
428:
has simplified the picture somewhat. Most programming languages now have a datatype for Unicode strings. Unicode's preferred byte stream format
2906: 1475:
to facilitate text operations. Perl is particularly noted for its regular expression use, and many other languages and applications implement
542:, avoid implementing a dedicated string datatype at all, instead adopting the convention of representing strings as lists of character codes. 2460: 4344: 3345: 345:, whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the actual requirements at run time (see 4017: 236:
Use of the word "string" to mean any items arranged in a line, series or succession dates back centuries. In 19th-Century typesetting,
3590: 3240: 2924: 2370: 1908: 240:
used the term "string" to denote a length of type printed on paper; the string would be measured to determine the compositor's pay.
3279: 109:
declared to be a string may either cause storage in memory to be statically allocated for a predetermined maximum length or employ
2764: 3961: 3595: 3205: 1855: 4231: 4010: 3585: 3580: 3210: 243:
Use of the word "string" to mean "a sequence of symbols or linguistic elements in a definite order" emerged from mathematics,
2497: 2753: 495: 4156: 3568: 3469: 3215: 2801: 501: 370:
comma) were in the same place in all the encodings a program would encounter. These character sets were typically based on
2783: 4246: 3014: 2627: 1476: 4386: 4171: 3426: 3200: 2703: 1722:
For example, if Σ = {0, 1}, then Σ = {ε, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, ...}. Although the set Σ itself is
353:
are variable-length strings. Of course, even variable-length strings are limited in length – by the size of available
220:. Often these are intended to be somewhat human-readable, though their primary purpose is to communicate to computers. 3746: 3294: 3235: 3107: 3072: 2968: 2887: 2682: 1445: 2271:
A number of additional operations on strings commonly occur in the formal theory. These are given in the article on
489:
strings. Some of these languages with immutable strings also provide another type that is mutable, such as Java and
224:
The term string may also designate a sequence of data or computer records other than characters — like a "string of
4339: 4324: 3142: 1166:—which makes certain string operations, such as insertions, deletions, and undoing previous edits, more efficient. 4406: 4200: 3719: 875:
If the length is bounded, then it can be encoded in constant space, typically a machine word, thus leading to an
2857: 4421: 3836: 3641: 3573: 3535: 3322: 1464: 524: 59: 2941: 4368:
Any language in each category is generated by a grammar and by an automaton in the category in the same line.
2717: 1639:
is denoted Σ. For example, if Σ = {0, 1}, then Σ = {00, 01, 10, 11}. We have Σ = {ε} for every alphabet Σ.
1468: 1338: 845: 829: 539: 478: 598:
The length of a string can be stored implicitly by using a special terminating character; often this is the
4217: 4142: 3736: 3666: 3514: 3327: 3182: 2610: 1541: 1449: 136: 83: 4416: 4391: 4314: 3991: 3626: 3406: 3132: 2834: 2732: 1570: 1411: 1391: 466: 458: 148: 63: 4002: 485:, the value is fixed and a new string must be created if any alteration is to be made; these are termed 4210: 3614: 3411: 3289: 3256: 3192: 2591: 1431: 1332: 1180: 474: 2688: 625:
In terminated strings, the terminating code is not an allowable character in any string. Strings with
432:
is designed not to have the problems described above for older multibyte encodings. UTF-8, UTF-16 and
4401: 4135: 3924: 3876: 3788: 3766: 3761: 3555: 3509: 3421: 3317: 3261: 3162: 1192: 1053: 869: 641: 482: 99: 1188:
to ensure that it does not inadvertently access or change data outside of the string memory limits.
1120:
library functions: Strings using a length code are limited to the maximum value of the length code.
461:, normally allow the contents of a string to be changed after it has been created; these are termed 4411: 4287: 4282: 3798: 3462: 3416: 3220: 3152: 1142:
makes certain string operations, such as insertions, deletions, and concatenations more efficient.
810: 603: 237: 70:
and the length changed, or it may be fixed (after creation). A string is generally considered as a
3951: 3866: 3365: 2456: 1533:, where a new string is created by appending two strings, often this is the + addition operator. 1104:
is a pointer to a dynamically allocated memory area, which might be expanded as needed. See also
926:
Here is a Pascal string stored in a 10-byte buffer, along with its ASCII / UTF-8 representation:
904: 876: 619: 576: 2894:
The term stringology is a popular nickname for string algorithms as well as for text algorithms.
2146:
The reverse of a string is a string with the same symbols but in reverse order. For example, if
4298: 4236: 4161: 3694: 3550: 3504: 2567: 1314: 593: 378:. If text in one encoding was displayed on a system using a different encoding, text was often 106: 2198:= 01. As another example, the string abc has three different rotations, viz. abc itself (with 4241: 4189: 3684: 3659: 3370: 3312: 3172: 3100: 2858:"Former Dean Zvi Galil Named a Top 10 Most Influential Computer Scientist in the Past Decade" 2478: 2429: 2423: 2400: 2243: 1163: 1139: 350: 55: 35: 4334: 4309: 4166: 4127: 3486: 3177: 3002: 1611: 1504: 1483: 1371: 1366:
Advanced string algorithms often employ complex mechanisms and data structures, among them
1226: 718: 382:, though often somewhat readable and some computer users learned to read the mangled text. 302: 75: 43: 2920: 2369:
between string representations of topologies can be found by normalizing according to the
2159: 8: 4396: 3956: 3934: 3861: 3714: 3706: 3455: 3375: 1849: 1128: 275:, "the first realistic string handling and pattern matching language" for computers was 4319: 4261: 4205: 3939: 3919: 3871: 3846: 3631: 3600: 3304: 3271: 2900: 2761: 2650: 2571: 2239: 1723: 1486:, which permits arbitrary expressions to be evaluated and included in string literals. 1472: 1348: 1097: 653: 409: 194:
service. Instead of a string literal, the software would likely store this string in a
132: 110: 87: 4054: 3826: 3756: 3731: 3545: 3540: 3436: 3068: 3061: 3010: 2964: 2883: 2711: 2678: 2516: 2411: 2394: 2385:— a property of string manipulating functions treating their input as raw data stream 2272: 1343: 1287: 1179:
problems if the terminating character is not present, caused by a coding error or an
1117: 900: 790: 393: 346: 248: 144: 4303: 4256: 4223: 4069: 3971: 3856: 3654: 3431: 3401: 3355: 3157: 3093: 3041: 2654: 2642: 1821: 1754: 1238: 857: 389: 341:
and which use the same amount of memory whether this maximum is needed or not, and
67: 1313:
for processing strings, each with various trade-offs. Competing algorithms can be
4266: 4181: 4148: 4064: 4037: 4033: 3976: 3841: 3793: 3726: 3284: 3225: 3137: 2805: 2787: 2768: 2757: 2329: 2031: 1735: 1500: 1360: 1185: 1176: 602:(NUL), which has all bits zero, a convention used and perpetuated by the popular 567: 520: 397: 354: 140: 128: 95: 2798: 2750: 2283: 2190:. For example, if Σ = {0, 1} the string 0011001 is a rotation of 0100110, where 4277: 4059: 4041: 3929: 3751: 3741: 3649: 3337: 3230: 3039: 2929:
Perl's most famous strength is in string manipulation with regular expressions.
2780: 2345: 2337: 1569:
of distinct, unambiguous symbols (alternatively called characters), called the
1537: 1222: 1208: 1196: 1132: 1105: 688: 599: 571: 490: 401: 310: 306: 272: 244: 180: 121: 2866:
He invented the terms 'stringology,' which is a subfield of string algorithms,
38:, and are often used to store human-readable data, such as words or sentences. 4380: 4362: 3851: 3147: 3124: 2668: 2549: 2435: 2349: 2039: 2035: 1749: 1712:{\displaystyle \Sigma ^{*}=\bigcup _{n\in \mathbb {N} \cup \{0\}}\Sigma ^{n}} 1530: 1511: 1325:
for the theory of algorithms and data structures used for string processing.
1303: 865: 513: 252: 2095: 2055: 228:" — but when used without qualification it refers to strings of characters. 3808: 3783: 3350: 3167: 2417: 2251: 2231: 2150:= abc (where a, b, and c are symbols of the alphabet), then the reverse of 2135: 1620: 1453: 338: 217: 191: 2646: 1544:
contain direct support for string operations, such as block copy (e.g. In
1290:
functions on such a "byte string" often seems to work, but later leads to
4329: 4251: 4176: 4032: 3986: 3981: 3831: 3778: 3605: 3360: 2382: 2366: 2247: 2235: 1845: 1817: 1813: 1643: 1367: 1279: 1274: 1270: 1159: 1155: 1146: 630: 528: 506: 259: 176: 117: 2781:"strlcpy and strlcat - consistent, safe, string copy and concatenation." 2677:(2003 ed.), Upper Saddle River, NJ: Pearson Education, p. 40, 1131:(replacing repeated characters by the character value and a length) and 3891: 3886: 3803: 3771: 3676: 3619: 2606: 2360: 2319: 2318:
Variable-length strings (of finite length) can be viewed as nodes on a
1566: 1317:
with respect to run time, storage requirements, and so forth. The name
1266: 1151: 470: 405: 86:) that stores a sequence of elements, typically characters, using some 2291:
Strings admit the following interpretation as nodes on a graph, where
3966: 3944: 3901: 3896: 3563: 3519: 3478: 2388: 2352:
of the sets of finite strings. This is the construction used for the
2332:(otherwise not considered here) can be viewed as infinite paths on a 2308: 2002: 1545: 1322: 1310: 1242: 1100:, the string must be accessed and modified through member functions. 1052:
Many languages, including object-oriented ones, implement strings as
441:
fixed-sized, but these are not "characters" due to composing codes).
418: 385: 71: 2162:, which also includes the empty string and all strings of length 1. 1377: 105:
Depending on the programming language and precise data type used, a
3881: 3058: 2414:— passed to a driver to initiate a connection (e.g., to a database) 2259: 1765:
in Σ, their concatenation is defined as the sequence of symbols in
1586: 1253: 1138:
While these representations are common, others are possible. Using
825: 552: 414: 379: 195: 172: 51: 255:
behavior of symbolic systems, setting aside the symbols' meaning.
3396: 1249: 1123:
Both of these limitations can be overcome by clever programming.
425: 30: 2816:
Keith Thompson. "No, strncpy() is not a "safer" strcpy()". 2012.
566:
Most string implementations are very similar to variable-length
2353: 2134:. Both the relations "is a prefix of" and "is a suffix of" are 1986:{\displaystyle L(st)=L(s)+L(t)\quad \forall s,t\in \Sigma ^{*}} 1841: 1840:. Therefore, the set Σ and the concatenation operation form a 1730: 1598: 1457: 1421: 1354: 629:
field do not have this limitation and can also store arbitrary
581: 535: 433: 375: 314: 280: 2826: 4292: 3447: 2465:
String literals (or constants) are called 'anonymous strings'
1560: 1396: 848:
dialects; as a consequence, some people call such a string a
649: 645: 450: 429: 371: 276: 4361:
Each category of languages, except those marked by a , is a
2432:— a data structure for efficiently manipulating long strings 2420:— its properties and representation in programming languages 1848:
generated by Σ. In addition, the length function defines a
3499: 3380: 3063:
Introduction to Automata Theory, Languages, and Computation
2672: 1898:{\displaystyle L:\Sigma ^{*}\mapsto \mathbb {N} \cup \{0\}} 1438: 1406: 1401: 1302:"Stringology" redirects here. For the physical theory, see 821:
since this was the string delimiter in its BASIC language.
814: 454: 79: 3085: 3021:
Any finite sequence of symbols of a language is called an
606:. Hence, this representation is commonly referred to as a 3494: 2496:
Francis, David M.; Merk, Heather L. (November 14, 2019).
1852:
from Σ to the non-negative integers (that is, a function
1589:
of symbols from Σ. For example, if Σ = {0, 1}, then
1426: 1416: 1386: 225: 206: 2959:
Fletcher, Peter; Hoyle, Hughes; Patty, C. Wayne (1991).
2615:. Berkeley: University of California Press. p. 355. 1624:
is the unique string over Σ of length 0, and is denoted
337:, which have a fixed maximum length to be determined at 1460:
use strings to hold commands that will be interpreted.
2407:
Comparison of programming languages (string functions)
2154:
is cba. A string that is the reverse of itself (e.g.,
1496:
Comparison of programming languages (string functions)
1195:
of user input can cause a program to be vulnerable to
824:
Somewhat similar, "data processing" machines like the
813:
systems (this character had a value of zero), and the
293:
Comparison of programming languages (string functions)
2704:"An Assembly Listing of the ROM of the Sinclair ZX80" 2426:— a string that cannot be compressed by any algorithm 1911: 1858: 1655: 1273:, for example, may be used to represent non-textual 2538:. Vol. X. Oxford at the Clarendon Press. 1933. 1642:The set of all strings over Σ of any length is the 1510:The most basic example of a string function is the 113:to allow it to hold a variable number of elements. 3060: 2958: 1985: 1897: 1726:, each element of Σ is a string of finite length. 1711: 1378:Character string-oriented languages and utilities 618:+ 1 space (1 for the terminator), and is thus an 4378: 2706:. Archived from the original on August 15, 2015. 2511: 2509: 2507: 781:The length of the string in the above example, " 2483:University of Utah, Kahlert School of Computing 1743: 1297: 1241:are often available, usually prefixed with the 2578:. New York: The Century Company. p. 5994. 4018: 3463: 3101: 2521:A Supplement to the Oxford English Dictionary 2504: 2498:"DNA as a Biochemical Entity and Data String" 2348:, viewing the set of infinite strings as the 2303:can be viewed as the integer locations in an 2234:on a set of strings. If the alphabet Σ has a 1489: 1482:Some languages such as Perl and Ruby support 1096:However, since the implementation is usually 3059:John E. Hopcroft, Jeffrey D. Ullman (1979). 2674:Computer Systems: A Programmer's Perspective 1892: 1886: 1820:operation. The empty string ε serves as the 1694: 1688: 1610:(the length of the sequence) and can be any 1529:would return 11. Another common function is 4340:Counter-free (with aperiodic finite monoid) 2799:"A rant about strcpy, strncpy and strlcpy." 2667: 2628:"Programming Languages: History and Future" 2495: 2242:) one can define a total order on Σ called 2225: 796: 267:instead of sounds or odours is immaterial. 4025: 4011: 3470: 3456: 3108: 3094: 3009:(Reprint ed.). CRC Press. p. 2. 3001: 2905:: CS1 maint: location missing publisher ( 2877: 2515: 2476: 2130:. Suffixes and prefixes are substrings of 1769:followed by the sequence of characters in 305:. In some languages they are available as 66:. The latter may allow its elements to be 3035: 3033: 2371:lexicographically minimal string rotation 2287:(Hyper)cube of binary strings of length 3 2250:if the alphabetical order is, but is not 1879: 1681: 1321:was coined in 1984 by computer scientist 205:" representing nucleic acid sequences of 3280:Comparison of regular-expression engines 2566: 2282: 2045: 2014:if there exist (possibly empty) strings 1635:The set of all strings over Σ of length 1335:for finding a given substring or pattern 1111: 179:, this message would likely appear as a 29: 2438:— notions of similarity between strings 1777:. For example, if Σ = {a, b, ..., z}, 1646:of Σ and is denoted Σ. In terms of Σ, 1328:Some categories of algorithms include: 212:Computer settings or parameters, like " 14: 4379: 4232:Linear context-free rewriting language 3030: 2855: 2625: 1291: 584:avoids the first part of the problem. 465:strings. In other languages, such as 4157:Linear context-free rewriting systems 4006: 3451: 3241:Zhu–Takaoka string matching algorithm 3089: 2701: 2605: 1047: 860:to store the string length. When the 519:Strings are typically implemented as 364: 171:" is a string that software shows to 2751:"Data Structures for Text Sequences" 2266: 1169: 903:), so length-prefixed strings are a 805:was used by many assembler systems, 358: 163:Example strings and their purposes: 3206:Boyer–Moore string-search algorithm 3046:Mathematical Methods in Linguistics 2961:Foundations of Discrete Mathematics 2730: 2695: 1477:Perl compatible regular expressions 1260: 286: 139:, a string is a finite sequence of 116:When a string appears literally in 24: 18:Character string (computer science) 4365:of the category directly above it. 2457:"Introduction To Java – MFC 158 G" 1974: 1958: 1866: 1729:A set of strings over Σ (i.e. any 1700: 1657: 1202: 868:, strings are limited only by the 839: 833:(for example) handle ASCII codes. 587: 545: 444: 25: 4433: 3295:Nondeterministic finite automaton 3236:Two-way string-matching algorithm 2979:is a finite sequence with domain 2927:from the original on 2012-04-21. 2463:from the original on 2016-03-03. 2403:— overview of C++ string handling 1252:sequence, for example in Windows 1056:with an internal structure like: 793:directive used to declare them.) 34:Strings are typically made up of 27:Sequence of characters, data type 2948:from the original on 2015-03-27. 2837:from the original on 1 June 2015 2739:from the original on 2017-04-10. 2523:. Oxford at the Clarendon Press. 2363:, and yields the same topology. 2230:It is often useful to define an 1554: 1217:Two common representations are: 328: 3052: 2995: 2952: 2934: 2913: 2871: 2856:Evarts, Holly (18 March 2021). 2849: 2819: 2810: 2792: 2774: 2743: 2724: 2691:from the original on 2007-08-06 2661: 2619: 2397:— overview of C string handling 2299:Fixed-length strings of length 2295:is the number of symbols in Σ: 2246:. The lexicographical order is 1957: 1806: 1798: 1790: 1782: 1465:scripting programming languages 74:and is often implemented as an 3477: 3211:Boyer–Moore–Horspool algorithm 3201:Apostolico–Giancarlo algorithm 2599: 2595:. January 11, 1898. p. 3. 2582: 2560: 2542: 2527: 2489: 2470: 2449: 2359:and some constructions of the 2042:of which is the empty string. 2034:"is a substring of" defines a 1954: 1948: 1939: 1933: 1924: 1915: 1875: 1339:String manipulation algorithms 907:, encoding a string of length 279:in the 1950s, followed by the 13: 1: 3536:Arbitrary-precision or bignum 2827:"The Prague Stringology Club" 2733:"Design Notes for Tiny BASIC" 2626:Sammet, Jean E. (July 1972). 2536:The Oxford English Dictionary 2442: 1542:instruction set architectures 1505:computer programming language 1282:, data corruption may ensue. 1145:The core data structure in a 570:with the entries storing the 283:language of the early 1960s. 102:) data types and structures. 94:may also denote more general 3216:Knuth–Morris–Pratt algorithm 3143:Damerau–Levenshtein distance 2671:; David, O'Hallaron (2003), 2178:is said to be a rotation of 2165: 1744:Concatenation and substrings 1606:is the number of symbols in 1450:Multimedia Control Interface 1298:String processing algorithms 610:. This representation of an 160:intended for human reading. 137:theoretical computer science 7: 3407:Compressed pattern matching 3133:Approximate string matching 3115: 2878:Crochemore, Maxime (2002). 2554:Online Etymology Dictionary 2391:— a string of binary digits 2376: 2278: 2141: 1812:String concatenation is an 1757:on Σ. For any two strings 1333:String searching algorithms 1233:or ASCII 0x27 single quote 10: 4438: 4247:Deterministic context-free 4172:Deterministic context-free 3412:Longest common subsequence 3323:Needleman–Wunsch algorithm 3193:String-searching algorithm 2612:A survey of symbolic logic 2477:de St. Germain, H. James. 2090:. Symmetrically, a string 1614:; it is often denoted as | 1558: 1493: 1490:Character string functions 1301: 1206: 652:) representation as 8-bit 591: 349:). Most strings in modern 290: 231: 190:" as a status update on a 154: 4387:String (computer science) 4358: 4320:Nondeterministic pushdown 4048: 3910: 3877:Strongly typed identifier 3819: 3705: 3675: 3640: 3528: 3485: 3422:Sequential pattern mining 3389: 3336: 3303: 3270: 3262:Commentz-Walter algorithm 3250:Multiple string searching 3249: 3191: 3183:Wagner–Fischer algorithm 3123: 3044:; Robert E. Wall (1990). 2963:. PWS-Kent. p. 114. 2942:"x86 string instructions" 2716:: CS1 maint: unfit URL ( 2635:Communications of the ACM 2102:if there exists a string 2062:if there exists a string 201:Alphabetical data, like " 186:User-entered text, like " 3432:String rewriting systems 3417:Longest common substring 3328:Smith–Waterman algorithm 3153:Gestalt pattern matching 2975:Let Σ be an alphabet. A 2882:. Singapore. p. v. 2226:Lexicographical ordering 1193:limited or no validation 1058: 797:Byte- and bit-terminated 614:-character string takes 534:Some languages, such as 449:Some languages, such as 124:or an anonymous string. 3952:Parametric polymorphism 3366:Generalized suffix tree 3290:Thompson's construction 2568:Whitney, William Dwight 1585:) over Σ is any finite 1471:, Ruby, and Tcl employ 1245:character (ASCII 0x5C). 905:succinct data structure 877:implicit data structure 620:implicit data structure 499:, the thread-safe Java 400:(known collectively as 343:variable-length strings 143:that are chosen from a 4407:Combinatorics on words 4325:Deterministic pushdown 4201:Recursively enumerable 3318:Hirschberg's algorithm 2589:"Old Union's Demise". 2576:The Century Dictionary 2288: 2263:00000 < 00001 ... 1987: 1899: 1713: 638:null-terminated string 604:C programming language 594:Null-terminated string 269: 258:For example, logician 98:or other sequence (or 39: 4422:Algorithms on strings 3173:Levenshtein automaton 3163:Jaro–Winkler distance 3003:Shoenfield, Joseph R. 2880:Jewels of stringology 2647:10.1145/361454.361485 2430:Rope (data structure) 2424:Incompressible string 2311:with sides of length 2286: 2244:lexicographical order 2158:= madam) is called a 2046:Prefixes and suffixes 1988: 1900: 1714: 1527:length("hello world") 1372:finite-state machines 1112:Other representations 789:, after the original 351:programming languages 264: 188:I got a new job today 33: 4310:Tree stack automaton 3221:Rabin–Karp algorithm 3178:Levenshtein distance 2990:∈ ℕ) and codomain Σ. 2977:nonempty word over Σ 2862:Columbia Engineering 1909: 1856: 1653: 1612:non-negative integer 1593:is a string over Σ. 1484:string interpolation 640:stored in a 10-byte 335:fixed-length strings 303:programming language 169:file upload complete 131:, which are used in 76:array data structure 44:computer programming 4218:range concatenation 4143:range concatenation 3957:Primitive data type 3862:Recursive data type 3715:Algebraic data type 3591:Quadruple precision 3376:Ternary search tree 3079:Here: sect.1.1, p.1 3040:Barbara H. Partee; 2401:C++ string handling 2214:=a), and cab (with 1850:monoid homomorphism 1473:regular expressions 1129:run length encoding 654:hexadecimal numbers 251:to speak about the 175:. In the program's 120:, it is known as a 62:or as some kind of 50:is traditionally a 4417:Syntactic entities 4392:Character encoding 3920:Abstract data type 3601:Extended precision 3560:Reduced precision 3305:Sequence alignment 3272:Regular expression 3067:. Addison-Wesley. 3007:Mathematical Logic 2804:2016-02-29 at the 2786:2016-03-13 at the 2767:2016-04-04 at the 2756:2016-03-04 at the 2702:Wearmouth, Geoff. 2592:Milwaukee Sentinel 2572:Smith, Benjamin E. 2519:(1986). "string". 2289: 2240:alphabetical order 1983: 1895: 1733:of Σ) is called a 1724:countably infinite 1709: 1698: 1467:, including Perl, 1349:Regular expression 1344:Sorting algorithms 1048:Strings as records 527:implement them as 388:languages such as 365:Character encoding 133:mathematical logic 111:dynamic allocation 88:character encoding 40: 4374: 4373: 4353: 4352: 4315:Embedded pushdown 4211:Context-sensitive 4136:Context-sensitive 4070:Abstract machines 4055:Chomsky hierarchy 4000: 3999: 3732:Associative array 3596:Octuple precision 3445: 3444: 3437:String operations 3025:of that language. 2749:Charles Crowley. 2731:Allison, Dennis. 2669:Bryant, Randal E. 2412:Connection string 2395:C string handling 2273:string operations 2267:String operations 1824:; for any string 1773:, and is denoted 1669: 1292:security problems 1288:C string handling 1170:Security concerns 1045: 1044: 934: 901:fixed-length code 864:field covers the 791:assembly language 779: 778: 691: 644:, along with its 347:Memory management 309:and in others as 249:linguistic theory 16:(Redirected from 4429: 4402:Formal languages 4369: 4366: 4330:Visibly pushdown 4304:Thread automaton 4252:Visibly pushdown 4220: 4177:Visibly pushdown 4145: 4132:(no common name) 4051: 4050: 4038:formal languages 4027: 4020: 4013: 4004: 4003: 3972:Type constructor 3857:Opaque data type 3789:Record or Struct 3586:Double precision 3581:Single precision 3472: 3465: 3458: 3449: 3448: 3402:Pattern matching 3356:Suffix automaton 3158:Hamming distance 3110: 3103: 3096: 3087: 3086: 3080: 3078: 3066: 3056: 3050: 3049: 3042:Alice ter Meulen 3037: 3028: 3027: 2999: 2993: 2992: 2956: 2950: 2949: 2938: 2932: 2931: 2921:"Essential Perl" 2917: 2911: 2910: 2904: 2896: 2875: 2869: 2868: 2853: 2847: 2846: 2844: 2842: 2823: 2817: 2814: 2808: 2796: 2790: 2778: 2772: 2747: 2741: 2740: 2728: 2722: 2721: 2715: 2707: 2699: 2693: 2692: 2665: 2659: 2658: 2632: 2623: 2617: 2616: 2603: 2597: 2596: 2586: 2580: 2579: 2564: 2558: 2557: 2546: 2540: 2539: 2531: 2525: 2524: 2517:Burchfield, R.W. 2513: 2502: 2501: 2493: 2487: 2486: 2474: 2468: 2467: 2453: 2330:Infinite strings 2122:is said to be a 2094:is said to be a 2082:is said to be a 2054:is said to be a 2000:is said to be a 1992: 1990: 1989: 1984: 1982: 1981: 1904: 1902: 1901: 1896: 1882: 1874: 1873: 1822:identity element 1808: 1800: 1792: 1784: 1755:binary operation 1753:is an important 1718: 1716: 1715: 1710: 1708: 1707: 1697: 1684: 1665: 1664: 1550: 1528: 1524: 1520: 1501:String functions 1261:Non-text strings 1248:Terminated by a 1239:escape sequences 1236: 1232: 1133:Hamming encoding 1103: 1092: 1089: 1086: 1083: 1080: 1077: 1074: 1071: 1068: 1065: 1062: 979: 974: 969: 964: 959: 954: 949: 944: 939: 932: 929: 928: 870:available memory 820: 808: 804: 784: 711: 706: 701: 696: 687: 684: 679: 674: 669: 664: 659: 658: 648:(or more modern 636:An example of a 511: 504: 498: 287:String datatypes 215: 204: 189: 170: 167:A message like " 129:formal languages 60:literal constant 21: 4437: 4436: 4432: 4431: 4430: 4428: 4427: 4426: 4412:Primitive types 4377: 4376: 4375: 4370: 4367: 4360: 4354: 4349: 4271: 4215: 4194: 4140: 4121: 4044: 4042:formal grammars 4034:Automata theory 4031: 4001: 3996: 3977:Type conversion 3912: 3906: 3842:Enumerated type 3815: 3701: 3695:null-terminated 3671: 3636: 3524: 3481: 3476: 3446: 3441: 3385: 3332: 3299: 3285:Regular grammar 3266: 3245: 3226:Raita algorithm 3187: 3138:Bitap algorithm 3119: 3114: 3084: 3083: 3075: 3057: 3053: 3038: 3031: 3017: 3016:978-156881135-2 3000: 2996: 2984: 2971: 2957: 2953: 2940: 2939: 2935: 2919: 2918: 2914: 2898: 2897: 2890: 2876: 2872: 2854: 2850: 2840: 2838: 2831:stringology.org 2825: 2824: 2820: 2815: 2811: 2806:Wayback Machine 2797: 2793: 2788:Wayback Machine 2779: 2775: 2769:Wayback Machine 2758:Wayback Machine 2748: 2744: 2729: 2725: 2709: 2708: 2700: 2696: 2685: 2666: 2662: 2630: 2624: 2620: 2604: 2600: 2588: 2587: 2583: 2565: 2561: 2548: 2547: 2543: 2533: 2532: 2528: 2514: 2505: 2494: 2490: 2475: 2471: 2455: 2454: 2450: 2445: 2379: 2281: 2269: 2228: 2206:=ε), bca (with 2168: 2144: 2048: 1977: 1973: 1910: 1907: 1906: 1878: 1869: 1865: 1857: 1854: 1853: 1746: 1736:formal language 1703: 1699: 1680: 1673: 1660: 1656: 1654: 1651: 1650: 1563: 1557: 1548: 1526: 1525:. For example, 1522: 1516: 1498: 1492: 1380: 1361:Sequence mining 1309:There are many 1307: 1300: 1263: 1234: 1230: 1223:quotation marks 1211: 1205: 1203:Literal strings 1186:bounds checking 1177:buffer overflow 1172: 1114: 1101: 1094: 1093: 1090: 1087: 1084: 1081: 1078: 1075: 1072: 1069: 1066: 1063: 1060: 1050: 1041: 1035: 1029: 1023: 1017: 1011: 1005: 999: 993: 987: 977: 972: 967: 962: 957: 952: 947: 942: 937: 842: 840:Length-prefixed 828:used a special 818: 806: 802: 799: 782: 775: 769: 763: 757: 751: 745: 739: 733: 727: 721: 709: 704: 699: 694: 682: 677: 672: 667: 662: 596: 590: 588:Null-terminated 572:character codes 548: 546:Representations 510:NSMutableString 509: 500: 494: 447: 445:Implementations 367: 359:Null-terminated 355:computer memory 331: 311:composite types 307:primitive types 299:string datatype 295: 289: 262:wrote in 1918: 234: 213: 202: 187: 168: 157: 28: 23: 22: 15: 12: 11: 5: 4435: 4425: 4424: 4419: 4414: 4409: 4404: 4399: 4394: 4389: 4372: 4371: 4359: 4356: 4355: 4351: 4350: 4348: 4347: 4345:Acyclic finite 4342: 4337: 4332: 4327: 4322: 4317: 4312: 4306: 4301: 4296: 4295:Turing Machine 4290: 4288:Linear-bounded 4285: 4280: 4278:Turing machine 4274: 4272: 4270: 4269: 4264: 4259: 4254: 4249: 4244: 4239: 4237:Tree-adjoining 4234: 4229: 4226: 4221: 4213: 4208: 4203: 4197: 4195: 4193: 4192: 4187: 4184: 4179: 4174: 4169: 4164: 4162:Tree-adjoining 4159: 4154: 4151: 4146: 4138: 4133: 4130: 4124: 4122: 4120: 4119: 4116: 4113: 4110: 4107: 4104: 4101: 4098: 4095: 4092: 4089: 4086: 4083: 4080: 4076: 4073: 4072: 4067: 4062: 4057: 4049: 4046: 4045: 4030: 4029: 4022: 4015: 4007: 3998: 3997: 3995: 3994: 3989: 3984: 3979: 3974: 3969: 3964: 3959: 3954: 3949: 3948: 3947: 3937: 3932: 3930:Data structure 3927: 3922: 3916: 3914: 3908: 3907: 3905: 3904: 3899: 3894: 3889: 3884: 3879: 3874: 3869: 3864: 3859: 3854: 3849: 3844: 3839: 3834: 3829: 3823: 3821: 3817: 3816: 3814: 3813: 3812: 3811: 3801: 3796: 3791: 3786: 3781: 3776: 3775: 3774: 3764: 3759: 3754: 3749: 3744: 3739: 3734: 3729: 3724: 3723: 3722: 3711: 3709: 3703: 3702: 3700: 3699: 3698: 3697: 3687: 3681: 3679: 3673: 3672: 3670: 3669: 3664: 3663: 3662: 3657: 3646: 3644: 3638: 3637: 3635: 3634: 3629: 3624: 3623: 3622: 3612: 3611: 3610: 3609: 3608: 3598: 3593: 3588: 3583: 3578: 3577: 3576: 3571: 3569:Half precision 3566: 3556:Floating point 3553: 3548: 3543: 3538: 3532: 3530: 3526: 3525: 3523: 3522: 3517: 3512: 3507: 3502: 3497: 3491: 3489: 3483: 3482: 3475: 3474: 3467: 3460: 3452: 3443: 3442: 3440: 3439: 3434: 3429: 3424: 3419: 3414: 3409: 3404: 3399: 3393: 3391: 3387: 3386: 3384: 3383: 3378: 3373: 3368: 3363: 3358: 3353: 3348: 3342: 3340: 3338:Data structure 3334: 3333: 3331: 3330: 3325: 3320: 3315: 3309: 3307: 3301: 3300: 3298: 3297: 3292: 3287: 3282: 3276: 3274: 3268: 3267: 3265: 3264: 3259: 3253: 3251: 3247: 3246: 3244: 3243: 3238: 3233: 3231:Trigram search 3228: 3223: 3218: 3213: 3208: 3203: 3197: 3195: 3189: 3188: 3186: 3185: 3180: 3175: 3170: 3165: 3160: 3155: 3150: 3145: 3140: 3135: 3129: 3127: 3121: 3120: 3113: 3112: 3105: 3098: 3090: 3082: 3081: 3073: 3051: 3029: 3015: 2994: 2982: 2969: 2951: 2933: 2912: 2888: 2870: 2848: 2818: 2809: 2791: 2773: 2762:"Introduction" 2742: 2723: 2694: 2683: 2660: 2618: 2598: 2581: 2559: 2541: 2526: 2503: 2488: 2469: 2447: 2446: 2444: 2441: 2440: 2439: 2433: 2427: 2421: 2415: 2409: 2404: 2398: 2392: 2386: 2378: 2375: 2346:limit topology 2342: 2341: 2338:complete graph 2327: 2316: 2280: 2277: 2268: 2265: 2227: 2224: 2167: 2164: 2143: 2140: 2047: 2044: 1980: 1976: 1972: 1969: 1966: 1963: 1960: 1956: 1953: 1950: 1947: 1944: 1941: 1938: 1935: 1932: 1929: 1926: 1923: 1920: 1917: 1914: 1894: 1891: 1888: 1885: 1881: 1877: 1872: 1868: 1864: 1861: 1745: 1742: 1720: 1719: 1706: 1702: 1696: 1693: 1690: 1687: 1683: 1679: 1676: 1672: 1668: 1663: 1659: 1644:Kleene closure 1556: 1553: 1538:microprocessor 1491: 1488: 1435: 1434: 1429: 1424: 1419: 1414: 1409: 1404: 1399: 1394: 1389: 1379: 1376: 1364: 1363: 1358: 1352: 1346: 1341: 1336: 1299: 1296: 1262: 1259: 1258: 1257: 1246: 1221:Surrounded by 1209:String literal 1207:Main article: 1204: 1201: 1197:code injection 1171: 1168: 1113: 1110: 1059: 1049: 1046: 1043: 1042: 1039: 1036: 1033: 1030: 1027: 1024: 1021: 1018: 1015: 1012: 1009: 1006: 1003: 1000: 997: 994: 991: 988: 985: 981: 980: 975: 970: 965: 960: 955: 950: 945: 940: 935: 841: 838: 798: 795: 777: 776: 773: 770: 767: 764: 761: 758: 755: 752: 749: 746: 743: 740: 737: 734: 731: 728: 725: 722: 717: 713: 712: 707: 702: 697: 692: 685: 680: 675: 670: 665: 600:null character 592:Main article: 589: 586: 547: 544: 446: 443: 366: 363: 330: 327: 323:string literal 288: 285: 273:Jean E. Sammet 245:symbolic logic 233: 230: 222: 221: 210: 199: 184: 181:string literal 156: 153: 122:string literal 58:, either as a 26: 9: 6: 4: 3: 2: 4434: 4423: 4420: 4418: 4415: 4413: 4410: 4408: 4405: 4403: 4400: 4398: 4395: 4393: 4390: 4388: 4385: 4384: 4382: 4364: 4363:proper subset 4357: 4346: 4343: 4341: 4338: 4336: 4333: 4331: 4328: 4326: 4323: 4321: 4318: 4316: 4313: 4311: 4307: 4305: 4302: 4300: 4297: 4294: 4291: 4289: 4286: 4284: 4281: 4279: 4276: 4275: 4273: 4268: 4265: 4263: 4260: 4258: 4255: 4253: 4250: 4248: 4245: 4243: 4240: 4238: 4235: 4233: 4230: 4227: 4225: 4222: 4219: 4214: 4212: 4209: 4207: 4204: 4202: 4199: 4198: 4196: 4191: 4190:Non-recursive 4188: 4185: 4183: 4180: 4178: 4175: 4173: 4170: 4168: 4165: 4163: 4160: 4158: 4155: 4152: 4150: 4147: 4144: 4139: 4137: 4134: 4131: 4129: 4126: 4125: 4123: 4117: 4114: 4111: 4108: 4105: 4102: 4099: 4096: 4093: 4090: 4087: 4084: 4081: 4078: 4077: 4075: 4074: 4071: 4068: 4066: 4063: 4061: 4058: 4056: 4053: 4052: 4047: 4043: 4039: 4035: 4028: 4023: 4021: 4016: 4014: 4009: 4008: 4005: 3993: 3990: 3988: 3985: 3983: 3980: 3978: 3975: 3973: 3970: 3968: 3965: 3963: 3960: 3958: 3955: 3953: 3950: 3946: 3943: 3942: 3941: 3938: 3936: 3933: 3931: 3928: 3926: 3923: 3921: 3918: 3917: 3915: 3909: 3903: 3900: 3898: 3895: 3893: 3890: 3888: 3885: 3883: 3880: 3878: 3875: 3873: 3870: 3868: 3865: 3863: 3860: 3858: 3855: 3853: 3852:Function type 3850: 3848: 3845: 3843: 3840: 3838: 3835: 3833: 3830: 3828: 3825: 3824: 3822: 3818: 3810: 3807: 3806: 3805: 3802: 3800: 3797: 3795: 3792: 3790: 3787: 3785: 3782: 3780: 3777: 3773: 3770: 3769: 3768: 3765: 3763: 3760: 3758: 3755: 3753: 3750: 3748: 3745: 3743: 3740: 3738: 3735: 3733: 3730: 3728: 3725: 3721: 3718: 3717: 3716: 3713: 3712: 3710: 3708: 3704: 3696: 3693: 3692: 3691: 3688: 3686: 3683: 3682: 3680: 3678: 3674: 3668: 3665: 3661: 3658: 3656: 3653: 3652: 3651: 3648: 3647: 3645: 3643: 3639: 3633: 3630: 3628: 3625: 3621: 3618: 3617: 3616: 3613: 3607: 3604: 3603: 3602: 3599: 3597: 3594: 3592: 3589: 3587: 3584: 3582: 3579: 3575: 3572: 3570: 3567: 3565: 3562: 3561: 3559: 3558: 3557: 3554: 3552: 3549: 3547: 3544: 3542: 3539: 3537: 3534: 3533: 3531: 3527: 3521: 3518: 3516: 3513: 3511: 3508: 3506: 3503: 3501: 3498: 3496: 3493: 3492: 3490: 3488: 3487:Uninterpreted 3484: 3480: 3473: 3468: 3466: 3461: 3459: 3454: 3453: 3450: 3438: 3435: 3433: 3430: 3428: 3425: 3423: 3420: 3418: 3415: 3413: 3410: 3408: 3405: 3403: 3400: 3398: 3395: 3394: 3392: 3388: 3382: 3379: 3377: 3374: 3372: 3369: 3367: 3364: 3362: 3359: 3357: 3354: 3352: 3349: 3347: 3344: 3343: 3341: 3339: 3335: 3329: 3326: 3324: 3321: 3319: 3316: 3314: 3311: 3310: 3308: 3306: 3302: 3296: 3293: 3291: 3288: 3286: 3283: 3281: 3278: 3277: 3275: 3273: 3269: 3263: 3260: 3258: 3255: 3254: 3252: 3248: 3242: 3239: 3237: 3234: 3232: 3229: 3227: 3224: 3222: 3219: 3217: 3214: 3212: 3209: 3207: 3204: 3202: 3199: 3198: 3196: 3194: 3190: 3184: 3181: 3179: 3176: 3174: 3171: 3169: 3166: 3164: 3161: 3159: 3156: 3154: 3151: 3149: 3148:Edit distance 3146: 3144: 3141: 3139: 3136: 3134: 3131: 3130: 3128: 3126: 3125:String metric 3122: 3118: 3111: 3106: 3104: 3099: 3097: 3092: 3091: 3088: 3076: 3074:0-201-02988-X 3070: 3065: 3064: 3055: 3047: 3043: 3036: 3034: 3026: 3024: 3018: 3012: 3008: 3004: 2998: 2991: 2989: 2985: 2978: 2972: 2970:0-53492-373-9 2966: 2962: 2955: 2947: 2943: 2937: 2930: 2926: 2922: 2916: 2908: 2902: 2895: 2891: 2889:981-02-4782-6 2885: 2881: 2874: 2867: 2863: 2859: 2852: 2836: 2832: 2828: 2822: 2813: 2807: 2803: 2800: 2795: 2789: 2785: 2782: 2777: 2770: 2766: 2763: 2759: 2755: 2752: 2746: 2738: 2734: 2727: 2719: 2713: 2705: 2698: 2690: 2686: 2684:0-13-034074-X 2680: 2676: 2675: 2670: 2664: 2656: 2652: 2648: 2644: 2640: 2636: 2629: 2622: 2614: 2613: 2608: 2602: 2594: 2593: 2585: 2577: 2573: 2569: 2563: 2555: 2551: 2550:"string (n.)" 2545: 2537: 2530: 2522: 2518: 2512: 2510: 2508: 2499: 2492: 2484: 2480: 2473: 2466: 2462: 2458: 2452: 2448: 2437: 2436:String metric 2434: 2431: 2428: 2425: 2422: 2419: 2416: 2413: 2410: 2408: 2405: 2402: 2399: 2396: 2393: 2390: 2387: 2384: 2381: 2380: 2374: 2372: 2368: 2364: 2362: 2358: 2357:-adic numbers 2356: 2351: 2350:inverse limit 2347: 2339: 2335: 2331: 2328: 2325: 2323: 2317: 2314: 2310: 2307:-dimensional 2306: 2302: 2298: 2297: 2296: 2294: 2285: 2276: 2274: 2264: 2261: 2256: 2253: 2249: 2245: 2241: 2237: 2233: 2223: 2221: 2217: 2213: 2209: 2205: 2201: 2197: 2193: 2189: 2185: 2181: 2177: 2173: 2163: 2161: 2157: 2153: 2149: 2139: 2137: 2136:prefix orders 2133: 2129: 2125: 2121: 2118:is nonempty, 2117: 2113: 2109: 2105: 2101: 2097: 2093: 2089: 2085: 2081: 2078:is nonempty, 2077: 2073: 2069: 2065: 2061: 2057: 2053: 2043: 2041: 2040:least element 2037: 2036:partial order 2033: 2029: 2025: 2021: 2017: 2013: 2009: 2005: 2004: 1999: 1994: 1978: 1970: 1967: 1964: 1961: 1951: 1945: 1942: 1936: 1930: 1927: 1921: 1918: 1912: 1889: 1883: 1870: 1862: 1859: 1851: 1847: 1843: 1839: 1835: 1831: 1827: 1823: 1819: 1815: 1810: 1805: =  1804: 1797: =  1796: 1789: =  1788: 1781: =  1780: 1776: 1772: 1768: 1764: 1760: 1756: 1752: 1751: 1750:Concatenation 1741: 1738: 1737: 1732: 1727: 1725: 1704: 1691: 1685: 1677: 1674: 1670: 1666: 1661: 1649: 1648: 1647: 1645: 1640: 1638: 1633: 1631: 1627: 1623: 1622: 1617: 1613: 1609: 1605: 1601: 1600: 1594: 1592: 1588: 1584: 1580: 1576: 1572: 1568: 1562: 1555:Formal theory 1552: 1547: 1543: 1539: 1534: 1532: 1531:concatenation 1519: 1513: 1512:string length 1508: 1506: 1502: 1497: 1487: 1485: 1480: 1478: 1474: 1470: 1466: 1461: 1459: 1455: 1451: 1447: 1442: 1440: 1433: 1430: 1428: 1425: 1423: 1420: 1418: 1415: 1413: 1410: 1408: 1405: 1403: 1400: 1398: 1395: 1393: 1390: 1388: 1385: 1384: 1383: 1375: 1373: 1369: 1362: 1359: 1356: 1353: 1350: 1347: 1345: 1342: 1340: 1337: 1334: 1331: 1330: 1329: 1326: 1324: 1320: 1316: 1312: 1305: 1304:String theory 1295: 1293: 1289: 1283: 1281: 1276: 1272: 1268: 1255: 1251: 1247: 1244: 1240: 1229:double quote 1228: 1224: 1220: 1219: 1218: 1215: 1210: 1200: 1198: 1194: 1189: 1187: 1182: 1178: 1167: 1165: 1161: 1157: 1153: 1148: 1143: 1141: 1136: 1134: 1130: 1124: 1121: 1119: 1109: 1107: 1099: 1057: 1055: 1037: 1031: 1025: 1019: 1013: 1007: 1001: 995: 989: 983: 982: 976: 971: 966: 961: 956: 951: 946: 941: 936: 931: 930: 927: 924: 920: 918: 914: 910: 906: 902: 899:) space (see 898: 894: 890: 887:space, where 886: 882: 878: 873: 871: 867: 866:address space 863: 859: 855: 851: 850:Pascal string 847: 837: 834: 831: 827: 822: 816: 812: 794: 792: 788: 787:ASCIZ strings 771: 765: 759: 753: 747: 741: 735: 729: 723: 720: 715: 714: 708: 703: 698: 693: 690: 686: 681: 676: 671: 666: 661: 660: 657: 655: 651: 647: 643: 639: 634: 632: 628: 623: 621: 617: 613: 609: 605: 601: 595: 585: 583: 578: 573: 569: 564: 561: 556: 554: 543: 541: 537: 532: 530: 526: 522: 517: 515: 508: 503: 497: 496:StringBuilder 492: 488: 484: 480: 476: 472: 468: 464: 460: 456: 452: 442: 440: 435: 431: 427: 423: 420: 416: 411: 407: 403: 399: 395: 391: 387: 383: 381: 377: 373: 362: 360: 356: 352: 348: 344: 340: 336: 329:String length 326: 324: 320: 316: 312: 308: 304: 300: 294: 284: 282: 278: 274: 271:According to 268: 263: 261: 256: 254: 250: 246: 241: 239: 229: 227: 219: 211: 208: 200: 197: 193: 185: 182: 178: 174: 166: 165: 164: 161: 152: 150: 146: 142: 138: 134: 130: 125: 123: 119: 114: 112: 108: 103: 101: 97: 93: 89: 85: 81: 77: 73: 69: 65: 61: 57: 53: 49: 45: 37: 32: 19: 4299:Nested stack 4242:Context-free 4167:Context-free 4128:Unrestricted 3757:Intersection 3689: 3351:Suffix array 3257:Aho–Corasick 3168:Lee distance 3116: 3062: 3054: 3045: 3022: 3020: 3006: 2997: 2987: 2980: 2976: 2974: 2960: 2954: 2936: 2928: 2915: 2893: 2879: 2873: 2865: 2861: 2851: 2839:. Retrieved 2830: 2821: 2812: 2794: 2776: 2745: 2726: 2697: 2673: 2663: 2638: 2634: 2621: 2611: 2601: 2590: 2584: 2575: 2562: 2553: 2544: 2535: 2529: 2520: 2491: 2482: 2472: 2464: 2451: 2418:Empty string 2367:Isomorphisms 2365: 2354: 2343: 2333: 2321: 2312: 2304: 2300: 2292: 2290: 2270: 2257: 2252:well-founded 2229: 2219: 2215: 2211: 2207: 2203: 2199: 2195: 2194:= 00110 and 2191: 2187: 2183: 2179: 2175: 2171: 2169: 2155: 2151: 2147: 2145: 2131: 2127: 2123: 2119: 2115: 2111: 2107: 2103: 2099: 2091: 2087: 2083: 2079: 2075: 2071: 2067: 2063: 2059: 2051: 2049: 2027: 2023: 2019: 2015: 2011: 2007: 2001: 1997: 1995: 1905:, such that 1837: 1833: 1829: 1825: 1811: 1802: 1794: 1786: 1778: 1774: 1770: 1766: 1762: 1758: 1748: 1747: 1734: 1728: 1721: 1641: 1636: 1634: 1629: 1625: 1621:empty string 1619: 1615: 1607: 1603: 1602:of a string 1597: 1595: 1590: 1582: 1578: 1574: 1564: 1535: 1509: 1499: 1481: 1462: 1454:embedded SQL 1443: 1436: 1381: 1368:suffix trees 1365: 1327: 1318: 1308: 1284: 1264: 1216: 1212: 1190: 1173: 1158:of lines, a 1144: 1137: 1125: 1122: 1115: 1106:string (C++) 1095: 1051: 925: 921: 916: 912: 908: 896: 892: 888: 884: 880: 874: 861: 853: 849: 843: 835: 823: 800: 786: 780: 637: 635: 626: 624: 615: 611: 607: 597: 565: 559: 557: 549: 533: 529:linked lists 518: 502:StringBuffer 486: 462: 448: 438: 424: 384: 368: 342: 339:compile time 334: 332: 322: 318: 298: 296: 270: 265: 257: 242: 235: 223: 218:query string 214:?action=edit 192:social media 162: 158: 126: 115: 104: 91: 47: 41: 4308:restricted 3987:Type theory 3982:Type system 3832:Bottom type 3779:Option type 3720:generalized 3606:Long double 3551:Fixed point 3361:Suffix tree 2607:Lewis, C.I. 2383:Binary-safe 2236:total order 1846:free monoid 1818:commutative 1814:associative 1565:Let Σ be a 1549:REPNZ MOVSB 1319:stringology 1280:8-bit clean 1275:binary data 1271:byte string 1160:piece table 1156:linked list 1147:text editor 631:binary data 560:byte string 514:thread-safe 439:code points 386:Logographic 260:C. I. Lewis 238:compositors 216:" as a URL 177:source code 118:source code 4397:Data types 4381:Categories 3892:Empty type 3887:Type class 3837:Collection 3794:Refinement 3772:metaobject 3620:signedness 3479:Data types 3023:expression 2986:(for some 2760:. Section 2574:"string". 2534:"string". 2443:References 2361:Cantor set 2160:palindrome 2126:suffix of 2106:such that 2086:prefix of 2066:such that 2038:on Σ, the 2022:such that 1816:, but non- 1583:expression 1567:finite set 1559:See also: 1546:intel x86m 1494:See also: 1351:algorithms 1311:algorithms 1267:bit string 1152:gap buffer 895:takes log( 505:, and the 471:JavaScript 406:ideographs 291:See also: 147:called an 56:characters 36:characters 4262:Star-free 4216:Positive 4206:Decidable 4141:Positive 4065:Languages 3967:Subtyping 3962:Interface 3945:metaclass 3897:Unit type 3867:Semaphore 3847:Exception 3752:Inductive 3742:Dependent 3707:Composite 3685:Character 3667:Reference 3564:Minifloat 3520:Bit array 3048:. Kluwer. 3005:(2010) . 2901:cite book 2479:"Strings" 2389:Bit array 2324:-ary tree 2309:hypercube 2170:A string 2166:Rotations 2050:A string 2003:substring 1996:A string 1979:∗ 1975:Σ 1971:∈ 1959:∀ 1884:∪ 1876:↦ 1871:∗ 1867:Σ 1701:Σ 1686:∪ 1678:∈ 1671:⋃ 1662:∗ 1658:Σ 1323:Zvi Galil 1254:INI files 1243:backslash 1199:attacks. 879:, taking 830:word mark 558:The term 531:instead. 487:immutable 419:Shift-JIS 361:" below. 203:AGATGCCGT 173:end users 72:data type 4060:Grammars 3992:Variable 3882:Top type 3747:Equality 3655:physical 3632:Rational 3627:Interval 3574:bfloat16 2946:Archived 2925:Archived 2835:Archived 2802:Archived 2784:Archived 2765:Archived 2754:Archived 2737:Archived 2712:cite web 2689:archived 2609:(1918). 2461:Archived 2377:See also 2320:perfect 2279:Topology 2260:Shortlex 2232:ordering 2142:Reversal 2032:relation 1618:|. The 1587:sequence 1571:alphabet 1357:a string 1315:analyzed 1181:attacker 1118:C string 854:P-string 826:IBM 1401 809:used by 608:C string 553:ISO 8859 415:ISO-2022 394:Japanese 196:database 149:alphabet 107:variable 64:variable 52:sequence 4283:Decider 4257:Regular 4224:Indexed 4182:Regular 4149:Indexed 3935:Generic 3911:Related 3827:Boolean 3784:Product 3660:virtual 3650:Address 3642:Pointer 3615:Integer 3546:Decimal 3541:Complex 3529:Numeric 3427:Sorting 3397:Parsing 3117:Strings 2655:2003242 2030:. The 1807:hugbear 1799:bearhug 1793:, then 1355:Parsing 1250:newline 1225:(ASCII 1162:, or a 1054:records 919:space. 911:in log( 525:Haskell 463:mutable 426:Unicode 390:Chinese 380:mangled 319:literal 232:History 155:Purpose 141:symbols 68:mutated 4335:Finite 4267:Finite 4112:Type-3 4103:Type-2 4085:Type-1 4079:Type-0 3925:Boxing 3913:topics 3872:Stream 3809:tagged 3767:Object 3690:String 3071:  3013:  2967:  2886:  2841:23 May 2681:  2653:  2336:-node 2222:=ab). 2202:=abc, 2124:proper 2096:suffix 2084:proper 2056:prefix 2008:factor 1844:, the 1842:monoid 1785:, and 1731:subset 1599:length 1575:string 1518:length 1469:Python 1458:printf 1422:SNOBOL 1098:hidden 1073:length 1070:size_t 1064:string 933:length 862:length 846:Pascal 642:buffer 627:length 582:UTF-32 568:arrays 540:Erlang 536:Prolog 521:arrays 481:, and 479:Python 434:UTF-32 398:Korean 396:, and 376:EBCDIC 315:syntax 313:. The 281:SNOBOL 253:formal 247:, and 96:arrays 92:String 48:string 4293:PTIME 3820:Other 3804:Union 3737:Class 3727:Array 3510:Tryte 3390:Other 3346:DAFSA 3313:BLAST 2651:S2CID 2641:(7). 2631:(PDF) 2248:total 2238:(cf. 2210:=bc, 2114:. If 2074:. If 1591:01011 1561:Tuple 1536:Some 1463:Many 1448:like 1444:Some 1437:Many 1397:MUMPS 1235:'str' 1231:"str" 1140:ropes 1061:class 858:words 817:used 783:FRANK 650:UTF-8 646:ASCII 507:Cocoa 430:UTF-8 372:ASCII 277:COMIT 84:words 80:bytes 4040:and 3940:Kind 3902:Void 3762:List 3677:Text 3515:Word 3505:Trit 3500:Byte 3381:Trie 3371:Rope 3069:ISBN 3011:ISBN 2965:ISBN 2907:link 2884:ISBN 2843:2015 2718:link 2679:ISBN 2258:See 2218:=c, 2018:and 1836:ε = 1801:and 1783:bear 1761:and 1596:The 1579:word 1577:(or 1573:. A 1446:APIs 1439:Unix 1412:Ruby 1407:Rexx 1402:Perl 1392:Icon 1370:and 1278:not 1227:0x22 1164:rope 1154:, a 1102:text 1085:text 1079:char 915:) + 815:ZX80 656:is: 538:and 491:.NET 467:Java 459:Ruby 457:and 455:Perl 417:and 226:bits 135:and 100:list 82:(or 46:, a 3799:Set 3495:Bit 2643:doi 2315:-1. 2182:if 2098:of 2058:of 2028:usv 2010:of 2006:or 1993:). 1828:, ε 1791:hug 1628:or 1581:or 1551:). 1540:'s 1523:len 1521:or 1456:or 1432:TTM 1427:Tcl 1417:sed 1387:AWK 1269:or 852:or 811:CDC 689:NUL 577:UCS 493:'s 475:Lua 451:C++ 410:EUC 402:CJK 374:or 321:or 207:DNA 145:set 127:In 78:of 54:of 42:In 4383:: 4036:: 3032:^ 3019:. 2973:. 2944:. 2923:. 2903:}} 2899:{{ 2892:. 2864:. 2860:. 2833:. 2829:. 2735:. 2714:}} 2710:{{ 2687:, 2649:. 2639:15 2637:. 2633:. 2570:; 2552:. 2506:^ 2481:. 2459:. 2373:. 2275:. 2188:vu 2186:= 2176:uv 2174:= 2138:. 2112:us 2110:= 2072:su 2070:= 2026:= 1832:= 1809:. 1803:ts 1795:st 1775:st 1632:. 1507:. 1479:. 1452:, 1374:. 1294:. 1135:. 1108:. 1091:}; 1040:16 1038:77 1034:16 1032:66 1028:16 1026:65 1022:16 1020:6B 1016:16 1014:4B 1010:16 1008:4E 1004:16 1002:41 998:16 996:52 992:16 990:46 986:16 984:05 883:+ 872:. 803:$ 774:16 772:77 768:16 766:66 762:16 760:65 756:16 754:6B 750:16 748:00 744:16 742:4B 738:16 736:4E 732:16 730:41 726:16 724:52 719:16 716:46 633:. 622:. 516:. 483:Go 477:, 473:, 469:, 453:, 392:, 325:. 297:A 151:. 90:. 4228:— 4186:— 4153:— 4118:— 4115:— 4109:— 4106:— 4100:— 4097:— 4094:— 4091:— 4088:— 4082:— 4026:e 4019:t 4012:v 3471:e 3464:t 3457:v 3109:e 3102:t 3095:v 3077:. 2988:n 2983:n 2981:I 2909:) 2845:. 2771:. 2720:) 2657:. 2645:: 2556:. 2500:. 2485:. 2355:p 2340:. 2334:k 2326:. 2322:k 2313:k 2305:n 2301:n 2293:k 2220:v 2216:u 2212:v 2208:u 2204:v 2200:u 2196:v 2192:u 2184:t 2180:t 2172:s 2156:s 2152:s 2148:s 2132:t 2128:t 2120:s 2116:u 2108:t 2104:u 2100:t 2092:s 2088:t 2080:s 2076:u 2068:t 2064:u 2060:t 2052:s 2024:t 2020:v 2016:u 2012:t 1998:s 1968:t 1965:, 1962:s 1955:) 1952:t 1949:( 1946:L 1943:+ 1940:) 1937:s 1934:( 1931:L 1928:= 1925:) 1922:t 1919:s 1916:( 1913:L 1893:} 1890:0 1887:{ 1880:N 1863:: 1860:L 1838:s 1834:s 1830:s 1826:s 1787:t 1779:s 1771:t 1767:s 1763:t 1759:s 1705:n 1695:} 1692:0 1689:{ 1682:N 1675:n 1667:= 1637:n 1630:λ 1626:ε 1616:s 1608:s 1604:s 1306:. 1256:. 1088:; 1082:* 1076:; 1067:{ 978:w 973:f 968:e 963:k 958:K 953:N 948:A 943:R 938:F 917:n 913:n 909:n 897:n 893:n 889:k 885:k 881:n 819:" 807:: 710:w 705:f 700:e 695:k 683:K 678:N 673:A 668:R 663:F 616:n 612:n 575:( 209:. 198:. 183:. 20:)

Index

Character string (computer science)
Diagram of String data in computing. Shows the word "example" with each letter in a separate box. The word "String" is above, referring to the entire sentence. The label "Character" is below and points to an individual box.
characters
computer programming
sequence
characters
literal constant
variable
mutated
data type
array data structure
bytes
words
character encoding
arrays
list
variable
dynamic allocation
source code
string literal
formal languages
mathematical logic
theoretical computer science
symbols
set
alphabet
end users
source code
string literal
social media

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.