Knowledge

String (computer science)

Source 📝

20: 2273: 821:
bit to delimit strings at the left, where the operation would start at the right. This bit had to be clear in all other parts of the string. This meant that, while the IBM 1401 had a seven-bit word, almost no-one ever thought to use this as a feature, and override the assignment of the seventh bit to
148:
A primary purpose of strings is to store human-readable text, like words and sentences. Strings are used to communicate information from a computer program to the user of the program. A program may also accept string input from its user. Further, strings may store data expressed as characters yet not
2251:
for an alternative string ordering that preserves well-foundedness. For the example alphabet, the shortlex order is ε < 0 < 1 < 00 < 01 < 10 < 11 < 000 < 001 < 010 < 011 < 100 < 101 < 0110 < 111 < 0000 < 0001 < 0010 < 0011 < ... < 1111 <
1138:
is the one that manages the string (sequence of characters) that represents the current state of the file being edited. While that state could be stored in a single long consecutive array of characters, a typical text editor instead uses an alternative representation as its sequence data structure—a
1266:
retrieved from a communications medium. This data may or may not be represented by a string-specific datatype, depending on the needs of the application, the desire of the programmer, and the capabilities of the programming language being used. If the programming language's string implementation is
1202:
Sometimes, strings need to be embedded inside a text file that is both human-readable and intended for consumption by a machine. This is needed in, for example, source code of programming languages, or in configuration files. In this case, the NUL character does not work well as a terminator since
2243:
for any nontrivial alphabet, even if the alphabetical order is. For example, if Σ = {0, 1} and 0 < 1, then the lexicographical order on Σ includes the relationships ε < 0 < 00 < 000 < ... < 0001 < ... < 001 < ... < 01 < 010 < ... < 011 < 0110 < ... <
358:
String datatypes have historically allocated one byte per character, and, although the exact character set varied by region, character encodings were similar enough that programmers could often get away with ignoring this, since characters a program treated specially (such as period and space and
255:
A mathematical system is any set of strings of recognisable marks in which some of the strings are taken initially and the remainder derived from these by operations performed according to rules which are independent of any meaning assigned to the marks. That a system should consist of 'marks'
551:
usually indicates a general-purpose string of bytes, rather than strings of only (readable) characters, strings of bits, or such. Byte strings often imply that bytes can take any value and any data can be stored as-is, meaning that there should be no value interpreted as a termination value.
410:
do not make such guarantees, making matching on byte codes unsafe. These encodings also were not "self-synchronizing", so that locating character boundaries required backing up to the start of a string, and pasting two strings together could result in corruption of the second string.
1115:
It is possible to create data structures and functions that manipulate them that do not have the problems associated with character termination and can in principle overcome length code bounds. It is also possible to optimize the string represented using techniques from
568:
code points) can take anywhere from one to four bytes, and single characters can take an arbitrary number of codes. In these cases, the logical length of the string (number of characters) differs from the physical length of the array (number of bytes in use).
1274:
C programmers draw a sharp distinction between a "string", aka a "string of characters", which by definition is always null terminated, vs. a "byte string" or "pseudo string" which may be stored in the same array but is often not null terminated. Using
539:
Representations of strings depend heavily on the choice of character repertoire and the method of character encoding. Older string implementations were designed to work with repertoire and encoding defined by ASCII, or more recent extensions like the
346:. The string length can be stored as a separate integer (which may put another artificial limit on the length) or implicitly through a termination character, usually a character value with all bits zero such as in C programming language. See also " 393:) need far more than 256 characters (the limit of a one 8-bit byte per-character encoding) for reasonable representation. The normal solutions involved keeping single-byte representations for ASCII and using two-byte representations for CJK 2244:
01111 < ... < 1 < 10 < 100 < ... < 101 < ... < 111 < ... < 1111 < ... < 11111 ... With respect to this ordering, e.g. the infinite set { 1, 01, 001, 0001, 00001, 000001, ... } has no minimal element.
1706: 1503:
function – the function that returns the length of a string (not counting any terminator characters or any of the string's internal structural information) and does not modify the string. This function is often named
563:
of corresponding characters. The principal difference is that, with certain encodings, a single logical character may take up more than one entry in the array. This happens for example with UTF-8, where single codes
1163:
The differing memory layout and storage requirements of strings can affect the security of the program accessing the string data. String representations requiring a terminating character are commonly susceptible to
774:", is 5 characters, but it occupies 6 bytes. Characters after the terminator do not form part of the representation; they may be either part of other data or just garbage. (Strings of this form are sometimes called 1203:
it is normally invisible (non-printable) and is difficult to input via a keyboard. Storing the string length would also be inconvenient as manual computation and tracking of the length is tedious and error-prone.
425:
require the programmer to know that the fixed-size code units are different from the "characters", the main difficulty currently is incorrectly designed APIs that attempt to hide this difference (UTF-32 does make
1980: 1172:
deliberately altering the data. String representations adopting a separate length field are also susceptible if the length can be manipulated. In such cases, program code accessing the string data requires
1728:
over Σ. For example, if Σ = {0, 1}, the set of strings with an even number of zeros, {ε, 1, 00, 11, 001, 010, 100, 111, 0000, 0011, 0101, 0110, 1001, 1010, 1100, 1111, ...}, is a formal language over Σ.
401:
family guarantee that a byte value in the ASCII range will represent only that ASCII character, making the encoding safe for systems that use those characters as field separators. Other encodings such as
1892: 911:
In the latter case, the length-prefix field itself does not have fixed length, therefore the actual string data needs to be moved when the string grows such that the length field needs to be increased.
322:
Although formal strings can have an arbitrary finite length, the length of strings in real languages is often constrained to an artificial maximum. In general, there are two types of string datatypes:
1180:
String data is frequently obtained from user input to a program. As such, it is the responsibility of the program to validate the string to ensure that it represents the expected format. Performing
1371:
Character strings are such a useful datatype that several languages have been designed in order to make string processing applications easy to write. Examples include the following languages:
397:. Use of these with existing code led to problems with matching and cutting of strings, the severity of which depended on how the character encoding was designed. Some encodings such as the 1492:
are used to create strings or change the contents of a mutable string. They also are used to query information about a string. The set of functions and their names varies depending on the
790:
Using a special byte other than null for terminating strings has historically appeared in both hardware and software, though sometimes with a value that was also a printing character.
825:
Early microcomputer software relied upon the fact that ASCII codes do not use the high-order bit, and set it to indicate the end of a string. It must be reset to 0 prior to output.
880:
is the number of characters in a word (8 for 8-bit ASCII on a 64-bit machine, 1 for 32-bit UTF-32/UCS-4 on a 32-bit machine, etc.). If the length is not bounded, encoding a length
512:
of bytes, characters, or code units, in order to allow fast access to individual units or substrings—including characters when they have a fixed length. A few languages such as
1430:
utilities perform simple string manipulations and can be used to easily program some powerful string processing algorithms. Files and finite streams may be viewed as strings.
2333:
The natural topology on the set of fixed-length strings or variable-length strings is the discrete topology, but the natural topology on the set of infinite strings is the
501:. There are both advantages and disadvantages to immutability: although immutable strings may require inefficiently creating many copies, they are simpler and completely 306:
of most high-level programming languages allows for a string, usually quoted in some way, to represent an instance of a string datatype; such a meta-string is called a
1506: 845:. Storing the string length as byte limits the maximum string length to 255. To avoid such limitations, improved implementations of P-strings use 16-, 32-, or 64-bit 1226:), used by most programming languages. To be able to include special characters such as the quotation mark itself, newline characters, or non-printable characters, 4013: 2395: 1484: 281: 1105:
Both character termination and length codes limit strings: For example, C character arrays that contain null (NUL) characters cannot be handled directly by
1254:
While character strings are very common uses of strings, a string in computer science may refer generically to any sequence of homogeneously typed data. A
833:
The length of a string can also be stored explicitly, for example by prefixing the string with the length as a byte value. This convention is used in many
544:
series. Modern implementations often use the extensive repertoire defined by Unicode along with a variety of complex encodings such as UTF-8 and UTF-16.
2934: 2725: 1641: 290:
is a datatype modeled on the idea of a formal string. Strings are such an important and useful datatype that they are implemented in nearly every
417:
has simplified the picture somewhat. Most programming languages now have a datatype for Unicode strings. Unicode's preferred byte stream format
2895: 1464:
to facilitate text operations. Perl is particularly noted for its regular expression use, and many other languages and applications implement
531:, avoid implementing a dedicated string datatype at all, instead adopting the convention of representing strings as lists of character codes. 2449: 4333: 3334: 334:, whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the actual requirements at run time (see 4006: 225:
Use of the word "string" to mean any items arranged in a line, series or succession dates back centuries. In 19th-Century typesetting,
3579: 3229: 2913: 2359: 1897: 229:
used the term "string" to denote a length of type printed on paper; the string would be measured to determine the compositor's pay.
3268: 98:
declared to be a string may either cause storage in memory to be statically allocated for a predetermined maximum length or employ
2753: 3950: 3584: 3194: 1844: 4220: 3999: 3574: 3569: 3199: 232:
Use of the word "string" to mean "a sequence of symbols or linguistic elements in a definite order" emerged from mathematics,
2486: 2742: 484: 4145: 3557: 3458: 3204: 2790: 490: 359:
comma) were in the same place in all the encodings a program would encounter. These character sets were typically based on
2772: 4235: 3003: 2616: 1465: 4375: 4160: 3415: 3189: 2692: 1711:
For example, if Σ = {0, 1}, then Σ = {ε, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, ...}. Although the set Σ itself is
342:
are variable-length strings. Of course, even variable-length strings are limited in length – by the size of available
209:. Often these are intended to be somewhat human-readable, though their primary purpose is to communicate to computers. 3735: 3283: 3224: 3096: 3061: 2957: 2876: 2671: 1434: 2260:
A number of additional operations on strings commonly occur in the formal theory. These are given in the article on
478:
strings. Some of these languages with immutable strings also provide another type that is mutable, such as Java and
213:
The term string may also designate a sequence of data or computer records other than characters — like a "string of
4328: 4313: 3131: 1155:—which makes certain string operations, such as insertions, deletions, and undoing previous edits, more efficient. 4395: 4189: 3708: 864:
If the length is bounded, then it can be encoded in constant space, typically a machine word, thus leading to an
2846: 4410: 3825: 3630: 3562: 3524: 3311: 1453: 513: 48: 2930: 4357:
Any language in each category is generated by a grammar and by an automaton in the category in the same line.
2706: 1628:
is denoted Σ. For example, if Σ = {0, 1}, then Σ = {00, 01, 10, 11}. We have Σ = {ε} for every alphabet Σ.
1457: 1327: 834: 818: 528: 467: 587:
The length of a string can be stored implicitly by using a special terminating character; often this is the
4206: 4131: 3725: 3655: 3503: 3316: 3171: 2599: 1530: 1438: 125: 72: 4405: 4380: 4303: 3980: 3615: 3395: 3121: 2823: 2721: 1559: 1400: 1380: 455: 447: 137: 52: 3991: 474:, the value is fixed and a new string must be created if any alteration is to be made; these are termed 4199: 3603: 3400: 3278: 3245: 3181: 2580: 1420: 1321: 1169: 463: 2677: 614:
In terminated strings, the terminating code is not an allowable character in any string. Strings with
421:
is designed not to have the problems described above for older multibyte encodings. UTF-8, UTF-16 and
4390: 4124: 3913: 3865: 3777: 3755: 3750: 3544: 3498: 3410: 3306: 3250: 3151: 1181: 1042: 858: 630: 471: 88: 1177:
to ensure that it does not inadvertently access or change data outside of the string memory limits.
1109:
library functions: Strings using a length code are limited to the maximum value of the length code.
450:, normally allow the contents of a string to be changed after it has been created; these are termed 4400: 4276: 4271: 3787: 3451: 3405: 3209: 3141: 1131:
makes certain string operations, such as insertions, deletions, and concatenations more efficient.
799: 592: 226: 59:
and the length changed, or it may be fixed (after creation). A string is generally considered as a
3940: 3855: 3354: 2445: 1522:, where a new string is created by appending two strings, often this is the + addition operator. 1093:
is a pointer to a dynamically allocated memory area, which might be expanded as needed. See also
915:
Here is a Pascal string stored in a 10-byte buffer, along with its ASCII / UTF-8 representation:
893: 865: 608: 565: 2883:
The term stringology is a popular nickname for string algorithms as well as for text algorithms.
2135:
The reverse of a string is a string with the same symbols but in reverse order. For example, if
4287: 4225: 4150: 3683: 3539: 3493: 2556: 1303: 582: 367:. If text in one encoding was displayed on a system using a different encoding, text was often 95: 2187:= 01. As another example, the string abc has three different rotations, viz. abc itself (with 4230: 4178: 3673: 3648: 3359: 3301: 3161: 3089: 2847:"Former Dean Zvi Galil Named a Top 10 Most Influential Computer Scientist in the Past Decade" 2467: 2418: 2412: 2389: 2232: 1152: 1128: 339: 44: 24: 4323: 4298: 4155: 4116: 3475: 3166: 2991: 1600: 1493: 1472: 1360: 1355:
Advanced string algorithms often employ complex mechanisms and data structures, among them
1215: 707: 371:, though often somewhat readable and some computer users learned to read the mangled text. 291: 64: 32: 2909: 2358:
between string representations of topologies can be found by normalizing according to the
2148: 8: 4385: 3945: 3923: 3850: 3703: 3695: 3444: 3364: 1838: 1117: 264:, "the first realistic string handling and pattern matching language" for computers was 4308: 4250: 4194: 3928: 3908: 3860: 3835: 3620: 3589: 3293: 3260: 2889: 2750: 2639: 2560: 2228: 1712: 1475:, which permits arbitrary expressions to be evaluated and included in string literals. 1461: 1337: 1086: 642: 398: 183:
service. Instead of a string literal, the software would likely store this string in a
121: 99: 76: 4043: 3815: 3745: 3720: 3534: 3529: 3425: 3057: 3050: 2999: 2953: 2872: 2700: 2667: 2505: 2400: 2383: 2374:— a property of string manipulating functions treating their input as raw data stream 2261: 1332: 1276: 1168:
problems if the terminating character is not present, caused by a coding error or an
1106: 889: 779: 382: 335: 237: 133: 4292: 4245: 4212: 4058: 3960: 3845: 3643: 3420: 3390: 3344: 3146: 3082: 3030: 2643: 2631: 1810: 1743: 1227: 846: 378: 330:
and which use the same amount of memory whether this maximum is needed or not, and
56: 1302:
for processing strings, each with various trade-offs. Competing algorithms can be
4255: 4170: 4137: 4053: 4026: 4022: 3965: 3830: 3782: 3715: 3273: 3214: 3126: 2794: 2776: 2757: 2746: 2318: 2020: 1724: 1489: 1349: 1174: 1165: 591:(NUL), which has all bits zero, a convention used and perpetuated by the popular 556: 509: 386: 343: 129: 117: 84: 2787: 2739: 2272: 2179:. For example, if Σ = {0, 1} the string 0011001 is a rotation of 0100110, where 4266: 4048: 4030: 3918: 3740: 3730: 3638: 3326: 3219: 3028: 2918:
Perl's most famous strength is in string manipulation with regular expressions.
2769: 2334: 2326: 1558:
of distinct, unambiguous symbols (alternatively called characters), called the
1526: 1211: 1197: 1185: 1121: 1094: 677: 588: 560: 479: 390: 299: 295: 261: 233: 169: 110: 2855:
He invented the terms 'stringology,' which is a subfield of string algorithms,
27:, and are often used to store human-readable data, such as words or sentences. 4369: 4351: 3840: 3136: 3113: 2657: 2538: 2424: 2338: 2028: 2024: 1738: 1701:{\displaystyle \Sigma ^{*}=\bigcup _{n\in \mathbb {N} \cup \{0\}}\Sigma ^{n}} 1519: 1500: 1314:
for the theory of algorithms and data structures used for string processing.
1292: 854: 502: 241: 2084: 2044: 217:" — but when used without qualification it refers to strings of characters. 3797: 3772: 3339: 3156: 2406: 2240: 2220: 2139:= abc (where a, b, and c are symbols of the alphabet), then the reverse of 2124: 1609: 1442: 327: 206: 180: 2635: 1533:
contain direct support for string operations, such as block copy (e.g. In
1279:
functions on such a "byte string" often seems to work, but later leads to
4318: 4240: 4165: 4021: 3975: 3970: 3820: 3767: 3594: 3349: 2371: 2355: 2236: 2224: 1834: 1806: 1802: 1632: 1356: 1268: 1263: 1259: 1148: 1144: 1135: 619: 517: 495: 248: 165: 106: 2770:"strlcpy and strlcat - consistent, safe, string copy and concatenation." 2666:(2003 ed.), Upper Saddle River, NJ: Pearson Education, p. 40, 1120:(replacing repeated characters by the character value and a length) and 3880: 3875: 3792: 3760: 3665: 3608: 2595: 2349: 2308: 2307:
Variable-length strings (of finite length) can be viewed as nodes on a
1555: 1306:
with respect to run time, storage requirements, and so forth. The name
1255: 1140: 459: 394: 75:) that stores a sequence of elements, typically characters, using some 2280:
Strings admit the following interpretation as nodes on a graph, where
3955: 3933: 3890: 3885: 3552: 3508: 3467: 2377: 2341:
of the sets of finite strings. This is the construction used for the
2321:(otherwise not considered here) can be viewed as infinite paths on a 2297: 1991: 1534: 1311: 1299: 1231: 1089:, the string must be accessed and modified through member functions. 1041:
Many languages, including object-oriented ones, implement strings as
430:
fixed-sized, but these are not "characters" due to composing codes).
407: 374: 60: 2151:, which also includes the empty string and all strings of length 1. 1366: 94:
Depending on the programming language and precise data type used, a
3870: 3047: 2403:— passed to a driver to initiate a connection (e.g., to a database) 2248: 1754:
in Σ, their concatenation is defined as the sequence of symbols in
1575: 1242: 1127:
While these representations are common, others are possible. Using
814: 541: 403: 368: 184: 161: 40: 244:
behavior of symbolic systems, setting aside the symbols' meaning.
3385: 1238: 1112:
Both of these limitations can be overcome by clever programming.
414: 19: 2805:
Keith Thompson. "No, strncpy() is not a "safer" strcpy()". 2012.
555:
Most string implementations are very similar to variable-length
2342: 2123:. Both the relations "is a prefix of" and "is a suffix of" are 1975:{\displaystyle L(st)=L(s)+L(t)\quad \forall s,t\in \Sigma ^{*}} 1830: 1829:. Therefore, the set Σ and the concatenation operation form a 1719: 1587: 1446: 1410: 1343: 618:
field do not have this limitation and can also store arbitrary
570: 524: 422: 364: 303: 269: 2815: 4281: 3436: 2454:
String literals (or constants) are called 'anonymous strings'
1549: 1385: 837:
dialects; as a consequence, some people call such a string a
638: 634: 439: 418: 360: 265: 4350:
Each category of languages, except those marked by a , is a
2421:— a data structure for efficiently manipulating long strings 2409:— its properties and representation in programming languages 1837:
generated by Σ. In addition, the length function defines a
3488: 3369: 3052:
Introduction to Automata Theory, Languages, and Computation
2661: 1887:{\displaystyle L:\Sigma ^{*}\mapsto \mathbb {N} \cup \{0\}} 1427: 1395: 1390: 1291:"Stringology" redirects here. For the physical theory, see 810:
since this was the string delimiter in its BASIC language.
803: 443: 68: 3074: 3010:
Any finite sequence of symbols of a language is called an
595:. Hence, this representation is commonly referred to as a 3483: 2485:
Francis, David M.; Merk, Heather L. (November 14, 2019).
1841:
from Σ to the non-negative integers (that is, a function
1578:
of symbols from Σ. For example, if Σ = {0, 1}, then
1415: 1405: 1375: 214: 195: 2948:
Fletcher, Peter; Hoyle, Hughes; Patty, C. Wayne (1991).
2604:. Berkeley: University of California Press. p. 355. 1613:
is the unique string over Σ of length 0, and is denoted
326:, which have a fixed maximum length to be determined at 1449:
use strings to hold commands that will be interpreted.
2396:
Comparison of programming languages (string functions)
2143:
is cba. A string that is the reverse of itself (e.g.,
1485:
Comparison of programming languages (string functions)
1184:
of user input can cause a program to be vulnerable to
813:
Somewhat similar, "data processing" machines like the
802:
systems (this character had a value of zero), and the
282:
Comparison of programming languages (string functions)
2693:"An Assembly Listing of the ROM of the Sinclair ZX80" 2415:— a string that cannot be compressed by any algorithm 1900: 1847: 1644: 1262:, for example, may be used to represent non-textual 2527:. Vol. X. Oxford at the Clarendon Press. 1933. 1631:The set of all strings over Σ of any length is the 1499:The most basic example of a string function is the 102:to allow it to hold a variable number of elements. 3049: 2947: 1974: 1886: 1715:, each element of Σ is a string of finite length. 1700: 1367:Character string-oriented languages and utilities 607:+ 1 space (1 for the terminator), and is thus an 4367: 2695:. Archived from the original on August 15, 2015. 2500: 2498: 2496: 770:The length of the string in the above example, " 2472:University of Utah, Kahlert School of Computing 1732: 1286: 1230:are often available, usually prefixed with the 2567:. New York: The Century Company. p. 5994. 4007: 3452: 3090: 2510:A Supplement to the Oxford English Dictionary 2493: 2487:"DNA as a Biochemical Entity and Data String" 2337:, viewing the set of infinite strings as the 2292:can be viewed as the integer locations in an 2223:on a set of strings. If the alphabet Σ has a 1478: 1471:Some languages such as Perl and Ruby support 1085:However, since the implementation is usually 3048:John E. Hopcroft, Jeffrey D. Ullman (1979). 2663:Computer Systems: A Programmer's Perspective 1881: 1875: 1809:operation. The empty string ε serves as the 1683: 1677: 1599:(the length of the sequence) and can be any 1518:would return 11. Another common function is 4329:Counter-free (with aperiodic finite monoid) 2788:"A rant about strcpy, strncpy and strlcpy." 2656: 2617:"Programming Languages: History and Future" 2484: 2231:) one can define a total order on Σ called 2214: 785: 256:instead of sounds or odours is immaterial. 4014: 4000: 3459: 3445: 3097: 3083: 2998:(Reprint ed.). CRC Press. p. 2. 2990: 2894:: CS1 maint: location missing publisher ( 2866: 2504: 2465: 2119:. Suffixes and prefixes are substrings of 1758:followed by the sequence of characters in 294:. In some languages they are available as 55:. The latter may allow its elements to be 3024: 3022: 2360:lexicographically minimal string rotation 2276:(Hyper)cube of binary strings of length 3 2239:if the alphabetical order is, but is not 1868: 1670: 1310:was coined in 1984 by computer scientist 194:" representing nucleic acid sequences of 3269:Comparison of regular-expression engines 2555: 2271: 2034: 2003:if there exist (possibly empty) strings 1624:The set of all strings over Σ of length 1324:for finding a given substring or pattern 1100: 168:, this message would likely appear as a 18: 2427:— notions of similarity between strings 1766:. For example, if Σ = {a, b, ..., z}, 1635:of Σ and is denoted Σ. In terms of Σ, 1317:Some categories of algorithms include: 201:Computer settings or parameters, like " 4368: 4221:Linear context-free rewriting language 3019: 2844: 2614: 1280: 573:avoids the first part of the problem. 454:strings. In other languages, such as 4146:Linear context-free rewriting systems 3995: 3440: 3230:Zhu–Takaoka string matching algorithm 3078: 2690: 2594: 1036: 849:to store the string length. When the 508:Strings are typically implemented as 353: 160:" is a string that software shows to 2740:"Data Structures for Text Sequences" 2255: 1158: 892:), so length-prefixed strings are a 794:was used by many assembler systems, 347: 152:Example strings and their purposes: 3195:Boyer–Moore string-search algorithm 3035:Mathematical Methods in Linguistics 2950:Foundations of Discrete Mathematics 2719: 2684: 1466:Perl compatible regular expressions 1249: 275: 128:, a string is a finite sequence of 105:When a string appears literally in 13: 4354:of the category directly above it. 2446:"Introduction To Java – MFC 158 G" 1963: 1947: 1855: 1718:A set of strings over Σ (i.e. any 1689: 1646: 1191: 857:, strings are limited only by the 828: 822:(for example) handle ASCII codes. 576: 534: 433: 14: 4422: 3284:Nondeterministic finite automaton 3225:Two-way string-matching algorithm 2968:is a finite sequence with domain 2916:from the original on 2012-04-21. 2452:from the original on 2016-03-03. 2392:— overview of C++ string handling 1241:sequence, for example in Windows 1045:with an internal structure like: 782:directive used to declare them.) 23:Strings are typically made up of 16:Sequence of characters, data type 2937:from the original on 2015-03-27. 2826:from the original on 1 June 2015 2728:from the original on 2017-04-10. 2512:. Oxford at the Clarendon Press. 2352:, and yields the same topology. 2219:It is often useful to define an 1543: 1206:Two common representations are: 317: 3041: 2984: 2941: 2923: 2902: 2860: 2845:Evarts, Holly (18 March 2021). 2838: 2808: 2799: 2781: 2763: 2732: 2713: 2680:from the original on 2007-08-06 2650: 2608: 2386:— overview of C string handling 2288:Fixed-length strings of length 2284:is the number of symbols in Σ: 2235:. The lexicographical order is 1946: 1795: 1787: 1779: 1771: 1454:scripting programming languages 63:and is often implemented as an 3466: 3200:Boyer–Moore–Horspool algorithm 3190:Apostolico–Giancarlo algorithm 2588: 2584:. January 11, 1898. p. 3. 2571: 2549: 2531: 2516: 2478: 2459: 2438: 2348:and some constructions of the 2031:of which is the empty string. 2023:"is a substring of" defines a 1943: 1937: 1928: 1922: 1913: 1904: 1864: 1328:String manipulation algorithms 896:, encoding a string of length 268:in the 1950s, followed by the 1: 3525:Arbitrary-precision or bignum 2816:"The Prague Stringology Club" 2722:"Design Notes for Tiny BASIC" 2615:Sammet, Jean E. (July 1972). 2525:The Oxford English Dictionary 2431: 1531:instruction set architectures 1494:computer programming language 1271:, data corruption may ensue. 1134:The core data structure in a 559:with the entries storing the 272:language of the early 1960s. 91:) data types and structures. 83:may also denote more general 3205:Knuth–Morris–Pratt algorithm 3132:Damerau–Levenshtein distance 2660:; David, O'Hallaron (2003), 2167:is said to be a rotation of 2154: 1733:Concatenation and substrings 1595:is the number of symbols in 1439:Multimedia Control Interface 1287:String processing algorithms 599:. This representation of an 149:intended for human reading. 126:theoretical computer science 7: 3396:Compressed pattern matching 3122:Approximate string matching 3104: 2867:Crochemore, Maxime (2002). 2543:Online Etymology Dictionary 2380:— a string of binary digits 2365: 2267: 2130: 1801:String concatenation is an 1746:on Σ. For any two strings 1322:String searching algorithms 1222:or ASCII 0x27 single quote 10: 4427: 4236:Deterministic context-free 4161:Deterministic context-free 3401:Longest common subsequence 3312:Needleman–Wunsch algorithm 3182:String-searching algorithm 2601:A survey of symbolic logic 2466:de St. Germain, H. James. 2079:. Symmetrically, a string 1603:; it is often denoted as | 1547: 1482: 1479:Character string functions 1290: 1195: 641:) representation as 8-bit 580: 338:). Most strings in modern 279: 220: 179:" as a status update on a 143: 4376:String (computer science) 4347: 4309:Nondeterministic pushdown 4037: 3899: 3866:Strongly typed identifier 3808: 3694: 3664: 3629: 3517: 3474: 3411:Sequential pattern mining 3378: 3325: 3292: 3259: 3251:Commentz-Walter algorithm 3239:Multiple string searching 3238: 3180: 3172:Wagner–Fischer algorithm 3112: 3033:; Robert E. Wall (1990). 2952:. PWS-Kent. p. 114. 2931:"x86 string instructions" 2705:: CS1 maint: unfit URL ( 2624:Communications of the ACM 2091:if there exists a string 2051:if there exists a string 190:Alphabetical data, like " 175:User-entered text, like " 3421:String rewriting systems 3406:Longest common substring 3317:Smith–Waterman algorithm 3142:Gestalt pattern matching 2964:Let Σ be an alphabet. A 2871:. Singapore. p. v. 2215:Lexicographical ordering 1182:limited or no validation 1047: 786:Byte- and bit-terminated 603:-character string takes 523:Some languages, such as 438:Some languages, such as 113:or an anonymous string. 3941:Parametric polymorphism 3355:Generalized suffix tree 3279:Thompson's construction 2557:Whitney, William Dwight 1574:) over Σ is any finite 1460:, Ruby, and Tcl employ 1234:character (ASCII 0x5C). 894:succinct data structure 866:implicit data structure 609:implicit data structure 488:, the thread-safe Java 389:(known collectively as 332:variable-length strings 132:that are chosen from a 4396:Combinatorics on words 4314:Deterministic pushdown 4190:Recursively enumerable 3307:Hirschberg's algorithm 2578:"Old Union's Demise". 2565:The Century Dictionary 2277: 2252:00000 < 00001 ... 1976: 1888: 1702: 627:null-terminated string 593:C programming language 583:Null-terminated string 258: 247:For example, logician 87:or other sequence (or 28: 4411:Algorithms on strings 3162:Levenshtein automaton 3152:Jaro–Winkler distance 2992:Shoenfield, Joseph R. 2869:Jewels of stringology 2636:10.1145/361454.361485 2419:Rope (data structure) 2413:Incompressible string 2300:with sides of length 2275: 2233:lexicographical order 2147:= madam) is called a 2035:Prefixes and suffixes 1977: 1889: 1703: 1516:length("hello world") 1361:finite-state machines 1101:Other representations 778:, after the original 340:programming languages 253: 177:I got a new job today 22: 4299:Tree stack automaton 3210:Rabin–Karp algorithm 3167:Levenshtein distance 2979:∈ ℕ) and codomain Σ. 2966:nonempty word over Σ 2851:Columbia Engineering 1898: 1845: 1642: 1601:non-negative integer 1582:is a string over Σ. 1473:string interpolation 629:stored in a 10-byte 324:fixed-length strings 292:programming language 158:file upload complete 120:, which are used in 65:array data structure 33:computer programming 4207:range concatenation 4132:range concatenation 3946:Primitive data type 3851:Recursive data type 3704:Algebraic data type 3580:Quadruple precision 3365:Ternary search tree 3068:Here: sect.1.1, p.1 3029:Barbara H. Partee; 2390:C++ string handling 2203:=a), and cab (with 1839:monoid homomorphism 1462:regular expressions 1118:run length encoding 643:hexadecimal numbers 240:to speak about the 164:. In the program's 109:, it is known as a 51:or as some kind of 39:is traditionally a 4406:Syntactic entities 4381:Character encoding 3909:Abstract data type 3590:Extended precision 3549:Reduced precision 3294:Sequence alignment 3261:Regular expression 3056:. Addison-Wesley. 2996:Mathematical Logic 2793:2016-02-29 at the 2775:2016-03-13 at the 2756:2016-04-04 at the 2745:2016-03-04 at the 2691:Wearmouth, Geoff. 2581:Milwaukee Sentinel 2561:Smith, Benjamin E. 2508:(1986). "string". 2278: 2229:alphabetical order 1972: 1884: 1722:of Σ) is called a 1713:countably infinite 1698: 1687: 1456:, including Perl, 1338:Regular expression 1333:Sorting algorithms 1037:Strings as records 516:implement them as 377:languages such as 354:Character encoding 122:mathematical logic 100:dynamic allocation 77:character encoding 29: 4363: 4362: 4342: 4341: 4304:Embedded pushdown 4200:Context-sensitive 4125:Context-sensitive 4059:Abstract machines 4044:Chomsky hierarchy 3989: 3988: 3721:Associative array 3585:Octuple precision 3434: 3433: 3426:String operations 3014:of that language. 2738:Charles Crowley. 2720:Allison, Dennis. 2658:Bryant, Randal E. 2401:Connection string 2384:C string handling 2262:string operations 2256:String operations 1813:; for any string 1762:, and is denoted 1658: 1281:security problems 1277:C string handling 1159:Security concerns 1034: 1033: 923: 890:fixed-length code 853:field covers the 780:assembly language 768: 767: 680: 633:, along with its 336:Memory management 298:and in others as 238:linguistic theory 4418: 4391:Formal languages 4358: 4355: 4319:Visibly pushdown 4293:Thread automaton 4241:Visibly pushdown 4209: 4166:Visibly pushdown 4134: 4121:(no common name) 4040: 4039: 4027:formal languages 4016: 4009: 4002: 3993: 3992: 3961:Type constructor 3846:Opaque data type 3778:Record or Struct 3575:Double precision 3570:Single precision 3461: 3454: 3447: 3438: 3437: 3391:Pattern matching 3345:Suffix automaton 3147:Hamming distance 3099: 3092: 3085: 3076: 3075: 3069: 3067: 3055: 3045: 3039: 3038: 3031:Alice ter Meulen 3026: 3017: 3016: 2988: 2982: 2981: 2945: 2939: 2938: 2927: 2921: 2920: 2910:"Essential Perl" 2906: 2900: 2899: 2893: 2885: 2864: 2858: 2857: 2842: 2836: 2835: 2833: 2831: 2812: 2806: 2803: 2797: 2785: 2779: 2767: 2761: 2736: 2730: 2729: 2717: 2711: 2710: 2704: 2696: 2688: 2682: 2681: 2654: 2648: 2647: 2621: 2612: 2606: 2605: 2592: 2586: 2585: 2575: 2569: 2568: 2553: 2547: 2546: 2535: 2529: 2528: 2520: 2514: 2513: 2506:Burchfield, R.W. 2502: 2491: 2490: 2482: 2476: 2475: 2463: 2457: 2456: 2442: 2319:Infinite strings 2111:is said to be a 2083:is said to be a 2071:is said to be a 2043:is said to be a 1989:is said to be a 1981: 1979: 1978: 1973: 1971: 1970: 1893: 1891: 1890: 1885: 1871: 1863: 1862: 1811:identity element 1797: 1789: 1781: 1773: 1744:binary operation 1742:is an important 1707: 1705: 1704: 1699: 1697: 1696: 1686: 1673: 1654: 1653: 1539: 1517: 1513: 1509: 1490:String functions 1250:Non-text strings 1237:Terminated by a 1228:escape sequences 1225: 1221: 1122:Hamming encoding 1092: 1081: 1078: 1075: 1072: 1069: 1066: 1063: 1060: 1057: 1054: 1051: 968: 963: 958: 953: 948: 943: 938: 933: 928: 921: 918: 917: 859:available memory 809: 797: 793: 773: 700: 695: 690: 685: 676: 673: 668: 663: 658: 653: 648: 647: 637:(or more modern 625:An example of a 500: 493: 487: 276:String datatypes 204: 193: 178: 159: 156:A message like " 118:formal languages 49:literal constant 4426: 4425: 4421: 4420: 4419: 4417: 4416: 4415: 4401:Primitive types 4366: 4365: 4364: 4359: 4356: 4349: 4343: 4338: 4260: 4204: 4183: 4129: 4110: 4033: 4031:formal grammars 4023:Automata theory 4020: 3990: 3985: 3966:Type conversion 3901: 3895: 3831:Enumerated type 3804: 3690: 3684:null-terminated 3660: 3625: 3513: 3470: 3465: 3435: 3430: 3374: 3321: 3288: 3274:Regular grammar 3255: 3234: 3215:Raita algorithm 3176: 3127:Bitap algorithm 3108: 3103: 3073: 3072: 3064: 3046: 3042: 3027: 3020: 3006: 3005:978-156881135-2 2989: 2985: 2973: 2960: 2946: 2942: 2929: 2928: 2924: 2908: 2907: 2903: 2887: 2886: 2879: 2865: 2861: 2843: 2839: 2829: 2827: 2820:stringology.org 2814: 2813: 2809: 2804: 2800: 2795:Wayback Machine 2786: 2782: 2777:Wayback Machine 2768: 2764: 2758:Wayback Machine 2747:Wayback Machine 2737: 2733: 2718: 2714: 2698: 2697: 2689: 2685: 2674: 2655: 2651: 2619: 2613: 2609: 2593: 2589: 2577: 2576: 2572: 2554: 2550: 2537: 2536: 2532: 2522: 2521: 2517: 2503: 2494: 2483: 2479: 2464: 2460: 2444: 2443: 2439: 2434: 2368: 2270: 2258: 2217: 2195:=ε), bca (with 2157: 2133: 2037: 1966: 1962: 1899: 1896: 1895: 1867: 1858: 1854: 1846: 1843: 1842: 1735: 1725:formal language 1692: 1688: 1669: 1662: 1649: 1645: 1643: 1640: 1639: 1552: 1546: 1537: 1515: 1514:. For example, 1511: 1505: 1487: 1481: 1369: 1350:Sequence mining 1298:There are many 1296: 1289: 1252: 1223: 1219: 1212:quotation marks 1200: 1194: 1192:Literal strings 1175:bounds checking 1166:buffer overflow 1161: 1103: 1090: 1083: 1082: 1079: 1076: 1073: 1070: 1067: 1064: 1061: 1058: 1055: 1052: 1049: 1039: 1030: 1024: 1018: 1012: 1006: 1000: 994: 988: 982: 976: 966: 961: 956: 951: 946: 941: 936: 931: 926: 831: 829:Length-prefixed 817:used a special 807: 795: 791: 788: 771: 764: 758: 752: 746: 740: 734: 728: 722: 716: 710: 698: 693: 688: 683: 671: 666: 661: 656: 651: 585: 579: 577:Null-terminated 561:character codes 537: 535:Representations 499:NSMutableString 498: 489: 483: 436: 434:Implementations 356: 348:Null-terminated 344:computer memory 320: 300:composite types 296:primitive types 288:string datatype 284: 278: 251:wrote in 1918: 223: 202: 191: 176: 157: 146: 17: 12: 11: 5: 4424: 4414: 4413: 4408: 4403: 4398: 4393: 4388: 4383: 4378: 4361: 4360: 4348: 4345: 4344: 4340: 4339: 4337: 4336: 4334:Acyclic finite 4331: 4326: 4321: 4316: 4311: 4306: 4301: 4295: 4290: 4285: 4284:Turing Machine 4279: 4277:Linear-bounded 4274: 4269: 4267:Turing machine 4263: 4261: 4259: 4258: 4253: 4248: 4243: 4238: 4233: 4228: 4226:Tree-adjoining 4223: 4218: 4215: 4210: 4202: 4197: 4192: 4186: 4184: 4182: 4181: 4176: 4173: 4168: 4163: 4158: 4153: 4151:Tree-adjoining 4148: 4143: 4140: 4135: 4127: 4122: 4119: 4113: 4111: 4109: 4108: 4105: 4102: 4099: 4096: 4093: 4090: 4087: 4084: 4081: 4078: 4075: 4072: 4069: 4065: 4062: 4061: 4056: 4051: 4046: 4038: 4035: 4034: 4019: 4018: 4011: 4004: 3996: 3987: 3986: 3984: 3983: 3978: 3973: 3968: 3963: 3958: 3953: 3948: 3943: 3938: 3937: 3936: 3926: 3921: 3919:Data structure 3916: 3911: 3905: 3903: 3897: 3896: 3894: 3893: 3888: 3883: 3878: 3873: 3868: 3863: 3858: 3853: 3848: 3843: 3838: 3833: 3828: 3823: 3818: 3812: 3810: 3806: 3805: 3803: 3802: 3801: 3800: 3790: 3785: 3780: 3775: 3770: 3765: 3764: 3763: 3753: 3748: 3743: 3738: 3733: 3728: 3723: 3718: 3713: 3712: 3711: 3700: 3698: 3692: 3691: 3689: 3688: 3687: 3686: 3676: 3670: 3668: 3662: 3661: 3659: 3658: 3653: 3652: 3651: 3646: 3635: 3633: 3627: 3626: 3624: 3623: 3618: 3613: 3612: 3611: 3601: 3600: 3599: 3598: 3597: 3587: 3582: 3577: 3572: 3567: 3566: 3565: 3560: 3558:Half precision 3555: 3545:Floating point 3542: 3537: 3532: 3527: 3521: 3519: 3515: 3514: 3512: 3511: 3506: 3501: 3496: 3491: 3486: 3480: 3478: 3472: 3471: 3464: 3463: 3456: 3449: 3441: 3432: 3431: 3429: 3428: 3423: 3418: 3413: 3408: 3403: 3398: 3393: 3388: 3382: 3380: 3376: 3375: 3373: 3372: 3367: 3362: 3357: 3352: 3347: 3342: 3337: 3331: 3329: 3327:Data structure 3323: 3322: 3320: 3319: 3314: 3309: 3304: 3298: 3296: 3290: 3289: 3287: 3286: 3281: 3276: 3271: 3265: 3263: 3257: 3256: 3254: 3253: 3248: 3242: 3240: 3236: 3235: 3233: 3232: 3227: 3222: 3220:Trigram search 3217: 3212: 3207: 3202: 3197: 3192: 3186: 3184: 3178: 3177: 3175: 3174: 3169: 3164: 3159: 3154: 3149: 3144: 3139: 3134: 3129: 3124: 3118: 3116: 3110: 3109: 3102: 3101: 3094: 3087: 3079: 3071: 3070: 3062: 3040: 3018: 3004: 2983: 2971: 2958: 2940: 2922: 2901: 2877: 2859: 2837: 2807: 2798: 2780: 2762: 2751:"Introduction" 2731: 2712: 2683: 2672: 2649: 2607: 2587: 2570: 2548: 2530: 2515: 2492: 2477: 2458: 2436: 2435: 2433: 2430: 2429: 2428: 2422: 2416: 2410: 2404: 2398: 2393: 2387: 2381: 2375: 2367: 2364: 2335:limit topology 2331: 2330: 2327:complete graph 2316: 2305: 2269: 2266: 2257: 2254: 2216: 2213: 2156: 2153: 2132: 2129: 2036: 2033: 1969: 1965: 1961: 1958: 1955: 1952: 1949: 1945: 1942: 1939: 1936: 1933: 1930: 1927: 1924: 1921: 1918: 1915: 1912: 1909: 1906: 1903: 1883: 1880: 1877: 1874: 1870: 1866: 1861: 1857: 1853: 1850: 1734: 1731: 1709: 1708: 1695: 1691: 1685: 1682: 1679: 1676: 1672: 1668: 1665: 1661: 1657: 1652: 1648: 1633:Kleene closure 1545: 1542: 1527:microprocessor 1480: 1477: 1424: 1423: 1418: 1413: 1408: 1403: 1398: 1393: 1388: 1383: 1378: 1368: 1365: 1353: 1352: 1347: 1341: 1335: 1330: 1325: 1288: 1285: 1251: 1248: 1247: 1246: 1235: 1210:Surrounded by 1198:String literal 1196:Main article: 1193: 1190: 1186:code injection 1160: 1157: 1102: 1099: 1048: 1038: 1035: 1032: 1031: 1028: 1025: 1022: 1019: 1016: 1013: 1010: 1007: 1004: 1001: 998: 995: 992: 989: 986: 983: 980: 977: 974: 970: 969: 964: 959: 954: 949: 944: 939: 934: 929: 924: 830: 827: 787: 784: 766: 765: 762: 759: 756: 753: 750: 747: 744: 741: 738: 735: 732: 729: 726: 723: 720: 717: 714: 711: 706: 702: 701: 696: 691: 686: 681: 674: 669: 664: 659: 654: 589:null character 581:Main article: 578: 575: 536: 533: 435: 432: 355: 352: 319: 316: 312:string literal 277: 274: 262:Jean E. Sammet 234:symbolic logic 222: 219: 211: 210: 199: 188: 173: 170:string literal 145: 142: 111:string literal 47:, either as a 15: 9: 6: 4: 3: 2: 4423: 4412: 4409: 4407: 4404: 4402: 4399: 4397: 4394: 4392: 4389: 4387: 4384: 4382: 4379: 4377: 4374: 4373: 4371: 4353: 4352:proper subset 4346: 4335: 4332: 4330: 4327: 4325: 4322: 4320: 4317: 4315: 4312: 4310: 4307: 4305: 4302: 4300: 4296: 4294: 4291: 4289: 4286: 4283: 4280: 4278: 4275: 4273: 4270: 4268: 4265: 4264: 4262: 4257: 4254: 4252: 4249: 4247: 4244: 4242: 4239: 4237: 4234: 4232: 4229: 4227: 4224: 4222: 4219: 4216: 4214: 4211: 4208: 4203: 4201: 4198: 4196: 4193: 4191: 4188: 4187: 4185: 4180: 4179:Non-recursive 4177: 4174: 4172: 4169: 4167: 4164: 4162: 4159: 4157: 4154: 4152: 4149: 4147: 4144: 4141: 4139: 4136: 4133: 4128: 4126: 4123: 4120: 4118: 4115: 4114: 4112: 4106: 4103: 4100: 4097: 4094: 4091: 4088: 4085: 4082: 4079: 4076: 4073: 4070: 4067: 4066: 4064: 4063: 4060: 4057: 4055: 4052: 4050: 4047: 4045: 4042: 4041: 4036: 4032: 4028: 4024: 4017: 4012: 4010: 4005: 4003: 3998: 3997: 3994: 3982: 3979: 3977: 3974: 3972: 3969: 3967: 3964: 3962: 3959: 3957: 3954: 3952: 3949: 3947: 3944: 3942: 3939: 3935: 3932: 3931: 3930: 3927: 3925: 3922: 3920: 3917: 3915: 3912: 3910: 3907: 3906: 3904: 3898: 3892: 3889: 3887: 3884: 3882: 3879: 3877: 3874: 3872: 3869: 3867: 3864: 3862: 3859: 3857: 3854: 3852: 3849: 3847: 3844: 3842: 3841:Function type 3839: 3837: 3834: 3832: 3829: 3827: 3824: 3822: 3819: 3817: 3814: 3813: 3811: 3807: 3799: 3796: 3795: 3794: 3791: 3789: 3786: 3784: 3781: 3779: 3776: 3774: 3771: 3769: 3766: 3762: 3759: 3758: 3757: 3754: 3752: 3749: 3747: 3744: 3742: 3739: 3737: 3734: 3732: 3729: 3727: 3724: 3722: 3719: 3717: 3714: 3710: 3707: 3706: 3705: 3702: 3701: 3699: 3697: 3693: 3685: 3682: 3681: 3680: 3677: 3675: 3672: 3671: 3669: 3667: 3663: 3657: 3654: 3650: 3647: 3645: 3642: 3641: 3640: 3637: 3636: 3634: 3632: 3628: 3622: 3619: 3617: 3614: 3610: 3607: 3606: 3605: 3602: 3596: 3593: 3592: 3591: 3588: 3586: 3583: 3581: 3578: 3576: 3573: 3571: 3568: 3564: 3561: 3559: 3556: 3554: 3551: 3550: 3548: 3547: 3546: 3543: 3541: 3538: 3536: 3533: 3531: 3528: 3526: 3523: 3522: 3520: 3516: 3510: 3507: 3505: 3502: 3500: 3497: 3495: 3492: 3490: 3487: 3485: 3482: 3481: 3479: 3477: 3476:Uninterpreted 3473: 3469: 3462: 3457: 3455: 3450: 3448: 3443: 3442: 3439: 3427: 3424: 3422: 3419: 3417: 3414: 3412: 3409: 3407: 3404: 3402: 3399: 3397: 3394: 3392: 3389: 3387: 3384: 3383: 3381: 3377: 3371: 3368: 3366: 3363: 3361: 3358: 3356: 3353: 3351: 3348: 3346: 3343: 3341: 3338: 3336: 3333: 3332: 3330: 3328: 3324: 3318: 3315: 3313: 3310: 3308: 3305: 3303: 3300: 3299: 3297: 3295: 3291: 3285: 3282: 3280: 3277: 3275: 3272: 3270: 3267: 3266: 3264: 3262: 3258: 3252: 3249: 3247: 3244: 3243: 3241: 3237: 3231: 3228: 3226: 3223: 3221: 3218: 3216: 3213: 3211: 3208: 3206: 3203: 3201: 3198: 3196: 3193: 3191: 3188: 3187: 3185: 3183: 3179: 3173: 3170: 3168: 3165: 3163: 3160: 3158: 3155: 3153: 3150: 3148: 3145: 3143: 3140: 3138: 3137:Edit distance 3135: 3133: 3130: 3128: 3125: 3123: 3120: 3119: 3117: 3115: 3114:String metric 3111: 3107: 3100: 3095: 3093: 3088: 3086: 3081: 3080: 3077: 3065: 3063:0-201-02988-X 3059: 3054: 3053: 3044: 3036: 3032: 3025: 3023: 3015: 3013: 3007: 3001: 2997: 2993: 2987: 2980: 2978: 2974: 2967: 2961: 2959:0-53492-373-9 2955: 2951: 2944: 2936: 2932: 2926: 2919: 2915: 2911: 2905: 2897: 2891: 2884: 2880: 2878:981-02-4782-6 2874: 2870: 2863: 2856: 2852: 2848: 2841: 2825: 2821: 2817: 2811: 2802: 2796: 2792: 2789: 2784: 2778: 2774: 2771: 2766: 2759: 2755: 2752: 2748: 2744: 2741: 2735: 2727: 2723: 2716: 2708: 2702: 2694: 2687: 2679: 2675: 2673:0-13-034074-X 2669: 2665: 2664: 2659: 2653: 2645: 2641: 2637: 2633: 2629: 2625: 2618: 2611: 2603: 2602: 2597: 2591: 2583: 2582: 2574: 2566: 2562: 2558: 2552: 2544: 2540: 2539:"string (n.)" 2534: 2526: 2519: 2511: 2507: 2501: 2499: 2497: 2488: 2481: 2473: 2469: 2462: 2455: 2451: 2447: 2441: 2437: 2426: 2425:String metric 2423: 2420: 2417: 2414: 2411: 2408: 2405: 2402: 2399: 2397: 2394: 2391: 2388: 2385: 2382: 2379: 2376: 2373: 2370: 2369: 2363: 2361: 2357: 2353: 2351: 2347: 2346:-adic numbers 2345: 2340: 2339:inverse limit 2336: 2328: 2324: 2320: 2317: 2314: 2312: 2306: 2303: 2299: 2296:-dimensional 2295: 2291: 2287: 2286: 2285: 2283: 2274: 2265: 2263: 2253: 2250: 2245: 2242: 2238: 2234: 2230: 2226: 2222: 2212: 2210: 2206: 2202: 2198: 2194: 2190: 2186: 2182: 2178: 2174: 2170: 2166: 2162: 2152: 2150: 2146: 2142: 2138: 2128: 2126: 2125:prefix orders 2122: 2118: 2114: 2110: 2107:is nonempty, 2106: 2102: 2098: 2094: 2090: 2086: 2082: 2078: 2074: 2070: 2067:is nonempty, 2066: 2062: 2058: 2054: 2050: 2046: 2042: 2032: 2030: 2029:least element 2026: 2025:partial order 2022: 2018: 2014: 2010: 2006: 2002: 1998: 1994: 1993: 1988: 1983: 1967: 1959: 1956: 1953: 1950: 1940: 1934: 1931: 1925: 1919: 1916: 1910: 1907: 1901: 1878: 1872: 1859: 1851: 1848: 1840: 1836: 1832: 1828: 1824: 1820: 1816: 1812: 1808: 1804: 1799: 1794: =  1793: 1786: =  1785: 1778: =  1777: 1770: =  1769: 1765: 1761: 1757: 1753: 1749: 1745: 1741: 1740: 1739:Concatenation 1730: 1727: 1726: 1721: 1716: 1714: 1693: 1680: 1674: 1666: 1663: 1659: 1655: 1650: 1638: 1637: 1636: 1634: 1629: 1627: 1622: 1620: 1616: 1612: 1611: 1606: 1602: 1598: 1594: 1590: 1589: 1583: 1581: 1577: 1573: 1569: 1565: 1561: 1557: 1551: 1544:Formal theory 1541: 1536: 1532: 1528: 1523: 1521: 1520:concatenation 1508: 1502: 1501:string length 1497: 1495: 1491: 1486: 1476: 1474: 1469: 1467: 1463: 1459: 1455: 1450: 1448: 1444: 1440: 1436: 1431: 1429: 1422: 1419: 1417: 1414: 1412: 1409: 1407: 1404: 1402: 1399: 1397: 1394: 1392: 1389: 1387: 1384: 1382: 1379: 1377: 1374: 1373: 1372: 1364: 1362: 1358: 1351: 1348: 1345: 1342: 1339: 1336: 1334: 1331: 1329: 1326: 1323: 1320: 1319: 1318: 1315: 1313: 1309: 1305: 1301: 1294: 1293:String theory 1284: 1282: 1278: 1272: 1270: 1265: 1261: 1257: 1244: 1240: 1236: 1233: 1229: 1218:double quote 1217: 1213: 1209: 1208: 1207: 1204: 1199: 1189: 1187: 1183: 1178: 1176: 1171: 1167: 1156: 1154: 1150: 1146: 1142: 1137: 1132: 1130: 1125: 1123: 1119: 1113: 1110: 1108: 1098: 1096: 1088: 1046: 1044: 1026: 1020: 1014: 1008: 1002: 996: 990: 984: 978: 972: 971: 965: 960: 955: 950: 945: 940: 935: 930: 925: 920: 919: 916: 913: 909: 907: 903: 899: 895: 891: 888:) space (see 887: 883: 879: 876:space, where 875: 871: 867: 862: 860: 856: 855:address space 852: 848: 844: 840: 839:Pascal string 836: 826: 823: 820: 816: 811: 805: 801: 783: 781: 777: 776:ASCIZ strings 760: 754: 748: 742: 736: 730: 724: 718: 712: 709: 704: 703: 697: 692: 687: 682: 679: 675: 670: 665: 660: 655: 650: 649: 646: 644: 640: 636: 632: 628: 623: 621: 617: 612: 610: 606: 602: 598: 594: 590: 584: 574: 572: 567: 562: 558: 553: 550: 545: 543: 532: 530: 526: 521: 519: 515: 511: 506: 504: 497: 492: 486: 485:StringBuilder 481: 477: 473: 469: 465: 461: 457: 453: 449: 445: 441: 431: 429: 424: 420: 416: 412: 409: 405: 400: 396: 392: 388: 384: 380: 376: 372: 370: 366: 362: 351: 349: 345: 341: 337: 333: 329: 325: 318:String length 315: 313: 309: 305: 301: 297: 293: 289: 283: 273: 271: 267: 263: 260:According to 257: 252: 250: 245: 243: 239: 235: 230: 228: 218: 216: 208: 200: 197: 189: 186: 182: 174: 171: 167: 163: 155: 154: 153: 150: 141: 139: 135: 131: 127: 123: 119: 114: 112: 108: 103: 101: 97: 92: 90: 86: 82: 78: 74: 70: 66: 62: 58: 54: 50: 46: 42: 38: 34: 26: 21: 4288:Nested stack 4231:Context-free 4156:Context-free 4117:Unrestricted 3746:Intersection 3678: 3340:Suffix array 3246:Aho–Corasick 3157:Lee distance 3105: 3051: 3043: 3034: 3011: 3009: 2995: 2986: 2976: 2969: 2965: 2963: 2949: 2943: 2925: 2917: 2904: 2882: 2868: 2862: 2854: 2850: 2840: 2828:. Retrieved 2819: 2810: 2801: 2783: 2765: 2734: 2715: 2686: 2662: 2652: 2627: 2623: 2610: 2600: 2590: 2579: 2573: 2564: 2551: 2542: 2533: 2524: 2518: 2509: 2480: 2471: 2461: 2453: 2440: 2407:Empty string 2356:Isomorphisms 2354: 2343: 2332: 2322: 2310: 2301: 2293: 2289: 2281: 2279: 2259: 2246: 2241:well-founded 2218: 2208: 2204: 2200: 2196: 2192: 2188: 2184: 2183:= 00110 and 2180: 2176: 2172: 2168: 2164: 2160: 2158: 2144: 2140: 2136: 2134: 2120: 2116: 2112: 2108: 2104: 2100: 2096: 2092: 2088: 2080: 2076: 2072: 2068: 2064: 2060: 2056: 2052: 2048: 2040: 2038: 2016: 2012: 2008: 2004: 2000: 1996: 1990: 1986: 1984: 1894:, such that 1826: 1822: 1818: 1814: 1800: 1791: 1783: 1775: 1767: 1763: 1759: 1755: 1751: 1747: 1737: 1736: 1723: 1717: 1710: 1630: 1625: 1623: 1618: 1614: 1610:empty string 1608: 1604: 1596: 1592: 1591:of a string 1586: 1584: 1579: 1571: 1567: 1563: 1553: 1524: 1498: 1488: 1470: 1451: 1443:embedded SQL 1432: 1425: 1370: 1357:suffix trees 1354: 1316: 1307: 1297: 1273: 1253: 1205: 1201: 1179: 1162: 1147:of lines, a 1133: 1126: 1114: 1111: 1104: 1095:string (C++) 1084: 1040: 914: 910: 905: 901: 897: 885: 881: 877: 873: 869: 863: 850: 842: 838: 832: 824: 812: 789: 775: 769: 626: 624: 615: 613: 604: 600: 596: 586: 554: 548: 546: 538: 522: 518:linked lists 507: 491:StringBuffer 475: 451: 437: 427: 413: 373: 357: 331: 328:compile time 323: 321: 311: 307: 287: 285: 259: 254: 246: 231: 224: 212: 207:query string 203:?action=edit 181:social media 151: 147: 115: 104: 93: 80: 36: 30: 4297:restricted 3976:Type theory 3971:Type system 3821:Bottom type 3768:Option type 3709:generalized 3595:Long double 3540:Fixed point 3350:Suffix tree 2596:Lewis, C.I. 2372:Binary-safe 2225:total order 1835:free monoid 1807:commutative 1803:associative 1554:Let Σ be a 1538:REPNZ MOVSB 1308:stringology 1269:8-bit clean 1264:binary data 1260:byte string 1149:piece table 1145:linked list 1136:text editor 620:binary data 549:byte string 503:thread-safe 428:code points 375:Logographic 249:C. I. Lewis 227:compositors 205:" as a URL 166:source code 107:source code 4386:Data types 4370:Categories 3881:Empty type 3876:Type class 3826:Collection 3783:Refinement 3761:metaobject 3609:signedness 3468:Data types 3012:expression 2975:(for some 2749:. Section 2563:"string". 2523:"string". 2432:References 2350:Cantor set 2149:palindrome 2115:suffix of 2095:such that 2075:prefix of 2055:such that 2027:on Σ, the 2011:such that 1805:, but non- 1572:expression 1556:finite set 1548:See also: 1535:intel x86m 1483:See also: 1340:algorithms 1300:algorithms 1256:bit string 1141:gap buffer 884:takes log( 494:, and the 460:JavaScript 395:ideographs 280:See also: 136:called an 45:characters 25:characters 4251:Star-free 4205:Positive 4195:Decidable 4130:Positive 4054:Languages 3956:Subtyping 3951:Interface 3934:metaclass 3886:Unit type 3856:Semaphore 3836:Exception 3741:Inductive 3731:Dependent 3696:Composite 3674:Character 3656:Reference 3553:Minifloat 3509:Bit array 3037:. Kluwer. 2994:(2010) . 2890:cite book 2468:"Strings" 2378:Bit array 2313:-ary tree 2298:hypercube 2159:A string 2155:Rotations 2039:A string 1992:substring 1985:A string 1968:∗ 1964:Σ 1960:∈ 1948:∀ 1873:∪ 1865:↦ 1860:∗ 1856:Σ 1690:Σ 1675:∪ 1667:∈ 1660:⋃ 1651:∗ 1647:Σ 1312:Zvi Galil 1243:INI files 1232:backslash 1188:attacks. 868:, taking 819:word mark 547:The term 520:instead. 476:immutable 408:Shift-JIS 350:" below. 192:AGATGCCGT 162:end users 61:data type 4049:Grammars 3981:Variable 3871:Top type 3736:Equality 3644:physical 3621:Rational 3616:Interval 3563:bfloat16 2935:Archived 2914:Archived 2824:Archived 2791:Archived 2773:Archived 2754:Archived 2743:Archived 2726:Archived 2701:cite web 2678:archived 2598:(1918). 2450:Archived 2366:See also 2309:perfect 2268:Topology 2249:Shortlex 2221:ordering 2131:Reversal 2021:relation 1607:|. The 1576:sequence 1560:alphabet 1346:a string 1304:analyzed 1170:attacker 1107:C string 843:P-string 815:IBM 1401 798:used by 597:C string 542:ISO 8859 404:ISO-2022 383:Japanese 185:database 138:alphabet 96:variable 53:variable 41:sequence 4272:Decider 4246:Regular 4213:Indexed 4171:Regular 4138:Indexed 3924:Generic 3900:Related 3816:Boolean 3773:Product 3649:virtual 3639:Address 3631:Pointer 3604:Integer 3535:Decimal 3530:Complex 3518:Numeric 3416:Sorting 3386:Parsing 3106:Strings 2644:2003242 2019:. The 1796:hugbear 1788:bearhug 1782:, then 1344:Parsing 1239:newline 1214:(ASCII 1151:, or a 1043:records 908:space. 900:in log( 514:Haskell 452:mutable 415:Unicode 379:Chinese 369:mangled 308:literal 221:History 144:Purpose 130:symbols 57:mutated 4324:Finite 4256:Finite 4101:Type-3 4092:Type-2 4074:Type-1 4068:Type-0 3914:Boxing 3902:topics 3861:Stream 3798:tagged 3756:Object 3679:String 3060:  3002:  2956:  2875:  2830:23 May 2670:  2642:  2325:-node 2211:=ab). 2191:=abc, 2113:proper 2085:suffix 2073:proper 2045:prefix 1997:factor 1833:, the 1831:monoid 1774:, and 1720:subset 1588:length 1564:string 1507:length 1458:Python 1447:printf 1411:SNOBOL 1087:hidden 1062:length 1059:size_t 1053:string 922:length 851:length 835:Pascal 631:buffer 616:length 571:UTF-32 557:arrays 529:Erlang 525:Prolog 510:arrays 470:, and 468:Python 423:UTF-32 387:Korean 385:, and 365:EBCDIC 304:syntax 302:. The 270:SNOBOL 242:formal 236:, and 85:arrays 81:String 37:string 4282:PTIME 3809:Other 3793:Union 3726:Class 3716:Array 3499:Tryte 3379:Other 3335:DAFSA 3302:BLAST 2640:S2CID 2630:(7). 2620:(PDF) 2237:total 2227:(cf. 2199:=bc, 2103:. If 2063:. If 1580:01011 1550:Tuple 1525:Some 1452:Many 1437:like 1433:Some 1426:Many 1386:MUMPS 1224:'str' 1220:"str" 1129:ropes 1050:class 847:words 806:used 772:FRANK 639:UTF-8 635:ASCII 496:Cocoa 419:UTF-8 361:ASCII 266:COMIT 73:words 69:bytes 4029:and 3929:Kind 3891:Void 3751:List 3666:Text 3504:Word 3494:Trit 3489:Byte 3370:Trie 3360:Rope 3058:ISBN 3000:ISBN 2954:ISBN 2896:link 2873:ISBN 2832:2015 2707:link 2668:ISBN 2247:See 2207:=c, 2007:and 1825:ε = 1790:and 1772:bear 1750:and 1585:The 1568:word 1566:(or 1562:. A 1435:APIs 1428:Unix 1401:Ruby 1396:Rexx 1391:Perl 1381:Icon 1359:and 1267:not 1216:0x22 1153:rope 1143:, a 1091:text 1074:text 1068:char 904:) + 804:ZX80 645:is: 527:and 480:.NET 456:Java 448:Ruby 446:and 444:Perl 406:and 215:bits 124:and 89:list 71:(or 35:, a 3788:Set 3484:Bit 2632:doi 2304:-1. 2171:if 2087:of 2047:of 2017:usv 1999:of 1995:or 1982:). 1817:, ε 1780:hug 1617:or 1570:or 1540:). 1529:'s 1512:len 1510:or 1445:or 1421:TTM 1416:Tcl 1406:sed 1376:AWK 1258:or 841:or 800:CDC 678:NUL 566:UCS 482:'s 464:Lua 440:C++ 399:EUC 391:CJK 363:or 310:or 196:DNA 134:set 116:In 67:of 43:of 31:In 4372:: 4025:: 3021:^ 3008:. 2962:. 2933:. 2912:. 2892:}} 2888:{{ 2881:. 2853:. 2849:. 2822:. 2818:. 2724:. 2703:}} 2699:{{ 2676:, 2638:. 2628:15 2626:. 2622:. 2559:; 2541:. 2495:^ 2470:. 2448:. 2362:. 2264:. 2177:vu 2175:= 2165:uv 2163:= 2127:. 2101:us 2099:= 2061:su 2059:= 2015:= 1821:= 1798:. 1792:ts 1784:st 1764:st 1621:. 1496:. 1468:. 1441:, 1363:. 1283:. 1124:. 1097:. 1080:}; 1029:16 1027:77 1023:16 1021:66 1017:16 1015:65 1011:16 1009:6B 1005:16 1003:4B 999:16 997:4E 993:16 991:41 987:16 985:52 981:16 979:46 975:16 973:05 872:+ 861:. 792:$ 763:16 761:77 757:16 755:66 751:16 749:65 745:16 743:6B 739:16 737:00 733:16 731:4B 727:16 725:4E 721:16 719:41 715:16 713:52 708:16 705:46 622:. 611:. 505:. 472:Go 466:, 462:, 458:, 442:, 381:, 314:. 286:A 140:. 79:. 4217:— 4175:— 4142:— 4107:— 4104:— 4098:— 4095:— 4089:— 4086:— 4083:— 4080:— 4077:— 4071:— 4015:e 4008:t 4001:v 3460:e 3453:t 3446:v 3098:e 3091:t 3084:v 3066:. 2977:n 2972:n 2970:I 2898:) 2834:. 2760:. 2709:) 2646:. 2634:: 2545:. 2489:. 2474:. 2344:p 2329:. 2323:k 2315:. 2311:k 2302:k 2294:n 2290:n 2282:k 2209:v 2205:u 2201:v 2197:u 2193:v 2189:u 2185:v 2181:u 2173:t 2169:t 2161:s 2145:s 2141:s 2137:s 2121:t 2117:t 2109:s 2105:u 2097:t 2093:u 2089:t 2081:s 2077:t 2069:s 2065:u 2057:t 2053:u 2049:t 2041:s 2013:t 2009:v 2005:u 2001:t 1987:s 1957:t 1954:, 1951:s 1944:) 1941:t 1938:( 1935:L 1932:+ 1929:) 1926:s 1923:( 1920:L 1917:= 1914:) 1911:t 1908:s 1905:( 1902:L 1882:} 1879:0 1876:{ 1869:N 1852:: 1849:L 1827:s 1823:s 1819:s 1815:s 1776:t 1768:s 1760:t 1756:s 1752:t 1748:s 1694:n 1684:} 1681:0 1678:{ 1671:N 1664:n 1656:= 1626:n 1619:λ 1615:ε 1605:s 1597:s 1593:s 1295:. 1245:. 1077:; 1071:* 1065:; 1056:{ 967:w 962:f 957:e 952:k 947:K 942:N 937:A 932:R 927:F 906:n 902:n 898:n 886:n 882:n 878:k 874:k 870:n 808:" 796:: 699:w 694:f 689:e 684:k 672:K 667:N 662:A 657:R 652:F 605:n 601:n 564:( 198:. 187:. 172:.

Index

Diagram of String data in computing. Shows the word "example" with each letter in a separate box. The word "String" is above, referring to the entire sentence. The label "Character" is below and points to an individual box.
characters
computer programming
sequence
characters
literal constant
variable
mutated
data type
array data structure
bytes
words
character encoding
arrays
list
variable
dynamic allocation
source code
string literal
formal languages
mathematical logic
theoretical computer science
symbols
set
alphabet
end users
source code
string literal
social media
database

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.