Knowledge

Re-Pair

Source đź“ť

2961: 2951: 946: 961: 956: 224: 25: 113: 66: 980:, described below, sequentially on all the axiom's symbols. Intuitively, rules are encoded as they are visited in a depth-first traversal of the grammar. The first time a rule is visited, its right hand side is encoded recursively and a new code is assigned to the rule. From that point, whenever the rule is reached, the assigned value is written. 940:
Since the hash table and the priority queue refer to the same elements (pairs), they can be implemented by a common data structure called PAIR with pointers for the hash table (h_next) and the priority queue (p_next and p_prev). Furthermore, each PAIR points to the beginning of the first (f_pos) and
234:
In their paper the algorithm is presented together with a detailed description of the data structures required to implement it with linear time and space complexity. The experiments showed that Re-Pair achieves high compression ratios and offers good performance for decompression. However, the major
1672:
This algorithm reduces the size of the grammar generated by Re-Pair by replacing maximal repeats first. When a pair is identified as the most frequent pair, i.e. the one to be replaced in the current step of the algorithm, MR-repair extends the pair to find the longest string that occurs the same
1560:
This is one of the most popular implementations of Re-Pair. It uses the data structures described here (the ones proposed when it was originally published) and encodes the resulting grammar using the implicit encoding method. Most later versions of Re-Pair are implemented starting from this one.
1544:
Instead of manipulating the input string as a sequence of characters, this tool first groups the characters into phrases (for instance, words). The compression algorithm works as Re-Pair but considering the identified phrases as the terminals of the grammar. The tool accepts different options to
214:
The grammar is built by recursively replacing the most frequent pair of characters occurring in the text. Once there is no pair of characters occurring twice, the resulting string is used as the axiom of the grammar. Therefore, the output grammar is such that all rules but the axiom have two
929:. Each element of the queue is a pair of symbols (terminals or previously defined pairs) that occur consecutively in the sequence. The priority of a pair is given by the number of occurrences of the pair in the remaining sequence. Each time a new pair is created, the priority queue is updated. 235:
drawback of the algorithm is its memory consumption, which is approximately 5 times the size of the input. Such memory usage is required in order to perform the compression in linear time but makes the algorithm impractical for compressing large files.
1656:
This version modifies the way the next pair to be replaced is chosen. Instead of simply considering the most frequent pair, it employs a heuristic which penalizes pairs that are not coherent with a Lempel–Ziv factorisation of the input string.
1513:
There exists a number of different implementations of Re-Pair. Each of these versions aims at improving one specific aspect of the algorithm, such as reducing the runtime, reducing the space consumption or increasing the compression ratio.
1673:
number of times as the pair to be replaced. The provided implementation encodes the grammar by simply listing the rules in text, hence this tool is purely for research purposes and cannot be used for compression as it is.
971:
Once the grammar has been built for a given input string, in order to achieve effective compression, this grammar has to be encoded efficiently. One of the simplest methods for encoding the grammar is the
1730:
Satoshi Yoshida and Takuya Kida, Effective Variable-Length-to-Fixed-Length Coding via a Re-Pair Algorithm, In Proc. of Data Compression Conference 2013 (DCC 2013), p. 532, Snowbird, Utah, USA, March 2013.
951:
The following two pictures show an example of how these data structures look after the initialization and after applying one step of the pairing process (pointers to NULL are not displayed):
1505:
was first introduced. However, most implementations of Re-Pair use the implicit encoding method due to its simplicity and good performance. Furthermore, it allows on-the-fly decompression.
753: 445: 135: 1635: 567: 211:
generating a single string: the input text. In order to perform the compression in linear time, it consumes the amount of memory that is approximately five times the size of its input.
960: 298: 1757:
Furuya, I., Takagi, T., Nakashima, Y., Inenaga, S., Bannai, H., & Kida, T. (2018). MR-RePair: Grammar Compression based on Maximal Repeats. arXiv preprint arXiv:1811.04596.
945: 955: 1479: 1341: 617: 475: 671: 644: 502: 371: 1427: 1545:
decide which sort of phrases are considered and encodes the resulting grammar into separated files: one containing the axiom and one with the rest of the rules.
587: 324: 1499: 1447: 1401: 1381: 1361: 920: 891: 862: 833: 813: 789: 344: 1748:
Gańczorz, M., & Jeż, A. (2017, April). Improvements on Re-Pair grammar compressor. In 2017 Data Compression Conference (DCC) (pp. 181–190). IEEE.
941:
the last (b_pos) occurrences of the string represented by the PAIR in the sequence. The following picture shows an overview of this data structure.
795:-th symbol of the input string plus two references to other positions in the sequence. These references point to the next/previous positions, say 1785: 2453: 2264: 83: 2153: 1641:
are considered in the second. The main difference between the two phases is the implementation of the corresponding priority queues.
3000: 2659: 2482: 2276: 1580:
to Fixed Length method, in which each rule (represented with a string of Variable Length) is encoded using a code of Fixed Length.
1967: 1739:
Bille, P., Gørtz, I. L., & Prezza, N. (2017, April). Space-efficient re-pair compression. In 2017 (DCC) (pp. 171-180). IEEE.
2664: 2241: 75: 922:
and all three occurrences are captured by the same reference (i.e. there is a variable in the grammar generating the string).
2394: 1712:
Larsson, N. J., & Moffat, A. (2000). Off-line dictionary-based compression. Proceedings of the IEEE, 88(11), 1722–1732.
2771: 2509: 2448: 2259: 2209: 2032: 1892: 1877: 1778: 1721:
R. Wan. "Browsing and Searching Compressed Documents". PhD thesis, University of Melbourne, Australia, December 2003.
179: 161: 52: 676: 2884: 376: 2995: 2894: 2732: 2583: 2502: 2296: 1602: 507: 2867: 2487: 2281: 2069: 2964: 2000: 2629: 2954: 2857: 2399: 1957: 1771: 241: 1947: 1942: 936:
to keep track of already defined pairs. This table is updated each time a new pair is created or removed.
227:
Construction of a Straight Line Program that generates the string w="xabcabcy123123zabc" by using Re-Pair
2889: 2816: 2654: 2634: 2578: 2236: 2027: 1830: 2990: 2899: 2840: 2766: 2614: 2204: 2199: 2054: 1897: 2904: 2477: 2271: 1972: 143: 139: 123: 38: 2845: 2216: 2103: 2059: 1872: 1855: 1845: 2470: 2221: 2005: 1850: 1452: 2742: 1314:
Another possibility is to separate the rules of the grammar into generations such that a rule
2874: 1652: 1317: 204: 592: 450: 2558: 2020: 1982: 1803: 1577: 763:
In order to achieve linear time complexity, Re-Pair requires the following data structures
649: 622: 480: 349: 208: 8: 2789: 2680: 2639: 2624: 2593: 2588: 2497: 2404: 2337: 2306: 2291: 2074: 1406: 572: 306: 2862: 2832: 2811: 2717: 2649: 2543: 2231: 2047: 2037: 1932: 1912: 1907: 1690: 1685: 1484: 1432: 1386: 1366: 1346: 896: 867: 838: 818: 798: 774: 755:
contains no repeated pair and therefore it is used as the axiom of the output grammar.
329: 200: 2443: 2806: 2794: 2776: 2644: 2528: 2465: 2311: 2226: 2182: 2143: 1825: 2781: 2737: 2710: 2705: 2563: 2548: 2458: 2367: 2362: 2191: 1924: 1902: 1794: 2700: 2514: 2438: 2419: 2389: 2357: 2323: 1882: 1820: 131: 87: 2492: 2286: 2015: 2010: 1867: 1840: 1812: 1595:
The algorithm performs in two phases. During the first phase, it considers the
2984: 2799: 2747: 2414: 2409: 2384: 2316: 1937: 1835: 1556: 2920: 1887: 1862: 1763: 1481:. Then these generations are encoded subsequently starting from generation 994:// By default, the extended ASCII charset are the terminals of the grammar. 238:
The image on the right shows how the algorithm works compresses the string
44: 1591: 2879: 2757: 2553: 2429: 2379: 1572: 2936: 2727: 2722: 2609: 2568: 2374: 1540: 1668: 1576:
Instead of the implicit encoding method, this implementation uses a
504:. Thus, at the end of the second iteration, the remaining string is 231:
Re-Pair was first introduced by NJ. Larsson and A. Moffat in 1999.
2850: 2695: 2352: 142:
external links, and converting useful links where appropriate into
223: 2619: 2093: 373:. On the second iteration, the most frequent pair in the string 2133: 2968: 2573: 2166: 2113: 2123: 1977: 1962: 1952: 16:
Lossless, but memory-consuming, data compression algorithm
2098: 2064: 1033:// Initially 8, to describe any extended ASCII character 1605: 1487: 1455: 1435: 1409: 1389: 1369: 1349: 1320: 899: 870: 841: 821: 801: 777: 679: 652: 625: 595: 575: 510: 483: 453: 379: 352: 332: 309: 244: 1629: 1493: 1473: 1441: 1421: 1395: 1375: 1355: 1335: 914: 885: 856: 827: 807: 783: 747: 665: 638: 611: 581: 561: 496: 469: 439: 365: 338: 318: 292: 126:may not follow Knowledge's policies or guidelines 2982: 1501:. This was the method proposed originally when 748:{\displaystyle w=xR_{2}R_{2}yR_{4}R_{4}zR_{2}} 203:algorithm that, given an input text, builds a 1779: 440:{\displaystyle w=xR_{1}cR_{1}cy123123zR_{1}c} 1793: 1624: 1606: 1630:{\displaystyle \lceil {\sqrt {n}}/3\rceil } 53:Learn how and when to remove these messages 1786: 1772: 562:{\displaystyle w=xR_{2}R_{2}y123123zR_{2}} 1708: 1706: 835:, such that the same substring begins at 180:Learn how and when to remove this message 162:Learn how and when to remove this message 771:representing the input string. Position 569:. In the next two iterations, the pairs 222: 966: 2983: 1703: 976:, which consists on invoking function 1767: 303:During the first iteration, the pair 1429:and the other belongs to generation 293:{\displaystyle w=xabcabcy123123zabc} 106: 88:move details into the article's body 59: 18: 13: 758: 673:respectively. Finally, the string 14: 3012: 1599:, i.e. those occurring more than 34:This article has multiple issues. 2960: 2959: 2950: 2949: 959: 954: 944: 215:symbols on the right-hand side. 111: 64: 23: 3001:Lossless compression algorithms 218: 42:or discuss these issues on the 1751: 1742: 1733: 1724: 1715: 1324: 909: 903: 880: 874: 851: 845: 477:, is replaced by a new symbol 346:, is replaced by a new symbol 326:, which occurs three times in 1: 1696: 791:of the sequence contains the 7: 1679: 1508: 10: 3017: 2841:Compressed data structures 2163:RLE + BWT + MTF + Huffman 1831:Asymmetric numeral systems 2945: 2929: 2913: 2831: 2756: 2688: 2679: 2602: 2536: 2527: 2428: 2345: 2336: 2252: 2200:Discrete cosine transform 2190: 2181: 2130:LZ77 + Huffman + context 2083: 1993: 1923: 1811: 1802: 1474:{\displaystyle j\leq i-1} 201:grammar-based compression 2905:Smallest grammar problem 982: 619:are replaced by symbols 2846:Compressed suffix array 2395:Nyquist–Shannon theorem 1336:{\displaystyle X\to YZ} 2996:Compression algorithms 1631: 1495: 1475: 1443: 1423: 1403:belongs to generation 1397: 1377: 1357: 1343:belongs to generation 1337: 916: 887: 858: 829: 809: 785: 749: 667: 640: 613: 612:{\displaystyle R_{3}3} 583: 563: 498: 471: 470:{\displaystyle R_{1}c} 441: 367: 340: 320: 294: 228: 2875:Kolmogorov complexity 2743:Video characteristics 2120:LZ77 + Huffman + ANS 1632: 1496: 1476: 1444: 1424: 1398: 1378: 1358: 1338: 917: 888: 859: 830: 810: 786: 750: 668: 666:{\displaystyle R_{4}} 641: 639:{\displaystyle R_{3}} 614: 584: 564: 499: 497:{\displaystyle R_{2}} 472: 442: 368: 366:{\displaystyle R_{1}} 341: 321: 295: 226: 205:straight-line program 2965:Compression software 2559:Compression artifact 2515:Psychoacoustic model 1603: 1597:high frequency pairs 1485: 1453: 1433: 1407: 1387: 1367: 1363:iff at least one of 1347: 1318: 967:Encoding the grammar 897: 868: 839: 819: 799: 775: 677: 650: 623: 593: 573: 508: 481: 451: 377: 350: 330: 307: 242: 209:context-free grammar 132:improve this article 2955:Compression formats 2594:Texture compression 2589:Standard test image 2405:Silence compression 1639:low frequency pairs 1422:{\displaystyle i-1} 144:footnote references 2863:Information theory 2718:Display resolution 2544:Chroma subsampling 1933:Byte pair encoding 1878:Shannon–Fano–Elias 1691:Sequitur algorithm 1686:Byte pair encoding 1627: 1491: 1471: 1439: 1419: 1393: 1373: 1353: 1333: 912: 883: 854: 825: 805: 781: 745: 663: 636: 609: 582:{\displaystyle 12} 579: 559: 494: 467: 437: 363: 336: 319:{\displaystyle ab} 316: 290: 229: 2978: 2977: 2827: 2826: 2777:Deblocking filter 2675: 2674: 2523: 2522: 2332: 2331: 2177: 2176: 1677: 1676: 1637:times, while the 1614: 1494:{\displaystyle 0} 1442:{\displaystyle j} 1396:{\displaystyle Z} 1376:{\displaystyle Y} 1356:{\displaystyle i} 1210:num_rules_encoded 1027:num_rules_encoded 985:num_rules_encoded 974:implicit encoding 915:{\displaystyle w} 886:{\displaystyle w} 857:{\displaystyle w} 828:{\displaystyle m} 808:{\displaystyle k} 784:{\displaystyle i} 339:{\displaystyle w} 197:recursive pairing 190: 189: 182: 172: 171: 164: 105: 104: 84:length guidelines 57: 3008: 2991:Data compression 2963: 2962: 2953: 2952: 2782:Lapped transform 2686: 2685: 2564:Image resolution 2549:Coding tree unit 2534: 2533: 2343: 2342: 2188: 2187: 1809: 1808: 1795:Data compression 1788: 1781: 1774: 1765: 1764: 1758: 1755: 1749: 1746: 1740: 1737: 1731: 1728: 1722: 1719: 1713: 1710: 1636: 1634: 1633: 1628: 1620: 1615: 1610: 1517: 1516: 1500: 1498: 1497: 1492: 1480: 1478: 1477: 1472: 1448: 1446: 1445: 1440: 1428: 1426: 1425: 1420: 1402: 1400: 1399: 1394: 1382: 1380: 1379: 1374: 1362: 1360: 1359: 1354: 1342: 1340: 1339: 1334: 1310: 1307: 1304: 1301: 1298: 1295: 1292: 1289: 1286: 1283: 1280: 1277: 1274: 1271: 1268: 1265: 1262: 1259: 1256: 1253: 1250: 1247: 1244: 1241: 1238: 1235: 1232: 1229: 1226: 1223: 1220: 1217: 1214: 1211: 1208: 1205: 1202: 1199: 1196: 1193: 1190: 1187: 1184: 1181: 1178: 1175: 1172: 1169: 1166: 1163: 1160: 1157: 1154: 1151: 1148: 1145: 1142: 1139: 1136: 1133: 1130: 1127: 1124: 1121: 1118: 1115: 1112: 1109: 1106: 1103: 1100: 1097: 1094: 1091: 1088: 1085: 1082: 1079: 1076: 1073: 1070: 1067: 1064: 1061: 1058: 1055: 1052: 1049: 1046: 1043: 1040: 1037: 1034: 1031: 1028: 1025: 1022: 1019: 1016: 1013: 1010: 1007: 1004: 1001: 998: 995: 992: 989: 986: 979: 963: 958: 948: 921: 919: 918: 913: 892: 890: 889: 884: 863: 861: 860: 855: 834: 832: 831: 826: 814: 812: 811: 806: 790: 788: 787: 782: 754: 752: 751: 746: 744: 743: 731: 730: 721: 720: 708: 707: 698: 697: 672: 670: 669: 664: 662: 661: 645: 643: 642: 637: 635: 634: 618: 616: 615: 610: 605: 604: 588: 586: 585: 580: 568: 566: 565: 560: 558: 557: 539: 538: 529: 528: 503: 501: 500: 495: 493: 492: 476: 474: 473: 468: 463: 462: 446: 444: 443: 438: 433: 432: 411: 410: 398: 397: 372: 370: 369: 364: 362: 361: 345: 343: 342: 337: 325: 323: 322: 317: 299: 297: 296: 291: 185: 178: 167: 160: 156: 153: 147: 115: 114: 107: 100: 97: 91: 82:Please read the 68: 67: 60: 49: 27: 26: 19: 3016: 3015: 3011: 3010: 3009: 3007: 3006: 3005: 2981: 2980: 2979: 2974: 2941: 2925: 2909: 2890:Rate–distortion 2823: 2752: 2671: 2598: 2519: 2424: 2420:Sub-band coding 2328: 2253:Predictive type 2248: 2173: 2140:LZSS + Huffman 2090:LZ77 + Huffman 2079: 1989: 1925:Dictionary type 1919: 1821:Adaptive coding 1798: 1792: 1762: 1761: 1756: 1752: 1747: 1743: 1738: 1734: 1729: 1725: 1720: 1716: 1711: 1704: 1699: 1682: 1616: 1609: 1604: 1601: 1600: 1578:Variable Length 1534:Phrase Browsing 1511: 1486: 1483: 1482: 1454: 1451: 1450: 1434: 1431: 1430: 1408: 1405: 1404: 1388: 1385: 1384: 1368: 1365: 1364: 1348: 1345: 1344: 1319: 1316: 1315: 1312: 1311: 1308: 1305: 1302: 1299: 1296: 1293: 1290: 1287: 1284: 1281: 1278: 1275: 1272: 1269: 1266: 1263: 1260: 1257: 1254: 1251: 1248: 1245: 1242: 1239: 1236: 1233: 1230: 1227: 1224: 1221: 1218: 1215: 1212: 1209: 1206: 1203: 1200: 1197: 1194: 1191: 1188: 1185: 1182: 1179: 1176: 1173: 1170: 1167: 1164: 1161: 1158: 1155: 1152: 1149: 1146: 1143: 1140: 1137: 1134: 1131: 1128: 1125: 1122: 1119: 1116: 1113: 1110: 1107: 1104: 1101: 1098: 1095: 1092: 1089: 1086: 1083: 1080: 1077: 1074: 1071: 1068: 1065: 1062: 1059: 1056: 1053: 1050: 1047: 1044: 1041: 1038: 1035: 1032: 1029: 1026: 1023: 1020: 1017: 1014: 1011: 1008: 1005: 1002: 999: 996: 993: 990: 987: 984: 977: 969: 898: 895: 894: 869: 866: 865: 840: 837: 836: 820: 817: 816: 800: 797: 796: 776: 773: 772: 761: 759:Data structures 739: 735: 726: 722: 716: 712: 703: 699: 693: 689: 678: 675: 674: 657: 653: 651: 648: 647: 630: 626: 624: 621: 620: 600: 596: 594: 591: 590: 574: 571: 570: 553: 549: 534: 530: 524: 520: 509: 506: 505: 488: 484: 482: 479: 478: 458: 454: 452: 449: 448: 428: 424: 406: 402: 393: 389: 378: 375: 374: 357: 353: 351: 348: 347: 331: 328: 327: 308: 305: 304: 243: 240: 239: 221: 186: 175: 174: 173: 168: 157: 151: 148: 129: 120:This article's 116: 112: 101: 95: 92: 81: 78:may be too long 73:This article's 69: 65: 28: 24: 17: 12: 11: 5: 3014: 3004: 3003: 2998: 2993: 2976: 2975: 2973: 2972: 2957: 2946: 2943: 2942: 2940: 2939: 2933: 2931: 2927: 2926: 2924: 2923: 2917: 2915: 2911: 2910: 2908: 2907: 2902: 2897: 2892: 2887: 2882: 2877: 2872: 2871: 2870: 2860: 2855: 2854: 2853: 2848: 2837: 2835: 2829: 2828: 2825: 2824: 2822: 2821: 2820: 2819: 2814: 2804: 2803: 2802: 2797: 2792: 2784: 2779: 2774: 2769: 2763: 2761: 2754: 2753: 2751: 2750: 2745: 2740: 2735: 2730: 2725: 2720: 2715: 2714: 2713: 2708: 2703: 2692: 2690: 2683: 2677: 2676: 2673: 2672: 2670: 2669: 2668: 2667: 2662: 2657: 2652: 2642: 2637: 2632: 2627: 2622: 2617: 2612: 2606: 2604: 2600: 2599: 2597: 2596: 2591: 2586: 2581: 2576: 2571: 2566: 2561: 2556: 2551: 2546: 2540: 2538: 2531: 2525: 2524: 2521: 2520: 2518: 2517: 2512: 2507: 2506: 2505: 2500: 2495: 2490: 2485: 2475: 2474: 2473: 2463: 2462: 2461: 2456: 2446: 2441: 2435: 2433: 2426: 2425: 2423: 2422: 2417: 2412: 2407: 2402: 2397: 2392: 2387: 2382: 2377: 2372: 2371: 2370: 2365: 2360: 2349: 2347: 2340: 2334: 2333: 2330: 2329: 2327: 2326: 2324:Psychoacoustic 2321: 2320: 2319: 2314: 2309: 2301: 2300: 2299: 2294: 2289: 2284: 2279: 2269: 2268: 2267: 2256: 2254: 2250: 2249: 2247: 2246: 2245: 2244: 2239: 2234: 2224: 2219: 2214: 2213: 2212: 2207: 2196: 2194: 2192:Transform type 2185: 2179: 2178: 2175: 2174: 2172: 2171: 2170: 2169: 2161: 2160: 2159: 2156: 2148: 2147: 2146: 2138: 2137: 2136: 2128: 2127: 2126: 2118: 2117: 2116: 2108: 2107: 2106: 2101: 2096: 2087: 2085: 2081: 2080: 2078: 2077: 2072: 2067: 2062: 2057: 2052: 2051: 2050: 2045: 2035: 2030: 2025: 2024: 2023: 2013: 2008: 2003: 1997: 1995: 1991: 1990: 1988: 1987: 1986: 1985: 1980: 1975: 1970: 1965: 1960: 1955: 1950: 1945: 1935: 1929: 1927: 1921: 1920: 1918: 1917: 1916: 1915: 1910: 1905: 1900: 1890: 1885: 1880: 1875: 1870: 1865: 1860: 1859: 1858: 1853: 1848: 1838: 1833: 1828: 1823: 1817: 1815: 1806: 1800: 1799: 1791: 1790: 1783: 1776: 1768: 1760: 1759: 1750: 1741: 1732: 1723: 1714: 1701: 1700: 1698: 1695: 1694: 1693: 1688: 1681: 1678: 1675: 1674: 1670: 1666: 1663: 1659: 1658: 1654: 1650: 1647: 1643: 1642: 1626: 1623: 1619: 1613: 1608: 1593: 1589: 1586: 1582: 1581: 1574: 1570: 1567: 1563: 1562: 1558: 1554: 1551: 1547: 1546: 1542: 1538: 1535: 1531: 1530: 1527: 1526:Implementation 1524: 1521: 1510: 1507: 1490: 1470: 1467: 1464: 1461: 1458: 1438: 1418: 1415: 1412: 1392: 1372: 1352: 1332: 1329: 1326: 1323: 983: 968: 965: 938: 937: 930: 927:priority queue 923: 911: 908: 905: 902: 882: 879: 876: 873: 853: 850: 847: 844: 824: 804: 780: 760: 757: 742: 738: 734: 729: 725: 719: 715: 711: 706: 702: 696: 692: 688: 685: 682: 660: 656: 633: 629: 608: 603: 599: 578: 556: 552: 548: 545: 542: 537: 533: 527: 523: 519: 516: 513: 491: 487: 466: 461: 457: 436: 431: 427: 423: 420: 417: 414: 409: 405: 401: 396: 392: 388: 385: 382: 360: 356: 335: 315: 312: 289: 286: 283: 280: 277: 274: 271: 268: 265: 262: 259: 256: 253: 250: 247: 220: 217: 188: 187: 170: 169: 152:September 2019 124:external links 119: 117: 110: 103: 102: 96:September 2019 72: 70: 63: 58: 32: 31: 29: 22: 15: 9: 6: 4: 3: 2: 3013: 3002: 2999: 2997: 2994: 2992: 2989: 2988: 2986: 2970: 2966: 2958: 2956: 2948: 2947: 2944: 2938: 2935: 2934: 2932: 2928: 2922: 2919: 2918: 2916: 2912: 2906: 2903: 2901: 2898: 2896: 2893: 2891: 2888: 2886: 2883: 2881: 2878: 2876: 2873: 2869: 2866: 2865: 2864: 2861: 2859: 2856: 2852: 2849: 2847: 2844: 2843: 2842: 2839: 2838: 2836: 2834: 2830: 2818: 2815: 2813: 2810: 2809: 2808: 2805: 2801: 2798: 2796: 2793: 2791: 2788: 2787: 2785: 2783: 2780: 2778: 2775: 2773: 2770: 2768: 2765: 2764: 2762: 2759: 2755: 2749: 2748:Video quality 2746: 2744: 2741: 2739: 2736: 2734: 2731: 2729: 2726: 2724: 2721: 2719: 2716: 2712: 2709: 2707: 2704: 2702: 2699: 2698: 2697: 2694: 2693: 2691: 2687: 2684: 2682: 2678: 2666: 2663: 2661: 2658: 2656: 2653: 2651: 2648: 2647: 2646: 2643: 2641: 2638: 2636: 2633: 2631: 2628: 2626: 2623: 2621: 2618: 2616: 2613: 2611: 2608: 2607: 2605: 2601: 2595: 2592: 2590: 2587: 2585: 2582: 2580: 2577: 2575: 2572: 2570: 2567: 2565: 2562: 2560: 2557: 2555: 2552: 2550: 2547: 2545: 2542: 2541: 2539: 2535: 2532: 2530: 2526: 2516: 2513: 2511: 2508: 2504: 2501: 2499: 2496: 2494: 2491: 2489: 2486: 2484: 2481: 2480: 2479: 2476: 2472: 2469: 2468: 2467: 2464: 2460: 2457: 2455: 2452: 2451: 2450: 2447: 2445: 2442: 2440: 2437: 2436: 2434: 2431: 2427: 2421: 2418: 2416: 2415:Speech coding 2413: 2411: 2410:Sound quality 2408: 2406: 2403: 2401: 2398: 2396: 2393: 2391: 2388: 2386: 2385:Dynamic range 2383: 2381: 2378: 2376: 2373: 2369: 2366: 2364: 2361: 2359: 2356: 2355: 2354: 2351: 2350: 2348: 2344: 2341: 2339: 2335: 2325: 2322: 2318: 2315: 2313: 2310: 2308: 2305: 2304: 2302: 2298: 2295: 2293: 2290: 2288: 2285: 2283: 2280: 2278: 2275: 2274: 2273: 2270: 2266: 2263: 2262: 2261: 2258: 2257: 2255: 2251: 2243: 2240: 2238: 2235: 2233: 2230: 2229: 2228: 2225: 2223: 2220: 2218: 2215: 2211: 2208: 2206: 2203: 2202: 2201: 2198: 2197: 2195: 2193: 2189: 2186: 2184: 2180: 2168: 2165: 2164: 2162: 2157: 2155: 2152: 2151: 2150:LZ77 + Range 2149: 2145: 2142: 2141: 2139: 2135: 2132: 2131: 2129: 2125: 2122: 2121: 2119: 2115: 2112: 2111: 2109: 2105: 2102: 2100: 2097: 2095: 2092: 2091: 2089: 2088: 2086: 2082: 2076: 2073: 2071: 2068: 2066: 2063: 2061: 2058: 2056: 2053: 2049: 2046: 2044: 2041: 2040: 2039: 2036: 2034: 2031: 2029: 2026: 2022: 2019: 2018: 2017: 2014: 2012: 2009: 2007: 2004: 2002: 1999: 1998: 1996: 1992: 1984: 1981: 1979: 1976: 1974: 1971: 1969: 1966: 1964: 1961: 1959: 1956: 1954: 1951: 1949: 1946: 1944: 1941: 1940: 1939: 1936: 1934: 1931: 1930: 1928: 1926: 1922: 1914: 1911: 1909: 1906: 1904: 1901: 1899: 1896: 1895: 1894: 1891: 1889: 1886: 1884: 1881: 1879: 1876: 1874: 1871: 1869: 1866: 1864: 1861: 1857: 1854: 1852: 1849: 1847: 1844: 1843: 1842: 1839: 1837: 1834: 1832: 1829: 1827: 1824: 1822: 1819: 1818: 1816: 1814: 1810: 1807: 1805: 1801: 1796: 1789: 1784: 1782: 1777: 1775: 1770: 1769: 1766: 1754: 1745: 1736: 1727: 1718: 1709: 1707: 1702: 1692: 1689: 1687: 1684: 1683: 1671: 1669: 1667: 1664: 1661: 1660: 1655: 1653: 1651: 1648: 1645: 1644: 1640: 1621: 1617: 1611: 1598: 1594: 1592: 1590: 1587: 1584: 1583: 1579: 1575: 1573: 1571: 1568: 1565: 1564: 1559: 1557: 1555: 1552: 1549: 1548: 1543: 1541: 1539: 1536: 1533: 1532: 1528: 1525: 1522: 1519: 1518: 1515: 1506: 1504: 1488: 1468: 1465: 1462: 1459: 1456: 1436: 1416: 1413: 1410: 1390: 1370: 1350: 1330: 1327: 1321: 1285:encodeCFG_rec 1180:encodeCFG_rec 1168:encodeCFG_rec 1063:encodeCFG_rec 981: 975: 964: 962: 957: 952: 949: 947: 942: 935: 931: 928: 924: 906: 900: 877: 871: 848: 842: 822: 802: 794: 778: 770: 766: 765: 764: 756: 740: 736: 732: 727: 723: 717: 713: 709: 704: 700: 694: 690: 686: 683: 680: 658: 654: 631: 627: 606: 601: 597: 576: 554: 550: 546: 543: 540: 535: 531: 525: 521: 517: 514: 511: 489: 485: 464: 459: 455: 434: 429: 425: 421: 418: 415: 412: 407: 403: 399: 394: 390: 386: 383: 380: 358: 354: 333: 313: 310: 301: 287: 284: 281: 278: 275: 272: 269: 266: 263: 260: 257: 254: 251: 248: 245: 236: 232: 225: 216: 212: 210: 206: 202: 198: 194: 184: 181: 166: 163: 155: 145: 141: 140:inappropriate 137: 133: 127: 125: 118: 109: 108: 99: 89: 85: 79: 77: 71: 62: 61: 56: 54: 47: 46: 41: 40: 35: 30: 21: 20: 2921:Hutter Prize 2885:Quantization 2790:Compensation 2584:Quantization 2307:Compensation 2042: 1873:Shannon–Fano 1813:Entropy type 1753: 1744: 1735: 1726: 1717: 1638: 1596: 1585:Memory usage 1529:Description 1512: 1502: 1313: 978:encodeCFG(X) 973: 970: 953: 950: 943: 939: 933: 926: 792: 768: 762: 302: 237: 233: 230: 219:How it works 213: 196: 192: 191: 176: 158: 149: 134:by removing 121: 93: 76:lead section 74: 50: 43: 37: 36:Please help 33: 2880:Prefix code 2733:Frame types 2554:Color space 2380:Convolution 2110:LZ77 + ANS 2021:Incremental 1994:Other types 1913:Levenshtein 1662:Compression 1646:Compression 1520:Improvement 1237:writeSymbol 997:writeSymbol 447:, which is 195:(short for 2985:Categories 2937:Mark Adler 2895:Redundancy 2812:Daubechies 2795:Estimation 2728:Frame rate 2650:Daubechies 2610:Chain code 2569:Macroblock 2375:Companding 2312:Estimation 2232:Daubechies 1938:Lempel–Ziv 1898:Exp-Golomb 1826:Arithmetic 1697:References 934:hash table 39:improve it 2914:Community 2738:Interlace 2124:Zstandard 1903:Fibonacci 1893:Universal 1851:Canonical 1625:⌉ 1607:⌈ 1466:− 1460:≤ 1414:− 1325:→ 1267:encodeCFG 207:, i.e. a 136:excessive 86:and help 45:talk page 2900:Symmetry 2868:Timeline 2851:FM-index 2696:Bit rate 2689:Concepts 2537:Concepts 2400:Sampling 2353:Bit rate 2346:Concepts 2048:Sequitur 1883:Tunstall 1856:Modified 1846:Adaptive 1804:Lossless 1680:See also 1566:Encoding 1550:Original 1509:Versions 1252:assigned 1243:terminal 1099:terminal 769:sequence 2858:Entropy 2807:Wavelet 2786:Motion 2645:Wavelet 2625:Fractal 2620:Deflate 2603:Methods 2390:Latency 2303:Motion 2227:Wavelet 2144:LHA/LZH 2094:Deflate 2043:Re-Pair 2038:Grammar 1868:Shannon 1841:Huffman 1797:methods 1503:Re-Pair 1126:appears 1051:bitslen 1015:bitslen 199:) is a 193:Re-Pair 130:Please 122:use of 2969:codecs 2930:People 2833:Theory 2800:Vector 2317:Vector 2134:Brotli 2084:Hybrid 1983:Snappy 1836:Golomb 1273:symbol 1198:symbol 1192:assign 1120:symbol 1069:symbol 1045:binary 1003:symbol 544:123123 419:123123 276:123123 2760:parts 2758:Codec 2723:Frame 2681:Video 2665:SPIHT 2574:Pixel 2529:Image 2483:ACELP 2454:ADPCM 2444:ÎĽ-law 2439:A-law 2432:parts 2430:Codec 2338:Audio 2277:ACELP 2265:ADPCM 2242:SPIHT 2183:Lossy 2167:bzip2 2158:LZHAM 2114:LZFSE 2016:Delta 1908:Gamma 1888:Unary 1863:Range 1449:with 1297:write 1249:value 1225:write 1204:value 1156:write 1114:first 1048:using 1036:write 2772:DPCM 2579:PSNR 2510:MDCT 2503:WLPC 2488:CELP 2449:DPCM 2297:WLPC 2282:CELP 2260:DPCM 2210:MDCT 2154:LZMA 2055:LDCT 2033:DPCM 1978:LZWL 1968:LZSS 1963:LZRW 1953:LZJB 1665:2018 1649:2017 1588:2017 1569:2013 1553:2011 1537:2003 1523:Year 1264:void 1219:else 1138:rule 1135:take 1117:time 1105:this 1060:void 1054:bits 893:and 815:and 646:and 589:and 2817:DWT 2767:DCT 2711:VBR 2706:CBR 2701:ABR 2660:EZW 2655:DWT 2640:RLE 2630:KLT 2615:DCT 2498:LSP 2493:LAR 2478:LPC 2471:FFT 2368:VBR 2363:CBR 2358:ABR 2292:LSP 2287:LAR 2272:LPC 2237:DWT 2222:FFT 2217:DST 2205:DCT 2104:LZS 2099:LZX 2075:RLE 2070:PPM 2065:PAQ 2060:MTF 2028:DMC 2006:CTW 2001:BWT 1973:LZW 1958:LZO 1948:LZ4 1943:842 1383:or 1300:bit 1228:bit 1159:bit 1111:the 1102:and 1093:non 1021:log 991:256 138:or 2987:: 2635:LP 2466:FT 2459:DM 2011:CM 1705:^ 1294:); 1207:++ 1195:to 1189:); 1177:); 1108:is 1090:is 1081:if 1042:in 1030:); 932:A 925:A 864:, 767:A 577:12 300:. 48:. 2971:) 2967:( 1787:e 1780:t 1773:v 1622:3 1618:/ 1612:n 1489:0 1469:1 1463:i 1457:j 1437:j 1417:1 1411:i 1391:Z 1371:Y 1351:i 1331:Z 1328:Y 1322:X 1309:} 1306:; 1303:1 1291:s 1288:( 1282:{ 1279:) 1276:s 1270:( 1261:} 1258:} 1255:) 1246:/ 1240:( 1234:; 1231:0 1222:{ 1216:} 1213:; 1201:s 1186:Y 1183:( 1174:X 1171:( 1165:; 1162:1 1153:; 1150:Y 1147:X 1144:→ 1141:s 1132:{ 1129:) 1123:s 1096:- 1087:s 1084:( 1078:{ 1075:) 1072:s 1066:( 1057:} 1039:s 1024:( 1018:= 1012:{ 1009:) 1006:s 1000:( 988:= 910:] 907:m 904:[ 901:w 881:] 878:k 875:[ 872:w 852:] 849:i 846:[ 843:w 823:m 803:k 793:i 779:i 741:2 737:R 733:z 728:4 724:R 718:4 714:R 710:y 705:2 701:R 695:2 691:R 687:x 684:= 681:w 659:4 655:R 632:3 628:R 607:3 602:3 598:R 555:2 551:R 547:z 541:y 536:2 532:R 526:2 522:R 518:x 515:= 512:w 490:2 486:R 465:c 460:1 456:R 435:c 430:1 426:R 422:z 416:y 413:c 408:1 404:R 400:c 395:1 391:R 387:x 384:= 381:w 359:1 355:R 334:w 314:b 311:a 288:c 285:b 282:a 279:z 273:y 270:c 267:b 264:a 261:c 258:b 255:a 252:x 249:= 246:w 183:) 177:( 165:) 159:( 154:) 150:( 146:. 128:. 98:) 94:( 90:. 80:. 55:) 51:(

Index

improve it
talk page
Learn how and when to remove these messages
lead section
length guidelines
move details into the article's body
external links
improve this article
excessive
inappropriate
footnote references
Learn how and when to remove this message
Learn how and when to remove this message
grammar-based compression
straight-line program
context-free grammar

Data structure to implement the Recursive Pairing algorithm with linear runtime and space usage.
State of the data structures used by the Recursive Pairing algorithm after going through the input text.
State of the data structures used by the Recursive Pairing algorithm after performing the first pair replacement.



Variable Length



Byte pair encoding
Sequitur algorithm

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑