Knowledge

Linear predictive coding

Source đź“ť

2390: 2380: 128:
LPC analyzes the speech signal by estimating the formants, removing their effects from the speech signal, and estimating the intensity and frequency of the remaining buzz. The process of removing the formants is called inverse filtering, and the remaining signal after the subtraction of the filtered
132:
The numbers which describe the intensity and frequency of the buzz, the formants, and the residue signal, can be stored or transmitted somewhere else. LPC synthesizes the speech signal by reversing the process: use the buzz parameters and the residue to create a source signal, use the formants to
658: 329:
for a definition of coefficients) is undesirable, since they are very sensitive to errors. In other words, a very small error can distort the whole spectrum, or worse, a small error might make the prediction filter unstable.
136:
Because speech signals vary with time, this process is done on short chunks of the speech signal, which are called frames; generally, 30 to 50 frames per second give an intelligible speech with good compression.
301:
started the first developments in packetized speech, which would eventually lead to voice-over-IP technology. In 1973, according to Lincoln Laboratory informal history, the first real-time 2400 
325:
LPC is frequently used for transmitting spectral envelope information, and as such it has to be tolerant of transmission errors. Transmission of the filter coefficients directly (see
345:. Of these, especially LSP decomposition has gained popularity since it ensures the stability of the predictor, and spectral errors are local for small coefficient deviations. 125:, or enhanced frequency bands in the sound produced. Hisses and pops are generated by the action of the tongue, lips and throat during sibilants and plosives. 972: 1029: 388:
where musical instruments are used as an excitation signal to the time-varying filter estimated from a singer's speech. This is somewhat popular in
615: 1214: 313:
at 3500 bit/s between Culler-Harrison and Lincoln Laboratory. In 1976, the first LPC conference took place over the ARPANET using the
1882: 1693: 361:. It is generally used for speech analysis and resynthesis. It is used as a form of voice compression by phone companies, such as in the 772:
S. Saito; F. Itakura (Jan 1967). "Theoretical consideration of the statistical optimum recognition of the spectral density of speech".
1582: 309:
LPC was implemented by Ed Hofstetter. In 1974, the first real-time two-way LPC packet speech communication was accomplished over the
290: 2088: 1911: 1705: 1396: 2093: 1670: 158: 2429: 1134: 548: 523: 1823: 433:
LPC has received some attention as a tool for use in the tonal analysis of violins and other stringed musical instruments.
205: 829:"A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol" 2439: 2200: 1938: 1877: 1688: 1638: 1461: 1161: 1321: 1306: 1207: 894: 861: 2313: 2323: 2161: 2012: 1931: 1725: 1003: 488: 2296: 1916: 1710: 1498: 452: 275: 185: 121:(the throat and mouth) forms the tube, which is characterized by its resonances; these resonances give rise to 2393: 1429: 224: 197: 2058: 73:. It is a powerful speech analysis technique, and a useful method for encoding good quality speech at a low 2444: 2383: 2286: 1828: 1386: 1200: 442: 1033: 1376: 1371: 263:
coding algorithm exploiting the masking properties of the human ear. This later became the basis for the
133:
create a filter (which represents the tube), and run the source through the filter, resulting in speech.
44: 2318: 2245: 2083: 2063: 2007: 1665: 1456: 1259: 478: 256: 656:, C. C. Cutler, "Differential quantization of communication signals", published 1952-07-29 653: 85:
LPC starts with the assumption that a speech signal is produced by a buzzer at the end of a tube (for
2419: 2328: 2269: 2195: 2043: 1633: 1628: 1483: 1326: 910:
Atal, B.; Schroeder, M. (1978). "Predictive coding of speech signals and subjective error criteria".
639: 2333: 1401: 447: 271: 2274: 1645: 1532: 1488: 1301: 1284: 1274: 401: 102: 29: 109:(the space between the vocal folds) produces the buzz, which is characterized by its intensity ( 2434: 1899: 1650: 1434: 1279: 397: 342: 314: 2171: 882: 616:"Voice pitch changing by Linear Predictive Coding Method to keep the Singer's Personal Timbre" 2424: 2303: 626: 509: 468: 408: 1055: 1987: 1449: 1411: 1232: 940:(1985). "Code-excited linear prediction(CELP): High-quality speech at very low bit rates". 933: 887:
Acoustics, Information, and Communication: Memorial Volume in Honor of Manfred R. Schroeder
577: 427: 248: 193: 59: 48: 8: 2218: 2109: 2068: 2053: 2022: 2017: 1926: 1833: 1766: 1735: 1720: 1503: 1182: 338: 212: 581: 2291: 2261: 2240: 2146: 2078: 1972: 1660: 1476: 1466: 1361: 1341: 1336: 1098: 995: 980:
International Journal of Advanced Research in Computer Science and Software Engineering
953: 712: 685: 593: 294: 1872: 1056:"Stradivari Violins Exhibit Formant Frequencies Resembling Vowels Produced by Females" 377:
and sent over a narrow voice channel; an early example of this is the US government's
2235: 2223: 2205: 2073: 1957: 1894: 1740: 1655: 1611: 1572: 1254: 1157: 1130: 999: 987: 942:
ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing
912:
ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing
890: 853: 544: 519: 473: 326: 264: 236: 181: 166: 86: 56: 37: 33: 1177: 1102: 957: 597: 2210: 2166: 2139: 2134: 1992: 1977: 1887: 1796: 1791: 1620: 1353: 1331: 1223: 1149: 1122: 1090: 945: 915: 843: 754: 731: 708: 681: 585: 565: 483: 389: 358: 298: 244: 228: 177: 162: 70: 52: 1126: 2129: 1943: 1867: 1848: 1818: 1786: 1752: 1311: 1249: 201: 949: 919: 1921: 1715: 1444: 1439: 1296: 1269: 1241: 1114: 937: 824: 400:
using linear predictive coding. A 10th-order LPC was used in the popular 1980s
334: 260: 220: 189: 154: 146: 1188:
Robert M. Gray, IEEE Signal Processing Society, Distinguished Lecturer Program
1153: 2413: 2228: 2176: 1843: 1838: 1813: 1745: 1366: 1264: 991: 857: 804:
Proceedings of 37th Meeting, Society of Exploration Geophysics, Oklahoma City
758: 735: 589: 515: 354: 282: 150: 90: 66: 2349: 1316: 1291: 1192: 1187: 699:
H. C. Harrison (1952). "Experiments with linear prediction in television".
196:
and John Burg. Itakura and Saito described a statistical approach based on
145:
Linear prediction (signal estimation) goes back to at least the 1940s when
105:
is actually a close approximation of the reality of speech production. The
41: 828: 2308: 2186: 1982: 1858: 1808: 423: 393: 232: 216: 170: 118: 2365: 2156: 2151: 2038: 1997: 1803: 1110: 848: 412: 374: 235:; four units were sold. LPC technology was advanced by Bishnu Atal and 1183:
30 years later Dr Richard Wiggins Talks Speak & Spell development
114: 1094: 2279: 2124: 1781: 378: 370: 286: 176:
Linear predictors were applied to speech analysis independently by
110: 94: 74: 2048: 1522: 1471: 787:
B.S. Atal; M.R. Schroeder (1967). "Predictive coding of speech".
463: 458: 385: 310: 153:
and predictors for detecting signals hidden in noise. Soon after
122: 106: 98: 317:, between Culler-Harrison, ISI, SRI, and LL at 3500 bit/s. 1562: 511:
Speech processing: a dynamic and optimization-oriented approach
366: 306: 89:
sounds), with occasional added hissing and popping sounds (for
173:
in 1955 published two papers on predictive coding of signals.
2397: 2002: 1595: 1542: 973:"Application of MFCC in Text Independent Speaker Recognition" 507: 252: 1552: 1406: 1391: 1381: 420: 416: 223:
presented an LPC speech coder at the Annual Meeting of the
1527: 1493: 362: 302: 268: 149:
developed a mathematical theory for calculating the best
802:
J.P. Burg (1967). "Maximum Entropy Spectral Analysis".
786: 613: 1081:
O'Shaughnessy, D. (1988). "Linear predictive coding".
211:
In 1969, Itakura and Saito introduced method based on
1144:
El-Jaroudi, Amro (2003). "Linear Predictive Coding".
239:
during the 1970s–1980s. In 1978, Atal and Vishwanath
278:(CELP) was developed by Schroeder and Atal in 1985. 1178:real-time LPC analysis/synthesis learning software 1054:Tai, Hwan-Ching; Chung, Dai-Ting (June 14, 2012). 1053: 771: 320: 1080: 2411: 876: 874: 333:There are more advanced representations such as 538: 698: 1208: 932: 909: 871: 204:approach; Burg outlined an approach based on 1222: 671: 652: 1109: 365:standard, for example. It is also used for 1215: 1201: 1143: 1119:Catalogue of Artificial Intelligence Tools 609: 607: 880: 847: 801: 749:P. Elias (1955). "Predictive coding II". 672:B. M. Oliver (1952). "Efficient coding". 563: 396:made the well-known computer music piece 1146:Wiley Encyclopedia of Telecommunications 748: 726:P. Elias (1955). "Predictive coding I". 725: 508:Deng, Li; Douglas O'Shaughnessy (2003). 219:proposed real-time speech encoding, and 161:, work on predictive coding was done by 819: 817: 815: 813: 604: 384:LPC synthesis can be used to construct 247:LPC algorithm. The same year, Atal and 2412: 353:LPC is the most widely used method in 129:modeled signal is called the residue. 65:LPC is the most widely used method in 16:Speech analysis and encoding technique 1196: 970: 1121:. Symbolic Computation. p. 61. 1117:(1984). "Linear Predictive Coding". 823: 810: 251:at Bell Labs proposed an LPC speech 1047: 541:Fundamentals of Speaker Recognition 101:). Although apparently crude, this 13: 1074: 1027: 944:. Vol. 10. pp. 937–940. 713:10.1002/j.1538-7305.1952.tb01405.x 686:10.1002/j.1538-7305.1952.tb01403.x 614:Y. Sasahira; S. Hashimoto (1995). 566:"The history of linear prediction" 200:; Atal and Schroeder described an 14: 2456: 1171: 914:. Vol. 3. pp. 573–576. 674:The Bell System Technical Journal 231:LPC hardware was demonstrated by 55:form, using the information of a 2389: 2388: 2379: 2378: 867:from the original on 2022-10-09. 140: 1021: 964: 926: 903: 795: 780: 765: 680:(4). Nokia Bell Labs: 724–750. 570:IEEE Signal Processing Magazine 489:Warped linear predictive coding 348: 321:LPC coefficient representations 881:Schroeder, Manfred R. (2014). 742: 719: 692: 665: 646: 557: 532: 501: 453:Code-excited linear prediction 369:wireless, where voice must be 276:Code-excited linear prediction 227:. In 1971, realtime LPC using 186:Nippon Telegraph and Telephone 1: 1127:10.1007/978-3-642-96868-6_123 789:Conf. Communications and Proc 701:Bell System Technical Journal 494: 225:Acoustical Society of America 198:maximum likelihood estimation 28:) is a method used mostly in 2430:Lossy compression algorithms 836:Found. Trends Signal Process 443:Akaike information criterion 285:(VoIP) technology. In 1972, 274:format, introduced in 1993. 206:principle of maximum entropy 7: 950:10.1109/ICASSP.1985.1168147 920:10.1109/ICASSP.1978.1170564 543:. Berlin: Springer-Verlag. 436: 407:LPC predictors are used in 243:of BBN developed the first 80: 10: 2461: 2270:Compressed data structures 1592:RLE + BWT + MTF + Huffman 1260:Asymmetric numeral systems 971:Gupta, Shipra (May 2016). 479:Linear predictive analysis 257:adaptive predictive coding 2440:Digital signal processing 2374: 2358: 2342: 2260: 2185: 2117: 2108: 2031: 1965: 1956: 1857: 1774: 1765: 1681: 1629:Discrete cosine transform 1619: 1610: 1559:LZ77 + Huffman + context 1512: 1422: 1352: 1240: 1231: 1154:10.1002/0471219282.eot155 889:. Springer. p. 388. 751:IRE Trans. Inform. Theory 202:adaptive linear predictor 2334:Smallest grammar problem 1030:"More Than Idle Chatter" 759:10.1109/TIT.1955.1055116 736:10.1109/TIT.1955.1055126 728:IRE Trans. Inform.Theory 590:10.1109/MSP.2006.1598091 539:Beigi, Homayoon (2011). 341:(LSP) decomposition and 297:(LL) and Dave Walden of 159:general theory of coding 22:Linear predictive coding 2275:Compressed suffix array 1824:Nyquist–Shannon theorem 343:reflection coefficients 188:in 1966 and in 1967 by 169:and Henry C. Harrison. 30:audio signal processing 634:Cite journal requires 621:. Michigan Publishing. 398:notjustmoreidlechatter 315:Network Voice Protocol 267:technique used by the 2304:Kolmogorov complexity 2172:Video characteristics 1549:LZ77 + Huffman + ANS 934:Schroeder, Manfred R. 753:. IT-1 no. 1: 24–33. 730:. IT-1 no. 1: 16–24. 469:Generalized filtering 281:LPC is the basis for 36:for representing the 2394:Compression software 1988:Compression artifact 1944:Psychoacoustic model 986:(5): 805–810 (806). 774:J. Acoust. Soc.Japan 249:Manfred R. Schroeder 194:Manfred R. Schroeder 2445:Japanese inventions 2384:Compression formats 2023:Texture compression 2018:Standard test image 1834:Silence compression 883:"Bell Laboratories" 582:2006ISPM...23..154A 339:line spectral pairs 293:with Jim Forgie of 213:partial correlation 184:and Shuzo Saito of 103:Source–filter model 2292:Information theory 2147:Display resolution 1973:Chroma subsampling 1362:Byte pair encoding 1307:Shannon–Fano–Elias 849:10.1561/2000000036 564:B.S. Atal (2006). 518:. pp. 41–48. 295:Lincoln Laboratory 2407: 2406: 2256: 2255: 2206:Deblocking filter 2104: 2103: 1952: 1951: 1761: 1760: 1606: 1605: 1136:978-3-540-13938-6 550:978-0-387-77591-3 525:978-0-8247-4040-5 474:Linear prediction 448:Audio compression 404:educational toy. 402:Speak & Spell 327:linear prediction 272:audio compression 265:perceptual coding 237:Manfred Schroeder 182:Nagoya University 167:Bernard M. Oliver 38:spectral envelope 34:speech processing 2452: 2420:Data compression 2392: 2391: 2382: 2381: 2211:Lapped transform 2115: 2114: 1993:Image resolution 1978:Coding tree unit 1963: 1962: 1772: 1771: 1617: 1616: 1238: 1237: 1224:Data compression 1217: 1210: 1203: 1194: 1193: 1167: 1140: 1106: 1068: 1067: 1051: 1045: 1044: 1042: 1041: 1032:. Archived from 1025: 1019: 1018: 1016: 1014: 1008: 1002:. Archived from 977: 968: 962: 961: 930: 924: 923: 907: 901: 900: 878: 869: 868: 866: 851: 833: 821: 808: 807: 799: 793: 792: 784: 778: 777: 769: 763: 762: 746: 740: 739: 723: 717: 716: 696: 690: 689: 669: 663: 662: 661: 657: 650: 644: 643: 637: 632: 630: 622: 620: 611: 602: 601: 561: 555: 554: 536: 530: 529: 505: 484:Pitch estimation 390:electronic music 359:speech synthesis 299:BBN Technologies 178:Fumitada Itakura 163:C. Chapin Cutler 71:speech synthesis 60:predictive model 2460: 2459: 2455: 2454: 2453: 2451: 2450: 2449: 2410: 2409: 2408: 2403: 2370: 2354: 2338: 2319:Rate–distortion 2252: 2181: 2100: 2027: 1948: 1853: 1849:Sub-band coding 1757: 1682:Predictive type 1677: 1602: 1569:LZSS + Huffman 1519:LZ77 + Huffman 1508: 1418: 1354:Dictionary type 1348: 1250:Adaptive coding 1227: 1221: 1174: 1164: 1137: 1115:Wallen, Lincoln 1095:10.1109/45.1890 1083:IEEE Potentials 1077: 1075:Further reading 1072: 1071: 1052: 1048: 1039: 1037: 1026: 1022: 1012: 1010: 1006: 975: 969: 965: 938:Atal, Bishnu S. 931: 927: 908: 904: 897: 879: 872: 864: 831: 825:Gray, Robert M. 822: 811: 800: 796: 785: 781: 770: 766: 747: 743: 724: 720: 697: 693: 670: 666: 659: 651: 647: 635: 633: 624: 623: 618: 612: 605: 562: 558: 551: 537: 533: 526: 506: 502: 497: 439: 351: 335:log area ratios 323: 259:, which used a 143: 93:sounds such as 83: 17: 12: 11: 5: 2458: 2448: 2447: 2442: 2437: 2432: 2427: 2422: 2405: 2404: 2402: 2401: 2386: 2375: 2372: 2371: 2369: 2368: 2362: 2360: 2356: 2355: 2353: 2352: 2346: 2344: 2340: 2339: 2337: 2336: 2331: 2326: 2321: 2316: 2311: 2306: 2301: 2300: 2299: 2289: 2284: 2283: 2282: 2277: 2266: 2264: 2258: 2257: 2254: 2253: 2251: 2250: 2249: 2248: 2243: 2233: 2232: 2231: 2226: 2221: 2213: 2208: 2203: 2198: 2192: 2190: 2183: 2182: 2180: 2179: 2174: 2169: 2164: 2159: 2154: 2149: 2144: 2143: 2142: 2137: 2132: 2121: 2119: 2112: 2106: 2105: 2102: 2101: 2099: 2098: 2097: 2096: 2091: 2086: 2081: 2071: 2066: 2061: 2056: 2051: 2046: 2041: 2035: 2033: 2029: 2028: 2026: 2025: 2020: 2015: 2010: 2005: 2000: 1995: 1990: 1985: 1980: 1975: 1969: 1967: 1960: 1954: 1953: 1950: 1949: 1947: 1946: 1941: 1936: 1935: 1934: 1929: 1924: 1919: 1914: 1904: 1903: 1902: 1892: 1891: 1890: 1885: 1875: 1870: 1864: 1862: 1855: 1854: 1852: 1851: 1846: 1841: 1836: 1831: 1826: 1821: 1816: 1811: 1806: 1801: 1800: 1799: 1794: 1789: 1778: 1776: 1769: 1763: 1762: 1759: 1758: 1756: 1755: 1753:Psychoacoustic 1750: 1749: 1748: 1743: 1738: 1730: 1729: 1728: 1723: 1718: 1713: 1708: 1698: 1697: 1696: 1685: 1683: 1679: 1678: 1676: 1675: 1674: 1673: 1668: 1663: 1653: 1648: 1643: 1642: 1641: 1636: 1625: 1623: 1621:Transform type 1614: 1608: 1607: 1604: 1603: 1601: 1600: 1599: 1598: 1590: 1589: 1588: 1585: 1577: 1576: 1575: 1567: 1566: 1565: 1557: 1556: 1555: 1547: 1546: 1545: 1537: 1536: 1535: 1530: 1525: 1516: 1514: 1510: 1509: 1507: 1506: 1501: 1496: 1491: 1486: 1481: 1480: 1479: 1474: 1464: 1459: 1454: 1453: 1452: 1442: 1437: 1432: 1426: 1424: 1420: 1419: 1417: 1416: 1415: 1414: 1409: 1404: 1399: 1394: 1389: 1384: 1379: 1374: 1364: 1358: 1356: 1350: 1349: 1347: 1346: 1345: 1344: 1339: 1334: 1329: 1319: 1314: 1309: 1304: 1299: 1294: 1289: 1288: 1287: 1282: 1277: 1267: 1262: 1257: 1252: 1246: 1244: 1235: 1229: 1228: 1220: 1219: 1212: 1205: 1197: 1191: 1190: 1185: 1180: 1173: 1172:External links 1170: 1169: 1168: 1163:978-0471219286 1162: 1141: 1135: 1107: 1076: 1073: 1070: 1069: 1060:Savart Journal 1046: 1028:Lansky, Paul. 1020: 963: 925: 902: 895: 870: 842:(4): 203–303. 809: 794: 779: 764: 741: 718: 707:(4): 764–783. 691: 664: 645: 636:|journal= 603: 576:(2): 154–161. 556: 549: 531: 524: 499: 498: 496: 493: 492: 491: 486: 481: 476: 471: 466: 461: 456: 450: 445: 438: 435: 430:audio codecs. 350: 347: 322: 319: 261:psychoacoustic 221:Bishnu S. Atal 190:Bishnu S. Atal 157:established a 155:Claude Shannon 147:Norbert Wiener 142: 139: 82: 79: 15: 9: 6: 4: 3: 2: 2457: 2446: 2443: 2441: 2438: 2436: 2435:Speech codecs 2433: 2431: 2428: 2426: 2423: 2421: 2418: 2417: 2415: 2399: 2395: 2387: 2385: 2377: 2376: 2373: 2367: 2364: 2363: 2361: 2357: 2351: 2348: 2347: 2345: 2341: 2335: 2332: 2330: 2327: 2325: 2322: 2320: 2317: 2315: 2312: 2310: 2307: 2305: 2302: 2298: 2295: 2294: 2293: 2290: 2288: 2285: 2281: 2278: 2276: 2273: 2272: 2271: 2268: 2267: 2265: 2263: 2259: 2247: 2244: 2242: 2239: 2238: 2237: 2234: 2230: 2227: 2225: 2222: 2220: 2217: 2216: 2214: 2212: 2209: 2207: 2204: 2202: 2199: 2197: 2194: 2193: 2191: 2188: 2184: 2178: 2177:Video quality 2175: 2173: 2170: 2168: 2165: 2163: 2160: 2158: 2155: 2153: 2150: 2148: 2145: 2141: 2138: 2136: 2133: 2131: 2128: 2127: 2126: 2123: 2122: 2120: 2116: 2113: 2111: 2107: 2095: 2092: 2090: 2087: 2085: 2082: 2080: 2077: 2076: 2075: 2072: 2070: 2067: 2065: 2062: 2060: 2057: 2055: 2052: 2050: 2047: 2045: 2042: 2040: 2037: 2036: 2034: 2030: 2024: 2021: 2019: 2016: 2014: 2011: 2009: 2006: 2004: 2001: 1999: 1996: 1994: 1991: 1989: 1986: 1984: 1981: 1979: 1976: 1974: 1971: 1970: 1968: 1964: 1961: 1959: 1955: 1945: 1942: 1940: 1937: 1933: 1930: 1928: 1925: 1923: 1920: 1918: 1915: 1913: 1910: 1909: 1908: 1905: 1901: 1898: 1897: 1896: 1893: 1889: 1886: 1884: 1881: 1880: 1879: 1876: 1874: 1871: 1869: 1866: 1865: 1863: 1860: 1856: 1850: 1847: 1845: 1844:Speech coding 1842: 1840: 1839:Sound quality 1837: 1835: 1832: 1830: 1827: 1825: 1822: 1820: 1817: 1815: 1814:Dynamic range 1812: 1810: 1807: 1805: 1802: 1798: 1795: 1793: 1790: 1788: 1785: 1784: 1783: 1780: 1779: 1777: 1773: 1770: 1768: 1764: 1754: 1751: 1747: 1744: 1742: 1739: 1737: 1734: 1733: 1731: 1727: 1724: 1722: 1719: 1717: 1714: 1712: 1709: 1707: 1704: 1703: 1702: 1699: 1695: 1692: 1691: 1690: 1687: 1686: 1684: 1680: 1672: 1669: 1667: 1664: 1662: 1659: 1658: 1657: 1654: 1652: 1649: 1647: 1644: 1640: 1637: 1635: 1632: 1631: 1630: 1627: 1626: 1624: 1622: 1618: 1615: 1613: 1609: 1597: 1594: 1593: 1591: 1586: 1584: 1581: 1580: 1579:LZ77 + Range 1578: 1574: 1571: 1570: 1568: 1564: 1561: 1560: 1558: 1554: 1551: 1550: 1548: 1544: 1541: 1540: 1538: 1534: 1531: 1529: 1526: 1524: 1521: 1520: 1518: 1517: 1515: 1511: 1505: 1502: 1500: 1497: 1495: 1492: 1490: 1487: 1485: 1482: 1478: 1475: 1473: 1470: 1469: 1468: 1465: 1463: 1460: 1458: 1455: 1451: 1448: 1447: 1446: 1443: 1441: 1438: 1436: 1433: 1431: 1428: 1427: 1425: 1421: 1413: 1410: 1408: 1405: 1403: 1400: 1398: 1395: 1393: 1390: 1388: 1385: 1383: 1380: 1378: 1375: 1373: 1370: 1369: 1368: 1365: 1363: 1360: 1359: 1357: 1355: 1351: 1343: 1340: 1338: 1335: 1333: 1330: 1328: 1325: 1324: 1323: 1320: 1318: 1315: 1313: 1310: 1308: 1305: 1303: 1300: 1298: 1295: 1293: 1290: 1286: 1283: 1281: 1278: 1276: 1273: 1272: 1271: 1268: 1266: 1263: 1261: 1258: 1256: 1253: 1251: 1248: 1247: 1245: 1243: 1239: 1236: 1234: 1230: 1225: 1218: 1213: 1211: 1206: 1204: 1199: 1198: 1195: 1189: 1186: 1184: 1181: 1179: 1176: 1175: 1165: 1159: 1155: 1151: 1147: 1142: 1138: 1132: 1128: 1124: 1120: 1116: 1112: 1108: 1104: 1100: 1096: 1092: 1088: 1084: 1079: 1078: 1065: 1061: 1057: 1050: 1036:on 2017-12-24 1035: 1031: 1024: 1009:on 2019-10-18 1005: 1001: 997: 993: 989: 985: 981: 974: 967: 959: 955: 951: 947: 943: 939: 935: 929: 921: 917: 913: 906: 898: 896:9783319056609 892: 888: 884: 877: 875: 863: 859: 855: 850: 845: 841: 837: 830: 826: 820: 818: 816: 814: 805: 798: 790: 783: 775: 768: 760: 756: 752: 745: 737: 733: 729: 722: 714: 710: 706: 702: 695: 687: 683: 679: 675: 668: 655: 649: 641: 628: 617: 610: 608: 599: 595: 591: 587: 583: 579: 575: 571: 567: 560: 552: 546: 542: 535: 527: 521: 517: 516:Marcel Dekker 513: 512: 504: 500: 490: 487: 485: 482: 480: 477: 475: 472: 470: 467: 465: 462: 460: 457: 454: 451: 449: 446: 444: 441: 440: 434: 431: 429: 425: 422: 418: 414: 410: 405: 403: 399: 395: 391: 387: 382: 380: 376: 372: 368: 364: 360: 356: 355:speech coding 346: 344: 340: 336: 331: 328: 318: 316: 312: 308: 304: 300: 296: 292: 288: 284: 283:voice-over-IP 279: 277: 273: 270: 266: 262: 258: 254: 250: 246: 245:variable-rate 242: 238: 234: 230: 226: 222: 218: 214: 209: 207: 203: 199: 195: 191: 187: 183: 179: 174: 172: 168: 164: 160: 156: 152: 148: 141:Early history 138: 134: 130: 126: 124: 120: 117:(pitch). The 116: 112: 108: 104: 100: 96: 92: 88: 78: 76: 72: 68: 67:speech coding 63: 61: 58: 54: 50: 46: 43: 39: 35: 31: 27: 23: 19: 2425:Audio codecs 2350:Hutter Prize 2314:Quantization 2219:Compensation 2013:Quantization 1906: 1736:Compensation 1700: 1302:Shannon–Fano 1242:Entropy type 1145: 1118: 1089:(1): 29–32. 1086: 1082: 1063: 1059: 1049: 1038:. Retrieved 1034:the original 1023: 1011:. Retrieved 1004:the original 983: 979: 966: 941: 928: 911: 905: 886: 839: 835: 803: 797: 788: 782: 773: 767: 750: 744: 727: 721: 704: 700: 694: 677: 673: 667: 648: 627:cite journal 573: 569: 559: 540: 534: 510: 503: 432: 426:, and other 406: 383: 352: 349:Applications 332: 324: 280: 240: 210: 175: 144: 135: 131: 127: 84: 64: 25: 21: 20: 18: 2309:Prefix code 2162:Frame types 1983:Color space 1809:Convolution 1539:LZ77 + ANS 1450:Incremental 1423:Other types 1342:Levenshtein 1111:Bundy, Alan 424:audio codec 394:Paul Lansky 233:Philco-Ford 217:Glen Culler 171:Peter Elias 119:vocal tract 2414:Categories 2366:Mark Adler 2324:Redundancy 2241:Daubechies 2224:Estimation 2157:Frame rate 2079:Daubechies 2039:Chain code 1998:Macroblock 1804:Companding 1741:Estimation 1661:Daubechies 1367:Lempel–Ziv 1327:Exp-Golomb 1255:Arithmetic 1040:2024-06-02 1013:18 October 654:US 2605361 495:References 413:MPEG-4 ALS 215:(PARCOR), 53:compressed 2343:Community 2167:Interlace 1553:Zstandard 1332:Fibonacci 1322:Universal 1280:Canonical 1000:212485331 992:2277-128X 858:1932-8346 375:encrypted 371:digitized 115:frequency 95:sibilants 91:voiceless 2329:Symmetry 2297:Timeline 2280:FM-index 2125:Bit rate 2118:Concepts 1966:Concepts 1829:Sampling 1782:Bit rate 1775:Concepts 1477:Sequitur 1312:Tunstall 1285:Modified 1275:Adaptive 1233:Lossless 1103:12786562 958:14803427 862:Archived 827:(2010). 598:15601493 437:See also 428:lossless 386:vocoders 379:Navajo I 287:Bob Kahn 123:formants 111:loudness 99:plosives 81:Overview 75:bit rate 2287:Entropy 2236:Wavelet 2215:Motion 2074:Wavelet 2054:Fractal 2049:Deflate 2032:Methods 1819:Latency 1732:Motion 1656:Wavelet 1573:LHA/LZH 1523:Deflate 1472:Re-Pair 1467:Grammar 1297:Shannon 1270:Huffman 1226:methods 578:Bibcode 464:FS-1016 459:FS-1015 409:Shorten 337:(LAR), 311:ARPANET 255:called 151:filters 107:glottis 42:digital 2398:codecs 2359:People 2262:Theory 2229:Vector 1746:Vector 1563:Brotli 1513:Hybrid 1412:Snappy 1265:Golomb 1160:  1133:  1101:  998:  990:  956:  893:  856:  660:  596:  547:  522:  455:(CELP) 367:secure 241:et al. 229:16-bit 113:) and 87:voiced 57:linear 49:speech 45:signal 2189:parts 2187:Codec 2152:Frame 2110:Video 2094:SPIHT 2003:Pixel 1958:Image 1912:ACELP 1883:ADPCM 1873:ÎĽ-law 1868:A-law 1861:parts 1859:Codec 1767:Audio 1706:ACELP 1694:ADPCM 1671:SPIHT 1612:Lossy 1596:bzip2 1587:LZHAM 1543:LZFSE 1445:Delta 1337:Gamma 1317:Unary 1292:Range 1099:S2CID 1007:(PDF) 996:S2CID 976:(PDF) 954:S2CID 865:(PDF) 832:(PDF) 619:(pdf) 594:S2CID 253:codec 40:of a 2201:DPCM 2008:PSNR 1939:MDCT 1932:WLPC 1917:CELP 1878:DPCM 1726:WLPC 1711:CELP 1689:DPCM 1639:MDCT 1583:LZMA 1484:LDCT 1462:DPCM 1407:LZWL 1397:LZSS 1392:LZRW 1382:LZJB 1158:ISBN 1131:ISBN 1066:(2). 1015:2019 988:ISSN 891:ISBN 854:ISSN 640:help 545:ISBN 520:ISBN 421:SILK 417:FLAC 357:and 291:ARPA 97:and 69:and 32:and 2246:DWT 2196:DCT 2140:VBR 2135:CBR 2130:ABR 2089:EZW 2084:DWT 2069:RLE 2059:KLT 2044:DCT 1927:LSP 1922:LAR 1907:LPC 1900:FFT 1797:VBR 1792:CBR 1787:ABR 1721:LSP 1716:LAR 1701:LPC 1666:DWT 1651:FFT 1646:DST 1634:DCT 1533:LZS 1528:LZX 1504:RLE 1499:PPM 1494:PAQ 1489:MTF 1457:DMC 1435:CTW 1430:BWT 1402:LZW 1387:LZO 1377:LZ4 1372:842 1150:doi 1123:doi 1091:doi 946:doi 916:doi 844:doi 755:doi 732:doi 709:doi 682:doi 586:doi 363:GSM 303:bit 289:of 269:MP3 180:of 51:in 47:of 26:LPC 2416:: 2064:LP 1895:FT 1888:DM 1440:CM 1156:. 1148:. 1129:. 1113:; 1097:. 1085:. 1062:. 1058:. 994:. 982:. 978:. 952:. 936:; 885:. 873:^ 860:. 852:. 838:. 834:. 812:^ 705:31 703:. 678:31 676:. 631:: 629:}} 625:{{ 606:^ 592:. 584:. 574:23 572:. 568:. 514:. 419:, 415:, 411:, 392:. 381:. 373:, 208:. 192:, 165:, 77:. 62:. 2400:) 2396:( 1216:e 1209:t 1202:v 1166:. 1152:: 1139:. 1125:: 1105:. 1093:: 1087:7 1064:1 1043:. 1017:. 984:6 960:. 948:: 922:. 918:: 899:. 846:: 840:3 806:. 791:. 776:. 761:. 757:: 738:. 734:: 715:. 711:: 688:. 684:: 642:) 638:( 600:. 588:: 580:: 553:. 528:. 307:s 305:/ 24:(

Index

audio signal processing
speech processing
spectral envelope
digital
signal
speech
compressed
linear
predictive model
speech coding
speech synthesis
bit rate
voiced
voiceless
sibilants
plosives
Source–filter model
glottis
loudness
frequency
vocal tract
formants
Norbert Wiener
filters
Claude Shannon
general theory of coding
C. Chapin Cutler
Bernard M. Oliver
Peter Elias
Fumitada Itakura

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑