Knowledge

Grammar induction

Source đź“ť

1155:, greedy grammar inference algorithms make, in iterative manner, decisions that seem to be the best at that stage. The decisions made usually deal with things like the creation of new rules, the removal of existing rules, the choice of a rule to be applied or the merging of some existing rules. Because there are several ways to define 'the stage' and 'the best', there are also several greedy grammar inference algorithms. 1349: 1014:
The simplest form of learning is where the learning algorithm merely receives a set of examples drawn from the language in question: the aim is to learn the language from examples of it (and, rarely, from counter-examples, that is, example that do not belong to the language). However, other learning
1130:
In the case of grammar induction, the transplantation of sub-trees corresponds to the swapping of production rules that enable the parsing of phrases from some language. The fitness operator for the grammar is based upon some measure of how well it performed in parsing some group of sentences from
1121:
programs as trees. He was able to find analogues to the genetic operators within the standard set of tree operators. For example, swapping sub-trees is equivalent to the corresponding process of genetic crossover, where sub-strings of a genetic code are transplanted into an individual of the next
1069:
suggests successively guessing grammar rules (productions) and testing them against positive and negative observations. The rule set is expanded so as to be able to generate each positive example, but if a given rule set also generates a negative example, it must be discarded. This particular
978:
or automaton of some kind) from a set of observations, thus constructing a model which accounts for the characteristics of the observed objects. More generally, grammatical inference is that branch of machine learning where the instance space consists of discrete combinatorial objects such as
1937: 1045:. A more recent textbook is de la Higuera (2010), which covers the theory of grammatical inference of regular languages and finite state automata. D'Ulizia, Ferri and Grifoni provide a survey that explores grammatical inference methods for natural languages. 1222:
from a disjoint set". The language of such a pattern is the set of all its nonempty ground instances i.e. all strings resulting from consistent replacement of its variable symbols by nonempty strings of constant symbols. A pattern is called
1126:
of the Lisp code. Similar analogues between the tree structured lisp representation and the representation of grammars as trees, made the application of genetic programming techniques possible for grammar induction.
1015:
models have been studied. One frequently studied alternative is the case where the learner can ask membership queries as in the exact query learning model or minimally adequate teacher model introduced by Angluin.
1267:
in that it does not begin by prescribing algorithms and machinery to recognize and classify patterns; rather, it prescribes a vocabulary to articulate and recast the pattern concepts in precise language.
1830: 998:
and richer formalisms, such as multiple context-free grammars and parallel multiple context-free grammars. Other classes of grammars for which grammatical inference has been studied are
1673: 1035:
also devote a brief section to the problem, and cite a number of references. The basic trial-and-error method they present is discussed below. For approaches to infer subclasses of
1430: 1110:. Other early work on simple formal languages used the binary string representation of genetic algorithms, but the inherently hierarchical structure of grammars couched in the 1078:
text provide a simple example which nicely illustrates the process, but the feasibility of such an unguided trial-and-error approach for more substantial problems is dubious.
863: 901: 1496: 1470: 1450: 1924: 1898: 858: 848: 1234:. To this end, she builds an automaton representing all possibly relevant patterns; using sophisticated arguments about word lengths, which rely on 689: 1750: 1227:
for a finite input set of strings if its language is minimal (with respect to set inclusion) among all pattern languages subsuming the input set.
2076: 896: 1733: 1295:
Broad in its mathematical coverage, pattern theory spans algebra and statistics, as well as local topological and global entropic properties.
1716: 1320: 853: 704: 1662:." Proceedings of the 2001 workshop on Computational Natural Language Learning-Volume 7. Association for Computational Linguistics, 2001. 435: 1888:." Proceedings of the 32nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 1994. 936: 739: 1282:
Formulate prior distributions for hidden variables and models for the observed variables that form the vertices of a Gibbs-like graph.
1169:
creates a context-free grammar in a deterministic way such that it is necessary to store only the start rule of the generated grammar.
1646: 1356: 987:
Grammatical inference has often been very focused on the problem of learning finite state machines of various types (see the article
1976:
Charikar, M.; Lehman, E.; Liu, D.; Panigrahy, R.; Prabharakan, M.; Sahai, A.; Shelat, A. (2005), "The Smallest Grammar Problem",
1512: 815: 1850:"Finding Minimal Generalizations for Unions of Pattern Languages and Its Application to Inductive Inference from Positive Data" 1606: 364: 1179:
These context-free grammar generating algorithms first read the whole given symbol-sequence and then start to make decisions:
1195:
A more recent approach is based on distributional learning. Algorithms using these approaches have been applied to learning
2132: 1532: 1135:
of a production rule corresponds to a leaf node of the tree. Its parent nodes corresponds to a non-terminal symbol (e.g. a
873: 636: 171: 1241:
Erlebach et al. give a more efficient version of Angluin's pattern learning algorithm, as well as a parallelized version.
1562: 1208: 891: 1230:
Angluin gives a polynomial algorithm to compute, for a given input string set, all descriptive patterns in one variable
2137: 1476:) is known to be NP-hard, so many grammar-transform algorithms are proposed from theoretical and practical viewpoints. 1244:
Arimura et al. show that a language class obtained from limited unions of patterns can be learned in polynomial time.
1054: 724: 699: 648: 1517: 1200: 772: 767: 420: 1849: 1090:
is the process of evolving a representation of the grammar of a target language through some evolutionary process.
430: 68: 1003: 1312: 1041: 999: 991:
for details on these approaches), since there have been efficient algorithms for this problem since the 1980s.
988: 970: 929: 825: 589: 410: 1542: 1111: 800: 502: 278: 2127: 1950:
Kieffer, J. C.; Yang, E.-H. (2000), "Grammar-based codes: A new class of universal lossless source codes",
757: 694: 604: 582: 425: 415: 1831:"Learning One-Variable Pattern Languages Very Efficiently on Average, in Parallel, and by Asking Queries" 1389: 1304: 1118: 908: 820: 805: 266: 88: 1561:
The language of a pattern with at least two occurrences of the same variable is not regular due to the
994:
Since the beginning of the century, these approaches have been extended to the problem of inference of
868: 795: 545: 440: 228: 161: 121: 1279:
of a data set using real world data rather than artificial stimuli, which was commonplace at the time.
2015: 1383: 1316: 922: 528: 296: 166: 2101: 2086: 1695: 1473: 550: 470: 393: 311: 141: 103: 98: 58: 53: 1288:
Create the basic classes of stochastic models applied by listing the deformations of the patterns.
2110: 2082: 1681: 1264: 1260: 497: 346: 246: 73: 1642: 1690: 1271:
In addition to the new algebraic vocabulary, its statistical approach was novel in its aim to:
1087: 677: 653: 555: 316: 291: 251: 63: 1023:
There is a wide variety of methods for grammatical inference. Two of the classic sources are
2064: 1753:
Proceedings of the 25th annual ACM symposium on User interface software and technology. 2012.
1537: 1527: 1522: 1352: 1196: 1095: 1070:
approach can be characterized as "hypothesis testing" and bears some similarity to Mitchel's
995: 631: 453: 405: 261: 176: 48: 2034: 1911: 1379: 1159: 1123: 975: 560: 510: 1660:
Unsupervised induction of stochastic context-free grammars using distributional clustering
1143:) in the rule set. Ultimately, the root node might correspond to a sentence non-terminal. 8: 1901:." Proceedings of the MT Summit VIII Workshop on Example-Based Machine Translation. 2001. 1203:
and have been proven to be correct and efficient for large subclasses of these grammars.
1103: 663: 599: 570: 475: 301: 234: 220: 206: 181: 131: 83: 43: 2147: 1993: 1708: 1481: 1455: 1435: 1371: 1360: 1341: 1324: 1183: 1172: 641: 565: 351: 146: 1645:." Proceedings of the conference on empirical methods in natural language processing. 1810: 1793: 1704: 1499: 1328: 1036: 734: 577: 490: 286: 256: 201: 151: 93: 1871: 1712: 1615: 1263:
to describe knowledge of the world as patterns. It differs from other approaches to
2142: 1997: 1985: 1959: 1805: 1700: 1375: 1308: 1152: 957: 762: 515: 465: 375: 359: 329: 191: 186: 136: 126: 24: 1826: 1276: 1132: 790: 594: 460: 400: 2068: 1252: 1091: 961: 810: 341: 78: 2121: 2021: 1847: 1824: 1256: 1071: 729: 658: 540: 271: 156: 1989: 1659: 1885: 1140: 1136: 535: 29: 1835:
Proc. 8th International Workshop on Algorithmic Learning Theory — ALT'97
1303:
The principle of grammar induction has been applied to other aspects of
1734:
A Survey of Grammatical Inference Methods for Natural Language Learning
684: 380: 306: 1963: 1291:
Synthesize (sample) from the models, not just analyze signals with it.
1764:"Compound probabilistic context-free grammars for grammar induction." 1238:
being the only variable, the state count can be drastically reduced.
1107: 1099: 1098:
of production rules that can be subjected to evolutionary operators.
843: 624: 1643:
Lexical generalization in CCG grammar induction for semantic parsing
2060: 1927:." DIMACS Working Group on The Burrows–Wheeler Transform 21 (2004). 1763: 1472:. The problem of finding a smallest grammar for an input sequence ( 1382:(CFG) for the string to be compressed. Examples include universal 1162:
generating algorithms make the decision after every read symbol:
619: 2047: 1938:"Time series anomaly discovery with grammar-based compression." 1122:
generation. Fitness is measured by scoring the output from the
370: 1131:
the target language. In a tree representation of a grammar, a
1081: 1146: 614: 609: 336: 1912:
Probabilistic models of language processing and acquisition
1751:"Learning design patterns with bayesian grammar induction." 1364: 1060: 2014:
Duda, Richard O.; Hart, Peter E.; Stork, David G. (2001),
1975: 1848:
Hiroki Arimura; Takeshi Shinohara; Setsuko Otsuki (1994).
1348: 1674:"Learning Regular Sets from Queries and Counter-Examples" 1166: 902:
List of datasets in computer vision and image processing
1614:. Cambridge: Cambridge University Press. Archived from 1600: 1598: 1307:, and has been applied (among many other problems) to 1048: 1914:." Trends in cognitive sciences 10.7 (2006): 335-344. 1899:
Transfer-rule induction for example-based translation
1608:
Grammatical Inference: Learning Automata and Grammars
1484: 1458: 1438: 1392: 1355:(with start symbol Ăź) for the second sentence of the 1285:
Study the randomness and variability of these graphs.
1595: 2063:: Stanford University Computer Science Department, 1837:. LNAI. Vol. 1316. Springer. pp. 260–276. 1513:
Artificial grammar learning#Artificial intelligence
1498:is further compressed by statistical encoders like 1859:. LNCS. Vol. 775. Springer. pp. 649–660. 1490: 1464: 1444: 1424: 2119: 1875:. Vol. 1. Oxford: Oxford university press, 2007. 1872:Pattern theory: from representation to inference 1206: 1886:Hidden understanding models of natural language 1577:may occur several times, but no other variable 1378:algorithms based on the idea of constructing a 1218:to be "a string of constant symbols from ÎŁ and 2031:Syntactic Pattern Recognition and Applications 1762:Kim, Yoon, Chris Dyer, and Alexander M. Rush. 1114:language made trees a more flexible approach. 897:List of datasets for machine-learning research 2013: 1794:"Finding Patterns Common to a Set of Strings" 1732:D’Ulizia, A., Ferri, F., Grifoni, P. (2011) " 1637: 1635: 1604: 1075: 1066: 1032: 1006:, contextual grammars and pattern languages. 930: 1791: 1671: 2044:Syntactic Pattern Recognition, Applications 1949: 1910:Chater, Nick, and Christopher D. Manning. " 1082:Grammatical inference by genetic algorithms 1053:There are several methods for induction of 1925:Grammar-based compression of DNA sequences 1825:T. Erlebach; P. Rossmanith; H. Stadtherr; 1632: 1190: 1147:Grammatical inference by greedy algorithms 937: 923: 1809: 1694: 1647:Association for Computational Linguistics 1357:United States Declaration of Independence 1334: 1923:Cherniavsky, Neva, and Richard Ladner. " 1386:algorithms. To compress a data sequence 1347: 1061:Grammatical inference by trial-and-error 2054: 1869:Grenander, Ulf, and Michael I. Miller. 1798:Journal of Computer and System Sciences 1766:arXiv preprint arXiv:1906.10225 (2019). 2120: 1065:The method proposed in Section 8.7 of 2103:Language Identification in the Limit 2099: 2078:Language Identification in the Limit 2074: 1777:Journal of Machine Learning Research 1533:Language identification in the limit 1425:{\displaystyle x=x_{1}\cdots x_{n}} 1055:probabilistic context-free grammars 1049:Induction of probabilistic grammars 892:Glossary of artificial intelligence 13: 2085:, pp. 447–474, archived from 2041: 2028: 1432:, a grammar-based code transforms 1201:mildly context-sensitive languages 1028: 1024: 1009: 982: 14: 2159: 1518:Example-based machine translation 1374:or Grammar-based compression are 1247: 2057:A Study of Grammatical Inference 1478:Generally, the produced grammar 1359:. Each blue character denotes a 1340:This section is an excerpt from 1018: 1004:stochastic context-free grammars 1969: 1943: 1930: 1917: 1904: 1891: 1878: 1863: 1841: 1833:. In M. Li; A. Maruoka (eds.). 1818: 1785: 1568: 1298: 1000:combinatory categorial grammars 1769: 1756: 1743: 1738:Artificial Intelligence Review 1726: 1665: 1652: 1555: 1313:natural language understanding 1042:Induction of regular languages 989:Induction of regular languages 312:Relevance vector machine (RVM) 1: 1605:de la Higuera, Colin (2010). 1588: 1543:Syntactic pattern recognition 1367:-compression of the sentence. 1094:can easily be represented as 1076:Duda, Hart & Stork (2001) 1067:Duda, Hart & Stork (2001) 1033:Duda, Hart & Stork (2001) 801:Computational learning theory 365:Expectation–maximization (EM) 1811:10.1016/0022-0000(80)90041-0 1781:Theoretical Computer Science 1705:10.1016/0890-5401(87)90052-6 1452:into a context-free grammar 1363:; they were obtained from a 1086:Grammatical induction using 964:(usually as a collection of 758:Coefficient of determination 605:Convolutional neural network 317:Support vector machine (SVM) 7: 2133:Natural language processing 2055:Horning, James Jay (1969), 1740:, Vol. 36, No. 1, pp. 1–27. 1506: 1305:natural language processing 1102:of this sort stem from the 979:strings, trees and graphs. 909:Outline of machine learning 806:Empirical risk minimization 10: 2164: 2007: 1641:Kwiatkowski, Tom, et al. " 1339: 1167:Lempel-Ziv-Welch algorithm 546:Feedforward neural network 297:Artificial neural networks 2138:Computational linguistics 2059:(Ph.D. Thesis ed.), 1384:lossless data compression 1325:grammar-based compression 1317:example-based translation 529:Artificial neural network 1775:Clark and Eyraud (2007) 1548: 1474:smallest grammar problem 838:Journals and conferences 785:Mathematical foundations 695:Temporal difference (TD) 551:Recurrent neural network 471:Conditional random field 394:Dimensionality reduction 142:Dimensionality reduction 104:Quantum machine learning 99:Neuromorphic engineering 59:Self-supervised learning 54:Semi-supervised learning 2111:Information and Control 2083:Information and Control 2024:: John Wiley & Sons 1990:10.1109/tit.2005.850116 1978:IEEE Trans. Inf. Theory 1952:IEEE Trans. Inf. Theory 1884:Miller, Scott, et al. " 1779:; Ryo Yoshinaka (2011) 1682:Information and Control 1265:artificial intelligence 1191:Distributional learning 1088:evolutionary algorithms 247:Apprenticeship learning 2100:Gold, E. Mark (1967), 2075:Gold, E. Mark (1967), 2017:Pattern Classification 1829:; T. Zeugmann (1997). 1749:Talton, Jerry, et al. 1492: 1466: 1446: 1426: 1368: 1335:Compression algorithms 1186:and its optimizations. 1175:and its modifications. 1106:paradigm pioneered by 974:or alternatively as a 796:Bias–variance tradeoff 678:Reinforcement learning 654:Spiking neural network 64:Reinforcement learning 2042:Fu, King Sun (1977), 2029:Fu, King Sun (1982), 1936:Senin, Pavel, et al. 1792:Dana Angluin (1980). 1672:Dana Angluin (1987). 1538:Straight-line grammar 1528:Kolmogorov complexity 1523:Inductive programming 1493: 1467: 1447: 1427: 1353:Straight-line grammar 1351: 1197:context-free grammars 996:context-free grammars 954:grammatical inference 632:Neural radiance field 454:Structured prediction 177:Structured prediction 49:Unsupervised learning 2035:Englewood Cliffs, NJ 1482: 1456: 1436: 1390: 1380:context-free grammar 1321:language acquisition 1259:, is a mathematical 1160:context-free grammar 976:finite state machine 956:) is the process in 821:Statistical learning 719:Learning with humans 511:Local outlier factor 2128:Genetic programming 1658:Clark, Alexander. " 1372:Grammar-based codes 1104:genetic programming 1039:in particular, see 664:Electrochemical RAM 571:reservoir computing 302:Logistic regression 221:Supervised learning 207:Multimodal learning 182:Feature engineering 127:Generative modeling 89:Rule-based learning 84:Curriculum learning 44:Supervised learning 19:Part of a series on 2113:, pp. 447–474 1488: 1462: 1442: 1422: 1369: 1361:nonterminal symbol 1342:Grammar-based code 1214:Angluin defines a 1184:Byte pair encoding 232: • 147:Density estimation 2050:: Springer-Verlag 1964:10.1109/18.841160 1500:arithmetic coding 1491:{\displaystyle G} 1465:{\displaystyle G} 1445:{\displaystyle x} 1329:anomaly detection 1209:pattern languages 1153:greedy algorithms 1117:Koza represented 1037:regular languages 950:Grammar induction 947: 946: 752:Model diagnostics 735:Human-in-the-loop 578:Boltzmann machine 491:Anomaly detection 287:Linear regression 202:Ontology learning 197:Grammar induction 172:Semantic analysis 167:Association rules 152:Anomaly detection 94:Neuro-symbolic AI 2155: 2114: 2109:, vol. 10, 2108: 2096: 2095: 2094: 2081:, vol. 10, 2071: 2051: 2038: 2025: 2001: 2000: 1984:(7): 2554–2576, 1973: 1967: 1966: 1947: 1941: 1934: 1928: 1921: 1915: 1908: 1902: 1897:Brown, Ralf D. " 1895: 1889: 1882: 1876: 1867: 1861: 1860: 1854: 1845: 1839: 1838: 1822: 1816: 1815: 1813: 1789: 1783: 1773: 1767: 1760: 1754: 1747: 1741: 1730: 1724: 1723: 1721: 1715:. Archived from 1698: 1678: 1669: 1663: 1656: 1650: 1639: 1630: 1629: 1627: 1626: 1620: 1613: 1602: 1582: 1572: 1566: 1559: 1497: 1495: 1494: 1489: 1471: 1469: 1468: 1463: 1451: 1449: 1448: 1443: 1431: 1429: 1428: 1423: 1421: 1420: 1408: 1407: 1309:semantic parsing 1277:hidden variables 1255:, formulated by 1220:variable symbols 958:machine learning 939: 932: 925: 886:Related articles 763:Confusion matrix 516:Isolation forest 461:Graphical models 240: 239: 192:Learning to rank 187:Feature learning 25:Machine learning 16: 15: 2163: 2162: 2158: 2157: 2156: 2154: 2153: 2152: 2118: 2117: 2106: 2092: 2090: 2037:: Prentice-Hall 2010: 2005: 2004: 1974: 1970: 1948: 1944: 1935: 1931: 1922: 1918: 1909: 1905: 1896: 1892: 1883: 1879: 1868: 1864: 1852: 1846: 1842: 1823: 1819: 1790: 1786: 1774: 1770: 1761: 1757: 1748: 1744: 1731: 1727: 1719: 1696:10.1.1.187.9414 1676: 1670: 1666: 1657: 1653: 1640: 1633: 1624: 1622: 1618: 1611: 1603: 1596: 1591: 1586: 1585: 1573: 1569: 1560: 1556: 1551: 1509: 1504: 1503: 1483: 1480: 1479: 1457: 1454: 1453: 1437: 1434: 1433: 1416: 1412: 1403: 1399: 1391: 1388: 1387: 1345: 1337: 1301: 1250: 1212: 1193: 1149: 1133:terminal symbol 1096:tree structures 1092:Formal grammars 1084: 1074:algorithm. The 1063: 1051: 1021: 1012: 1010:Learning models 985: 983:Grammar classes 943: 914: 913: 887: 879: 878: 839: 831: 830: 791:Kernel machines 786: 778: 777: 753: 745: 744: 725:Active learning 720: 712: 711: 680: 670: 669: 595:Diffusion model 531: 521: 520: 493: 483: 482: 456: 446: 445: 401:Factor analysis 396: 386: 385: 369: 332: 322: 321: 242: 241: 225: 224: 223: 212: 211: 117: 109: 108: 74:Online learning 39: 27: 12: 11: 5: 2161: 2151: 2150: 2145: 2140: 2135: 2130: 2116: 2115: 2097: 2072: 2052: 2039: 2026: 2020:(2 ed.), 2009: 2006: 2003: 2002: 1968: 1958:(3): 737–754, 1942: 1929: 1916: 1903: 1890: 1877: 1862: 1857:Proc. STACS 11 1840: 1817: 1784: 1768: 1755: 1742: 1725: 1722:on 2013-12-02. 1664: 1651: 1631: 1593: 1592: 1590: 1587: 1584: 1583: 1567: 1553: 1552: 1550: 1547: 1546: 1545: 1540: 1535: 1530: 1525: 1520: 1515: 1508: 1505: 1487: 1461: 1441: 1419: 1415: 1411: 1406: 1402: 1398: 1395: 1346: 1338: 1336: 1333: 1300: 1297: 1293: 1292: 1289: 1286: 1283: 1280: 1253:Pattern theory 1249: 1248:Pattern theory 1246: 1211: 1205: 1192: 1189: 1188: 1187: 1177: 1176: 1170: 1148: 1145: 1083: 1080: 1062: 1059: 1050: 1047: 1020: 1017: 1011: 1008: 984: 981: 966:re-write rules 962:formal grammar 960:of learning a 945: 944: 942: 941: 934: 927: 919: 916: 915: 912: 911: 906: 905: 904: 894: 888: 885: 884: 881: 880: 877: 876: 871: 866: 861: 856: 851: 846: 840: 837: 836: 833: 832: 829: 828: 823: 818: 813: 811:Occam learning 808: 803: 798: 793: 787: 784: 783: 780: 779: 776: 775: 770: 768:Learning curve 765: 760: 754: 751: 750: 747: 746: 743: 742: 737: 732: 727: 721: 718: 717: 714: 713: 710: 709: 708: 707: 697: 692: 687: 681: 676: 675: 672: 671: 668: 667: 661: 656: 651: 646: 645: 644: 634: 629: 628: 627: 622: 617: 612: 602: 597: 592: 587: 586: 585: 575: 574: 573: 568: 563: 558: 548: 543: 538: 532: 527: 526: 523: 522: 519: 518: 513: 508: 500: 494: 489: 488: 485: 484: 481: 480: 479: 478: 473: 468: 457: 452: 451: 448: 447: 444: 443: 438: 433: 428: 423: 418: 413: 408: 403: 397: 392: 391: 388: 387: 384: 383: 378: 373: 367: 362: 357: 349: 344: 339: 333: 328: 327: 324: 323: 320: 319: 314: 309: 304: 299: 294: 289: 284: 276: 275: 274: 269: 264: 254: 252:Decision trees 249: 243: 229:classification 219: 218: 217: 214: 213: 210: 209: 204: 199: 194: 189: 184: 179: 174: 169: 164: 159: 154: 149: 144: 139: 134: 129: 124: 122:Classification 118: 115: 114: 111: 110: 107: 106: 101: 96: 91: 86: 81: 79:Batch learning 76: 71: 66: 61: 56: 51: 46: 40: 37: 36: 33: 32: 21: 20: 9: 6: 4: 3: 2: 2160: 2149: 2146: 2144: 2141: 2139: 2136: 2134: 2131: 2129: 2126: 2125: 2123: 2112: 2105: 2104: 2098: 2089:on 2016-08-28 2088: 2084: 2080: 2079: 2073: 2070: 2066: 2062: 2058: 2053: 2049: 2045: 2040: 2036: 2032: 2027: 2023: 2019: 2018: 2012: 2011: 1999: 1995: 1991: 1987: 1983: 1979: 1972: 1965: 1961: 1957: 1953: 1946: 1939: 1933: 1926: 1920: 1913: 1907: 1900: 1894: 1887: 1881: 1874: 1873: 1866: 1858: 1851: 1844: 1836: 1832: 1828: 1821: 1812: 1807: 1803: 1799: 1795: 1788: 1782: 1778: 1772: 1765: 1759: 1752: 1746: 1739: 1735: 1729: 1718: 1714: 1710: 1706: 1702: 1697: 1692: 1689:(2): 87–106. 1688: 1684: 1683: 1675: 1668: 1661: 1655: 1648: 1644: 1638: 1636: 1621:on 2019-02-14 1617: 1610: 1609: 1601: 1599: 1594: 1580: 1576: 1571: 1564: 1563:pumping lemma 1558: 1554: 1544: 1541: 1539: 1536: 1534: 1531: 1529: 1526: 1524: 1521: 1519: 1516: 1514: 1511: 1510: 1501: 1485: 1477: 1475: 1459: 1439: 1417: 1413: 1409: 1404: 1400: 1396: 1393: 1385: 1381: 1377: 1373: 1366: 1362: 1358: 1354: 1350: 1343: 1332: 1330: 1326: 1322: 1318: 1314: 1310: 1306: 1296: 1290: 1287: 1284: 1281: 1278: 1275:Identify the 1274: 1273: 1272: 1269: 1266: 1262: 1258: 1257:Ulf Grenander 1254: 1245: 1242: 1239: 1237: 1233: 1228: 1226: 1221: 1217: 1210: 1204: 1202: 1198: 1185: 1182: 1181: 1180: 1174: 1171: 1168: 1165: 1164: 1163: 1161: 1156: 1154: 1144: 1142: 1138: 1134: 1128: 1125: 1120: 1115: 1113: 1109: 1105: 1101: 1097: 1093: 1089: 1079: 1077: 1073: 1072:version space 1068: 1058: 1056: 1046: 1044: 1043: 1038: 1034: 1030: 1026: 1019:Methodologies 1016: 1007: 1005: 1001: 997: 992: 990: 980: 977: 973: 972: 967: 963: 959: 955: 951: 940: 935: 933: 928: 926: 921: 920: 918: 917: 910: 907: 903: 900: 899: 898: 895: 893: 890: 889: 883: 882: 875: 872: 870: 867: 865: 862: 860: 857: 855: 852: 850: 847: 845: 842: 841: 835: 834: 827: 824: 822: 819: 817: 814: 812: 809: 807: 804: 802: 799: 797: 794: 792: 789: 788: 782: 781: 774: 771: 769: 766: 764: 761: 759: 756: 755: 749: 748: 741: 738: 736: 733: 731: 730:Crowdsourcing 728: 726: 723: 722: 716: 715: 706: 703: 702: 701: 698: 696: 693: 691: 688: 686: 683: 682: 679: 674: 673: 665: 662: 660: 659:Memtransistor 657: 655: 652: 650: 647: 643: 640: 639: 638: 635: 633: 630: 626: 623: 621: 618: 616: 613: 611: 608: 607: 606: 603: 601: 598: 596: 593: 591: 588: 584: 581: 580: 579: 576: 572: 569: 567: 564: 562: 559: 557: 554: 553: 552: 549: 547: 544: 542: 541:Deep learning 539: 537: 534: 533: 530: 525: 524: 517: 514: 512: 509: 507: 505: 501: 499: 496: 495: 492: 487: 486: 477: 476:Hidden Markov 474: 472: 469: 467: 464: 463: 462: 459: 458: 455: 450: 449: 442: 439: 437: 434: 432: 429: 427: 424: 422: 419: 417: 414: 412: 409: 407: 404: 402: 399: 398: 395: 390: 389: 382: 379: 377: 374: 372: 368: 366: 363: 361: 358: 356: 354: 350: 348: 345: 343: 340: 338: 335: 334: 331: 326: 325: 318: 315: 313: 310: 308: 305: 303: 300: 298: 295: 293: 290: 288: 285: 283: 281: 277: 273: 272:Random forest 270: 268: 265: 263: 260: 259: 258: 255: 253: 250: 248: 245: 244: 237: 236: 231: 230: 222: 216: 215: 208: 205: 203: 200: 198: 195: 193: 190: 188: 185: 183: 180: 178: 175: 173: 170: 168: 165: 163: 160: 158: 157:Data cleaning 155: 153: 150: 148: 145: 143: 140: 138: 135: 133: 130: 128: 125: 123: 120: 119: 113: 112: 105: 102: 100: 97: 95: 92: 90: 87: 85: 82: 80: 77: 75: 72: 70: 69:Meta-learning 67: 65: 62: 60: 57: 55: 52: 50: 47: 45: 42: 41: 35: 34: 31: 26: 23: 22: 18: 17: 2102: 2091:, retrieved 2087:the original 2077: 2056: 2043: 2030: 2016: 1981: 1977: 1971: 1955: 1951: 1945: 1932: 1919: 1906: 1893: 1880: 1870: 1865: 1856: 1843: 1834: 1820: 1801: 1797: 1787: 1780: 1776: 1771: 1758: 1745: 1737: 1728: 1717:the original 1686: 1680: 1667: 1654: 1623:. Retrieved 1616:the original 1607: 1578: 1574: 1570: 1557: 1370: 1302: 1299:Applications 1294: 1270: 1251: 1243: 1240: 1235: 1231: 1229: 1224: 1219: 1215: 1213: 1207:Learning of 1194: 1178: 1157: 1150: 1129: 1116: 1085: 1064: 1052: 1040: 1022: 1013: 993: 986: 969: 965: 953: 949: 948: 816:PAC learning 503: 352: 347:Hierarchical 279: 233: 227: 196: 1940:Edbt. 2015. 1376:compression 1225:descriptive 1141:verb phrase 1137:noun phrase 971:productions 700:Multi-agent 637:Transformer 536:Autoencoder 292:Naive Bayes 30:data mining 2122:Categories 2093:2016-09-04 1625:2017-08-16 1589:References 1100:Algorithms 685:Q-learning 583:Restricted 381:Mean shift 330:Clustering 307:Perceptron 235:regression 137:Clustering 132:Regression 2148:Inference 2069:302483145 1827:A. Steger 1804:: 46–62. 1691:CiteSeerX 1581:may occur 1410:⋯ 1261:formalism 1151:Like all 1124:functions 1108:John Koza 1029:Fu (1982) 1025:Fu (1977) 844:ECML PKDD 826:VC theory 773:ROC curve 705:Self-play 625:DeepDream 466:Bayes net 257:Ensembles 38:Paradigms 2065:ProQuest 2061:Stanford 2022:New York 1713:11873053 1507:See also 1173:Sequitur 267:Boosting 116:Problems 2143:Grammar 2008:Sources 1998:6900082 1649:, 2011. 1216:pattern 849:NeurIPS 666:(ECRAM) 620:AlexNet 262:Bagging 2067:  2048:Berlin 1996:  1711:  1693:  1327:, and 1158:These 642:Vision 498:RANSAC 376:OPTICS 371:DBSCAN 355:-means 162:AutoML 2107:(PDF) 1994:S2CID 1853:(PDF) 1720:(PDF) 1709:S2CID 1677:(PDF) 1619:(PDF) 1612:(PDF) 1549:Notes 1139:or a 864:IJCAI 690:SARSA 649:Mamba 615:LeNet 610:U-Net 436:t-SNE 360:Fuzzy 337:BIRCH 1365:gzip 1199:and 1119:Lisp 1112:EBNF 1027:and 952:(or 874:JMLR 859:ICLR 854:ICML 740:RLHF 556:LSTM 342:CURE 28:and 1986:doi 1960:doi 1806:doi 1736:", 1701:doi 968:or 600:SOM 590:GAN 566:ESN 561:GRU 506:-NN 441:SDL 431:PGD 426:PCA 421:NMF 416:LDA 411:ICA 406:CCA 282:-NN 2124:: 2046:, 2033:, 1992:, 1982:51 1980:, 1956:46 1954:, 1855:. 1802:21 1800:. 1796:. 1707:. 1699:. 1687:75 1685:. 1679:. 1634:^ 1597:^ 1331:. 1323:, 1319:, 1315:, 1311:, 1057:. 1031:. 1002:, 869:ML 1988:: 1962:: 1814:. 1808:: 1703:: 1628:. 1579:y 1575:x 1565:. 1502:. 1486:G 1460:G 1440:x 1418:n 1414:x 1405:1 1401:x 1397:= 1394:x 1344:. 1236:x 1232:x 938:e 931:t 924:v 504:k 353:k 280:k 238:) 226:(

Index

Machine learning
data mining
Supervised learning
Unsupervised learning
Semi-supervised learning
Self-supervised learning
Reinforcement learning
Meta-learning
Online learning
Batch learning
Curriculum learning
Rule-based learning
Neuro-symbolic AI
Neuromorphic engineering
Quantum machine learning
Classification
Generative modeling
Regression
Clustering
Dimensionality reduction
Density estimation
Anomaly detection
Data cleaning
AutoML
Association rules
Semantic analysis
Structured prediction
Feature engineering
Feature learning
Learning to rank

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑