Knowledge

Feature engineering

Source đź“ť

1055:
These include Non-Negative Matrix Factorization (NMF), Non-Negative Matrix-Tri Factorization (NMTF), Non-Negative Tensor Decomposition/Factorization (NTF/NTD) etc. The non-negativity constraints on coefficients of the feature vectors mined by above-stated algorithms yields a part-based representation and different factor matrices exhibit natural clustering properties. Several extensions of the above-stated feature engineering methods have been reported in literature, including Orthogonality constrained factorization for hard clustering and manifold learning to overcome inherent issues with these algorithms.
969: 1059:
clustering scheme across multiple datasets. The algorithm is designed to output two types of class labels (scale-variant and scale-invariant clustering), is computational robustness to missing information, can obtain shape and scale based outliers and can handle high dimensional data effectively. Coupled matrix and tensor decompositions are popularly used in multi-view feature engineering.
1290:
may be used to process a large raw dataset without having to resort to feature engineering. However, deep learning algorithms still require careful preprocessing and cleaning of the input data. In addition, choosing the right architecture, hyperparameters, and optimization algorithm for a deep neural
1058:
Other class of feature engineering algorithms include leveraging common hidden structure across multiple inter-related datasets to obtain a consensus (common) clustering scheme. Examples include Multi-view Classification based on Consensus Matrix Decomposition (MCMD) algorithm which mines common
1054:
One of the applications of Feature Engineering has been clustering of feature-objects or sample-objects in a dataset. Especially, feature engineering based on matrix/tensor decompositions have been extensively used for data clustering under non-negativity constraints on the feature coefficients.
1270:
The Feature Store is where the features are stored and organized for the explicit purpose of being used to either train models (by data scientists) or make predictions (by applications that have a trained model). It is a central location where you can either create or update groups of features
1274:
A feature store includes the ability to store code used to generate features, apply the code to raw data, and serve those features to models upon request. Useful capabilities include feature versioning and policies governing the circumstances under which features can be used.
1010:
which transforms raw data into a more effective set of inputs. Each input comprises several attributes, known as features. By providing models with relevant information, feature engineering significantly enhances their predictive accuracy and decision-making capability.
1202:
helps data scientists reduce data exploration time allowing them to try and error many ideas in short time. On the other hand, it enables non-experts, who are not familiar with data science, to quickly extract value from their data with a little effort, time, and
1271:
created from multiple different data sources, or create and update new datasets from those feature groups for training models or for use in applications that do not want to compute the features but just retrieve them when it needs them to make predictions.
1075:
involves selecting, creating, transforming, and extracting data features. Key components include feature creation from existing data, transforming and imputing missing or invalid features, reducing data dimensionality through methods like
1231:
is an open source Python library for extracting features from time series data. Despite being 100% written in Python, it has been shown to be faster and more memory efficient than tsfresh, seglearn or tsfel.
1168:
Most MRDTL studies base implementations on relational databases, which results in many redundant operations. These redundancies can be reduced by using techniques such as tuple id propagation.
2127:
Thanh Lam, Hoang; Thiebaut, Johann-Michael; Sinn, Mathieu; Chen, Bei; Mai, Tiep; Alkan, Oznur (2017-06-01). "One button machine for automating feature engineering in relational databases".
2106:
Thanh Lam, Hoang; Thiebaut, Johann-Michael; Sinn, Mathieu; Chen, Bei; Mai, Tiep; Alkan, Oznur (2017-06-01). "One button machine for automating feature engineering in relational databases".
877: 915: 1470:. Cambridge, Massachusetts: The MIT Press (Copyright 2022 Massachusetts Institute of Technology, this work is subject to a Creative Commons CC-BY-NC-ND license). 1799: 1014:
Beyond machine learning, the principles of feature engineering are applied in various scientific fields, including physics. For example, physicists construct
872: 862: 1102:
Feature explosion occurs when the number of identified features is too large for effective model estimation or optimization. Common causes include:
703: 2631: 1335: 910: 867: 718: 1286:
Feature engineering can be a time-consuming and error-prone process, as it requires domain expertise and often involves trial and error.
449: 1933:
Frank R, Moser F, Ester M (2007). "A Method for Multi-relational Classification Using Single and Multi-feature Aggregation Functions".
950: 753: 1725: 1199:
or One-Button Machine combines feature transformations and feature selection on relational data with feature selection techniques.
1225:
is a Python library for feature extraction on time series data. It evaluates the quality of the features using hypothesis testing.
829: 1133:
Automation of feature engineering is a research topic that dates back to the 1990s. Machine learning software that incorporates
378: 2806: 2787: 2768: 2725: 2607: 2483: 1950: 1858: 1701: 1590: 1376: 1219:
with a Python interface. It has been shown to be at least 60 times faster than tsflex, tsfresh, tsfel, featuretools or kats.
1176:
There are a number of open-source libraries and tools that automate feature engineering on relational data and time series:
887: 650: 185: 905: 2547: 1099:
can reduce the number of features to prevent a model from becoming too specific to the training data set (overfitting).
2702:, Lecture Notes in Computer Science, vol. 7700, Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 437–478, 2466:
Kanter, James Max; Veeramachaneni, Kalyan (2015). "Deep feature synthesis: Towards automating data science endeavors".
1493:
SOLID-LIQUID MIXING IN STIRRED TANKS : Modeling, Validation, Design Optimization and Suspension Quality Prediction
738: 713: 662: 17: 1909: 1475: 1450: 1325: 786: 781: 434: 2038:
mcmd: Multi-view Classification framework based on Consensus Matrix Decomposition developed by Shubham Sharma at QUT
1806: 1774: 1305: 444: 82: 2572: 1211:
is an open source tool for automated feature engineering on time series and relational data. It is implemented in
1749: 1137:
has been commercially available since 2016. Related academic literature can be roughly separated into two types:
1081: 943: 839: 603: 424: 1892:
Yin X, Han J, Yang J, Yu PS (2004). "CrossMine: Efficient classification across multiple database relations".
1184: 1134: 814: 516: 292: 1114: 1085: 1077: 771: 708: 618: 596: 439: 429: 1835: 1003: 922: 834: 819: 280: 102: 2647:"Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions" 1340: 882: 809: 559: 454: 242: 175: 135: 2320: 2826: 936: 542: 310: 180: 1187:
library for transforming time series and relational data into feature matrices for machine learning.
2272:"Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package)" 1212: 564: 484: 407: 325: 155: 117: 112: 72: 67: 1088:(LDA), and selecting the most relevant features for model training based on importance scores and 1571:"Nonnegative Matrix Tri-factorization Based High-Order Co-clustering and Its Fast Implementation" 1141:
Multi-relational decision tree learning (MRDTL) uses a supervised algorithm that is similar to a
1095:
Features vary in significance. Even relatively insignificant features may contribute to a model.
511: 360: 260: 87: 1606:
Lim, Lek-Heng; Comon, Pierre (2009-04-12). "Nonnegative approximations of nonnegative tensors".
2831: 1015: 691: 667: 569: 330: 305: 265: 77: 2452: 1161:, handling complex data relationships across tables. It innovatively uses selection graphs as 2059:"Multi-view feature engineering for day-to-day joint clustering of multiple traffic datasets" 1629:"Multi-view feature engineering for day-to-day joint clustering of multiple traffic datasets" 1394:"Multi-view feature engineering for day-to-day joint clustering of multiple traffic datasets" 1157:
Multi-relational Decision Tree Learning (MRDTL) extends traditional decision tree methods to
1043: 645: 467: 419: 275: 190: 62: 1042:. They also develop first approximations of solutions, such as analytical solutions for the 2342: 2070: 1640: 1527: 1405: 1089: 1072: 1007: 574: 524: 1278:
Feature stores can be standalone software tools or built into machine learning platforms.
8: 2319:
Van Der Donckt, Jonas; Van Der Donckt, Jeroen; Deprost, Emiel; Van Hoecke, Sofie (2022).
1262:
The deep feature synthesis (DFS) algorithm beat 615 of 906 human teams in a competition.
1193:
An open-source feature engineering algorithm for joint clustering of multiple datasets .
1158: 677: 613: 584: 489: 315: 248: 234: 220: 145: 97: 57: 2527: 2346: 2074: 1644: 1531: 1409: 982:
Please help update this article to reflect recent events or newly available information.
2780:
Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists
2731: 2703: 2671: 2646: 2625: 2507: 2489: 2358: 2332: 2128: 2107: 1915: 1607: 1310: 655: 579: 365: 160: 2802: 2783: 2764: 2721: 2676: 2613: 2603: 2493: 2479: 2362: 2271: 2088: 1946: 1905: 1854: 1707: 1697: 1658: 1586: 1551: 1543: 1492: 1471: 1446: 1423: 1372: 1122: 1096: 1035: 748: 591: 504: 300: 270: 215: 210: 165: 107: 2735: 1500: 2713: 2666: 2658: 2471: 2383: 2350: 2078: 1938: 1919: 1897: 1874: 1846: 1689: 1648: 1578: 1535: 1496: 1413: 1315: 1068: 776: 529: 479: 389: 373: 343: 205: 200: 150: 140: 38: 2695: 1988:"Featuretools - An open source python framework for automated feature engineering" 2717: 1942: 1850: 1366: 1109:
Feature combinations - combinations that cannot be represented by a linear system
1106:
Feature templates - implementing feature templates instead of coding new features
1019: 804: 608: 474: 414: 2468:
2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA)
2354: 2662: 1027: 1023: 824: 355: 92: 2617: 2475: 2083: 2058: 1966: 1901: 1693: 1653: 1628: 1418: 1393: 2820: 2696:"Practical Recommendations for Gradient-Based Training of Deep Architectures" 2318: 2092: 1711: 1662: 1547: 1427: 1345: 1330: 1320: 1287: 1162: 1142: 1118: 1039: 1031: 743: 672: 554: 285: 170: 1677: 1368:
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
1165:, refined systematically until a specific termination criterion is reached. 2680: 2597: 2425: 2250: 2223: 2196: 2169: 1570: 1555: 1238: 2404: 2292: 2008: 1582: 549: 43: 2036: 1515: 1937:. Lecture Notes in Computer Science. Vol. 4702. pp. 430–437. 1845:. Lecture Notes in Computer Science. Vol. 1704. pp. 378–383. 698: 394: 320: 1300: 1237:
is an extension for multivariate, sequential time series data to the
857: 638: 1516:"Learning the parts of objects by non-negative matrix factorization" 2759:
Boehmke B, Greenwell B (2019). "Feature & Target Engineering".
2337: 2133: 2112: 2708: 2321:"tsflex: Flexible time series processing & feature extraction" 1612: 1539: 1152: 2148: 633: 1365:
Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome H. (2009).
2430: 2297: 2228: 2201: 2174: 2013: 1987: 1247:
is a Python package for feature extraction on time series data.
384: 1894:
Proceedings. 20th International Conference on Data Engineering
2057:
Sharma, Shubham; Nayak, Richi; Bhaskar, Ashish (2024-05-01).
1627:
Sharma, Shubham; Nayak, Richi; Bhaskar, Ashish (2024-05-01).
1392:
Sharma, Shubham; Nayak, Richi; Bhaskar, Ashish (2024-05-01).
1216: 628: 623: 350: 2797:
Zumel N, Mount (2020). "Data Engineering and Data Shaping".
1833: 1569:
Wang, Hua; Nie, Feiping; Huang, Heng; Ding, Chris (2011).
2126: 2105: 1443:
Understanding Machine Learning: From Theory to Algorithms
1113:
Feature explosion can be limited via techniques such as:
1364: 916:
List of datasets in computer vision and image processing
1440: 1575:
2011 IEEE 11th International Conference on Data Mining
27:
Extracting features from raw data for machine learning
2063:
Transportation Research Part C: Emerging Technologies
1633:
Transportation Research Part C: Emerging Technologies
1398:
Transportation Research Part C: Emerging Technologies
2465: 1291:
network can be a challenging and iterative process.
1253:
is a Python toolkit for analyzing time series data.
2573:"5 Reasons Why Feature Engineering is Challenging" 2056: 1626: 1391: 1843:Principles of Data Mining and Knowledge Discovery 2818: 2758: 1568: 2599:The art of statistics : learning from data 2570: 2552:Engineering Education (EngEd) Program | Section 1932: 1153:Multi-relational decision tree learning (MRDTL) 1834:Knobbe AJ, Siebes A, Van Der Wallen D (1999). 1441:Shalev-Shwartz, Shai; Ben-David, Shai (2014). 1336:List of datasets for machine learning research 1171: 911:List of datasets for machine-learning research 2595: 1726:"Feature engineering - Machine Learning Lens" 944: 1891: 1514:Lee, Daniel D.; Seung, H. Sebastian (1999). 1148:Deep Feature Synthesis uses simpler methods. 2801:(2nd ed.). Manning. pp. 113–160. 2777: 1935:Knowledge Discovery in Databases: PKDD 2007 2796: 2630:: CS1 maint: location missing publisher ( 1836:"Multi-relational Decision Tree Induction" 1780:. Alexandre Bouchard-CĂ´tĂ©. October 1, 2009 1675: 951: 937: 2707: 2670: 2548:"Feature Engineering in Machine Learning" 2336: 2132: 2111: 2082: 1800:"Feature engineering in Machine Learning" 1652: 1611: 1490: 1445:. Cambridge: Cambridge University Press. 1417: 1257: 1605: 1513: 1062: 14: 2819: 2763:. Chapman & Hall. pp. 41–75. 2693: 2644: 1465: 1829: 1827: 1682:Intelligent Systems Reference Library 2700:Neural Networks: Tricks of the Trade 962: 2528:"An Introduction to Feature Stores" 1805:. Zdenek Zabokrtsky. Archived from 1775:"Feature engineering and selection" 1676:Nayak, Richi; Luong, Khanh (2023). 906:Glossary of artificial intelligence 24: 2751: 2034: 1824: 25: 2843: 2405:"Welcome to TSFEL documentation!" 1326:Instrumental variables estimation 1265: 2761:Hands-On Machine Learning with R 967: 2687: 2638: 2589: 2564: 2540: 2520: 2500: 2459: 2445: 2426:"github: facebookresearch/Kats" 2418: 2397: 2376: 2312: 2285: 2264: 2243: 2224:"github: getml/getml-community" 2216: 2197:"github: getml/getml-community" 2189: 2170:"github: getml/getml-community" 2162: 2141: 2120: 2099: 2050: 2028: 2001: 1980: 1959: 1926: 1885: 1867: 1792: 1767: 1742: 1718: 1669: 1281: 2571:explorium_admin (2021-10-25). 2453:"Automating big-data analysis" 2009:"github: alteryx/featuretools" 1620: 1599: 1562: 1507: 1484: 1468:Probabilistic Machine Learning 1459: 1434: 1385: 1358: 1082:Independent Component Analysis 326:Relevance vector machine (RVM) 13: 1: 2799:Practical Data Science with R 2596:Spiegelhalter, D. J. (2019). 1501:10.13140/RG.2.2.11074.84164/1 1351: 1135:automated feature engineering 1128: 1078:Principal Components Analysis 1049: 815:Computational learning theory 379:Expectation–maximization (EM) 2718:10.1007/978-3-642-35289-8_26 1943:10.1007/978-3-540-74976-9_43 1875:"Its all about the features" 1851:10.1007/978-3-540-48247-5_46 1086:Linear Discriminant Analysis 772:Coefficient of determination 619:Convolutional neural network 331:Support vector machine (SVM) 7: 2645:Sarker IH (November 2021). 2355:10.1016/j.softx.2021.100971 1294: 1172:Open-source implementations 1004:supervised machine learning 1002:is a preprocessing step in 923:Outline of machine learning 820:Empirical risk minimization 10: 2848: 2778:Zheng A, Casari A (2018). 2663:10.1007/s42979-021-00815-1 1577:. IEEE. pp. 774–783. 1341:Scale co-occurrence matrix 560:Feedforward neural network 311:Artificial neural networks 2508:"What is a feature store" 2476:10.1109/DSAA.2015.7344858 2084:10.1016/j.trc.2024.104607 1902:10.1109/ICDE.2004.1320014 1694:10.1007/978-3-031-33560-0 1654:10.1016/j.trc.2024.104607 1466:Murphy, Kevin P. (2022). 1419:10.1016/j.trc.2024.104607 976:This article needs to be 543:Artificial neural network 1288:Deep learning algorithms 852:Journals and conferences 799:Mathematical foundations 709:Temporal difference (TD) 565:Recurrent neural network 485:Conditional random field 408:Dimensionality reduction 156:Dimensionality reduction 118:Quantum machine learning 113:Neuromorphic engineering 73:Self-supervised learning 68:Semi-supervised learning 2694:Bengio, Yoshua (2012), 2251:"tsfresh documentation" 1967:"What is Featuretools?" 1678:"Multi-aspect Learning" 1067:Feature engineering in 261:Apprenticeship learning 2293:"predict-idlab/tsflex" 1258:Deep feature synthesis 1205: 810:Bias–variance tradeoff 692:Reinforcement learning 668:Spiking neural network 78:Reinforcement learning 2384:"seglearn user guide" 2149:"getML documentation" 1750:"Feature Engineering" 1583:10.1109/icdm.2011.109 1200: 1044:strength of materials 1016:dimensionless numbers 646:Neural radiance field 468:Structured prediction 191:Structured prediction 63:Unsupervised learning 1896:. pp. 399–410. 1491:MacQueron C (2021). 1159:relational databases 1090:correlation matrices 1073:statistical modeling 1063:Predictive modelling 1008:statistical modeling 835:Statistical learning 733:Learning with humans 525:Local outlier factor 2651:SN Computer Science 2347:2022SoftX..1700971V 2075:2024TRPC..16204607S 1730:docs.aws.amazon.com 1645:2024TRPC..16204607S 1532:1999Natur.401..788L 1410:2024TRPC..16204607S 1306:Data transformation 1000:Feature engineering 678:Electrochemical RAM 585:reservoir computing 316:Logistic regression 235:Supervised learning 221:Multimodal learning 196:Feature engineering 141:Generative modeling 103:Rule-based learning 98:Curriculum learning 58:Supervised learning 33:Part of a series on 2455:. 16 October 2015. 1311:Feature extraction 246: • 161:Density estimation 18:Feature extraction 2808:978-1-61729-587-4 2789:978-1-4919-5324-2 2770:978-1-138-49568-5 2727:978-3-642-35288-1 2609:978-0-241-39863-0 2485:978-1-4673-8272-4 2470:. pp. 1–10. 2035:Sharma, Shubham, 1952:978-3-540-74975-2 1881:. September 2017. 1860:978-3-540-66490-1 1703:978-3-031-33559-4 1592:978-1-4577-2075-8 1526:(6755): 788–791. 1378:978-0-387-84884-6 1123:feature selection 1097:Feature selection 1036:Archimedes number 997: 996: 961: 960: 766:Model diagnostics 749:Human-in-the-loop 592:Boltzmann machine 505:Anomaly detection 301:Linear regression 216:Ontology learning 211:Grammar induction 186:Semantic analysis 181:Association rules 166:Anomaly detection 108:Neuro-symbolic AI 16:(Redirected from 2839: 2827:Machine learning 2812: 2793: 2774: 2745: 2744: 2743: 2742: 2711: 2691: 2685: 2684: 2674: 2642: 2636: 2635: 2629: 2621: 2593: 2587: 2586: 2584: 2583: 2568: 2562: 2561: 2559: 2558: 2544: 2538: 2537: 2535: 2534: 2524: 2518: 2517: 2515: 2514: 2504: 2498: 2497: 2463: 2457: 2456: 2449: 2443: 2442: 2440: 2438: 2422: 2416: 2415: 2413: 2411: 2401: 2395: 2394: 2392: 2390: 2380: 2374: 2373: 2371: 2369: 2340: 2316: 2310: 2309: 2307: 2305: 2289: 2283: 2282: 2280: 2278: 2268: 2262: 2261: 2259: 2257: 2247: 2241: 2240: 2238: 2236: 2220: 2214: 2213: 2211: 2209: 2193: 2187: 2186: 2184: 2182: 2166: 2160: 2159: 2157: 2155: 2145: 2139: 2138: 2136: 2124: 2118: 2117: 2115: 2103: 2097: 2096: 2086: 2054: 2048: 2047: 2046: 2045: 2032: 2026: 2025: 2023: 2021: 2005: 1999: 1998: 1996: 1994: 1984: 1978: 1977: 1975: 1973: 1963: 1957: 1956: 1930: 1924: 1923: 1889: 1883: 1882: 1871: 1865: 1864: 1840: 1831: 1822: 1821: 1819: 1817: 1811: 1804: 1796: 1790: 1789: 1787: 1785: 1779: 1771: 1765: 1764: 1762: 1760: 1754: 1746: 1740: 1739: 1737: 1736: 1722: 1716: 1715: 1673: 1667: 1666: 1656: 1624: 1618: 1617: 1615: 1603: 1597: 1596: 1566: 1560: 1559: 1511: 1505: 1504: 1488: 1482: 1481: 1463: 1457: 1456: 1438: 1432: 1431: 1421: 1389: 1383: 1382: 1362: 1316:Feature learning 1069:machine learning 992: 989: 983: 971: 970: 963: 953: 946: 939: 900:Related articles 777:Confusion matrix 530:Isolation forest 475:Graphical models 254: 253: 206:Learning to rank 201:Feature learning 39:Machine learning 30: 29: 21: 2847: 2846: 2842: 2841: 2840: 2838: 2837: 2836: 2817: 2816: 2815: 2809: 2790: 2771: 2754: 2752:Further reading 2749: 2748: 2740: 2738: 2728: 2692: 2688: 2643: 2639: 2623: 2622: 2610: 2594: 2590: 2581: 2579: 2569: 2565: 2556: 2554: 2546: 2545: 2541: 2532: 2530: 2526: 2525: 2521: 2512: 2510: 2506: 2505: 2501: 2486: 2464: 2460: 2451: 2450: 2446: 2436: 2434: 2424: 2423: 2419: 2409: 2407: 2403: 2402: 2398: 2388: 2386: 2382: 2381: 2377: 2367: 2365: 2317: 2313: 2303: 2301: 2291: 2290: 2286: 2276: 2274: 2270: 2269: 2265: 2255: 2253: 2249: 2248: 2244: 2234: 2232: 2222: 2221: 2217: 2207: 2205: 2195: 2194: 2190: 2180: 2178: 2168: 2167: 2163: 2153: 2151: 2147: 2146: 2142: 2125: 2121: 2104: 2100: 2055: 2051: 2043: 2041: 2033: 2029: 2019: 2017: 2007: 2006: 2002: 1992: 1990: 1986: 1985: 1981: 1971: 1969: 1965: 1964: 1960: 1953: 1931: 1927: 1912: 1890: 1886: 1879:Reality AI Blog 1873: 1872: 1868: 1861: 1838: 1832: 1825: 1815: 1813: 1812:on 4 March 2016 1809: 1802: 1798: 1797: 1793: 1783: 1781: 1777: 1773: 1772: 1768: 1758: 1756: 1752: 1748: 1747: 1743: 1734: 1732: 1724: 1723: 1719: 1704: 1674: 1670: 1625: 1621: 1604: 1600: 1593: 1567: 1563: 1512: 1508: 1489: 1485: 1478: 1464: 1460: 1453: 1439: 1435: 1390: 1386: 1379: 1363: 1359: 1354: 1297: 1284: 1268: 1260: 1241:Python library. 1209:getML community 1174: 1155: 1131: 1065: 1052: 1020:Reynolds number 993: 987: 984: 981: 972: 968: 957: 928: 927: 901: 893: 892: 853: 845: 844: 805:Kernel machines 800: 792: 791: 767: 759: 758: 739:Active learning 734: 726: 725: 694: 684: 683: 609:Diffusion model 545: 535: 534: 507: 497: 496: 470: 460: 459: 415:Factor analysis 410: 400: 399: 383: 346: 336: 335: 256: 255: 239: 238: 237: 226: 225: 131: 123: 122: 88:Online learning 53: 41: 28: 23: 22: 15: 12: 11: 5: 2845: 2835: 2834: 2829: 2814: 2813: 2807: 2794: 2788: 2775: 2769: 2755: 2753: 2750: 2747: 2746: 2726: 2686: 2637: 2608: 2588: 2563: 2539: 2519: 2499: 2484: 2458: 2444: 2417: 2396: 2375: 2311: 2284: 2263: 2242: 2215: 2188: 2161: 2140: 2119: 2098: 2049: 2027: 2000: 1979: 1958: 1951: 1925: 1910: 1884: 1866: 1859: 1823: 1791: 1766: 1741: 1717: 1702: 1668: 1619: 1598: 1591: 1561: 1506: 1483: 1476: 1458: 1451: 1433: 1384: 1377: 1356: 1355: 1353: 1350: 1349: 1348: 1343: 1338: 1333: 1328: 1323: 1318: 1313: 1308: 1303: 1296: 1293: 1283: 1280: 1267: 1266:Feature stores 1264: 1259: 1256: 1255: 1254: 1248: 1242: 1232: 1226: 1220: 1206: 1194: 1188: 1173: 1170: 1163:decision nodes 1154: 1151: 1150: 1149: 1146: 1130: 1127: 1119:kernel methods 1115:regularization 1111: 1110: 1107: 1064: 1061: 1051: 1048: 1046:in mechanics. 1028:Nusselt number 1024:fluid dynamics 995: 994: 975: 973: 966: 959: 958: 956: 955: 948: 941: 933: 930: 929: 926: 925: 920: 919: 918: 908: 902: 899: 898: 895: 894: 891: 890: 885: 880: 875: 870: 865: 860: 854: 851: 850: 847: 846: 843: 842: 837: 832: 827: 825:Occam learning 822: 817: 812: 807: 801: 798: 797: 794: 793: 790: 789: 784: 782:Learning curve 779: 774: 768: 765: 764: 761: 760: 757: 756: 751: 746: 741: 735: 732: 731: 728: 727: 724: 723: 722: 721: 711: 706: 701: 695: 690: 689: 686: 685: 682: 681: 675: 670: 665: 660: 659: 658: 648: 643: 642: 641: 636: 631: 626: 616: 611: 606: 601: 600: 599: 589: 588: 587: 582: 577: 572: 562: 557: 552: 546: 541: 540: 537: 536: 533: 532: 527: 522: 514: 508: 503: 502: 499: 498: 495: 494: 493: 492: 487: 482: 471: 466: 465: 462: 461: 458: 457: 452: 447: 442: 437: 432: 427: 422: 417: 411: 406: 405: 402: 401: 398: 397: 392: 387: 381: 376: 371: 363: 358: 353: 347: 342: 341: 338: 337: 334: 333: 328: 323: 318: 313: 308: 303: 298: 290: 289: 288: 283: 278: 268: 266:Decision trees 263: 257: 243:classification 233: 232: 231: 228: 227: 224: 223: 218: 213: 208: 203: 198: 193: 188: 183: 178: 173: 168: 163: 158: 153: 148: 143: 138: 136:Classification 132: 129: 128: 125: 124: 121: 120: 115: 110: 105: 100: 95: 93:Batch learning 90: 85: 80: 75: 70: 65: 60: 54: 51: 50: 47: 46: 35: 34: 26: 9: 6: 4: 3: 2: 2844: 2833: 2832:Data analysis 2830: 2828: 2825: 2824: 2822: 2810: 2804: 2800: 2795: 2791: 2785: 2781: 2776: 2772: 2766: 2762: 2757: 2756: 2737: 2733: 2729: 2723: 2719: 2715: 2710: 2705: 2701: 2697: 2690: 2682: 2678: 2673: 2668: 2664: 2660: 2656: 2652: 2648: 2641: 2633: 2627: 2619: 2615: 2611: 2605: 2601: 2600: 2592: 2578: 2574: 2567: 2553: 2549: 2543: 2529: 2523: 2509: 2503: 2495: 2491: 2487: 2481: 2477: 2473: 2469: 2462: 2454: 2448: 2433: 2432: 2427: 2421: 2406: 2400: 2385: 2379: 2364: 2360: 2356: 2352: 2348: 2344: 2339: 2334: 2330: 2326: 2322: 2315: 2300: 2299: 2294: 2288: 2273: 2267: 2252: 2246: 2231: 2230: 2225: 2219: 2204: 2203: 2198: 2192: 2177: 2176: 2171: 2165: 2150: 2144: 2135: 2130: 2123: 2114: 2109: 2102: 2094: 2090: 2085: 2080: 2076: 2072: 2068: 2064: 2060: 2053: 2040: 2039: 2031: 2016: 2015: 2010: 2004: 1989: 1983: 1968: 1962: 1954: 1948: 1944: 1940: 1936: 1929: 1921: 1917: 1913: 1911:0-7695-2065-0 1907: 1903: 1899: 1895: 1888: 1880: 1876: 1870: 1862: 1856: 1852: 1848: 1844: 1837: 1830: 1828: 1808: 1801: 1795: 1776: 1770: 1751: 1745: 1731: 1727: 1721: 1713: 1709: 1705: 1699: 1695: 1691: 1687: 1683: 1679: 1672: 1664: 1660: 1655: 1650: 1646: 1642: 1638: 1634: 1630: 1623: 1614: 1609: 1602: 1594: 1588: 1584: 1580: 1576: 1572: 1565: 1557: 1553: 1549: 1545: 1541: 1540:10.1038/44565 1537: 1533: 1529: 1525: 1521: 1517: 1510: 1502: 1498: 1494: 1487: 1479: 1477:9780262046824 1473: 1469: 1462: 1454: 1452:9781107057135 1448: 1444: 1437: 1429: 1425: 1420: 1415: 1411: 1407: 1403: 1399: 1395: 1388: 1380: 1374: 1370: 1369: 1361: 1357: 1347: 1346:Space mapping 1344: 1342: 1339: 1337: 1334: 1332: 1331:Kernel method 1329: 1327: 1324: 1322: 1321:Hashing trick 1319: 1317: 1314: 1312: 1309: 1307: 1304: 1302: 1299: 1298: 1292: 1289: 1279: 1276: 1272: 1263: 1252: 1249: 1246: 1243: 1240: 1236: 1233: 1230: 1227: 1224: 1221: 1218: 1214: 1210: 1207: 1204: 1198: 1195: 1192: 1189: 1186: 1182: 1179: 1178: 1177: 1169: 1166: 1164: 1160: 1147: 1144: 1143:decision tree 1140: 1139: 1138: 1136: 1126: 1124: 1120: 1116: 1108: 1105: 1104: 1103: 1100: 1098: 1093: 1091: 1087: 1083: 1079: 1074: 1070: 1060: 1056: 1047: 1045: 1041: 1040:sedimentation 1037: 1033: 1032:heat transfer 1029: 1025: 1021: 1017: 1012: 1009: 1005: 1001: 991: 988:February 2024 979: 974: 965: 964: 954: 949: 947: 942: 940: 935: 934: 932: 931: 924: 921: 917: 914: 913: 912: 909: 907: 904: 903: 897: 896: 889: 886: 884: 881: 879: 876: 874: 871: 869: 866: 864: 861: 859: 856: 855: 849: 848: 841: 838: 836: 833: 831: 828: 826: 823: 821: 818: 816: 813: 811: 808: 806: 803: 802: 796: 795: 788: 785: 783: 780: 778: 775: 773: 770: 769: 763: 762: 755: 752: 750: 747: 745: 744:Crowdsourcing 742: 740: 737: 736: 730: 729: 720: 717: 716: 715: 712: 710: 707: 705: 702: 700: 697: 696: 693: 688: 687: 679: 676: 674: 673:Memtransistor 671: 669: 666: 664: 661: 657: 654: 653: 652: 649: 647: 644: 640: 637: 635: 632: 630: 627: 625: 622: 621: 620: 617: 615: 612: 610: 607: 605: 602: 598: 595: 594: 593: 590: 586: 583: 581: 578: 576: 573: 571: 568: 567: 566: 563: 561: 558: 556: 555:Deep learning 553: 551: 548: 547: 544: 539: 538: 531: 528: 526: 523: 521: 519: 515: 513: 510: 509: 506: 501: 500: 491: 490:Hidden Markov 488: 486: 483: 481: 478: 477: 476: 473: 472: 469: 464: 463: 456: 453: 451: 448: 446: 443: 441: 438: 436: 433: 431: 428: 426: 423: 421: 418: 416: 413: 412: 409: 404: 403: 396: 393: 391: 388: 386: 382: 380: 377: 375: 372: 370: 368: 364: 362: 359: 357: 354: 352: 349: 348: 345: 340: 339: 332: 329: 327: 324: 322: 319: 317: 314: 312: 309: 307: 304: 302: 299: 297: 295: 291: 287: 286:Random forest 284: 282: 279: 277: 274: 273: 272: 269: 267: 264: 262: 259: 258: 251: 250: 245: 244: 236: 230: 229: 222: 219: 217: 214: 212: 209: 207: 204: 202: 199: 197: 194: 192: 189: 187: 184: 182: 179: 177: 174: 172: 171:Data cleaning 169: 167: 164: 162: 159: 157: 154: 152: 149: 147: 144: 142: 139: 137: 134: 133: 127: 126: 119: 116: 114: 111: 109: 106: 104: 101: 99: 96: 94: 91: 89: 86: 84: 83:Meta-learning 81: 79: 76: 74: 71: 69: 66: 64: 61: 59: 56: 55: 49: 48: 45: 40: 37: 36: 32: 31: 19: 2798: 2782:. O'Reilly. 2779: 2760: 2739:, retrieved 2699: 2689: 2654: 2650: 2640: 2598: 2591: 2580:. Retrieved 2576: 2566: 2555:. Retrieved 2551: 2542: 2531:. Retrieved 2522: 2511:. Retrieved 2502: 2467: 2461: 2447: 2437:September 7, 2435:. Retrieved 2429: 2420: 2410:September 7, 2408:. Retrieved 2399: 2389:September 7, 2387:. Retrieved 2378: 2368:September 7, 2366:. Retrieved 2328: 2324: 2314: 2304:September 7, 2302:. Retrieved 2296: 2287: 2277:September 7, 2275:. Retrieved 2266: 2256:September 7, 2254:. Retrieved 2245: 2235:September 7, 2233:. Retrieved 2227: 2218: 2208:September 7, 2206:. Retrieved 2200: 2191: 2181:September 7, 2179:. Retrieved 2173: 2164: 2154:September 7, 2152:. Retrieved 2143: 2122: 2101: 2066: 2062: 2052: 2042:, retrieved 2037: 2030: 2020:September 7, 2018:. Retrieved 2012: 2003: 1993:September 7, 1991:. Retrieved 1982: 1972:September 7, 1970:. Retrieved 1961: 1934: 1928: 1893: 1887: 1878: 1869: 1842: 1814:. Retrieved 1807:the original 1794: 1782:. Retrieved 1769: 1757:. Retrieved 1755:. 2010-04-22 1744: 1733:. Retrieved 1729: 1720: 1685: 1681: 1671: 1636: 1632: 1622: 1601: 1574: 1564: 1523: 1519: 1509: 1486: 1467: 1461: 1442: 1436: 1401: 1397: 1387: 1371:. Springer. 1367: 1360: 1285: 1282:Alternatives 1277: 1273: 1269: 1261: 1250: 1244: 1239:scikit-learn 1234: 1228: 1222: 1208: 1201: 1196: 1190: 1181:featuretools 1180: 1175: 1167: 1156: 1132: 1112: 1101: 1094: 1066: 1057: 1053: 1018:such as the 1013: 999: 998: 985: 977: 830:PAC learning 517: 366: 361:Hierarchical 293: 247: 241: 195: 1816:12 November 1784:12 November 1759:12 November 1084:(ICA), and 714:Multi-agent 651:Transformer 550:Autoencoder 306:Naive Bayes 44:data mining 2821:Categories 2741:2023-03-21 2657:(6): 420. 2618:1064776283 2582:2023-03-21 2557:2023-03-21 2533:2021-04-15 2513:2022-04-19 2338:2111.12429 2331:: 100971. 2134:1706.00327 2113:1706.00327 2069:: 104607. 2044:2024-04-14 1735:2024-03-01 1639:: 104607. 1495:(Report). 1404:: 104607. 1352:References 1129:Automation 1050:Clustering 1034:, and the 699:Q-learning 597:Restricted 395:Mean shift 344:Clustering 321:Perceptron 249:regression 151:Clustering 146:Regression 2709:1206.5533 2626:cite book 2577:Explorium 2494:206610380 2363:244527198 2325:SoftwareX 2093:0968-090X 1712:1868-4394 1663:0968-090X 1613:0903.4530 1548:1476-4687 1428:0968-090X 1301:Covariate 858:ECML PKDD 840:VC theory 787:ROC curve 719:Self-play 639:DeepDream 480:Bayes net 271:Ensembles 52:Paradigms 2736:10808461 2681:34426802 1556:10548103 1295:See also 1235:seglearn 281:Boosting 130:Problems 2672:8372231 2602:. UK. 2343:Bibcode 2071:Bibcode 1920:1183403 1641:Bibcode 1528:Bibcode 1406:Bibcode 1223:tsfresh 1080:(PCA), 978:updated 863:NeurIPS 680:(ECRAM) 634:AlexNet 276:Bagging 2805:  2786:  2767:  2734:  2724:  2679:  2669:  2616:  2606:  2492:  2482:  2431:GitHub 2361:  2298:GitHub 2229:GitHub 2202:GitHub 2175:GitHub 2091:  2014:GitHub 1949:  1918:  1908:  1857:  1710:  1700:  1661:  1589:  1554:  1546:  1520:Nature 1474:  1449:  1426:  1375:  1229:tsflex 1185:Python 1121:, and 1026:, the 656:Vision 512:RANSAC 390:OPTICS 385:DBSCAN 369:-means 176:AutoML 2732:S2CID 2704:arXiv 2490:S2CID 2359:S2CID 2333:arXiv 2129:arXiv 2108:arXiv 1916:S2CID 1839:(PDF) 1810:(PDF) 1803:(PDF) 1778:(PDF) 1753:(PDF) 1608:arXiv 1245:tsfel 1203:cost. 1197:OneBM 1191:MCMD: 1183:is a 878:IJCAI 704:SARSA 663:Mamba 629:LeNet 624:U-Net 450:t-SNE 374:Fuzzy 351:BIRCH 2803:ISBN 2784:ISBN 2765:ISBN 2722:ISBN 2677:PMID 2632:link 2614:OCLC 2604:ISBN 2480:ISBN 2439:2022 2412:2022 2391:2022 2370:2022 2306:2022 2279:2022 2258:2022 2237:2022 2210:2022 2183:2022 2156:2022 2089:ISSN 2022:2022 1995:2022 1974:2022 1947:ISBN 1906:ISBN 1855:ISBN 1818:2015 1786:2015 1761:2015 1708:ISSN 1698:ISBN 1659:ISSN 1587:ISBN 1552:PMID 1544:ISSN 1472:ISBN 1447:ISBN 1424:ISSN 1373:ISBN 1251:kats 1071:and 1006:and 888:JMLR 873:ICLR 868:ICML 754:RLHF 570:LSTM 356:CURE 42:and 2714:doi 2667:PMC 2659:doi 2472:doi 2351:doi 2079:doi 2067:162 1939:doi 1898:doi 1847:doi 1690:doi 1686:242 1649:doi 1637:162 1579:doi 1536:doi 1524:401 1497:doi 1414:doi 1402:162 1217:C++ 1038:in 1030:in 1022:in 614:SOM 604:GAN 580:ESN 575:GRU 520:-NN 455:SDL 445:PGD 440:PCA 435:NMF 430:LDA 425:ICA 420:CCA 296:-NN 2823:: 2730:, 2720:, 2712:, 2698:, 2675:. 2665:. 2653:. 2649:. 2628:}} 2624:{{ 2612:. 2575:. 2550:. 2488:. 2478:. 2428:. 2357:. 2349:. 2341:. 2329:17 2327:. 2323:. 2295:. 2226:. 2199:. 2172:. 2087:. 2077:. 2065:. 2061:. 2011:. 1945:. 1914:. 1904:. 1877:. 1853:. 1841:. 1826:^ 1728:. 1706:. 1696:. 1688:. 1684:. 1680:. 1657:. 1647:. 1635:. 1631:. 1585:. 1573:. 1550:. 1542:. 1534:. 1522:. 1518:. 1422:. 1412:. 1400:. 1396:. 1125:. 1117:, 1092:. 883:ML 2811:. 2792:. 2773:. 2716:: 2706:: 2683:. 2661:: 2655:2 2634:) 2620:. 2585:. 2560:. 2536:. 2516:. 2496:. 2474:: 2441:. 2414:. 2393:. 2372:. 2353:: 2345:: 2335:: 2308:. 2281:. 2260:. 2239:. 2212:. 2185:. 2158:. 2137:. 2131:: 2116:. 2110:: 2095:. 2081:: 2073:: 2024:. 1997:. 1976:. 1955:. 1941:: 1922:. 1900:: 1863:. 1849:: 1820:. 1788:. 1763:. 1738:. 1714:. 1692:: 1665:. 1651:: 1643:: 1616:. 1610:: 1595:. 1581:: 1558:. 1538:: 1530:: 1503:. 1499:: 1480:. 1455:. 1430:. 1416:: 1408:: 1381:. 1215:/ 1213:C 1145:. 990:) 986:( 980:. 952:e 945:t 938:v 518:k 367:k 294:k 252:) 240:( 20:)

Index

Feature extraction
Machine learning
data mining
Supervised learning
Unsupervised learning
Semi-supervised learning
Self-supervised learning
Reinforcement learning
Meta-learning
Online learning
Batch learning
Curriculum learning
Rule-based learning
Neuro-symbolic AI
Neuromorphic engineering
Quantum machine learning
Classification
Generative modeling
Regression
Clustering
Dimensionality reduction
Density estimation
Anomaly detection
Data cleaning
AutoML
Association rules
Semantic analysis
Structured prediction
Feature engineering
Feature learning

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑