Knowledge

Bootstrap aggregating

Source 📝

1525: 1355:, which possesses numerous benefits over a single decision tree generated without randomness. In a random forest, each tree "votes" on whether or not to classify a sample as positive based on its features. The sample is then classified based on majority vote. An example of this is given in the diagram below, where the four trees in a random forest vote on whether or not a patient with mutations A, B, F, and G has cancer. Since three out of four trees vote yes, the patient is then classified as cancer positive. 1359: 1190: 2246: 1243: 1386:, as they are much easier to interpret and generally require less data for training. As an integral component of random forests, bootstrap aggregating is very important to classification algorithms, and provides a critical element of variability that allows for increased accuracy when analyzing new data, as discussed below. 1291: 1351:
means that each tree only knows about the data pertaining to a small constant number of features, and a variable number of samples that is less than or equal to that of the original dataset. Consequently, the trees are more likely to return a wider array of answers, derived from more diverse knowledge. This results in a
1381:
There are several important factors to consider when designing a random forest. If the trees in the random forests are too deep, overfitting can still occur due to over-specificity. If the forest is too large, the algorithm may become less efficient due to an increased runtime. Random forests also do
1453:
Random Forests are more complex to implement than lone decision trees or other algorithms. This is because they take extra steps for bagging, as well as the need for recursion in order to produce an entire forest, which complicates implementation. Because of this, it requires much more computational
1476:
Much easier to interpret than a random forest. A single tree can be walked by hand (by a human) leading to a somewhat "explainable" understanding for the analyst of what the tree is actually doing. As the number of trees and schemes grow for ensembling those trees into predictions, this reviewing
1472:
Works well with non-linear data. As most tree based algorithms use linear splits, using an ensemble of a set of trees works better than using a single tree on data that has nonlinear properties (i.e. most real world distributions). Working well with non-linear data is a huge advantage because other
1341:
This process is repeated recursively for successive levels of the tree until the desired depth is reached. At the very bottom of the tree, samples that test positive for the final feature are generally classified as positive, while those that lack the feature are classified as negative. These trees
1935:
samples were drawn. Each sample is composed of a random subset of the original data and maintains a semblance of the master set's distribution and variability. For each bootstrap sample, a LOESS smoother was fit. Predictions from these 100 smoothers were then made across the range of the data. The
1350:
The next part of the algorithm involves introducing yet another element of variability amongst the bootstrapped trees. In addition to each tree only examining a bootstrapped set of samples, only a small but consistent number of unique features are considered when ranking them as classifiers. This
1301:
Creating the bootstrap and out-of-bag datasets is crucial since it is used to test the accuracy of a random forest algorithm. For example, a model that produces 50 trees using the bootstrap/out-of-bag datasets will have a better accuracy than if it produced 10 trees. Since the algorithm generates
1497:
To recreate specific results you need to keep track of the exact random seed used to generate the bootstrap sets. This may be important when collecting data for research or within a data mining class. Using random seeds is essential to the random forests, but can make it hard to support your
2015:). Breiman developed the concept of bagging in 1994 to improve classification by combining classifications of randomly generated training sets. He argued, "If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy". 1466:
Requires much more time to train the data compared to decision trees. Having a large forest can quickly begin to decrease the speed in which one's program operates because it has to traverse much more data even though each tree is using a smaller set of samples and features.
1440:
The algorithm may change significantly if there is a slight change to the data being bootstrapped and used within the forests. In other words, random forests are incredibly dependent on their data sets, changing these can drastically change the individual trees' structures.
1330:
The diagram below shows a decision tree of depth two being used to classify data. For example, a data point that exhibits Feature 1, but not Feature 2, will be given a "No". Another point that does not exhibit Feature 1, but does exhibit Feature 3, will be given a "Yes".
1418:
Decide on accuracy or speed: Depending on the desired results, increasing or decreasing the number of trees within the forest can help. Increasing the number of trees generally provides more accurate results while decreasing the number of trees will provide quicker
2259: 1402:(otherwise known as bootstrapping), there are certain techniques that can be used in order to improve their execution and voting time, their prediction accuracy, and their overall performance. The following are key steps in creating an efficient random forest: 1314:
from the bootstrapped dataset. To achieve this, the process examines each gene/feature and determines for how many samples the feature's presence or absence yields a positive or negative result. This information is then used to compute a
1946:
By taking the average of 100 smoothers, each corresponding to a subset of the original data set, we arrive at one bagged predictor (red line). The red line's flow is stable and does not overly conform to any data point(s).
2373: 1336: 1406:
Specify the maximum depth of trees: Instead of allowing your random forest to continue until all nodes are pure, it is better to cut it off at a certain point in order to further decrease chances of overfitting.
1377:
where it is important to be able to predict future results based on past data. One of their applications would be as a useful tool for predicting cancer based on genetic factors, as seen in the above example.
1302:
multiple trees and therefore multiple datasets the chance that an object is left out of the bootstrap dataset is low. The next few sections talk about how the random forest algorithm works in more detail.
1409:
Prune the dataset: Using an extremely large dataset may prove to create results that is less indicative of the data provided than a smaller set that more accurately represents what is being focused on.
1489:
Does not predict beyond the range of the training data. This is a con because while bagging is often effective, all of the data is not being considered, therefore it cannot predict an entire dataset.
1373:, which attempts to draw observed connections between statistical variables in a dataset. This makes random forests particularly useful in such fields as banking, healthcare, the stock market, and 1446:
Easy data preparation. Data is prepared by creating a bootstrap set and a certain number of decision trees to build a random forest that also utilizes feature selection, as mentioned in the
1323:
based on their confusion matrices. Some common metrics include estimate of positive correctness (calculated by subtracting false positives from true positives), measure of "goodness", and
1286:
Keep in mind that since both datasets are sets, when taking the difference the duplicate names are ignored in the bootstrap dataset. The illustration below shows how the math is done:
1319:, which lists the true positives, false positives, true negatives, and false negatives of the feature when used as a classifier. These features are then ranked according to various 1904:
for i = 1 to m { D' = bootstrap sample from D (sample with replacement) Ci = I(D') } C*(x) = argmax #{i:Ci(x)=y} (most often predicted label y) y∈Y
2149: 1936:
black lines represent these initial predictions. The lines lack agreement in their predictions and tend to overfit their data points: evident by the wobbly flow of the lines.
1382:
not generally perform well when given sparse data with little variability. However, they still have numerous advantages over similar data classification algorithms such as
1940: 1238:
However, the difference is that the bootstrap dataset can have duplicate objects. Here is a simple example to demonstrate how it works along with the illustration below:
866: 1898: 1851: 1824: 1795: 1748: 1721: 1671: 1620: 1162: 1120: 1074: 904: 1927:
The relationship between temperature and ozone appears to be nonlinear in this data set, based on the scatter plot. To mathematically describe this relationship,
1871: 1768: 1691: 1644: 1593: 1573: 1553: 1178:
sample. Sampling with replacement ensures each bootstrap is independent from its peers, as it does not depend on previous chosen samples when sampling. Then,
1039: 1999:
The concept of bootstrap aggregating is derived from the concept of bootstrapping which was developed by Bradley Efron. Bootstrap aggregating was proposed by
1494:
The random forest classifier operates with a high accuracy and speed. Random forests are much faster than decision trees because of using a smaller data set.
1282:
It can be calculated by taking the difference between the original and the bootstrap datasets. In this case, the remaining samples who were not selected are
861: 1209:. Bagging was shown to improve preimage learning. On the other hand, it can mildly degrade the performance of stable methods such as K-nearest neighbors. 851: 1486:
and runs efficiently on even large data sets. This is the result of the random forest's use of bagging in conjunction with random feature selection.
2617: 692: 1226:
Each section below will explain how each dataset is made except for the original dataset. The original dataset is whatever information is given.
2085:, Proceedings of the Electronic Voting Technology Workshop (EVT '07), Boston, MA, August 6, 2007. More generally, when drawing with replacement 899: 856: 707: 438: 2348: 939: 742: 2299: 2461: 818: 1369:
their data, and run quickly and efficiently even for large datasets. They are primarily useful for classification as opposed to
1327:. These features are then used to partition the samples into two sets: those who possess the top feature, and those who do not. 1365:
Because of their properties, random forests are considered one of the most accurate data mining algorithms, are less likely to
367: 1437:
There are overall less requirements involved for normalization and scaling, making the use of random forests more convenient.
2663: 2687: 1092: 876: 639: 174: 1270:
In this case, the bootstrap sample contained four duplicates for Constantine, and two duplicates for Lexi, and Theodore.
894: 1324: 1202: 1088: 727: 702: 651: 2692: 775: 770: 423: 1268:
James, Ellie, Constantine, Lexi, John, Constantine, Theodore, Constantine, Anthony, Lexi, Constantine, and Theodore.
433: 71: 2374:"Why Choose Random Forest and Not Decision Trees – Towards AI – The World's Leading AI and Technology Publication" 2393: 1931:
smoothers (with bandwidth 0.5) are used. Rather than building a single smoother for the complete data set, 100
932: 828: 592: 413: 2488: 1959:
Many weak learners aggregated typically outperform a single learner over the entire set, and have less overfit
2625:
Kotsiantis, Sotiris (2014). "Bagging and boosting variants for handling classifications problems: a survey".
2034: 803: 505: 281: 2055: 760: 697: 607: 585: 428: 418: 979: 911: 823: 808: 269: 91: 1186:
bootstrap samples and combined by averaging the output (for regression) or voting (for classification).
2682: 2549: 2169: 2029: 1198: 987: 871: 798: 548: 443: 231: 164: 124: 1335: 2024: 1166: 925: 531: 299: 169: 2610: 2096: 2563: 2511: 2272: 2183: 1921: 1258:
Emily, Jessie, George, Constantine, Lexi, Theodore, John, James, Rachel, Anthony, Ellie, and Jamal.
553: 473: 396: 314: 144: 106: 101: 61: 56: 1967: 500: 349: 249: 76: 1912:
To illustrate the basic principles of bagging, below is an analysis on the relationship between
1242: 2558: 2178: 2049: 1932: 1524: 1320: 1175: 1003: 680: 656: 558: 319: 294: 254: 66: 2419:"An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants" 2247:
Image denoising with a multi-phase kernel principal component approach and an ensemble version
1413:
Continue pruning the data at each node split rather than just in the original bagging process.
1290: 2597: 1081: 634: 456: 408: 179: 51: 2060: 1876: 1829: 1802: 1773: 1726: 1699: 1649: 1598: 1234:
The bootstrap dataset is made by randomly picking objects from the original dataset. Also,
1140: 1098: 1052: 563: 513: 8: 1370: 991: 666: 602: 573: 478: 304: 237: 223: 209: 184: 134: 86: 46: 2642: 2576: 2440: 2196: 1984:
For a weak learner with high bias, bagging will also carry high bias into its aggregate
1973: 1856: 1753: 1676: 1629: 1578: 1558: 1538: 1024: 644: 568: 354: 149: 2590:"adabag: An R package for classification with AdaBoost.M1, AdaBoost-SAMME and Bagging" 2659: 2323: 2218: 1963: 1206: 1197:
Bagging leads to "improvements for unstable procedures", which include, for example,
1007: 972: 737: 580: 493: 289: 259: 204: 199: 154: 96: 2646: 2580: 2200: 2080: 2634: 2568: 2520: 2444: 2430: 2188: 2039: 1928: 1358: 1316: 983: 765: 518: 468: 378: 362: 332: 194: 189: 139: 129: 27: 1006:
methods, it can be used with any type of method. Bagging is a special case of the
1917: 1473:
data mining techniques such as single decision trees do not handle this as well.
793: 597: 463: 403: 1508: 1383: 1125: 975: 813: 344: 81: 2638: 2435: 2418: 1511:, or by grouping values together to avoid values that are terribly far apart. 2676: 2525: 2506: 2502: 2044: 1460: 1399: 1395: 1352: 1311: 732: 661: 543: 274: 159: 2589: 1976:, as each separate bootstrap can be processed on its own before aggregation 1533: 1504: 1019: 1463:, forests are able to more accurately make predictions than single trees. 1222:
There are three types of datasets in bootstrap aggregating. These are the
1095:. By sampling with replacement, some observations may be repeated in each 2544: 2164: 2000: 1483: 1366: 999: 538: 32: 2572: 2192: 2093:(different and equally likely), the expected number of unique draws is 1498:
statements based on forests if there is a failure to record the seeds.
1374: 687: 383: 309: 1280:
represents the remaining people who were not in the bootstrap dataset.
846: 627: 1507:
and datasets with many outliers well. They deal with this by using
1189: 995: 2249:, IEEE Applied Imagery Pattern Recognition Workshop, pp.1-7, 2011. 2228:(421). Department of Statistics, University of California Berkeley 1873:, the classification predicted most often by the sub-classifiers 622: 2079:
Aslam, Javed A.; Popa, Raluca A.; and Rivest, Ronald L. (2007);
1826:
is generated by using the previously created set of classifiers
1528:
Flow chart of the bagging algorithm when used for classification
2258:
Shinde, Amit, Anshuman Sahu, Daniel Apley, and George Runger. "
1174:, the rest being duplicates. This kind of sample is known as a 373: 1913: 617: 612: 339: 2260:
Preimages for Variation Patterns from Kernel PCA and Bagging
2082:
On Estimating the Size and Confidence of a Statistical Audit
1990:
Can be computationally expensive depending on the data set
1310:
The next step of the algorithm involves the generation of
2654:
Boehmke, Bradley; Greenwell, Brandon (2019). "Bagging".
2061:
Predictive analysis: Classification and regression trees
1939: 1193:
An illustration for the concept of bootstrap aggregating
905:
List of datasets in computer vision and image processing
1389: 2349:"Random Forest Algorithm Advantages and Disadvantages" 2099: 1879: 1859: 1832: 1805: 1776: 1756: 1729: 1702: 1679: 1652: 1632: 1601: 1581: 1561: 1541: 1143: 1101: 1055: 1027: 2300:"Introduction to Random Forest in Machine Learning" 2507:"Bootstrap methods: Another look at the jackknife" 2143: 1892: 1865: 1845: 1818: 1789: 1762: 1742: 1715: 1685: 1665: 1638: 1614: 1587: 1567: 1547: 1342:are then used as predictors to classify new data. 1273: 1156: 1114: 1068: 1033: 1262:By randomly picking a group of names, let us say 1236:it must be the same size as the original dataset. 1229: 2674: 2653: 2003:who also coined the abbreviated term "bagging" ( 2304:Engineering Education (EngEd) Program | Section 1950: 1477:becomes much more difficult if not impossible. 1013: 900:List of datasets for machine-learning research 2588:Alfaro, E., GĂĄmez, M. and GarcĂ­a, N. (2012). 2273:"Random forests - classification description" 1519: 1394:While the techniques described above utilize 1305: 1224:original, bootstrap, and out-of-bag datasets. 933: 1424:Pros and Cons of Random Forests and Bagging 1357: 2489:"Bagging (Bootstrap Aggregating), Overview" 2416: 1212: 2624: 2462:"What is Bagging (Bootstrap Aggregation)?" 940: 926: 2616:CS1 maint: multiple names: authors list ( 2562: 2524: 2434: 2182: 1284:Emily, Jessie, George, Rachel, and Jamal. 2658:. Chapman & Hall. pp. 191–202. 2262:." IIE Transactions, Vol.46, Iss.5, 2014 1523: 1188: 1164:is expected to have the fraction (1 - 1/ 2543: 2216: 2163: 1770:to determine the classification of set 16:Ensemble method within machine learning 2675: 2587: 2486: 1907: 2501: 2456: 2454: 2410: 1170:) (≈63.2%) of the unique examples of 2294: 2292: 2212: 2210: 2159: 2157: 1987:Loss of interpretability of a model. 1575:and the number of bootstrap samples 1390:Improving Random Forests and Bagging 1002:. Although it is usually applied to 2487:Zoghni, Raouf (September 5, 2020). 1920:and Leroy (1986), analysis done in 1454:power and computational resources. 1203:classification and regression trees 895:Glossary of artificial intelligence 13: 2537: 2495: 2451: 1962:Reduces variance in high-variance 1938: 1182:models are fitted using the above 14: 2704: 2491:. The Startup – via Medium. 2417:Bauer, Eric; Kohavi, Ron (1999). 2346: 2289: 2245:Sahu, A., Runger, G., Apley, D., 2207: 2154: 1345: 2656:Hands-On Machine Learning with R 2371: 1966:weak learner, which can improve 1595:as input. Generate a classifier 1334: 1289: 1241: 2480: 2386: 2365: 2324:"Random Forest Pros & Cons" 2217:Breiman, Leo (September 1994). 1274:Creating the out-of-bag dataset 2547:(1996). "Bagging predictors". 2340: 2316: 2265: 2252: 2239: 2167:(1996). "Bagging predictors". 2144:{\displaystyle n(1-e^{-n'/n})} 2138: 2103: 2073: 1230:Creating the bootstrap dataset 315:Relevance vector machine (RVM) 1: 2468:. Corporate Finance Institute 2066: 2035:Cross-validation (statistics) 1296: 804:Computational learning theory 368:Expectation–maximization (EM) 2056:Resampled efficient frontier 1951:Advantages and disadvantages 1217: 1014:Description of the technique 761:Coefficient of determination 608:Convolutional neural network 320:Support vector machine (SVM) 7: 2688:Machine learning algorithms 2398:Corporate Finance Institute 2018: 1916:and temperature (data from 1900:is the final classification 912:Outline of machine learning 809:Empirical risk minimization 10: 2709: 2030:Bootstrapping (statistics) 1994: 1532:For classification, use a 1520:Algorithm (classification) 1306:Creation of Decision Trees 1205:, and subset selection in 1199:artificial neural networks 988:statistical classification 549:Feedforward neural network 300:Artificial neural networks 2639:10.1017/S0269888913000313 2025:Boosting (meta-algorithm) 1853:on the original data set 1482:There is a lower risk of 973:machine learning ensemble 532:Artificial neural network 2693:Computational statistics 2512:The Annals of Statistics 2347:K, Dhiraj (2020-11-22). 1213:Process of the algorithm 978:designed to improve the 841:Journals and conferences 788:Mathematical foundations 698:Temporal difference (TD) 554:Recurrent neural network 474:Conditional random field 397:Dimensionality reduction 145:Dimensionality reduction 107:Quantum machine learning 102:Neuromorphic engineering 62:Self-supervised learning 57:Semi-supervised learning 2436:10.1023/A:1007515423169 2089:values out of a set of 1968:efficiency (statistics) 1723:is built from each set 1459:Consisting of multiple 1278:The out-of-bag dataset 250:Apprenticeship learning 2605:Cite journal requires 2526:10.1214/aos/1176344552 2145: 2050:Random subspace method 1943: 1894: 1867: 1847: 1820: 1791: 1764: 1744: 1717: 1687: 1667: 1640: 1616: 1589: 1569: 1549: 1529: 1362: 1321:classification metrics 1194: 1158: 1116: 1070: 1035: 799:Bias–variance tradeoff 681:Reinforcement learning 657:Spiking neural network 67:Reinforcement learning 2627:Knowledge Eng. Review 2146: 1942: 1895: 1893:{\displaystyle C_{i}} 1868: 1848: 1846:{\displaystyle C_{i}} 1821: 1819:{\displaystyle C^{*}} 1792: 1790:{\displaystyle D_{i}} 1765: 1745: 1743:{\displaystyle D_{i}} 1718: 1716:{\displaystyle C_{i}} 1688: 1668: 1666:{\displaystyle D_{i}} 1641: 1617: 1615:{\displaystyle C^{*}} 1590: 1570: 1550: 1527: 1361: 1264:our bootstrap dataset 1192: 1159: 1157:{\displaystyle D_{i}} 1117: 1115:{\displaystyle D_{i}} 1071: 1069:{\displaystyle D_{i}} 1036: 953:Bootstrap aggregating 635:Neural radiance field 457:Structured prediction 180:Structured prediction 52:Unsupervised learning 2219:"Bagging Predictors" 2097: 1972:Can be performed in 1877: 1857: 1830: 1803: 1774: 1754: 1727: 1700: 1677: 1650: 1630: 1599: 1579: 1559: 1539: 1141: 1099: 1053: 1045:, bagging generates 1025: 824:Statistical learning 722:Learning with humans 514:Local outlier factor 2052:(attribute bagging) 1908:Example: ozone data 1799:Finally classifier 1646:new training sets 1425: 1254:group of 12 people. 998:and helps to avoid 986:algorithms used in 667:Electrochemical RAM 574:reservoir computing 305:Logistic regression 224:Supervised learning 210:Multimodal learning 185:Feature engineering 130:Generative modeling 92:Rule-based learning 87:Curriculum learning 47:Supervised learning 22:Part of a series on 2573:10.1007/BF00058655 2372:Team, Towards AI. 2193:10.1007/BF00058655 2141: 1944: 1890: 1863: 1843: 1816: 1787: 1760: 1740: 1713: 1683: 1663: 1636: 1612: 1585: 1565: 1545: 1530: 1423: 1363: 1195: 1154: 1112: 1066: 1049:new training sets 1031: 994:. It also reduces 235: • 150:Density estimation 2683:Ensemble learning 2665:978-1-138-49568-5 2277:stat.berkeley.edu 1866:{\displaystyle D} 1763:{\displaystyle I} 1686:{\displaystyle D} 1639:{\displaystyle m} 1588:{\displaystyle m} 1568:{\displaystyle I} 1548:{\displaystyle D} 1517: 1516: 1207:linear regression 1133:, then for large 1034:{\displaystyle D} 1018:Given a standard 950: 949: 755:Model diagnostics 738:Human-in-the-loop 581:Boltzmann machine 494:Anomaly detection 290:Linear regression 205:Ontology learning 200:Grammar induction 175:Semantic analysis 170:Association rules 155:Anomaly detection 97:Neuro-symbolic AI 2700: 2669: 2650: 2621: 2614: 2608: 2603: 2601: 2593: 2584: 2566: 2550:Machine Learning 2531: 2530: 2528: 2499: 2493: 2492: 2484: 2478: 2477: 2475: 2473: 2458: 2449: 2448: 2438: 2423:Machine Learning 2414: 2408: 2407: 2405: 2404: 2390: 2384: 2383: 2381: 2380: 2369: 2363: 2362: 2360: 2359: 2344: 2338: 2337: 2335: 2334: 2320: 2314: 2313: 2311: 2310: 2296: 2287: 2286: 2284: 2283: 2269: 2263: 2256: 2250: 2243: 2237: 2236: 2234: 2233: 2226:Technical Report 2223: 2214: 2205: 2204: 2186: 2170:Machine Learning 2161: 2152: 2150: 2148: 2147: 2142: 2137: 2136: 2132: 2127: 2077: 2040:Out-of-bag error 1899: 1897: 1896: 1891: 1889: 1888: 1872: 1870: 1869: 1864: 1852: 1850: 1849: 1844: 1842: 1841: 1825: 1823: 1822: 1817: 1815: 1814: 1796: 1794: 1793: 1788: 1786: 1785: 1769: 1767: 1766: 1761: 1749: 1747: 1746: 1741: 1739: 1738: 1722: 1720: 1719: 1714: 1712: 1711: 1693:with replacement 1692: 1690: 1689: 1684: 1672: 1670: 1669: 1664: 1662: 1661: 1645: 1643: 1642: 1637: 1621: 1619: 1618: 1613: 1611: 1610: 1594: 1592: 1591: 1586: 1574: 1572: 1571: 1566: 1554: 1552: 1551: 1546: 1426: 1422: 1338: 1325:information gain 1317:confusion matrix 1293: 1256:Their names are 1250:original dataset 1245: 1163: 1161: 1160: 1155: 1153: 1152: 1121: 1119: 1118: 1113: 1111: 1110: 1093:with replacement 1075: 1073: 1072: 1067: 1065: 1064: 1040: 1038: 1037: 1032: 984:machine learning 982:and accuracy of 942: 935: 928: 889:Related articles 766:Confusion matrix 519:Isolation forest 464:Graphical models 243: 242: 195:Learning to rank 190:Feature learning 28:Machine learning 19: 18: 2708: 2707: 2703: 2702: 2701: 2699: 2698: 2697: 2673: 2672: 2666: 2615: 2606: 2604: 2595: 2594: 2540: 2538:Further reading 2535: 2534: 2500: 2496: 2485: 2481: 2471: 2469: 2460: 2459: 2452: 2415: 2411: 2402: 2400: 2394:"Random Forest" 2392: 2391: 2387: 2378: 2376: 2370: 2366: 2357: 2355: 2345: 2341: 2332: 2330: 2322: 2321: 2317: 2308: 2306: 2298: 2297: 2290: 2281: 2279: 2271: 2270: 2266: 2257: 2253: 2244: 2240: 2231: 2229: 2221: 2215: 2208: 2162: 2155: 2128: 2120: 2116: 2112: 2098: 2095: 2094: 2078: 2074: 2069: 2021: 1997: 1980:Disadvantages: 1953: 1910: 1905: 1884: 1880: 1878: 1875: 1874: 1858: 1855: 1854: 1837: 1833: 1831: 1828: 1827: 1810: 1806: 1804: 1801: 1800: 1781: 1777: 1775: 1772: 1771: 1755: 1752: 1751: 1734: 1730: 1728: 1725: 1724: 1707: 1703: 1701: 1698: 1697: 1678: 1675: 1674: 1657: 1653: 1651: 1648: 1647: 1631: 1628: 1627: 1606: 1602: 1600: 1597: 1596: 1580: 1577: 1576: 1560: 1557: 1556: 1540: 1537: 1536: 1522: 1392: 1384:neural networks 1348: 1308: 1299: 1276: 1232: 1220: 1215: 1148: 1144: 1142: 1139: 1138: 1106: 1102: 1100: 1097: 1096: 1076:, each of size 1060: 1056: 1054: 1051: 1050: 1026: 1023: 1022: 1016: 1008:model averaging 946: 917: 916: 890: 882: 881: 842: 834: 833: 794:Kernel machines 789: 781: 780: 756: 748: 747: 728:Active learning 723: 715: 714: 683: 673: 672: 598:Diffusion model 534: 524: 523: 496: 486: 485: 459: 449: 448: 404:Factor analysis 399: 389: 388: 372: 335: 325: 324: 245: 244: 228: 227: 226: 215: 214: 120: 112: 111: 77:Online learning 42: 30: 17: 12: 11: 5: 2706: 2696: 2695: 2690: 2685: 2671: 2670: 2664: 2651: 2622: 2607:|journal= 2585: 2564:10.1.1.32.9399 2557:(2): 123–140. 2539: 2536: 2533: 2532: 2494: 2479: 2450: 2409: 2385: 2364: 2339: 2328:HolyPython.com 2315: 2288: 2264: 2251: 2238: 2206: 2184:10.1.1.32.9399 2177:(2): 123–140. 2153: 2140: 2135: 2131: 2126: 2123: 2119: 2115: 2111: 2108: 2105: 2102: 2071: 2070: 2068: 2065: 2064: 2063: 2058: 2053: 2047: 2042: 2037: 2032: 2027: 2020: 2017: 1996: 1993: 1992: 1991: 1988: 1985: 1978: 1977: 1970: 1960: 1952: 1949: 1909: 1906: 1903: 1902: 1901: 1887: 1883: 1862: 1840: 1836: 1813: 1809: 1797: 1784: 1780: 1759: 1737: 1733: 1710: 1706: 1694: 1682: 1660: 1656: 1635: 1609: 1605: 1584: 1564: 1544: 1521: 1518: 1515: 1514: 1512: 1500: 1499: 1495: 1491: 1490: 1487: 1479: 1478: 1474: 1469: 1468: 1464: 1461:decision trees 1456: 1455: 1451: 1448:Random Forests 1443: 1442: 1438: 1434: 1433: 1430: 1421: 1420: 1416: 1415: 1414: 1407: 1396:random forests 1391: 1388: 1347: 1346:Random Forests 1344: 1312:decision trees 1307: 1304: 1298: 1295: 1275: 1272: 1231: 1228: 1219: 1216: 1214: 1211: 1151: 1147: 1109: 1105: 1063: 1059: 1030: 1015: 1012: 976:meta-algorithm 955:, also called 948: 947: 945: 944: 937: 930: 922: 919: 918: 915: 914: 909: 908: 907: 897: 891: 888: 887: 884: 883: 880: 879: 874: 869: 864: 859: 854: 849: 843: 840: 839: 836: 835: 832: 831: 826: 821: 816: 814:Occam learning 811: 806: 801: 796: 790: 787: 786: 783: 782: 779: 778: 773: 771:Learning curve 768: 763: 757: 754: 753: 750: 749: 746: 745: 740: 735: 730: 724: 721: 720: 717: 716: 713: 712: 711: 710: 700: 695: 690: 684: 679: 678: 675: 674: 671: 670: 664: 659: 654: 649: 648: 647: 637: 632: 631: 630: 625: 620: 615: 605: 600: 595: 590: 589: 588: 578: 577: 576: 571: 566: 561: 551: 546: 541: 535: 530: 529: 526: 525: 522: 521: 516: 511: 503: 497: 492: 491: 488: 487: 484: 483: 482: 481: 476: 471: 460: 455: 454: 451: 450: 447: 446: 441: 436: 431: 426: 421: 416: 411: 406: 400: 395: 394: 391: 390: 387: 386: 381: 376: 370: 365: 360: 352: 347: 342: 336: 331: 330: 327: 326: 323: 322: 317: 312: 307: 302: 297: 292: 287: 279: 278: 277: 272: 267: 257: 255:Decision trees 252: 246: 232:classification 222: 221: 220: 217: 216: 213: 212: 207: 202: 197: 192: 187: 182: 177: 172: 167: 162: 157: 152: 147: 142: 137: 132: 127: 125:Classification 121: 118: 117: 114: 113: 110: 109: 104: 99: 94: 89: 84: 82:Batch learning 79: 74: 69: 64: 59: 54: 49: 43: 40: 39: 36: 35: 24: 23: 15: 9: 6: 4: 3: 2: 2705: 2694: 2691: 2689: 2686: 2684: 2681: 2680: 2678: 2667: 2661: 2657: 2652: 2648: 2644: 2640: 2636: 2633:(1): 78–100. 2632: 2628: 2623: 2619: 2612: 2599: 2591: 2586: 2582: 2578: 2574: 2570: 2565: 2560: 2556: 2552: 2551: 2546: 2542: 2541: 2527: 2522: 2518: 2514: 2513: 2508: 2504: 2498: 2490: 2483: 2467: 2463: 2457: 2455: 2446: 2442: 2437: 2432: 2428: 2424: 2420: 2413: 2399: 2395: 2389: 2375: 2368: 2354: 2350: 2343: 2329: 2325: 2319: 2305: 2301: 2295: 2293: 2278: 2274: 2268: 2261: 2255: 2248: 2242: 2227: 2220: 2213: 2211: 2202: 2198: 2194: 2190: 2185: 2180: 2176: 2172: 2171: 2166: 2160: 2158: 2133: 2129: 2124: 2121: 2117: 2113: 2109: 2106: 2100: 2092: 2088: 2084: 2083: 2076: 2072: 2062: 2059: 2057: 2054: 2051: 2048: 2046: 2045:Random forest 2043: 2041: 2038: 2036: 2033: 2031: 2028: 2026: 2023: 2022: 2016: 2014: 2010: 2006: 2002: 1989: 1986: 1983: 1982: 1981: 1975: 1971: 1969: 1965: 1961: 1958: 1957: 1956: 1948: 1941: 1937: 1934: 1930: 1925: 1923: 1919: 1915: 1885: 1881: 1860: 1838: 1834: 1811: 1807: 1798: 1782: 1778: 1757: 1735: 1731: 1708: 1704: 1695: 1680: 1658: 1654: 1633: 1625: 1624: 1623: 1607: 1603: 1582: 1562: 1542: 1535: 1526: 1513: 1510: 1506: 1502: 1501: 1496: 1493: 1492: 1488: 1485: 1481: 1480: 1475: 1471: 1470: 1465: 1462: 1458: 1457: 1452: 1449: 1445: 1444: 1439: 1436: 1435: 1431: 1428: 1427: 1417: 1412: 1411: 1408: 1405: 1404: 1403: 1401: 1397: 1387: 1385: 1379: 1376: 1372: 1368: 1360: 1356: 1354: 1353:random forest 1343: 1339: 1337: 1332: 1328: 1326: 1322: 1318: 1313: 1303: 1294: 1292: 1287: 1285: 1281: 1271: 1269: 1265: 1260: 1259: 1255: 1251: 1246: 1244: 1239: 1237: 1227: 1225: 1210: 1208: 1204: 1200: 1191: 1187: 1185: 1181: 1177: 1173: 1169: 1168: 1149: 1145: 1136: 1132: 1128: 1127: 1107: 1103: 1094: 1090: 1087: 1083: 1079: 1061: 1057: 1048: 1044: 1028: 1021: 1011: 1009: 1005: 1004:decision tree 1001: 997: 993: 989: 985: 981: 977: 974: 970: 966: 962: 958: 954: 943: 938: 936: 931: 929: 924: 923: 921: 920: 913: 910: 906: 903: 902: 901: 898: 896: 893: 892: 886: 885: 878: 875: 873: 870: 868: 865: 863: 860: 858: 855: 853: 850: 848: 845: 844: 838: 837: 830: 827: 825: 822: 820: 817: 815: 812: 810: 807: 805: 802: 800: 797: 795: 792: 791: 785: 784: 777: 774: 772: 769: 767: 764: 762: 759: 758: 752: 751: 744: 741: 739: 736: 734: 733:Crowdsourcing 731: 729: 726: 725: 719: 718: 709: 706: 705: 704: 701: 699: 696: 694: 691: 689: 686: 685: 682: 677: 676: 668: 665: 663: 662:Memtransistor 660: 658: 655: 653: 650: 646: 643: 642: 641: 638: 636: 633: 629: 626: 624: 621: 619: 616: 614: 611: 610: 609: 606: 604: 601: 599: 596: 594: 591: 587: 584: 583: 582: 579: 575: 572: 570: 567: 565: 562: 560: 557: 556: 555: 552: 550: 547: 545: 544:Deep learning 542: 540: 537: 536: 533: 528: 527: 520: 517: 515: 512: 510: 508: 504: 502: 499: 498: 495: 490: 489: 480: 479:Hidden Markov 477: 475: 472: 470: 467: 466: 465: 462: 461: 458: 453: 452: 445: 442: 440: 437: 435: 432: 430: 427: 425: 422: 420: 417: 415: 412: 410: 407: 405: 402: 401: 398: 393: 392: 385: 382: 380: 377: 375: 371: 369: 366: 364: 361: 359: 357: 353: 351: 348: 346: 343: 341: 338: 337: 334: 329: 328: 321: 318: 316: 313: 311: 308: 306: 303: 301: 298: 296: 293: 291: 288: 286: 284: 280: 276: 275:Random forest 273: 271: 268: 266: 263: 262: 261: 258: 256: 253: 251: 248: 247: 240: 239: 234: 233: 225: 219: 218: 211: 208: 206: 203: 201: 198: 196: 193: 191: 188: 186: 183: 181: 178: 176: 173: 171: 168: 166: 163: 161: 160:Data cleaning 158: 156: 153: 151: 148: 146: 143: 141: 138: 136: 133: 131: 128: 126: 123: 122: 116: 115: 108: 105: 103: 100: 98: 95: 93: 90: 88: 85: 83: 80: 78: 75: 73: 72:Meta-learning 70: 68: 65: 63: 60: 58: 55: 53: 50: 48: 45: 44: 38: 37: 34: 29: 26: 25: 21: 20: 2655: 2630: 2626: 2598:cite journal 2554: 2548: 2545:Breiman, Leo 2516: 2510: 2497: 2482: 2470:. Retrieved 2465: 2426: 2422: 2412: 2401:. Retrieved 2397: 2388: 2377:. Retrieved 2367: 2356:. Retrieved 2352: 2342: 2331:. Retrieved 2327: 2318: 2307:. Retrieved 2303: 2280:. Retrieved 2276: 2267: 2254: 2241: 2230:. Retrieved 2225: 2174: 2168: 2165:Breiman, Leo 2090: 2086: 2081: 2075: 2012: 2008: 2004: 1998: 1979: 1955:Advantages: 1954: 1945: 1926: 1911: 1534:training set 1531: 1505:missing data 1447: 1393: 1380: 1364: 1349: 1340: 1333: 1329: 1309: 1300: 1288: 1283: 1279: 1277: 1267: 1263: 1261: 1257: 1253: 1249: 1248:Suppose the 1247: 1240: 1235: 1233: 1223: 1221: 1196: 1183: 1179: 1171: 1165: 1134: 1130: 1123: 1085: 1077: 1046: 1042: 1020:training set 1017: 968: 964: 960: 956: 952: 951: 819:PAC learning 506: 355: 350:Hierarchical 282: 264: 236: 230: 2519:(1): 1–26. 2472:December 5, 2429:: 108–109. 2001:Leo Breiman 1696:Classifier 1503:Deals with 1484:overfitting 1000:overfitting 703:Multi-agent 640:Transformer 539:Autoencoder 295:Naive Bayes 33:data mining 2677:Categories 2403:2021-11-26 2379:2021-11-26 2358:2021-11-26 2333:2021-11-26 2309:2021-12-09 2282:2021-12-09 2232:2019-07-28 2067:References 1622:as output 1555:, Inducer 1375:e-commerce 1371:regression 1297:Importance 1010:approach. 992:regression 688:Q-learning 586:Restricted 384:Mean shift 333:Clustering 310:Perceptron 238:regression 140:Clustering 135:Regression 2559:CiteSeerX 2503:Efron, B. 2179:CiteSeerX 2118:− 2110:− 2007:ootstrap 1933:bootstrap 1918:Rousseeuw 1812:∗ 1608:∗ 1450:section. 1218:Key Terms 1176:bootstrap 1089:uniformly 980:stability 963:ootstrap 847:ECML PKDD 829:VC theory 776:ROC curve 708:Self-play 628:DeepDream 469:Bayes net 260:Ensembles 41:Paradigms 2647:27301684 2581:47328136 2505:(1979). 2201:47328136 2125:′ 2019:See also 1974:parallel 1964:low-bias 1419:results. 1137:the set 1082:sampling 1041:of size 996:variance 971:), is a 270:Boosting 119:Problems 2445:1088806 1995:History 1673:, from 1626:Create 1509:binning 1400:bagging 1367:overfit 957:bagging 852:NeurIPS 669:(ECRAM) 623:AlexNet 265:Bagging 2662:  2645:  2579:  2561:  2443:  2353:Medium 2199:  2181:  1750:using 959:(from 645:Vision 501:RANSAC 379:OPTICS 374:DBSCAN 358:-means 165:AutoML 2643:S2CID 2577:S2CID 2441:S2CID 2222:(PDF) 2197:S2CID 2011:regat 1929:LOESS 1914:ozone 1432:Cons 1429:Pros 1252:is a 1122:. If 1084:from 1080:, by 967:regat 867:IJCAI 693:SARSA 652:Mamba 618:LeNet 613:U-Net 439:t-SNE 363:Fuzzy 340:BIRCH 2660:ISBN 2618:link 2611:help 2474:2020 1398:and 1266:had 1091:and 990:and 877:JMLR 862:ICLR 857:ICML 743:RLHF 559:LSTM 345:CURE 31:and 2635:doi 2569:doi 2521:doi 2466:CFI 2431:doi 2189:doi 2013:ing 2009:agg 1924:). 969:ing 965:agg 603:SOM 593:GAN 569:ESN 564:GRU 509:-NN 444:SDL 434:PGD 429:PCA 424:NMF 419:LDA 414:ICA 409:CCA 285:-NN 2679:: 2641:. 2631:29 2629:. 2602:: 2600:}} 2596:{{ 2575:. 2567:. 2555:24 2553:. 2515:. 2509:. 2464:. 2453:^ 2439:. 2427:36 2425:. 2421:. 2396:. 2351:. 2326:. 2302:. 2291:^ 2275:. 2224:. 2209:^ 2195:. 2187:. 2175:24 2173:. 2156:^ 2087:n′ 1201:, 1078:n′ 872:ML 2668:. 2649:. 2637:: 2620:) 2613:) 2609:( 2592:. 2583:. 2571:: 2529:. 2523:: 2517:7 2476:. 2447:. 2433:: 2406:. 2382:. 2361:. 2336:. 2312:. 2285:. 2235:. 2203:. 2191:: 2151:. 2139:) 2134:n 2130:/ 2122:n 2114:e 2107:1 2104:( 2101:n 2091:n 2005:b 1922:R 1886:i 1882:C 1861:D 1839:i 1835:C 1808:C 1783:i 1779:D 1758:I 1736:i 1732:D 1709:i 1705:C 1681:D 1659:i 1655:D 1634:m 1604:C 1583:m 1563:I 1543:D 1184:m 1180:m 1172:D 1167:e 1150:i 1146:D 1135:n 1131:n 1129:= 1126:′ 1124:n 1108:i 1104:D 1086:D 1062:i 1058:D 1047:m 1043:n 1029:D 961:b 941:e 934:t 927:v 507:k 356:k 283:k 241:) 229:(

Index

Machine learning
data mining
Supervised learning
Unsupervised learning
Semi-supervised learning
Self-supervised learning
Reinforcement learning
Meta-learning
Online learning
Batch learning
Curriculum learning
Rule-based learning
Neuro-symbolic AI
Neuromorphic engineering
Quantum machine learning
Classification
Generative modeling
Regression
Clustering
Dimensionality reduction
Density estimation
Anomaly detection
Data cleaning
AutoML
Association rules
Semantic analysis
Structured prediction
Feature engineering
Feature learning
Learning to rank

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑