801:
2038:
38:
52:
457:
This release continues to provide Nutch users with a simplified Nutch distribution building on the 2.x development drive which is growing in popularity amongst the community. As well as addressing ~20 bugs this release also offers improved properties for better Solr configuration, upgrades to various
374:
This release includes several improvements (addition of parse-html as a selectable parser again, configurable per-field indexing), new features (including adding timing information to all Tool classes, and implementation of parser timeouts), and bug fixes (fixing an NPE in distributed search, fixing
485:
This release includes over 30 bug fixes and over 25 improvements representing the third release of increasingly popular 2.x Nutch series. This release features inclusion of
Crawler-Commons which Nutch now utilizes for improved robots.txt parsing, library upgrades to Apache Hadoop 1.1.1, Apache Gora
471:
This release includes over 20 bug fixes, the same in improvements, as well as new functionalities including a new HostNormalizer, the ability to dynamically set fetchInterval by MIME-type and functional enhancements to the
Indexer API including the normalization of URLs and the deletion of robots
499:
This release includes over 20 bug fixes, as many improvements; most noticeably featuring a new pluggable indexing architecture which currently supports Apache Solr and
Elastic Search. Shadowing the recent Nutch 2.2 release, parsing of Robots.txt is now delegated to Crawler-Commons. Key library
430:
This release offers users an edition focused on large scale crawling which builds on storage abstraction (via Apache Gora) for big data stores such as Apache
Accumulo, Apache Avro, Apache Cassandra, Apache HBase, HDFS, an in memory data store and various high-profile SQL stores.
416:
This release includes several improvements including upgrades of several major components including Tika 1.1 and Hadoop 1.0.0, improvements to LinkRank and WebGraph elements as well as a number of new plugins covering blacklisting, filtering and parsing to name a few.
388:
This release includes several improvements (improved RSS parsing support, tighter integration with Apache Tika, external parsing support, improved language identification and an order of magnitude smaller source release tarball—only about 2 MB).
402:
This release includes several improvements including allowing
Parsers to declare support for multiple MIME types, configurable Fetcher Queue depth, Fetcher speed improvements, tighter Tika integration, and support for HTTP auth in Solr indexing.
307:
In
January, 2005, Nutch joined the Apache Incubator, from which it graduated to become a subproject of Lucene in June of that same year. Since April, 2010, Nutch has been considered an independent, top level project of the
258:, but data is written in language-independent formats. It has a highly modular architecture, allowing developers to create plug-ins for media-type parsing, data retrieval, querying and clustering.
361:
This release includes several major upgrades of existing libraries (Hadoop, Solr, Tika, etc.) on which Nutch depends. Various bug fixes, and speedups (e.g., to
Fetcher2) have also been included.
292:
In June, 2003, a successful 100-million-page demonstration system was developed. To meet the multi-machine processing needs of the crawl and index tasks, the Nutch project has also implemented a
513:
This release includes library upgrades to Apache Hadoop 1.2.0 and Apache Tika 1.3, it is predominantly a bug fix for NUTCH-1591 - Incorrect conversion of ByteBuffer to String.
247:
1218:
526:
Although this release includes library upgrades to
Crawler Commons 0.3 and Apache Tika 1.5, it also provides over 30 bug fixes as well as 18 improvements.
1099:
565:
This release includes library upgrades to Hadoop 2.X, Tika 1.11, also provides over 32 bug fixes as well as 35 improvements and 14 new features.
2249:
539:
Nutch 2.3 release now comes packaged with a self-contained Apache Wicket-based Web
Application. The SQL backend for Gora has been deprecated.
2234:
763:
444:
This release is a maintenance release of the popular 1.5.X mainstream version of Nutch which has been widely adopted within the community.
1871:
1017:
552:
This release includes library upgrades to Tika 1.6, also provides over 46 bug fixes as well as 37 improvements and 12 new features.
1211:
710:
IBM Research studied the performance of Nutch/Lucene as part of its
Commercial Scale Out (CSO) project. Their findings were that a
1067:
2076:
948:
922:
322:
While it was once a goal for the Nutch project to release a global large-scale web search engine, that is no longer the case.
2254:
1158:
2244:
1144:
806:
2042:
1204:
472:
noIndex documents. Other notable improvements include the upgrade of key dependencies to Tika 1.2 and Automaton 1.11-8.
974:
714:
system, such as Nutch/Lucene, could achieve a performance level on a cluster of blades that was not achievable on any
2239:
187:
1227:
1814:
1103:
151:
255:
168:
96:
2259:
2069:
767:
309:
85:
1000:
726:
1819:
872:
2208:
1293:
297:
17:
651:
Expected to be the last release on the 2.X series, as "no committer is actively working on it".
1834:
2062:
1758:
896:
711:
715:
8:
1698:
1040:
1024:
458:
Gora dependencies and the introduction of the option to build indexes in elastic search.
232:
1071:
1288:
204:
1713:
1603:
1488:
1353:
1338:
1318:
1154:
776:
uses Nutch to crawl web pages for code, archives and technically interesting content.
1124:
1922:
1796:
1753:
1743:
1443:
1403:
1388:
1343:
757:
199:
175:
741:– Java framework that supports distributed applications running on large clusters.
729:) was gathered using Nutch, with an average speed of 755.31 documents per second.
2213:
1957:
1952:
1932:
1788:
1768:
1728:
1723:
1718:
1703:
1658:
1433:
1323:
1253:
1248:
1243:
1196:
1164:
1085:
1053:
2023:
1997:
1992:
1947:
1907:
1850:
1824:
1806:
1623:
1618:
1598:
1593:
1588:
1548:
1473:
1368:
1363:
1348:
1328:
1258:
1018:"Base Operating System Provisioning and Bringup for a Commercial Supercomputer"
286:
180:
73:
2228:
1982:
1937:
1912:
1783:
1773:
1748:
1733:
1708:
1653:
1613:
1553:
1528:
1523:
1503:
1483:
1478:
1453:
1438:
1373:
1358:
1268:
1263:
2172:
2102:
2098:
2094:
1977:
1962:
1917:
1866:
1829:
1778:
1693:
1688:
1678:
1673:
1668:
1663:
1643:
1638:
1583:
1578:
1533:
1518:
1508:
1493:
1463:
1458:
1423:
1418:
1408:
1398:
1393:
1383:
1333:
1308:
1283:
1278:
785:
760:
Search – an implementation of Nutch, used in the period of 2004–2006.
751:
316:
274:
69:
300:. The two facilities have been spun out into their own subproject, called
2187:
2151:
2085:
2002:
1942:
1897:
1738:
1683:
1648:
1558:
1538:
1513:
1498:
1468:
1448:
1413:
1313:
1303:
1298:
262:
235:
192:
2007:
1967:
1927:
1876:
1633:
1628:
1608:
1428:
1378:
1273:
754:– publicly available internet-wide crawls, started using Nutch in 2014.
80:
64:
2136:
1068:"Creative Commons Unique Search Tool Now Integrated into Firefox 1.0"
830:
293:
500:
upgrades have been made to Apache Hadoop 1.2.0 and Apache Tika 1.3.
37:
2192:
2141:
2126:
1563:
1543:
1125:"Update on Wikia – doing more of what's working | Jimmy Wales"
157:
51:
2146:
2131:
2121:
851:
265:") has been written from scratch specifically for this project.
2177:
2116:
2054:
1150:
779:
773:
738:
719:
301:
282:
278:
1043:. Boston.lti.cs.cmu.edu (2008-10-01). Retrieved on 2013-07-21.
1987:
1902:
1881:
1573:
2182:
2156:
1972:
1763:
1189:
319:
project adopted Nutch for its open, large-scale web crawl.
215:
578:
This bug fix release contains around 40 issues addressed.
246:
873:"Common Crawl's Move to Nutch – Common Crawl – Blog"
796:
1100:"Where can I get the source code for Wikia Search?"
1226:
1146:Building Search Applications with Lucene and Nutch
745:
981:. The Apache Software Foundation. 11 October 2019
955:. The Apache Software Foundation. 7 December 2015
903:. The Apache Software Foundation. 22 January 2015
2226:
375:of XML formatting issues per Document fields).
1070:. Creative Commons. 2004-11-22. Archived from
770:search prototype developed by Creative Commons
2070:
1212:
929:. The Apache Software Foundation. 6 May 2015
1041:The Sapphire Web Crawler - Crawl Statistics
825:
823:
486:0.3, Apache Tika 1.2 and Automaton 1.11-8.
2077:
2063:
1219:
1205:
36:
1001:"Scalability of the Nutch search engine"
820:
245:
1142:
14:
2227:
2250:Java (programming language) libraries
2058:
1200:
867:
865:
807:Free and open-source software portal
725:The ClueWeb09 dataset (used in e.g.
231:is a highly extensible and scalable
2235:Apache Software Foundation projects
732:
24:
325:
25:
2271:
1181:
862:
788:- launched 2008, closed down 2009
2084:
2037:
2036:
799:
50:
1143:Shoberg, J (October 26, 2006).
1135:
1117:
1092:
1088:. Creative Commons. 2006-08-02.
1078:
1060:
1056:. Creative Commons. 2004-09-03.
1046:
1034:
746:Search engines built with Nutch
254:Nutch is coded entirely in the
1228:The Apache Software Foundation
1010:
993:
967:
941:
915:
889:
844:
705:
13:
1:
813:
2255:Cross-platform free software
7:
2245:Free search engine software
831:"Apache Nutch™ - Downloads"
792:
241:
10:
2276:
949:"Nutch 1.11 Release Notes"
923:"Nutch 1.10 Release Notes"
768:Open educational resources
310:Apache Software Foundation
268:
127:2.4 / 11 October 2019
86:Apache Software Foundation
56:Nutch Web Interface Search
2201:
2165:
2109:
2092:
2032:
2016:
1890:
1859:
1843:
1805:
1234:
261:The fetcher ("robot" or "
256:Java programming language
210:
198:
186:
174:
164:
150:
146:
123:
109:1.20 / 24 April 2024
105:
95:
91:
79:
63:
44:
35:
2240:Internet search engines
2209:Distributed web crawler
298:distributed file system
158:Nutch Github Repository
27:Open source web crawler
273:Nutch originated with
251:
111:; 4 months ago
718:computer such as the
315:In February 2014 the
249:
129:; 4 years ago
1054:"Our Updated Search"
1030:on December 3, 2008.
877:blog.commoncrawl.org
1167:on December 2, 2009
975:"Nutch 2.4 Release"
897:"Nutch 2.3 Release"
32:
1289:Apache HTTP Server
1086:"New CC search UI"
277:, creator of both
252:
250:Nutch robot mascot
238:software project.
205:Apache License 2.0
65:Original author(s)
30:
2260:Free web crawlers
2222:
2221:
2052:
2051:
1160:978-1-59059-687-6
979:Apache Nutch News
901:Apache Nutch News
703:
702:
226:
225:
141:
140:
16:(Redirected from
2267:
2079:
2072:
2065:
2056:
2055:
2040:
2039:
1221:
1214:
1207:
1198:
1197:
1193:
1192:
1190:Official website
1176:
1174:
1172:
1163:. Archived from
1149:(1st ed.).
1129:
1128:
1127:. 31 March 2009.
1121:
1115:
1114:
1112:
1111:
1102:. Archived from
1096:
1090:
1089:
1082:
1076:
1075:
1064:
1058:
1057:
1050:
1044:
1038:
1032:
1031:
1029:
1023:. Archived from
1022:
1014:
1008:
1007:
1005:
997:
991:
990:
988:
986:
971:
965:
964:
962:
960:
945:
939:
938:
936:
934:
919:
913:
912:
910:
908:
893:
887:
886:
884:
883:
869:
860:
859:
856:nutch.apache.org
852:"Apache Nutch -"
848:
842:
841:
839:
837:
827:
809:
804:
803:
802:
758:Creative Commons
733:Related projects
330:
329:
222:
219:
217:
176:Operating system
160:
137:
135:
130:
119:
117:
112:
103:
102:
54:
40:
33:
29:
21:
2275:
2274:
2270:
2269:
2268:
2266:
2265:
2264:
2225:
2224:
2223:
2218:
2214:Focused crawler
2197:
2161:
2105:
2088:
2083:
2053:
2048:
2028:
2012:
1886:
1855:
1839:
1801:
1236:
1230:
1225:
1188:
1187:
1184:
1179:
1170:
1168:
1161:
1153:. p. 350.
1138:
1133:
1132:
1123:
1122:
1118:
1109:
1107:
1098:
1097:
1093:
1084:
1083:
1079:
1066:
1065:
1061:
1052:
1051:
1047:
1039:
1035:
1027:
1020:
1016:
1015:
1011:
1003:
999:
998:
994:
984:
982:
973:
972:
968:
958:
956:
947:
946:
942:
932:
930:
921:
920:
916:
906:
904:
895:
894:
890:
881:
879:
871:
870:
863:
850:
849:
845:
835:
833:
829:
828:
821:
816:
805:
800:
798:
795:
748:
735:
708:
328:
326:Release history
296:facility and a
271:
244:
214:
156:
142:
133:
131:
128:
115:
113:
110:
59:
58:
57:
48:
28:
23:
22:
15:
12:
11:
5:
2273:
2263:
2262:
2257:
2252:
2247:
2242:
2237:
2220:
2219:
2217:
2216:
2211:
2205:
2203:
2199:
2198:
2196:
2195:
2190:
2185:
2180:
2175:
2169:
2167:
2163:
2162:
2160:
2159:
2154:
2149:
2144:
2139:
2134:
2129:
2124:
2119:
2113:
2111:
2107:
2106:
2093:
2090:
2089:
2082:
2081:
2074:
2067:
2059:
2050:
2049:
2047:
2046:
2033:
2030:
2029:
2027:
2026:
2024:Apache License
2020:
2018:
2014:
2013:
2011:
2010:
2005:
2000:
1995:
1990:
1985:
1980:
1975:
1970:
1965:
1960:
1955:
1950:
1945:
1940:
1935:
1930:
1925:
1920:
1915:
1910:
1905:
1900:
1894:
1892:
1888:
1887:
1885:
1884:
1879:
1874:
1869:
1863:
1861:
1860:Other projects
1857:
1856:
1854:
1853:
1847:
1845:
1841:
1840:
1838:
1837:
1832:
1827:
1822:
1817:
1811:
1809:
1803:
1802:
1800:
1799:
1794:
1791:
1786:
1781:
1776:
1771:
1766:
1761:
1759:Traffic Server
1756:
1751:
1746:
1741:
1736:
1731:
1726:
1721:
1716:
1711:
1706:
1701:
1696:
1691:
1686:
1681:
1676:
1671:
1666:
1661:
1656:
1651:
1646:
1641:
1636:
1631:
1626:
1621:
1616:
1611:
1606:
1601:
1596:
1591:
1586:
1581:
1576:
1571:
1566:
1561:
1556:
1551:
1546:
1541:
1536:
1531:
1526:
1521:
1516:
1511:
1506:
1501:
1496:
1491:
1486:
1481:
1476:
1471:
1466:
1461:
1456:
1451:
1446:
1441:
1436:
1431:
1426:
1421:
1416:
1411:
1406:
1401:
1396:
1391:
1386:
1381:
1376:
1371:
1366:
1361:
1356:
1351:
1346:
1341:
1336:
1331:
1326:
1321:
1316:
1311:
1306:
1301:
1296:
1291:
1286:
1281:
1276:
1271:
1266:
1261:
1256:
1251:
1246:
1240:
1238:
1232:
1231:
1224:
1223:
1216:
1209:
1201:
1195:
1194:
1183:
1182:External links
1180:
1178:
1177:
1159:
1139:
1137:
1134:
1131:
1130:
1116:
1091:
1077:
1074:on 2010-01-07.
1059:
1045:
1033:
1009:
992:
966:
940:
914:
888:
861:
843:
818:
817:
815:
812:
811:
810:
794:
791:
790:
789:
783:
777:
771:
761:
755:
747:
744:
743:
742:
734:
731:
707:
704:
701:
700:
698:
695:
693:
689:
688:
686:
683:
681:
677:
676:
674:
671:
669:
665:
664:
662:
659:
657:
653:
652:
649:
646:
643:
640:
639:
637:
634:
632:
628:
627:
625:
622:
620:
616:
615:
613:
610:
608:
604:
603:
601:
598:
596:
592:
591:
589:
586:
584:
580:
579:
576:
573:
570:
567:
566:
563:
560:
558:
554:
553:
550:
547:
545:
541:
540:
537:
534:
531:
528:
527:
524:
521:
519:
515:
514:
511:
508:
505:
502:
501:
497:
494:
492:
488:
487:
483:
480:
477:
474:
473:
469:
466:
464:
460:
459:
455:
452:
449:
446:
445:
442:
439:
437:
433:
432:
428:
425:
422:
419:
418:
414:
411:
409:
405:
404:
400:
397:
395:
391:
390:
386:
383:
381:
377:
376:
372:
369:
367:
363:
362:
359:
356:
354:
350:
349:
346:
343:
337:
327:
324:
287:Mike Cafarella
270:
267:
243:
240:
224:
223:
212:
208:
207:
202:
196:
195:
190:
184:
183:
181:Cross-platform
178:
172:
171:
166:
162:
161:
154:
148:
147:
144:
143:
139:
138:
125:
121:
120:
107:
101:
99:
97:Stable release
93:
92:
89:
88:
83:
77:
76:
74:Mike Cafarella
67:
61:
60:
55:
49:
46:
45:
42:
41:
26:
9:
6:
4:
3:
2:
2272:
2261:
2258:
2256:
2253:
2251:
2248:
2246:
2243:
2241:
2238:
2236:
2233:
2232:
2230:
2215:
2212:
2210:
2207:
2206:
2204:
2200:
2194:
2191:
2189:
2186:
2184:
2181:
2179:
2176:
2174:
2171:
2170:
2168:
2164:
2158:
2155:
2153:
2150:
2148:
2145:
2143:
2140:
2138:
2135:
2133:
2130:
2128:
2125:
2123:
2120:
2118:
2115:
2114:
2112:
2108:
2104:
2100:
2097:designed for
2096:
2095:Internet bots
2091:
2087:
2080:
2075:
2073:
2068:
2066:
2061:
2060:
2057:
2045:
2044:
2035:
2034:
2031:
2025:
2022:
2021:
2019:
2015:
2009:
2006:
2004:
2001:
1999:
1996:
1994:
1991:
1989:
1986:
1984:
1981:
1979:
1976:
1974:
1971:
1969:
1966:
1964:
1961:
1959:
1956:
1954:
1951:
1949:
1946:
1944:
1941:
1939:
1936:
1934:
1931:
1929:
1926:
1924:
1921:
1919:
1916:
1914:
1911:
1909:
1906:
1904:
1901:
1899:
1896:
1895:
1893:
1889:
1883:
1880:
1878:
1875:
1873:
1870:
1868:
1865:
1864:
1862:
1858:
1852:
1849:
1848:
1846:
1842:
1836:
1833:
1831:
1828:
1826:
1823:
1821:
1818:
1816:
1813:
1812:
1810:
1808:
1804:
1798:
1795:
1792:
1790:
1787:
1785:
1782:
1780:
1777:
1775:
1772:
1770:
1767:
1765:
1762:
1760:
1757:
1755:
1752:
1750:
1747:
1745:
1742:
1740:
1737:
1735:
1732:
1730:
1727:
1725:
1722:
1720:
1717:
1715:
1712:
1710:
1707:
1705:
1702:
1700:
1697:
1695:
1692:
1690:
1687:
1685:
1682:
1680:
1677:
1675:
1672:
1670:
1667:
1665:
1662:
1660:
1657:
1655:
1652:
1650:
1647:
1645:
1642:
1640:
1637:
1635:
1632:
1630:
1627:
1625:
1622:
1620:
1617:
1615:
1612:
1610:
1607:
1605:
1602:
1600:
1597:
1595:
1592:
1590:
1587:
1585:
1582:
1580:
1577:
1575:
1572:
1570:
1567:
1565:
1562:
1560:
1557:
1555:
1552:
1550:
1547:
1545:
1542:
1540:
1537:
1535:
1532:
1530:
1527:
1525:
1522:
1520:
1517:
1515:
1512:
1510:
1507:
1505:
1502:
1500:
1497:
1495:
1492:
1490:
1487:
1485:
1482:
1480:
1477:
1475:
1472:
1470:
1467:
1465:
1462:
1460:
1457:
1455:
1452:
1450:
1447:
1445:
1442:
1440:
1437:
1435:
1432:
1430:
1427:
1425:
1422:
1420:
1417:
1415:
1412:
1410:
1407:
1405:
1402:
1400:
1397:
1395:
1392:
1390:
1387:
1385:
1382:
1380:
1377:
1375:
1372:
1370:
1367:
1365:
1362:
1360:
1357:
1355:
1352:
1350:
1347:
1345:
1342:
1340:
1337:
1335:
1332:
1330:
1327:
1325:
1322:
1320:
1317:
1315:
1312:
1310:
1307:
1305:
1302:
1300:
1297:
1295:
1292:
1290:
1287:
1285:
1282:
1280:
1277:
1275:
1272:
1270:
1267:
1265:
1262:
1260:
1257:
1255:
1252:
1250:
1247:
1245:
1242:
1241:
1239:
1233:
1229:
1222:
1217:
1215:
1210:
1208:
1203:
1202:
1199:
1191:
1186:
1185:
1166:
1162:
1156:
1152:
1148:
1147:
1141:
1140:
1126:
1120:
1106:on 2011-11-04
1105:
1101:
1095:
1087:
1081:
1073:
1069:
1063:
1055:
1049:
1042:
1037:
1026:
1019:
1013:
1002:
996:
980:
976:
970:
954:
950:
944:
928:
924:
918:
902:
898:
892:
878:
874:
868:
866:
857:
853:
847:
832:
826:
824:
819:
808:
797:
787:
784:
781:
778:
775:
772:
769:
765:
762:
759:
756:
753:
750:
749:
740:
737:
736:
730:
728:
723:
721:
717:
713:
699:
696:
694:
691:
690:
687:
684:
682:
679:
678:
675:
672:
670:
667:
666:
663:
660:
658:
655:
654:
650:
647:
644:
642:
641:
638:
635:
633:
630:
629:
626:
623:
621:
618:
617:
614:
611:
609:
606:
605:
602:
599:
597:
594:
593:
590:
587:
585:
582:
581:
577:
574:
571:
569:
568:
564:
561:
559:
556:
555:
551:
548:
546:
543:
542:
538:
535:
532:
530:
529:
525:
522:
520:
517:
516:
512:
509:
506:
504:
503:
498:
495:
493:
490:
489:
484:
481:
478:
476:
475:
470:
467:
465:
462:
461:
456:
453:
450:
448:
447:
443:
440:
438:
435:
434:
429:
426:
423:
421:
420:
415:
412:
410:
407:
406:
401:
398:
396:
393:
392:
387:
384:
382:
379:
378:
373:
370:
368:
365:
364:
360:
357:
355:
352:
351:
347:
345:Release date
344:
342:
338:
336:
332:
331:
323:
320:
318:
313:
311:
305:
303:
299:
295:
290:
288:
284:
280:
276:
266:
264:
259:
257:
248:
239:
237:
234:
230:
221:
213:
209:
206:
203:
201:
197:
194:
191:
189:
185:
182:
179:
177:
173:
170:
167:
163:
159:
155:
153:
149:
145:
126:
122:
108:
104:
100:
98:
94:
90:
87:
84:
82:
78:
75:
71:
68:
66:
62:
53:
43:
39:
34:
19:
2173:FAST Crawler
2166:Discontinued
2103:Web indexing
2099:Web crawling
2086:Web crawlers
2041:
1699:SpamAssassin
1568:
1169:. Retrieved
1165:the original
1145:
1136:Bibliography
1119:
1108:. Retrieved
1104:the original
1094:
1080:
1072:the original
1062:
1048:
1036:
1025:the original
1012:
995:
983:. Retrieved
978:
969:
957:. Retrieved
952:
943:
931:. Retrieved
926:
917:
905:. Retrieved
900:
891:
880:. Retrieved
876:
855:
846:
834:. Retrieved
786:Wikia Search
752:Common Crawl
724:
709:
348:Description
340:
334:
321:
317:Common Crawl
314:
306:
291:
275:Doug Cutting
272:
260:
253:
229:Apache Nutch
228:
227:
81:Developer(s)
70:Doug Cutting
31:Apache Nutch
2188:TkWWW robot
2152:PowerMapper
706:Scalability
697:2024-04-09
685:2022-08-22
673:2021-01-24
661:2020-07-02
648:2019-10-11
636:2019-10-11
624:2018-08-09
612:2017-12-23
600:2017-04-02
588:2016-06-18
575:2016-01-21
562:2015-12-07
549:2015-05-06
536:2015-01-22
523:2014-03-17
510:2013-07-02
496:2013-06-24
482:2013-06-08
468:2012-12-06
454:2012-10-05
441:2012-07-10
427:2012-07-07
413:2012-06-07
399:2011-11-26
385:2011-06-07
371:2010-10-24
358:2010-06-06
263:web crawler
236:web crawler
233:open source
193:Web crawler
2229:Categories
1928:Deltacloud
1714:Subversion
1604:OрenOffice
1489:Jackrabbit
1429:FreeMarker
1354:CloudStack
1339:CarbonData
1319:Bloodhound
1171:August 15,
1110:2010-02-12
959:18 January
933:18 January
907:18 January
882:2015-10-14
814:References
782:(inactive)
764:DiscoverEd
165:Written in
152:Repository
134:2019-10-11
116:2024-04-24
47:Screenshot
2137:Googlebot
1923:Continuum
1844:Incubator
1797:ZooKeeper
1754:Trafodion
1744:TinkerPop
1444:Guacamole
1404:Empire-db
1389:Directory
1344:Cassandra
1235:Top-level
712:scale-out
294:MapReduce
2193:Twiceler
2142:Heritrix
2127:Crawljax
2043:Category
2017:Licenses
1958:Marmotta
1789:XMLBeans
1769:Velocity
1729:Tapestry
1724:SystemDS
1719:Superset
1709:Struts 2
1704:Struts 1
1659:RocketMQ
1564:NetBeans
1544:mod_perl
1434:Geronimo
1324:Brooklyn
1254:Airavata
1249:ActiveMQ
1244:Accumulo
1237:projects
953:ASF JIRA
927:ASF JIRA
793:See also
716:scale-up
242:Features
2147:HTTrack
2132:Fetcher
2122:bingbot
1998:Tuscany
1993:Stanbol
1953:Jakarta
1948:Harmony
1908:Beehive
1851:Taverna
1835:Logging
1807:Commons
1624:Phoenix
1619:Parquet
1599:OpenNLP
1594:OpenJPA
1589:OpenEJB
1549:MyFaces
1474:Iceberg
1369:CouchDB
1364:Cordova
1349:Cayenne
1329:Calcite
1259:Airflow
836:11 June
341:Branch
335:Branch
269:History
218:.apache
211:Website
200:License
132: (
114: (
18:Fetcher
2178:msnbot
2117:80legs
2110:Active
1938:Giraph
1913:iBATIS
1825:Daemon
1784:Xerces
1774:Wicket
1749:Tomcat
1734:Thrift
1654:Roller
1614:PDFBox
1554:Mynewt
1529:Mahout
1524:Lucene
1504:JMeter
1484:Impala
1479:Ignite
1454:Hadoop
1439:Groovy
1374:cTAKES
1359:Cocoon
1269:Ambari
1264:Allura
1157:
1151:Apress
985:20 May
780:mozDex
774:Krugle
739:Hadoop
720:POWER5
572:2.3.1
507:2.2.1
436:1.5.1
302:Hadoop
285:, and
283:Hadoop
279:Lucene
2202:Types
1988:Sqoop
1983:Slide
1978:Shale
1973:River
1963:MXNet
1918:Click
1903:AxKit
1891:Attic
1882:Log4j
1867:Batik
1830:Jelly
1793:Yetus
1779:Xalan
1694:Storm
1689:Spark
1679:Sling
1674:SINGA
1669:Shiro
1664:Samza
1644:Pivot
1639:Pinot
1584:Oozie
1579:OFBiz
1574:NuttX
1569:Nutch
1534:Maven
1519:Kylin
1509:Kafka
1494:James
1464:Helix
1459:HBase
1424:Flume
1419:Flink
1409:Felix
1399:Druid
1394:Drill
1384:Derby
1334:Camel
1309:Axis2
1284:Arrow
1279:Aries
1028:(PDF)
1021:(PDF)
1004:(PDF)
692:1.20
680:1.19
668:1.18
656:1.17
631:1.16
619:1.15
607:1.14
595:1.13
583:1.12
557:1.11
544:1.10
216:nutch
2183:RBSE
2157:Wget
2101:and
2003:Wave
1943:Hama
1933:Etch
1898:Apex
1815:BCEL
1764:UIMA
1739:Tika
1684:Solr
1649:Qpid
1559:NiFi
1539:MINA
1514:Kudu
1499:Jena
1469:Hive
1449:Gump
1414:Flex
1314:Beam
1304:Axis
1299:Avro
1173:2009
1155:ISBN
987:2022
961:2016
935:2016
909:2016
838:2024
727:TREC
645:2.4
533:2.3
518:1.8
491:1.7
479:2.2
463:1.6
451:2.1
424:2.0
408:1.5
394:1.4
380:1.3
366:1.2
353:1.1
339:2.x
333:1.x
281:and
220:.org
188:Type
169:Java
2008:XML
1968:ODE
1877:Ivy
1872:FOP
1820:BSF
1634:Pig
1629:POI
1609:ORC
1379:CXF
1294:APR
1274:Ant
124:2.x
106:1.x
2231::
977:.
951:.
925:.
899:.
875:.
864:^
854:.
822:^
766:–
722:.
312:.
304:.
289:.
72:,
2078:e
2071:t
2064:v
1220:e
1213:t
1206:v
1175:.
1113:.
1006:.
989:.
963:.
937:.
911:.
885:.
858:.
840:.
136:)
118:)
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.