Knowledge

Apache Hadoop

Source 📝

1296:, a misleading term that some might incorrectly interpret as a backup namenode when the primary namenode goes offline. In fact, the secondary namenode regularly connects with the primary namenode and builds snapshots of the primary namenode's directory information, which the system then saves to local or remote directories. These checkpointed images can be used to restart a failed primary namenode without having to replay the entire journal of file-system actions, then to edit the log to create an up-to-date directory structure. Because the namenode is the single point for storage and management of metadata, it can become a bottleneck for supporting a huge number of files, especially a large number of small files. HDFS Federation, a new addition, aims to tackle this problem to a certain extent by allowing multiple namespaces served by separate namenodes. Moreover, there are some issues in HDFS such as small file issues, scalability problems, Single Point of Failure (SPoF), and bottlenecks in huge metadata requests. One advantage of using HDFS is data awareness between the job tracker and task tracker. The job tracker schedules map or reduce jobs to task trackers with an awareness of the data location. For example: if node A contains data (a, b, c) and node X contains data (x, y, z), the job tracker schedules node A to perform map or reduce tasks on (a, b, c) and node X would be scheduled to perform map or reduce tasks on (x, y, z). This reduces the amount of traffic that goes over the network and prevents unnecessary data transfer. When Hadoop is used with other file systems, this advantage is not always available. This can have a significant impact on job-completion times as demonstrated with data-intensive jobs. 1832: 3827: 213: 4674: 27: 1353:
to the platform on which HDFS is running. Due to its widespread integration into enterprise-level infrastructure, monitoring HDFS performance at scale has become an increasingly important issue. Monitoring end-to-end performance requires tracking metrics from datanodes, namenodes, and the underlying operating system. There are currently several monitoring platforms to track HDFS performance, including
1128: 1352:
HDFS is designed for portability across various hardware platforms and for compatibility with a variety of underlying operating systems. The HDFS design introduces portability limitations that result in some performance bottlenecks, since the Java implementation cannot use features that are exclusive
464:
In March 2006, Owen O'Malley was the first committer to add to the Hadoop project; Hadoop 0.1.0 was released in April 2006. It continues to evolve through contributions that are being made to the project. The first design document for the Hadoop Distributed File System was written by Dhruba Borthakur
1483:
nodes in the cluster, striving to keep the work as close to the data as possible. With a rack-aware file system, the JobTracker knows which node contains the data, and which other machines are nearby. If the work cannot be hosted on the actual node where the data resides, priority is given to nodes
1277:
storage on hosts (but to increase input-output (I/O) performance some RAID configurations are still useful). With the default replication value, 3, data is stored on three nodes: two on the same rack, and one on a different rack. Data nodes can talk to each other to rebalance data, to move copies
1208:
Top three are Master Services/Daemons/Nodes and bottom two are Slave Services. Master Services can communicate with each other and in the same way Slave services can communicate with each other. Name Node is a master node and Data node is its corresponding Slave node and can talk with each other.
1736:
and produced data that was used in every Yahoo! web search query. There are multiple Hadoop clusters at Yahoo! and no HDFS file systems or MapReduce jobs are split across multiple data centers. Every Hadoop cluster node bootstraps the Linux image, including the Hadoop distribution. Work that the
1123:
For effective scheduling of work, every Hadoop-compatible file system should provide location awareness, which is the name of the rack, specifically the network switch where a worker node is. Hadoop applications can use this information to execute code on the node where the data is, and, failing
1601:
The biggest difference between Hadoop 1 and Hadoop 2 is the addition of YARN (Yet Another Resource Negotiator), which replaced the MapReduce engine in the first version of Hadoop. YARN strives to allocate resources to various applications effectively. It runs two daemons, which take care of two
1233:
This is only to take care of the checkpoints of the file system metadata which is in the Name Node. This is also known as the checkpoint Node. It is the helper Node for the Name Node. The secondary name node instructs the name node to create & send fsimage & editlog file, upon which the
1226:
A Data Node stores data in it as blocks. This is also known as the slave node and it stores the actual data into HDFS which is responsible for the client to read and write. These are slave daemons. Every Data node sends a Heartbeat message to the Name node every 3 seconds and conveys that it is
1153:
In a larger cluster, HDFS nodes are managed through a dedicated NameNode server to host the file system index, and a secondary NameNode that can generate snapshots of the namenode's memory structures, thereby preventing file-system corruption and loss of data. Similarly, a standalone JobTracker
1124:
that, on the same rack/switch to reduce backbone traffic. HDFS uses this method when replicating data for data redundancy across multiple racks. This approach reduces the impact of a rack power outage or switch failure; if any of these hardware failures occurs, the data will remain available.
1278:
around, and to keep the replication of data high. HDFS is not fully POSIX-compliant, because the requirements for a POSIX file-system differ from the target goals of a Hadoop application. The trade-off of not having a fully POSIX-compliant file-system is increased performance for data
3330:
Chintapalli, Sanket; Dagit, Derek; Evans, Bobby; Farivar, Reza; Graves, Thomas; Holderbaugh, Mark; Liu, Zhuo; Nusbaum, Kyle; Patil, Kishorkumar; Peng, Boyang Jerry; Poulosky, Paul (May 2016). "Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming".
1219:
of all of the stored data within it. In particular, the name node contains the details of the number of blocks, locations of the data node that the data is stored in, where the replications are stored, and other details. The name node has direct contact with the client.
292:
The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. Hadoop splits files into large blocks and distributes them across nodes in a cluster. It then transfers
1510:
If one TaskTracker is very slow, it can delay the entire MapReduce job – especially towards the end, when everything can end up waiting for the slowest task. With speculative execution enabled, however, a single task can be executed on multiple slave
284:, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework. 1488:(JVM) process to prevent the TaskTracker itself from failing if the running job crashes its JVM. A heartbeat is sent from the TaskTracker to the JobTracker every few minutes to check its status. The Job Tracker and TaskTracker status and information is exposed by 1415:
A number of third-party file system bridges have also been written, none of which are currently in Hadoop distributions. However, some commercial distributions of Hadoop ship with an alternative file system as the default – specifically IBM and
1240:
Job Tracker receives the requests for Map Reduce execution from the client. Job tracker talks to the Name Node to know about the location of the data that will be used in processing. The Name Node responds with the metadata of the required processing data.
1247:
It is the Slave Node for the Job Tracker and it will take the task from the Job Tracker. It also receives code from the Job Tracker. Task Tracker will take the code and apply on the file. The process of applying that code on the file is known as Mapper.
453:, the genesis of Hadoop was the Google File System paper that was published in October 2003. This paper spawned another one from Google – "MapReduce: Simplified Data Processing on Large Clusters". Development started on the 1154:
server can manage job scheduling across nodes. When Hadoop MapReduce is used with an alternate file system, the NameNode, secondary NameNode, and DataNode architecture of HDFS are replaced by the file-system-specific equivalents.
437:. Though MapReduce Java code is common, any programming language can be used with Hadoop Streaming to implement the map and reduce parts of the user's program. Other projects in the Hadoop ecosystem expose richer user interfaces. 3512: 1524:
scheduling, and optionally 5 scheduling priorities to schedule jobs from a work queue. In version 0.19 the job scheduler was refactored out of the JobTracker, while adding the ability to use an alternate scheduler (such as the
1731:
On 19 February 2008, Yahoo! Inc. launched what they claimed was the world's largest Hadoop production application. The Yahoo! Search Webmap is a Hadoop application that runs on a Linux cluster with more than 10,000
1377:
URL; however, this comes at a price – the loss of locality. To reduce network traffic, Hadoop needs to know which servers are closest to the data, information that Hadoop-specific file system bridges can provide.
1187:
that are similar to other file systems. A Hadoop instance is divided into HDFS and MapReduce. HDFS is used for storing the data and MapReduce is used for processing data. HDFS has five services as follows:
1503:(such as "4 slots"). Every active map or reduce task takes up one slot. The Job Tracker allocates work to the tracker nearest to the data with an available slot. There is no consideration of the current 1411:
Windows Azure Storage Blobs (WASB) file system: This is an extension of HDFS that allows distributions of Hadoop to access data in Azure blob stores without moving the data permanently into the cluster.
1285:
In May 2012, high-availability capabilities were added to HDFS, letting the main metadata server called the NameNode manually fail-over onto a backup. The project has also started developing automatic
1227:
alive. In this way when Name Node does not receive a heartbeat from a data node for 2 minutes, it will take that data node as dead and starts the process of block replications on some other Data node.
1484:
in the same rack. This reduces network traffic on the main backbone network. If a TaskTracker fails or times out, that part of the job is rescheduled. The TaskTracker on each node spawns a separate
1737:
clusters perform is known to include the index calculations for the Yahoo! search engine. In June 2009, Yahoo! made the source code of its Hadoop version available to the open-source community.
461:
at the time, named it after his son's toy elephant. The initial code that was factored out of Nutch consisted of about 5,000 lines of code for HDFS and about 6,000 lines of code for MapReduce.
2172: 3115: 3394: 3468: 1806:. The authors highlight the need for storage systems to accept all data formats and to provide APIs for data access that evolve based on the storage system's understanding of the data. 1567:
By default, jobs that are uncategorized go into a default pool. Pools have to specify the minimum number of map slots, reduce slots, as well as a limit on the number of running jobs.
1255:
options are available for the namenode due to its criticality. Each datanode serves up blocks of data over the network using a block protocol specific to HDFS. The file system uses
6038: 3490: 3247: 1139:
acts as both a DataNode and TaskTracker, though it is possible to have data-only and compute-only worker nodes. These are normally used only in nonstandard applications.
1135:
A small Hadoop cluster includes a single master and multiple worker nodes. The master node consists of a Job Tracker, Task Tracker, NameNode, and DataNode. A slave or
2845: 1799:, Google. This paper inspired Doug Cutting to develop an open-source implementation of the Map-Reduce framework. He named it Hadoop, after his son's toy elephant. 1744:
of storage. In June 2012, they announced the data had grown to 100 PB and later that year they announced that the data was growing by roughly half a PB per day.
1639:
Also, Hadoop 3 permits usage of GPU hardware within the cluster, which is a very substantial benefit to execute deep learning algorithms on a Hadoop cluster.
1810: 1783:. The naming of products and derivative works from other vendors and the term "compatible" are somewhat controversial within the Hadoop developer community. 3854: 3177: 1385:
HDFS: Hadoop's own rack-aware file system. This is designed to scale to tens of petabytes of storage and runs on top of the file systems of the underlying
2426: 2131:
Wang, Yandong; Goldstone, Robin; Yu, Weikuan; Wang, Teng (October 2014). "Characterization and Optimization of Memory-Resident MapReduce on HPC Systems".
1647:
The HDFS is not restricted to MapReduce jobs. It can be used for other applications, many of which are under development at Apache. The list includes the
257:
software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a
2180: 3155: 2669: 1796: 3513:"Altior's AltraSTAR – Hadoop Storage Accelerator and Optimizer Now Certified on CDH4 (Cloudera's Distribution Including Apache Hadoop Version 4)" 1575:
The capacity scheduler was developed by Yahoo. The capacity scheduler supports several features that are similar to those of the fair scheduler.
3119: 335:– (introduced in 2012) is a platform responsible for managing computing resources in clusters and using them for scheduling users' applications; 3402: 1665:. Theoretically, Hadoop could be used for any workload that is batch-oriented rather than real-time, is very data-intensive, and benefits from 2452: 6018: 3300: 3048: 1112:
package, which provides file system and operating system level abstractions, a MapReduce engine (either MapReduce/MR1 or YARN/MR2) and the
2735: 3625: 2871: 1215:
HDFS consists of only one Name Node that is called the Master Node. The master node can track files, manage the file system and has the
6053: 6033: 4507: 3847: 2790: 1373:
Hadoop works directly with any distributed file system that can be mounted by the underlying operating system by simply using a
1269:
HDFS stores large files (typically in the range of gigabytes to terabytes) across multiple machines. It achieves reliability by
4737: 4712: 329:– a distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster; 5396: 3801: 3780: 3755: 3718: 3693: 3348: 2148: 1775:
The Apache Software Foundation has stated that only software officially released by the Apache Hadoop Project can be called
2849: 2721: 2699: 1837: 2974: 2935: 2521: 5969: 5444: 4678: 3840: 2895:
Pessach, Yaniv (2013). "Distributed Storage" (Distributed Storage: Concepts, Algorithms, and Implementations ed.).
1299:
HDFS was designed for mostly immutable files and may not be suitable for systems requiring concurrent write operations.
5959: 5130: 4939: 2230:"Continuuity Raises $ 10 Million Series A Round to Ignite Big Data Application Development Within the Hadoop Ecosystem" 5337: 6028: 5581: 2291: 2229: 5495: 3704: 1759:. The cloud allows organizations to deploy Hadoop without the need to acquire hardware or specific setup expertise. 5247: 4969: 4929: 1521: 1429: 175: 3446: 1972: 3070: 2418: 3863: 2757: 1329:
API (generates a client in a number of languages e.g. C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#,
1270: 2105: 6043: 5026: 4450: 1622:
in Hadoop 2, Hadoop 3, enables having multiple name nodes, which solves the single point of failure problem.
1401: 1184: 359:, or collection of additional software packages that can be installed on top of or alongside Hadoop, such as 2203: 5964: 5380: 4954: 3027: 1747:
As of 2013, Hadoop adoption had become widespread: more than half of the Fortune 50 companies used Hadoop.
1405: 139: 1438:
In April 2010, Appistry released a Hadoop file system driver for use with its own CloudIQ Storage product.
5885: 5733: 5665: 4959: 4766: 3088:"HADOOP-6330: Integrating IBM General Parallel File System implementation of Hadoop Filesystem interface" 2677: 1354: 422: 156: 84: 2777:
HDFS is not a file system in the traditional sense and isn't usually directly mounted for a user to view
2025: 1863:, a database that uses JSON for documents, JavaScript for MapReduce queries, and regular HTTP for an API 6023: 5770: 5760: 5750: 5142: 4732: 4705: 306: 55: 457:
project, but was moved to the new Hadoop subproject in January 2006. Doug Cutting, who was working at
2314: 1408:
server-on-demand infrastructure. There is no rack-awareness in this file system, as it is all remote.
3001: 2921: 5835: 5688: 5591: 5536: 5411: 5267: 5036: 4455: 2337: 1874: 1499:
The allocation of work to TaskTrackers is very simple. Every TaskTracker has a number of available
1435:
In April 2010, Parascale published the source code to run Hadoop against the Parascale file system.
1252: 1143: 426: 3218: 1791:
Some papers influenced the birth and growth of Hadoop and big data processing. Some of these are:
1392:
Apache Hadoop Ozone: HDFS-compatible object store targeting optimized for billions of small files.
5934: 5890: 5872: 5571: 5561: 5016: 3929: 3275: 2820: 1554: 1307: 1172: 180: 3308: 1803: 1183:
compliance, but it does provide shell commands and Java application programming interface (API)
5780: 5745: 5683: 5162: 4982: 4846: 4776: 4470: 1901: 1590: 1338: 1176: 430: 302: 3581: 3141: 2281: 1479:, to which client applications submit MapReduce jobs. The JobTracker pushes work to available 5916: 5825: 5775: 5718: 5464: 5434: 5385: 5237: 5210: 5087: 4977: 4890: 4781: 4698: 4394: 2908: 1263: 262: 3647: 2283:
Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data
5990: 5951: 5765: 5484: 5459: 5100: 4917: 4907: 4863: 4828: 3685: 3372: 1682: 1485: 310: 254: 1618:
There are important features provided by Hadoop 3. For example, while there is one single
8: 5995: 5941: 5880: 5469: 5137: 5078: 4994: 4334: 3469:"Under the Hood: Hadoop Distributed File system reliability with Namenode and Avatarnode" 2958:"Improving MapReduce performance through data placement in heterogeneous Hadoop Clusters" 1670: 1311: 3333:
2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
1998: 1740:
In 2010, Facebook claimed that they had the largest Hadoop cluster in the world with 21
1585:
Within a queue, a job with a high level of priority has access to the queue's resources.
5985: 5847: 5810: 5723: 5449: 5439: 5424: 5365: 5195: 4858: 4851: 4838: 4791: 3924: 3354: 2154: 1666: 1546: 1489: 1460:, which replaced the HDFS file system with a full random-access read/write file system. 415: 341:– an implementation of the MapReduce programming model for large-scale data processing. 281: 258: 192: 3826: 1545:. The goal of the fair scheduler is to provide fast response times for small jobs and 212: 5800: 5740: 5576: 5262: 5232: 5224: 5093: 5068: 4989: 4964: 4786: 4349: 4239: 4124: 3989: 3974: 3954: 3797: 3776: 3768:
Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools
3751: 3747: 3714: 3689: 3559: 3534: 3344: 3171: 2936:"Version 2.0 provides for manual failover and they are working on automatic failover" 2378: 2287: 2144: 1626: 1326: 1303: 1251:
Hadoop cluster has nominally a single namenode plus a cluster of datanodes, although
1117: 301:, where nodes manipulate the data they have access to. This allows the dataset to be 294: 273: 5073: 3191: 3101: 3087: 2158: 5820: 5728: 5526: 5176: 5120: 4809: 4558: 4432: 4389: 4379: 4079: 4039: 4024: 3979: 3603: 3358: 3336: 2402:
The Lucene PMC has voted to split part of Nutch into a new sub-project named Hadoop
2136: 1854: 1655: 1504: 1386: 380: 277: 227: 187: 163: 145: 2393: 1818:
H-store: a high-performance, distributed main memory transaction processing system
6048: 5929: 5842: 5350: 4744: 4593: 4588: 4568: 4424: 4404: 4364: 4359: 4354: 4339: 4294: 4069: 3959: 3889: 3884: 3879: 3832: 3791: 3679: 2957: 2896: 2765: 1921: 1845: 1756: 1442: 1381:
In May 2011, the list of supported file systems bundled with Apache Hadoop were:
1342: 1259: 2542: 5830: 5815: 5755: 4933: 4804: 4659: 4633: 4628: 4583: 4543: 4486: 4460: 4442: 4259: 4254: 4234: 4229: 4224: 4184: 4109: 4004: 3999: 3984: 3964: 3894: 3766: 2978: 2650: 2632: 2614: 2596: 2578: 2560: 2470: 2363: 1860: 1662: 450: 372: 168: 43: 1767:
A number of companies offer commercial implementations or support for Hadoop.
937: 701: 685: 669: 652: 635: 618: 601: 584: 567: 550: 533: 516: 500: 484: 6012: 5860: 5805: 5419: 5278: 4618: 4573: 4548: 4419: 4409: 4384: 4369: 4344: 4289: 4249: 4189: 4164: 4159: 4139: 4119: 4114: 4074: 4009: 3994: 3904: 3899: 3724: 1895: 1817: 1652: 971: 954: 920: 903: 886: 869: 852: 835: 819: 803: 786: 769: 752: 735: 718: 434: 384: 298: 297:
into nodes to process the data in parallel. This approach takes advantage of
3340: 2900: 2075: 1632:
One of the biggest changes is that Hadoop 3 decreases storage overhead with
1146:(JRE) 1.6 or higher. The standard startup and shutdown scripts require that 5924: 5632: 5220: 5187: 5125: 5105: 4613: 4598: 4553: 4502: 4465: 4414: 4329: 4324: 4314: 4309: 4304: 4299: 4279: 4274: 4219: 4214: 4204: 4169: 4154: 4144: 4129: 4099: 4094: 4059: 4054: 4044: 4034: 4029: 4019: 3969: 3944: 3919: 3914: 3741: 2448: 2414: 1717: 1678: 1674: 1633: 1446: 1175:
written in Java for the Hadoop framework. Some consider it to instead be a
1147: 1099: 454: 446: 400: 396: 388: 376: 368: 39: 3002:"The Hadoop Distributed Filesystem: Balancing Portability and Performance" 2140: 2133:
2014 IEEE 28th International Parallel and Distributed Processing Symposium
1950: 1804:
From Databases to Dataspaces: A New Abstraction for Information Management
5622: 4721: 4638: 4578: 4533: 4374: 4319: 4284: 4194: 4174: 4149: 4134: 4104: 4084: 4049: 3949: 3939: 3934: 2798: 2237: 1703: 1693: 1659: 1398:
file system: This stores all its data on remotely accessible FTP servers.
1330: 1273:
the data across multiple hosts, and hence theoretically does not require
364: 3491:"Under the Hood: Scheduling MapReduce jobs more efficiently with Corona" 1755:
Hadoop can be deployed in a traditional onsite datacenter as well as in
1549:(QoS) for production jobs. The fair scheduler has three basic concepts. 1475:
Atop the file systems comes the MapReduce Engine, which consists of one
5900: 5698: 4643: 4603: 4563: 4512: 4269: 4264: 4244: 4064: 4014: 3909: 1889: 1883: 1733: 1606:, which does job tracking and resource allocation to applications, the 1279: 360: 316:
The base Apache Hadoop framework is composed of the following modules:
50: 34: 1722:
Archival work for compliance, including of relational and tabular data
313:
where computation and data are distributed via high-speed networking.
5855: 5596: 5474: 5205: 3424: 2495: 1470: 1456:
announced the availability of an alternative file system for Hadoop,
1286: 1103: 411: 270: 305:
faster and more efficiently than it would be in a more conventional
5673: 5617: 5586: 5375: 5200: 5056: 4949: 4902: 4796: 4199: 4179: 3071:"Cloud analytics: Do we really need to reinvent the storage stack?" 1869: 1849: 1741: 1582:
Free resources are allocated to queues beyond their total capacity.
1542: 1464: 1358: 1216: 266: 1596: 1026: 1007: 988: 323:– contains libraries and utilities needed by other Hadoop modules; 26: 5627: 5601: 5428: 5061: 5031: 4922: 4878: 1457: 1362: 355:
is often used for both base modules and sub-modules and also the
1866:
Apache HCatalog, a table and storage management layer for Hadoop
1669:. It can also be used to complement a real-time system, such as 5637: 5546: 5541: 5345: 5046: 4690: 3772: 3710: 3395:"Yahoo! Launches World's Largest Hadoop Production Application" 2179:. Apache Software Foundation. 12 September 2014. Archived from 1579:
Queues are allocated a fraction of the total resource capacity.
1256: 458: 407: 406:
Apache Hadoop's MapReduce and HDFS components were inspired by
3158:. 17 August 2011. Archived from the original on 17 August 2011 2453:"[RESULT] VOTE: add Owen O'Malley as Hadoop committer" 1234:
compacted fsimage file is created by the secondary name node.
5895: 5693: 5678: 5655: 5650: 5645: 5556: 5551: 5489: 5370: 5323: 5318: 5311: 5306: 5301: 5296: 5252: 5242: 5147: 5115: 5009: 5004: 4999: 4895: 4823: 4771: 4623: 4538: 4517: 4209: 2522:"The Hadoop Distributed File System: Architecture and Design" 2336:
Cutting, Mike; Cafarella, Ben; Lorica, Doug (31 March 2016).
2259: 1857:, a column-oriented database that supports access from Hadoop 1648: 1334: 1325:
File access can be achieved through the native Java API, the
1315: 1180: 392: 242: 3820: 3515:(Press release). Eatontown, NJ: Altior Inc. 18 December 2012 3329: 2050: 1507:
of the allocated machine, and hence its actual availability.
203: 5566: 5531: 5521: 5516: 5454: 5390: 5360: 5355: 5291: 5286: 5257: 5166: 5157: 5110: 5051: 5021: 4912: 4883: 4873: 4868: 4833: 4815: 4608: 4399: 2722:"Running Hadoop on Ubuntu Linux System(Multi-Node Cluster)" 2419:"Hadoop, a Free Software Program, Finds Uses Beyond Search" 2362:
Ghemawat, Sanjay; Gobioff, Howard; Leung, Shun-Tak (2003).
1879: 1613: 1453: 1417: 1404:
object storage: This is targeted at clusters hosted on the
1346: 1319: 1274: 1127: 233: 1811:
Bigtable: A Distributed Storage System for Structured Data
1625:
In Hadoop 3, there are containers working in principle of
1073: 5172: 5152: 5041: 4943: 3226: 2379:"MapReduce: Simplified Data Processing on Large Clusters" 2083: 1711: 1425: 1395: 1061: 1056: 3447:"HDFS: Facebook has the world's largest Hadoop cluster!" 2286:. John Wiley & Sons. 19 December 2014. p. 300. 6039:
Free software programmed in Java (programming language)
1797:
MapReduce: Simplified Data Processing on Large Clusters
1629:, which reduces time spent on application development. 5276: 2736:"Running Hadoop on Ubuntu Linux (Single-Node Cluster)" 2335: 3301:"How Apache Hadoop 3 Adds Value Over Apache Hadoop 2" 2791:"Managing Files with the Hadoop File System Commands" 1282:
and support for non-POSIX operations such as Append.
421:
The Hadoop framework itself is mostly written in the
2361: 2315:"[nlpatumd] Adventures with Hadoop and Perl" 2076:"What is the Hadoop Distributed File System (HDFS)?" 1827: 245: 239: 3102:"HADOOP-6704: add support for Parascale filesystem" 2130: 1113: 1095: 236: 230: 3862: 3582:"Why the Pace of Hadoop Innovation Has to Pick Up" 3373:""How 30+ enterprises are using Hadoop", in DBMS2" 2728: 2714: 1802:Michael Franklin, Alon Halevy, David Maier (2005) 1886:Risk Solutions High Performance Computing Cluster 1560:Each pool is assigned a guaranteed minimum share. 1162: 347:– (introduced in 2020) An object store for Hadoop 6010: 3176:: CS1 maint: bot: original URL status unknown ( 2204:"Apache Hadoop YARN – Concepts and Applications" 1898:– Open source distributed storage and processing 1465:JobTracker and TaskTracker: the MapReduce engine 1432:. The source code was published in October 2009. 1171:(HDFS) is a distributed, scalable, and portable 2890: 2888: 1597:Difference between Hadoop 1 and Hadoop 2 (YARN) 1116:(HDFS). The Hadoop Common package contains the 3192:"Refactor the scheduler out of the JobTracker" 1750: 1150:(SSH) be set up between nodes in the cluster. 4706: 3848: 1349:, or via 3rd-party network client libraries. 1066: 3425:"Hadoop and Distributed Computing at Yahoo!" 3118:. Appistry, Inc. 6 July 2010. Archived from 2885: 2376: 1610:, which monitors progress of the execution. 3028:"How to Collect Hadoop Performance Metrics" 3000:Shafer, Jeffrey; Rixner, Scott; Cox, Alan. 2999: 1688:Commercial applications of Hadoop include: 1275:redundant array of independent disks (RAID) 4713: 4699: 3855: 3841: 3825: 3626:"Defining Hadoop Compatibility: revisited" 3025: 1945: 1943: 1292:The HDFS file system includes a so-called 1120:files and scripts needed to start Hadoop. 211: 25: 3789: 2758:"Big data storage: Hadoop storage basics" 2667: 2519: 2173:"Resource (Apache Hadoop Main 2.5.1 API)" 2106:"Data Locality: HPC vs. Hadoop vs. Spark" 3049:"HDFS Users Guide – Rack Awareness" 2928: 2846:"Apache Hadoop 2.7.5 – HDFS Users Guide" 2821:"Big Data Hadoop Tutorial for Beginners" 2377:Dean, Jeffrey; Ghemawat, Sanjay (2004). 2026:"Cray Launches Hadoop into HPC Airspace" 1614:Difference between Hadoop 2 and Hadoop 3 1495:Known limitations of this approach are: 1126: 1078: 3648:"Apache Accumulo User Manual: Security" 3628:. Mail-archives.apache.org. 10 May 2011 3248:"Hadoop Fair Scheduler Design Document" 2894: 2447: 2391: 2043: 2023: 1940: 6011: 3702: 2201: 1996: 1726: 1563:Excess capacity is split between jobs. 1492:and can be viewed from a web browser. 1266:(RPC) to communicate with each other. 4694: 3836: 3764: 3739: 3449:. Hadoopblog.blogspot.com. 9 May 2010 3216: 2755: 2413: 2103: 1973:"Doug Cutting: Big Data Is No Bubble" 1970: 1795:Jeffrey Dean, Sanjay Ghemawat (2004) 1762: 1642: 1570: 1368: 327:Hadoop Distributed File System (HDFS) 276:. Hadoop was originally designed for 16:Distributed data processing framework 2668:Chouraria, Harsh (21 October 2012). 2338:"The next 10 years of Apache Hadoop" 2104:Malak, Michael (19 September 2014). 1838:Free and open-source software portal 1541:The fair scheduler was developed by 6019:Apache Software Foundation projects 3677: 2561:"Release 3.0.0 generally available" 2429:from the original on 30 August 2011 2394:"new mailing lists request: hadoop" 2024:Hemsoth, Nicole (15 October 2014). 13: 3790:Wiktorski, Tomasz (January 2019). 3401:. 19 February 2008. Archived from 2818: 2788: 1428:discussed running Hadoop over the 1402:Amazon S3 (Simple Storage Service) 973:Old version, no longer maintained: 956:Old version, no longer maintained: 939:Old version, yet still maintained: 922:Old version, no longer maintained: 905:Old version, no longer maintained: 888:Old version, no longer maintained: 871:Old version, no longer maintained: 854:Old version, no longer maintained: 837:Old version, no longer maintained: 821:Old version, no longer maintained: 805:Old version, no longer maintained: 788:Old version, no longer maintained: 771:Old version, no longer maintained: 754:Old version, no longer maintained: 737:Old version, no longer maintained: 720:Old version, no longer maintained: 703:Old version, no longer maintained: 687:Old version, no longer maintained: 671:Old version, no longer maintained: 654:Old version, no longer maintained: 637:Old version, no longer maintained: 620:Old version, no longer maintained: 603:Old version, no longer maintained: 586:Old version, no longer maintained: 569:Old version, no longer maintained: 552:Old version, no longer maintained: 535:Old version, no longer maintained: 518:Old version, no longer maintained: 502:Old version, no longer maintained: 486:Old version, no longer maintained: 14: 6065: 6054:Software using the Apache license 6034:Free software for cloud computing 3812: 3307:. 7 February 2018. Archived from 3217:Jones, M. Tim (6 December 2011). 3026:Mouzakitis, Evan (21 July 2016). 2392:Cutting, Doug (28 January 2006). 1536: 4720: 4673: 4672: 3606:. Wiki.apache.org. 30 March 2013 2670:"MR2 and YARN Briefly Explained" 1971:Judge, Peter (22 October 2012). 1830: 1430:IBM General Parallel File System 226: 3796:. Cham, Switzerland: Springer. 3670: 3640: 3618: 3596: 3574: 3552: 3527: 3505: 3483: 3461: 3439: 3417: 3387: 3365: 3323: 3293: 3268: 3240: 3210: 3184: 3148: 3134: 3108: 3094: 3080: 3063: 3041: 3019: 2993: 2967: 2950: 2864: 2838: 2812: 2782: 2749: 2692: 2661: 2643: 2625: 2607: 2589: 2571: 2553: 2535: 2513: 2488: 2463: 2441: 2407: 2385: 2370: 2355: 2329: 2307: 2274: 2252: 2222: 2202:Murthy, Arun (15 August 2012). 2195: 2165: 1262:for communication. Clients use 1157: 1089: 3864:The Apache Software Foundation 3765:Vohra, Deepak (October 2016). 3703:Venner, Jason (22 June 2009). 2317:. Mail-archive.com. 2 May 2010 2124: 2097: 2068: 2017: 1990: 1964: 1914: 1781:Distributions of Apache Hadoop 1169:Hadoop distributed file system 1163:Hadoop distributed file system 1114:Hadoop Distributed File System 1096:Hadoop Distributed File System 445:According to its co-founders, 1: 2756:Evans, Chris (October 2013). 2529:Apache Hadoop Code Repository 1907: 1816:Robert Kallman et al. (2008) 1515: 1062:Old version, still maintained 3743:Hadoop: The Definitive Guide 3654:. Apache Software Foundation 3375:. Dbms2.com. 10 October 2009 3335:. IEEE. pp. 1789–1792. 3198:. Apache Software Foundation 2963:. Eng.auburn.ed. April 2010. 2471:"Index of /dist/hadoop/core" 2260:"Hadoop-related projects at" 1997:Woodie, Alex (12 May 2014). 1928:. Apache Software Foundation 1406:Amazon Elastic Compute Cloud 7: 5886:Filesystem-level encryption 3740:White, Tom (16 June 2009). 3678:Lam, Chuck (28 July 2010). 3584:. Gigaom.com. 25 April 2011 3116:"HDFS with CloudIQ Storage" 3104:. Parascale. 14 April 2010. 2051:"Welcome to Apache Hadoop!" 1823: 1770: 1751:Hadoop hosting in the cloud 1445:discussed a location-aware 1131:A multi-node Hadoop cluster 1057:Old version, not maintained 425:, with some native code in 287: 115:3.4.0 / March 17, 2024 10: 6070: 4733:Comparison of file systems 3535:"Hadoop - Microsoft Azure" 3142:"High Availability Hadoop" 2543:"Release 2.10.2 available" 2520:Borthakur, Dhruba (2006). 2135:. IEEE. pp. 799–808. 2112:. Data Science Association 1468: 1093: 440: 307:supercomputer architecture 97:2.10.2 / May 31, 2022 56:Apache Software Foundation 5978: 5950: 5915: 5871: 5796: 5789: 5711: 5664: 5610: 5512: 5505: 5410: 5336: 5219: 5186: 4762: 4753: 4728: 4668: 4652: 4526: 4495: 4479: 4441: 3870: 3276:"CapacityScheduler Guide" 2651:"Release 3.4.0 available" 2633:"Release 3.3.6 available" 2615:"Release 3.2.4 available" 2597:"Release 3.1.4 available" 2579:"Release 3.0.3 available" 1999:"Why Hadoop on IBM Power" 1786: 1045: 423:Java programming language 198: 186: 174: 162: 152: 138: 134: 111: 93: 83: 79: 61: 49: 33: 24: 6029:Distributed file systems 5836:Extended file attributes 5537:Compact Disc File System 2364:"The Google File System" 1875:Data-intensive computing 1809:Fay Chang et al. (2006) 1144:Java Runtime Environment 5935:Installable File System 3427:. Yahoo!. 20 April 2011 3341:10.1109/IPDPSW.2016.138 3090:. IBM. 23 October 2009. 1593:once a job is running. 1520:By default Hadoop uses 1308:Filesystem in Userspace 1108:Hadoop consists of the 1028:Current stable version: 1009:Current stable version: 990:Current stable version: 181:Distributed file system 4983:TiVo Media File System 4847:Encrypting File System 3793:Data-intensive Systems 3219:"Scheduling in Hadoop" 2916:Cite journal requires 1902:Slurm Workload Manager 1553:Jobs are grouped into 1454:MapR Technologies Inc. 1339:command-line interface 1264:remote procedure calls 1132: 1074:Latest preview version 474:Original release date 117:; 6 months ago 67:; 18 years ago 4978:Macintosh File System 3562:. Azure.microsoft.com 2141:10.1109/IPDPS.2014.87 1702:Machine learning and 1602:different tasks: the 1130: 433:utilities written as 253:) is a collection of 99:; 2 years ago 6044:Free system software 5991:GUID Partition Table 5338:Distributed parallel 5086:Shared File System ( 3686:Manning Publications 1486:Java virtual machine 1231:Secondary Name Node: 1142:Hadoop requires the 311:parallel file system 5996:Apple Partition Map 5942:Virtual file system 5881:Access-control list 4995:NetWare File System 3539:azure.microsoft.com 3311:on 16 November 2018 3051:. Hadoop.apache.org 2938:. Hadoop.apache.org 2872:"HDFS Architecture" 2702:. Hadoop.apache.org 2381:. pp. 137–150. 2262:. Hadoop.apache.org 2110:datascienceassn.org 1892:– HBase alternative 1727:Prominent use cases 1699:Marketing analytics 1671:lambda architecture 1667:parallel processing 1533:, described next). 1449:file system driver. 1312:virtual file system 1195:Secondary Name Node 1179:due to its lack of 263:distributed storage 21: 5986:Master Boot Record 5811:Data deduplication 5450:Google File System 5366:Google File System 4852:Extent File System 4814:Byte File System ( 3925:Apache HTTP Server 3727:on 5 December 2010 3144:. HP. 9 June 2010. 2852:on 23 October 2019 2762:computerweekly.com 2680:on 22 October 2013 2475:archive.apache.org 2423:The New York Times 2240:. 14 November 2012 1763:Commercial support 1714:message processing 1643:Other applications 1608:application master 1571:Capacity scheduler 1547:Quality of service 1531:Capacity scheduler 1369:Other file systems 1294:secondary namenode 1133: 1118:Java Archive (JAR) 416:Google File System 282:commodity hardware 265:and processing of 259:software framework 193:Apache License 2.0 65:April 1, 2006 35:Original author(s) 19: 6024:Big data products 6004: 6003: 5911: 5910: 5801:Case preservation 5707: 5706: 5406: 5405: 5332: 5331: 5094:Smart File System 4688: 4687: 3803:978-3-030-04603-3 3782:978-1-4842-2199-0 3757:978-0-596-52197-4 3720:978-1-430-21942-2 3695:978-1-935-18219-1 3604:"Defining Hadoop" 3350:978-1-5090-3682-0 3280:Hadoop.apache.org 3076:. IBM. June 2009. 3007:. Rice University 2700:"HDFS User Guide" 2655:hadoop.apache.org 2637:hadoop.apache.org 2619:hadoop.apache.org 2601:hadoop.apache.org 2583:hadoop.apache.org 2565:hadoop.apache.org 2547:hadoop.apache.org 2500:hadoop.apache.org 2457:hadoop-common-dev 2451:(30 March 2006). 2417:(17 March 2009). 2398:issues.apache.org 2366:. pp. 20–43. 2234:finance.yahoo.com 2183:on 6 October 2014 2150:978-1-4799-3800-1 2055:hadoop.apache.org 1922:"Hadoop Releases" 1387:operating systems 1333:, Smalltalk, and 1087: 1086: 1083: 309:that relies on a 278:computer clusters 274:programming model 219: 218: 146:Hadoop Repository 129: 128: 6061: 5821:Execute in place 5794: 5793: 5527:Boot File System 5510: 5509: 5274: 5273: 4810:Boot File System 4760: 4759: 4715: 4708: 4701: 4692: 4691: 4676: 4675: 3857: 3850: 3843: 3834: 3833: 3829: 3824: 3823: 3821:Official website 3807: 3786: 3771:(1st ed.). 3761: 3746:(1st ed.). 3736: 3734: 3732: 3723:. Archived from 3709:(1st ed.). 3699: 3684:(1st ed.). 3681:Hadoop in Action 3664: 3663: 3661: 3659: 3644: 3638: 3637: 3635: 3633: 3622: 3616: 3615: 3613: 3611: 3600: 3594: 3593: 3591: 3589: 3578: 3572: 3571: 3569: 3567: 3556: 3550: 3549: 3547: 3545: 3531: 3525: 3524: 3522: 3520: 3509: 3503: 3502: 3500: 3498: 3487: 3481: 3480: 3478: 3476: 3465: 3459: 3458: 3456: 3454: 3443: 3437: 3436: 3434: 3432: 3421: 3415: 3414: 3412: 3410: 3391: 3385: 3384: 3382: 3380: 3369: 3363: 3362: 3327: 3321: 3320: 3318: 3316: 3297: 3291: 3290: 3288: 3286: 3272: 3266: 3265: 3263: 3261: 3252: 3244: 3238: 3237: 3235: 3233: 3214: 3208: 3207: 3205: 3203: 3188: 3182: 3181: 3175: 3167: 3165: 3163: 3156:"Commands Guide" 3152: 3146: 3145: 3138: 3132: 3131: 3129: 3127: 3112: 3106: 3105: 3098: 3092: 3091: 3084: 3078: 3077: 3075: 3067: 3061: 3060: 3058: 3056: 3045: 3039: 3038: 3036: 3034: 3023: 3017: 3016: 3014: 3012: 3006: 2997: 2991: 2990: 2988: 2986: 2977:. Archived from 2971: 2965: 2964: 2962: 2954: 2948: 2947: 2945: 2943: 2932: 2926: 2925: 2919: 2914: 2912: 2904: 2892: 2883: 2882: 2880: 2878: 2868: 2862: 2861: 2859: 2857: 2848:. Archived from 2842: 2836: 2835: 2833: 2831: 2816: 2810: 2809: 2807: 2805: 2786: 2780: 2779: 2774: 2772: 2753: 2747: 2746: 2744: 2742: 2732: 2726: 2725: 2718: 2712: 2711: 2709: 2707: 2696: 2690: 2689: 2687: 2685: 2676:. Archived from 2665: 2659: 2658: 2647: 2641: 2640: 2629: 2623: 2622: 2611: 2605: 2604: 2593: 2587: 2586: 2575: 2569: 2568: 2557: 2551: 2550: 2539: 2533: 2532: 2526: 2517: 2511: 2510: 2508: 2506: 2492: 2486: 2485: 2483: 2481: 2467: 2461: 2460: 2445: 2439: 2438: 2436: 2434: 2411: 2405: 2404: 2389: 2383: 2382: 2374: 2368: 2367: 2359: 2353: 2352: 2350: 2348: 2333: 2327: 2326: 2324: 2322: 2311: 2305: 2304: 2302: 2300: 2278: 2272: 2271: 2269: 2267: 2256: 2250: 2249: 2247: 2245: 2226: 2220: 2219: 2217: 2215: 2199: 2193: 2192: 2190: 2188: 2169: 2163: 2162: 2128: 2122: 2121: 2119: 2117: 2101: 2095: 2094: 2092: 2090: 2072: 2066: 2065: 2063: 2061: 2047: 2041: 2040: 2038: 2036: 2021: 2015: 2014: 2012: 2010: 1994: 1988: 1987: 1985: 1983: 1968: 1962: 1961: 1959: 1957: 1947: 1938: 1937: 1935: 1933: 1918: 1855:Apache Cassandra 1840: 1835: 1834: 1833: 1708:Image processing 1658:system, and the 1656:machine learning 1604:resource manager 1376: 1306:directly with a 1080: 1075: 1070: 1063: 1058: 1053: 1046: 1029: 1010: 991: 974: 957: 940: 923: 906: 889: 872: 855: 838: 822: 806: 789: 772: 755: 738: 721: 704: 688: 672: 655: 638: 621: 604: 587: 570: 553: 536: 519: 503: 487: 468: 467: 381:Apache ZooKeeper 339:Hadoop MapReduce 252: 251: 248: 247: 244: 241: 238: 235: 232: 215: 210: 207: 205: 164:Operating system 148: 125: 123: 118: 107: 105: 100: 91: 90: 75: 73: 68: 29: 22: 18: 6069: 6068: 6064: 6063: 6062: 6060: 6059: 6058: 6009: 6008: 6005: 6000: 5974: 5946: 5930:File system API 5907: 5867: 5843:File change log 5785: 5761:Record-oriented 5734:Self-certifying 5703: 5660: 5606: 5501: 5402: 5328: 5272: 5215: 5182: 4755: 4749: 4745:Unix filesystem 4724: 4719: 4689: 4684: 4664: 4648: 4522: 4491: 4475: 4437: 3872: 3866: 3861: 3819: 3818: 3815: 3810: 3804: 3783: 3775:. p. 429. 3758: 3750:. p. 524. 3730: 3728: 3721: 3713:. p. 440. 3696: 3688:. p. 325. 3673: 3668: 3667: 3657: 3655: 3646: 3645: 3641: 3631: 3629: 3624: 3623: 3619: 3609: 3607: 3602: 3601: 3597: 3587: 3585: 3580: 3579: 3575: 3565: 3563: 3558: 3557: 3553: 3543: 3541: 3533: 3532: 3528: 3518: 3516: 3511: 3510: 3506: 3496: 3494: 3489: 3488: 3484: 3474: 3472: 3467: 3466: 3462: 3452: 3450: 3445: 3444: 3440: 3430: 3428: 3423: 3422: 3418: 3408: 3406: 3405:on 7 March 2016 3393: 3392: 3388: 3378: 3376: 3371: 3370: 3366: 3351: 3328: 3324: 3314: 3312: 3305:hortonworks.com 3299: 3298: 3294: 3284: 3282: 3274: 3273: 3269: 3259: 3257: 3250: 3246: 3245: 3241: 3231: 3229: 3215: 3211: 3201: 3199: 3190: 3189: 3185: 3169: 3168: 3161: 3159: 3154: 3153: 3149: 3140: 3139: 3135: 3125: 3123: 3122:on 5 April 2014 3114: 3113: 3109: 3100: 3099: 3095: 3086: 3085: 3081: 3073: 3069: 3068: 3064: 3054: 3052: 3047: 3046: 3042: 3032: 3030: 3024: 3020: 3010: 3008: 3004: 2998: 2994: 2984: 2982: 2975:"Mounting HDFS" 2973: 2972: 2968: 2960: 2956: 2955: 2951: 2941: 2939: 2934: 2933: 2929: 2917: 2915: 2906: 2905: 2893: 2886: 2876: 2874: 2870: 2869: 2865: 2855: 2853: 2844: 2843: 2839: 2829: 2827: 2825:www.gyansetu.in 2817: 2813: 2803: 2801: 2787: 2783: 2770: 2768: 2766:Computer Weekly 2754: 2750: 2740: 2738: 2734: 2733: 2729: 2720: 2719: 2715: 2705: 2703: 2698: 2697: 2693: 2683: 2681: 2666: 2662: 2649: 2648: 2644: 2631: 2630: 2626: 2613: 2612: 2608: 2595: 2594: 2590: 2577: 2576: 2572: 2559: 2558: 2554: 2541: 2540: 2536: 2524: 2518: 2514: 2504: 2502: 2494: 2493: 2489: 2479: 2477: 2469: 2468: 2464: 2459:(Mailing list). 2446: 2442: 2432: 2430: 2412: 2408: 2390: 2386: 2375: 2371: 2360: 2356: 2346: 2344: 2334: 2330: 2320: 2318: 2313: 2312: 2308: 2298: 2296: 2294: 2280: 2279: 2275: 2265: 2263: 2258: 2257: 2253: 2243: 2241: 2228: 2227: 2223: 2213: 2211: 2208:hortonworks.com 2200: 2196: 2186: 2184: 2171: 2170: 2166: 2151: 2129: 2125: 2115: 2113: 2102: 2098: 2088: 2086: 2074: 2073: 2069: 2059: 2057: 2049: 2048: 2044: 2034: 2032: 2022: 2018: 2008: 2006: 1995: 1991: 1981: 1979: 1969: 1965: 1955: 1953: 1951:"Apache Hadoop" 1949: 1948: 1941: 1931: 1929: 1920: 1919: 1915: 1910: 1846:Apache Accumulo 1836: 1831: 1829: 1826: 1789: 1773: 1765: 1753: 1729: 1683:Spark Streaming 1645: 1616: 1599: 1573: 1539: 1518: 1473: 1467: 1374: 1371: 1343:web application 1318:and some other 1165: 1160: 1106: 1092: 1082: 1081: 1076: 1071: 1064: 1059: 1054: 1049: 1027: 1008: 989: 972: 955: 938: 921: 904: 887: 870: 853: 836: 820: 804: 787: 770: 753: 736: 719: 702: 686: 670: 653: 636: 619: 602: 585: 568: 551: 534: 517: 501: 485: 477:Latest version 443: 290: 229: 225: 202: 144: 130: 121: 119: 116: 103: 101: 98: 71: 69: 66: 62:Initial release 17: 12: 11: 5: 6067: 6057: 6056: 6051: 6046: 6041: 6036: 6031: 6026: 6021: 6002: 6001: 5999: 5998: 5993: 5988: 5982: 5980: 5976: 5975: 5973: 5972: 5970:Log-structured 5967: 5962: 5956: 5954: 5948: 5947: 5945: 5944: 5939: 5938: 5937: 5927: 5921: 5919: 5913: 5912: 5909: 5908: 5906: 5905: 5904: 5903: 5898: 5888: 5883: 5877: 5875: 5873:Access control 5869: 5868: 5866: 5865: 5864: 5863: 5858: 5850: 5845: 5840: 5839: 5838: 5831:File attribute 5828: 5823: 5818: 5816:Data scrubbing 5813: 5808: 5803: 5797: 5791: 5787: 5786: 5784: 5783: 5778: 5773: 5771:Steganographic 5768: 5763: 5758: 5753: 5751:Log-structured 5748: 5743: 5738: 5737: 5736: 5731: 5726: 5715: 5713: 5709: 5708: 5705: 5704: 5702: 5701: 5696: 5691: 5686: 5681: 5676: 5670: 5668: 5662: 5661: 5659: 5658: 5653: 5648: 5643: 5640: 5635: 5630: 5625: 5620: 5614: 5612: 5608: 5607: 5605: 5604: 5599: 5594: 5589: 5584: 5579: 5574: 5569: 5564: 5559: 5554: 5549: 5544: 5539: 5534: 5529: 5524: 5519: 5513: 5507: 5503: 5502: 5500: 5499: 5492: 5487: 5482: 5477: 5472: 5467: 5462: 5457: 5452: 5447: 5442: 5437: 5432: 5422: 5416: 5414: 5408: 5407: 5404: 5403: 5401: 5400: 5393: 5388: 5383: 5378: 5373: 5368: 5363: 5358: 5353: 5348: 5342: 5340: 5334: 5333: 5330: 5329: 5327: 5326: 5321: 5316: 5315: 5314: 5304: 5299: 5294: 5289: 5283: 5281: 5271: 5270: 5265: 5260: 5255: 5250: 5245: 5240: 5235: 5229: 5227: 5217: 5216: 5214: 5213: 5208: 5203: 5198: 5192: 5190: 5184: 5183: 5181: 5180: 5170: 5160: 5155: 5150: 5145: 5140: 5135: 5134: 5133: 5128: 5118: 5113: 5108: 5103: 5098: 5097: 5096: 5091: 5081: 5076: 5074:Reliance Nitro 5071: 5066: 5065: 5064: 5054: 5049: 5044: 5039: 5034: 5029: 5024: 5019: 5014: 5013: 5012: 5002: 4997: 4992: 4987: 4986: 4985: 4980: 4972: 4967: 4962: 4957: 4952: 4947: 4937: 4934:Classic Mac OS 4927: 4926: 4925: 4915: 4910: 4905: 4900: 4899: 4898: 4888: 4887: 4886: 4881: 4876: 4871: 4861: 4856: 4855: 4854: 4849: 4841: 4836: 4831: 4826: 4821: 4820: 4819: 4812: 4807: 4805:Be File System 4799: 4794: 4789: 4784: 4779: 4774: 4769: 4763: 4757: 4751: 4750: 4748: 4747: 4742: 4741: 4740: 4729: 4726: 4725: 4718: 4717: 4710: 4703: 4695: 4686: 4685: 4683: 4682: 4669: 4666: 4665: 4663: 4662: 4660:Apache License 4656: 4654: 4650: 4649: 4647: 4646: 4641: 4636: 4631: 4626: 4621: 4616: 4611: 4606: 4601: 4596: 4591: 4586: 4581: 4576: 4571: 4566: 4561: 4556: 4551: 4546: 4541: 4536: 4530: 4528: 4524: 4523: 4521: 4520: 4515: 4510: 4505: 4499: 4497: 4496:Other projects 4493: 4492: 4490: 4489: 4483: 4481: 4477: 4476: 4474: 4473: 4468: 4463: 4458: 4453: 4447: 4445: 4439: 4438: 4436: 4435: 4430: 4427: 4422: 4417: 4412: 4407: 4402: 4397: 4395:Traffic Server 4392: 4387: 4382: 4377: 4372: 4367: 4362: 4357: 4352: 4347: 4342: 4337: 4332: 4327: 4322: 4317: 4312: 4307: 4302: 4297: 4292: 4287: 4282: 4277: 4272: 4267: 4262: 4257: 4252: 4247: 4242: 4237: 4232: 4227: 4222: 4217: 4212: 4207: 4202: 4197: 4192: 4187: 4182: 4177: 4172: 4167: 4162: 4157: 4152: 4147: 4142: 4137: 4132: 4127: 4122: 4117: 4112: 4107: 4102: 4097: 4092: 4087: 4082: 4077: 4072: 4067: 4062: 4057: 4052: 4047: 4042: 4037: 4032: 4027: 4022: 4017: 4012: 4007: 4002: 3997: 3992: 3987: 3982: 3977: 3972: 3967: 3962: 3957: 3952: 3947: 3942: 3937: 3932: 3927: 3922: 3917: 3912: 3907: 3902: 3897: 3892: 3887: 3882: 3876: 3874: 3868: 3867: 3860: 3859: 3852: 3845: 3837: 3831: 3830: 3814: 3813:External links 3811: 3809: 3808: 3802: 3787: 3781: 3762: 3756: 3748:O'Reilly Media 3737: 3719: 3700: 3694: 3674: 3672: 3669: 3666: 3665: 3639: 3617: 3595: 3573: 3551: 3526: 3504: 3482: 3460: 3438: 3416: 3386: 3364: 3349: 3322: 3292: 3267: 3239: 3209: 3183: 3147: 3133: 3107: 3093: 3079: 3062: 3040: 3018: 2992: 2981:on 14 May 2014 2966: 2949: 2927: 2918:|journal= 2884: 2863: 2837: 2811: 2789:deRoos, Dirk. 2781: 2748: 2727: 2713: 2691: 2660: 2642: 2624: 2606: 2588: 2570: 2552: 2534: 2512: 2487: 2462: 2440: 2406: 2384: 2369: 2354: 2342:O'Reilly Media 2328: 2306: 2292: 2273: 2251: 2221: 2194: 2164: 2149: 2123: 2096: 2067: 2042: 2016: 1989: 1963: 1939: 1912: 1911: 1909: 1906: 1905: 1904: 1899: 1893: 1887: 1877: 1872: 1867: 1864: 1861:Apache CouchDB 1858: 1852: 1842: 1841: 1825: 1822: 1821: 1820: 1814: 1807: 1800: 1788: 1785: 1772: 1769: 1764: 1761: 1752: 1749: 1728: 1725: 1724: 1723: 1720: 1715: 1709: 1706: 1700: 1697: 1663:data warehouse 1651:database, the 1644: 1641: 1634:erasure coding 1615: 1612: 1598: 1595: 1587: 1586: 1583: 1580: 1572: 1569: 1565: 1564: 1561: 1558: 1538: 1537:Fair scheduler 1535: 1527:Fair scheduler 1517: 1514: 1513: 1512: 1508: 1469:Main article: 1466: 1463: 1462: 1461: 1450: 1441:In June 2010, 1439: 1436: 1433: 1413: 1412: 1409: 1399: 1393: 1390: 1370: 1367: 1341:, the HDFS-UI 1206: 1205: 1202: 1199: 1196: 1193: 1164: 1161: 1159: 1156: 1091: 1088: 1085: 1084: 1079:Future release 1077: 1072: 1068:Latest version 1065: 1060: 1055: 1048: 1047: 1043: 1042: 1039: 1036: 1033: 1024: 1023: 1020: 1017: 1014: 1005: 1004: 1001: 998: 995: 986: 985: 982: 979: 976: 969: 968: 965: 962: 959: 952: 951: 948: 945: 942: 935: 934: 931: 928: 925: 918: 917: 914: 911: 908: 901: 900: 897: 894: 891: 884: 883: 880: 877: 874: 867: 866: 863: 860: 857: 850: 849: 846: 843: 840: 833: 832: 830: 827: 824: 817: 816: 814: 811: 808: 801: 800: 797: 794: 791: 784: 783: 780: 777: 774: 767: 766: 763: 760: 757: 750: 749: 746: 743: 740: 733: 732: 729: 726: 723: 716: 715: 712: 709: 706: 699: 698: 696: 693: 690: 683: 682: 680: 677: 674: 667: 666: 663: 660: 657: 650: 649: 646: 643: 640: 633: 632: 629: 626: 623: 616: 615: 612: 609: 606: 599: 598: 595: 592: 589: 582: 581: 578: 575: 572: 565: 564: 561: 558: 555: 548: 547: 544: 541: 538: 531: 530: 527: 524: 521: 514: 513: 510: 507: 505: 498: 497: 494: 491: 489: 482: 481: 478: 475: 472: 451:Mike Cafarella 442: 439: 373:Apache Phoenix 349: 348: 342: 336: 330: 324: 289: 286: 217: 216: 200: 196: 195: 190: 184: 183: 178: 172: 171: 169:Cross-platform 166: 160: 159: 154: 150: 149: 142: 136: 135: 132: 131: 127: 126: 113: 109: 108: 95: 89: 87: 85:Stable release 81: 80: 77: 76: 63: 59: 58: 53: 47: 46: 44:Mike Cafarella 37: 31: 30: 15: 9: 6: 4: 3: 2: 6066: 6055: 6052: 6050: 6047: 6045: 6042: 6040: 6037: 6035: 6032: 6030: 6027: 6025: 6022: 6020: 6017: 6016: 6014: 6007: 5997: 5994: 5992: 5989: 5987: 5984: 5983: 5981: 5977: 5971: 5968: 5966: 5963: 5961: 5960:Cryptographic 5958: 5957: 5955: 5953: 5949: 5943: 5940: 5936: 5933: 5932: 5931: 5928: 5926: 5923: 5922: 5920: 5918: 5914: 5902: 5899: 5897: 5894: 5893: 5892: 5889: 5887: 5884: 5882: 5879: 5878: 5876: 5874: 5870: 5862: 5859: 5857: 5854: 5853: 5851: 5849: 5846: 5844: 5841: 5837: 5834: 5833: 5832: 5829: 5827: 5824: 5822: 5819: 5817: 5814: 5812: 5809: 5807: 5806:Copy-on-write 5804: 5802: 5799: 5798: 5795: 5792: 5788: 5782: 5779: 5777: 5774: 5772: 5769: 5767: 5764: 5762: 5759: 5757: 5754: 5752: 5749: 5747: 5744: 5742: 5739: 5735: 5732: 5730: 5727: 5725: 5722: 5721: 5720: 5717: 5716: 5714: 5710: 5700: 5697: 5695: 5692: 5690: 5687: 5685: 5682: 5680: 5677: 5675: 5672: 5671: 5669: 5667: 5663: 5657: 5654: 5652: 5649: 5647: 5644: 5641: 5639: 5636: 5634: 5631: 5629: 5626: 5624: 5621: 5619: 5616: 5615: 5613: 5609: 5603: 5600: 5598: 5595: 5593: 5590: 5588: 5585: 5583: 5580: 5578: 5575: 5573: 5570: 5568: 5565: 5563: 5560: 5558: 5555: 5553: 5550: 5548: 5545: 5543: 5540: 5538: 5535: 5533: 5530: 5528: 5525: 5523: 5520: 5518: 5515: 5514: 5511: 5508: 5504: 5498: 5497: 5493: 5491: 5488: 5486: 5483: 5481: 5478: 5476: 5473: 5471: 5468: 5466: 5463: 5461: 5458: 5456: 5453: 5451: 5448: 5446: 5443: 5441: 5438: 5436: 5433: 5430: 5426: 5423: 5421: 5418: 5417: 5415: 5413: 5409: 5399: 5398: 5394: 5392: 5389: 5387: 5384: 5382: 5379: 5377: 5374: 5372: 5369: 5367: 5364: 5362: 5359: 5357: 5354: 5352: 5349: 5347: 5344: 5343: 5341: 5339: 5335: 5325: 5322: 5320: 5317: 5313: 5310: 5309: 5308: 5305: 5303: 5300: 5298: 5295: 5293: 5290: 5288: 5285: 5284: 5282: 5280: 5279:wear leveling 5275: 5269: 5266: 5264: 5261: 5259: 5256: 5254: 5251: 5249: 5246: 5244: 5241: 5239: 5236: 5234: 5231: 5230: 5228: 5226: 5222: 5218: 5212: 5209: 5207: 5204: 5202: 5199: 5197: 5194: 5193: 5191: 5189: 5185: 5178: 5174: 5171: 5168: 5164: 5161: 5159: 5156: 5154: 5151: 5149: 5146: 5144: 5141: 5139: 5136: 5132: 5129: 5127: 5124: 5123: 5122: 5119: 5117: 5114: 5112: 5109: 5107: 5104: 5102: 5099: 5095: 5092: 5089: 5085: 5084: 5082: 5080: 5077: 5075: 5072: 5070: 5067: 5063: 5060: 5059: 5058: 5055: 5053: 5050: 5048: 5045: 5043: 5040: 5038: 5035: 5033: 5030: 5028: 5025: 5023: 5020: 5018: 5015: 5011: 5008: 5007: 5006: 5003: 5001: 4998: 4996: 4993: 4991: 4988: 4984: 4981: 4979: 4976: 4975: 4973: 4971: 4968: 4966: 4963: 4961: 4958: 4956: 4953: 4951: 4948: 4945: 4941: 4938: 4935: 4931: 4928: 4924: 4921: 4920: 4919: 4916: 4914: 4911: 4909: 4906: 4904: 4901: 4897: 4894: 4893: 4892: 4889: 4885: 4882: 4880: 4877: 4875: 4872: 4870: 4867: 4866: 4865: 4862: 4860: 4857: 4853: 4850: 4848: 4845: 4844: 4842: 4840: 4837: 4835: 4832: 4830: 4827: 4825: 4822: 4817: 4813: 4811: 4808: 4806: 4803: 4802: 4800: 4798: 4795: 4793: 4790: 4788: 4785: 4783: 4780: 4778: 4775: 4773: 4770: 4768: 4765: 4764: 4761: 4758: 4752: 4746: 4743: 4739: 4736: 4735: 4734: 4731: 4730: 4727: 4723: 4716: 4711: 4709: 4704: 4702: 4697: 4696: 4693: 4681: 4680: 4671: 4670: 4667: 4661: 4658: 4657: 4655: 4651: 4645: 4642: 4640: 4637: 4635: 4632: 4630: 4627: 4625: 4622: 4620: 4617: 4615: 4612: 4610: 4607: 4605: 4602: 4600: 4597: 4595: 4592: 4590: 4587: 4585: 4582: 4580: 4577: 4575: 4572: 4570: 4567: 4565: 4562: 4560: 4557: 4555: 4552: 4550: 4547: 4545: 4542: 4540: 4537: 4535: 4532: 4531: 4529: 4525: 4519: 4516: 4514: 4511: 4509: 4506: 4504: 4501: 4500: 4498: 4494: 4488: 4485: 4484: 4482: 4478: 4472: 4469: 4467: 4464: 4462: 4459: 4457: 4454: 4452: 4449: 4448: 4446: 4444: 4440: 4434: 4431: 4428: 4426: 4423: 4421: 4418: 4416: 4413: 4411: 4408: 4406: 4403: 4401: 4398: 4396: 4393: 4391: 4388: 4386: 4383: 4381: 4378: 4376: 4373: 4371: 4368: 4366: 4363: 4361: 4358: 4356: 4353: 4351: 4348: 4346: 4343: 4341: 4338: 4336: 4333: 4331: 4328: 4326: 4323: 4321: 4318: 4316: 4313: 4311: 4308: 4306: 4303: 4301: 4298: 4296: 4293: 4291: 4288: 4286: 4283: 4281: 4278: 4276: 4273: 4271: 4268: 4266: 4263: 4261: 4258: 4256: 4253: 4251: 4248: 4246: 4243: 4241: 4238: 4236: 4233: 4231: 4228: 4226: 4223: 4221: 4218: 4216: 4213: 4211: 4208: 4206: 4203: 4201: 4198: 4196: 4193: 4191: 4188: 4186: 4183: 4181: 4178: 4176: 4173: 4171: 4168: 4166: 4163: 4161: 4158: 4156: 4153: 4151: 4148: 4146: 4143: 4141: 4138: 4136: 4133: 4131: 4128: 4126: 4123: 4121: 4118: 4116: 4113: 4111: 4108: 4106: 4103: 4101: 4098: 4096: 4093: 4091: 4088: 4086: 4083: 4081: 4078: 4076: 4073: 4071: 4068: 4066: 4063: 4061: 4058: 4056: 4053: 4051: 4048: 4046: 4043: 4041: 4038: 4036: 4033: 4031: 4028: 4026: 4023: 4021: 4018: 4016: 4013: 4011: 4008: 4006: 4003: 4001: 3998: 3996: 3993: 3991: 3988: 3986: 3983: 3981: 3978: 3976: 3973: 3971: 3968: 3966: 3963: 3961: 3958: 3956: 3953: 3951: 3948: 3946: 3943: 3941: 3938: 3936: 3933: 3931: 3928: 3926: 3923: 3921: 3918: 3916: 3913: 3911: 3908: 3906: 3903: 3901: 3898: 3896: 3893: 3891: 3888: 3886: 3883: 3881: 3878: 3877: 3875: 3869: 3865: 3858: 3853: 3851: 3846: 3844: 3839: 3838: 3835: 3828: 3822: 3817: 3816: 3805: 3799: 3795: 3794: 3788: 3784: 3778: 3774: 3770: 3769: 3763: 3759: 3753: 3749: 3745: 3744: 3738: 3726: 3722: 3716: 3712: 3708: 3707: 3701: 3697: 3691: 3687: 3683: 3682: 3676: 3675: 3653: 3649: 3643: 3627: 3621: 3605: 3599: 3583: 3577: 3561: 3555: 3540: 3536: 3530: 3514: 3508: 3492: 3486: 3470: 3464: 3448: 3442: 3426: 3420: 3404: 3400: 3396: 3390: 3374: 3368: 3360: 3356: 3352: 3346: 3342: 3338: 3334: 3326: 3310: 3306: 3302: 3296: 3281: 3277: 3271: 3256: 3249: 3243: 3228: 3224: 3220: 3213: 3197: 3196:Hadoop Common 3193: 3187: 3179: 3173: 3157: 3151: 3143: 3137: 3121: 3117: 3111: 3103: 3097: 3089: 3083: 3072: 3066: 3050: 3044: 3029: 3022: 3003: 2996: 2980: 2976: 2970: 2959: 2953: 2937: 2931: 2923: 2910: 2902: 2898: 2891: 2889: 2873: 2867: 2851: 2847: 2841: 2826: 2822: 2815: 2800: 2796: 2792: 2785: 2778: 2767: 2763: 2759: 2752: 2737: 2731: 2723: 2717: 2701: 2695: 2679: 2675: 2671: 2664: 2656: 2652: 2646: 2638: 2634: 2628: 2620: 2616: 2610: 2602: 2598: 2592: 2584: 2580: 2574: 2566: 2562: 2556: 2548: 2544: 2538: 2530: 2523: 2516: 2501: 2497: 2491: 2476: 2472: 2466: 2458: 2454: 2450: 2449:Cutting, Doug 2444: 2428: 2424: 2420: 2416: 2415:Vance, Ashlee 2410: 2403: 2399: 2395: 2388: 2380: 2373: 2365: 2358: 2343: 2339: 2332: 2316: 2310: 2295: 2293:9781118876220 2289: 2285: 2284: 2277: 2261: 2255: 2239: 2235: 2231: 2225: 2210:. Hortonworks 2209: 2205: 2198: 2182: 2178: 2174: 2168: 2160: 2156: 2152: 2146: 2142: 2138: 2134: 2127: 2111: 2107: 2100: 2085: 2081: 2077: 2071: 2056: 2052: 2046: 2031: 2027: 2020: 2004: 2000: 1993: 1978: 1977:silicon.co.uk 1974: 1967: 1952: 1946: 1944: 1927: 1923: 1917: 1913: 1903: 1900: 1897: 1896:Sector/Sphere 1894: 1891: 1888: 1885: 1881: 1878: 1876: 1873: 1871: 1868: 1865: 1862: 1859: 1856: 1853: 1851: 1847: 1844: 1843: 1839: 1828: 1819: 1815: 1812: 1808: 1805: 1801: 1798: 1794: 1793: 1792: 1784: 1782: 1778: 1777:Apache Hadoop 1768: 1760: 1758: 1748: 1745: 1743: 1738: 1735: 1721: 1719: 1716: 1713: 1710: 1707: 1705: 1701: 1698: 1695: 1691: 1690: 1689: 1686: 1684: 1680: 1676: 1672: 1668: 1664: 1661: 1657: 1654: 1653:Apache Mahout 1650: 1640: 1637: 1635: 1630: 1628: 1623: 1621: 1611: 1609: 1605: 1594: 1592: 1584: 1581: 1578: 1577: 1576: 1568: 1562: 1559: 1556: 1552: 1551: 1550: 1548: 1544: 1534: 1532: 1528: 1523: 1509: 1506: 1502: 1498: 1497: 1496: 1493: 1491: 1487: 1482: 1478: 1472: 1459: 1455: 1452:In May 2011, 1451: 1448: 1444: 1440: 1437: 1434: 1431: 1427: 1423: 1422: 1421: 1419: 1410: 1407: 1403: 1400: 1397: 1394: 1391: 1388: 1384: 1383: 1382: 1379: 1366: 1364: 1360: 1356: 1350: 1348: 1344: 1340: 1336: 1332: 1328: 1323: 1321: 1317: 1313: 1309: 1305: 1300: 1297: 1295: 1290: 1288: 1283: 1281: 1276: 1272: 1267: 1265: 1261: 1258: 1254: 1249: 1246: 1245:Task Tracker: 1242: 1239: 1235: 1232: 1228: 1225: 1221: 1218: 1214: 1210: 1203: 1200: 1197: 1194: 1191: 1190: 1189: 1186: 1182: 1178: 1174: 1170: 1155: 1151: 1149: 1145: 1140: 1138: 1129: 1125: 1121: 1119: 1115: 1111: 1110:Hadoop Common 1105: 1101: 1097: 1069: 1052: 1044: 1040: 1037: 1034: 1032: 1025: 1021: 1018: 1015: 1013: 1006: 1002: 999: 996: 994: 987: 983: 980: 977: 970: 966: 963: 960: 953: 949: 946: 943: 936: 932: 929: 926: 919: 915: 912: 909: 902: 898: 895: 892: 885: 881: 878: 875: 868: 864: 861: 858: 851: 847: 844: 841: 834: 831: 828: 825: 818: 815: 812: 809: 802: 798: 795: 792: 785: 781: 778: 775: 768: 764: 761: 758: 751: 747: 744: 741: 734: 730: 727: 724: 717: 713: 710: 707: 700: 697: 694: 691: 684: 681: 678: 675: 668: 664: 661: 658: 651: 647: 644: 641: 634: 630: 627: 624: 617: 613: 610: 607: 600: 596: 593: 590: 583: 579: 576: 573: 566: 562: 559: 556: 549: 545: 542: 539: 532: 528: 525: 522: 515: 511: 508: 506: 499: 495: 492: 490: 483: 480:Release date 479: 476: 473: 470: 469: 466: 462: 460: 456: 452: 448: 438: 436: 435:shell scripts 432: 428: 424: 419: 417: 413: 409: 404: 402: 398: 394: 390: 386: 385:Apache Impala 382: 378: 374: 370: 366: 362: 358: 354: 346: 343: 340: 337: 334: 331: 328: 325: 322: 321:Hadoop Common 319: 318: 317: 314: 312: 308: 304: 300: 299:data locality 296: 295:packaged code 285: 283: 279: 275: 272: 268: 264: 260: 256: 250: 223: 222:Apache Hadoop 214: 209: 201: 197: 194: 191: 189: 185: 182: 179: 177: 173: 170: 167: 165: 161: 158: 155: 151: 147: 143: 141: 137: 133: 114: 110: 96: 92: 88: 86: 82: 78: 64: 60: 57: 54: 52: 48: 45: 41: 38: 36: 32: 28: 23: 20:Apache Hadoop 6006: 5925:File manager 5494: 5479: 5395: 5221:Flash memory 5188:Optical disc 5126:soft updates 5106:Soup (Apple) 4756:non-rotating 4722:File systems 4677: 4335:SpamAssassin 4089: 3792: 3767: 3742: 3729:. Retrieved 3725:the original 3705: 3680: 3671:Bibliography 3656:. Retrieved 3651: 3642: 3630:. Retrieved 3620: 3608:. Retrieved 3598: 3586:. Retrieved 3576: 3564:. Retrieved 3554: 3542:. Retrieved 3538: 3529: 3517:. Retrieved 3507: 3495:. Retrieved 3485: 3475:13 September 3473:. Retrieved 3463: 3451:. Retrieved 3441: 3429:. Retrieved 3419: 3407:. Retrieved 3403:the original 3398: 3389: 3377:. Retrieved 3367: 3332: 3325: 3313:. Retrieved 3309:the original 3304: 3295: 3283:. Retrieved 3279: 3270: 3258:. Retrieved 3254: 3242: 3230:. Retrieved 3222: 3212: 3200:. Retrieved 3195: 3186: 3160:. Retrieved 3150: 3136: 3124:. Retrieved 3120:the original 3110: 3096: 3082: 3065: 3053:. Retrieved 3043: 3031:. Retrieved 3021: 3011:19 September 3009:. Retrieved 2995: 2983:. Retrieved 2979:the original 2969: 2952: 2940:. Retrieved 2930: 2909:cite journal 2875:. Retrieved 2866: 2854:. Retrieved 2850:the original 2840: 2828:. Retrieved 2824: 2814: 2802:. Retrieved 2794: 2784: 2776: 2769:. Retrieved 2761: 2751: 2739:. Retrieved 2730: 2716: 2704:. Retrieved 2694: 2682:. Retrieved 2678:the original 2674:Cloudera.com 2673: 2663: 2654: 2645: 2636: 2627: 2618: 2609: 2600: 2591: 2582: 2573: 2564: 2555: 2546: 2537: 2528: 2515: 2503:. Retrieved 2499: 2496:"Who We Are" 2490: 2478:. Retrieved 2474: 2465: 2456: 2443: 2431:. Retrieved 2422: 2409: 2401: 2397: 2387: 2372: 2357: 2345:. Retrieved 2341: 2331: 2319:. Retrieved 2309: 2297:. Retrieved 2282: 2276: 2264:. Retrieved 2254: 2242:. Retrieved 2233: 2224: 2214:30 September 2212:. Retrieved 2207: 2197: 2187:30 September 2185:. Retrieved 2181:the original 2176: 2167: 2132: 2126: 2114:. Retrieved 2109: 2099: 2087:. Retrieved 2079: 2070: 2058:. Retrieved 2054: 2045: 2033:. Retrieved 2029: 2019: 2007:. Retrieved 2003:datanami.com 2002: 1992: 1980:. Retrieved 1976: 1966: 1956:27 September 1954:. Retrieved 1930:. Retrieved 1925: 1916: 1790: 1780: 1776: 1774: 1766: 1754: 1746: 1739: 1730: 1718:Web crawling 1687: 1675:Apache Storm 1646: 1638: 1631: 1624: 1619: 1617: 1607: 1603: 1600: 1589:There is no 1588: 1574: 1566: 1540: 1530: 1526: 1519: 1500: 1494: 1480: 1476: 1474: 1447:IBRIX Fusion 1414: 1380: 1372: 1351: 1324: 1302:HDFS can be 1301: 1298: 1293: 1291: 1284: 1268: 1250: 1244: 1243: 1238:Job Tracker: 1237: 1236: 1230: 1229: 1223: 1222: 1212: 1211: 1207: 1204:Task Tracker 1168: 1166: 1158:File systems 1152: 1148:Secure Shell 1141: 1136: 1134: 1122: 1109: 1107: 1100:Apache HBase 1090:Architecture 1067: 1050: 1030: 1011: 992: 779:2.0.6-alpha 463: 455:Apache Nutch 447:Doug Cutting 444: 431:command line 420: 405: 401:Apache Storm 397:Apache Oozie 393:Apache Sqoop 389:Apache Flume 377:Apache Spark 369:Apache HBase 356: 352: 350: 345:Hadoop Ozone 344: 338: 332: 326: 320: 315: 291: 221: 220: 51:Developer(s) 40:Doug Cutting 5891:Permissions 5506:Specialized 4738:distributed 3544:11 December 3409:31 December 3285:31 December 3232:20 November 3162:11 December 3126:10 December 2877:1 September 2799:For Dummies 2795:dummies.com 2706:4 September 2505:11 December 2480:11 December 2238:Marketwired 2030:hpcwire.com 1704:data mining 1694:clickstream 1660:Apache Hive 1505:system load 1481:TaskTracker 1355:Hortonworks 1271:replicating 1198:Job tracker 1173:file system 1137:worker node 1041:2024-07-17 1035:2024-03-17 1022:2023-06-23 1016:2020-07-14 1003:2022-07-22 997:2019-01-16 984:2020-08-03 978:2018-04-06 967:2018-05-31 961:2017-12-13 950:2022-05-31 944:2019-10-29 933:2018-11-19 927:2017-12-17 916:2018-09-15 910:2017-03-22 899:2018-05-31 893:2015-04-21 882:2016-10-08 876:2014-11-18 865:2014-11-19 859:2014-08-11 848:2014-06-30 842:2014-04-07 826:2014-02-20 810:2013-12-11 799:2013-09-23 796:2.1.1-beta 793:2013-08-25 782:2013-08-23 776:2012-05-23 765:2013-08-01 759:2013-05-13 748:2013-02-15 742:2012-10-13 731:2012-10-12 725:2011-12-27 714:2014-06-27 708:2011-11-11 692:2011-12-10 676:2011-05-11 665:2011-10-17 662:0.20.205.0 659:2009-04-22 648:2009-07-23 642:2008-11-21 631:2009-01-29 625:2008-08-22 614:2008-08-19 608:2008-05-20 597:2008-05-05 591:2008-02-07 580:2008-01-18 574:2007-10-29 563:2007-11-26 557:2007-09-04 546:2007-07-23 540:2007-06-04 529:2007-04-06 523:2007-03-02 512:2007-02-16 496:2007-01-11 365:Apache Hive 333:Hadoop YARN 280:built from 255:open-source 6013:Categories 5917:Interfaces 5901:Sticky bit 5781:Versioning 5746:Journaling 5689:Rubberhose 5485:SMB (CIFS) 5277:host-side 4564:Deltacloud 4350:Subversion 4240:OрenOffice 4125:Jackrabbit 4065:FreeMarker 3990:CloudStack 3975:CarbonData 3955:Bloodhound 3706:Pro Hadoop 3658:3 December 3652:apache.org 3632:17 October 3610:17 October 3588:17 October 3519:30 October 3497:9 November 3493:. Facebook 3471:. Facebook 3431:17 October 3379:17 October 3260:12 October 3255:apache.org 3055:17 October 3033:24 October 2684:23 October 2433:20 January 2347:12 October 2299:29 January 2266:17 October 2244:30 October 2177:apache.org 2116:30 October 2005:. Datanami 1926:apache.org 1908:References 1890:Hypertable 1884:LexisNexis 1591:preemption 1516:Scheduling 1477:JobTracker 1287:fail-overs 1280:throughput 1253:redundancy 1224:Data Node: 1213:Name Node: 1177:data store 1094:See also: 410:papers on 361:Apache Pig 269:using the 153:Written in 140:Repository 122:2024-03-17 104:2022-05-31 72:2006-04-01 5776:Synthetic 5719:Clustered 5666:Encrypted 5597:OverlayFS 5206:ISO 13490 4782:Amiga OFS 4777:Amiga FFS 4559:Continuum 4480:Incubator 4433:ZooKeeper 4390:Trafodion 4380:TinkerPop 4080:Guacamole 4040:Empire-db 4025:Directory 3980:Cassandra 3871:Top-level 2901:25423189M 2060:25 August 1848:– Secure 1813:, Google. 1757:the cloud 1471:MapReduce 1424:In 2009, 1322:systems. 1201:Data Node 1192:Name Node 1104:MapReduce 465:in 2007. 412:MapReduce 357:ecosystem 351:The term 303:processed 271:MapReduce 5861:Symbolic 5790:Features 5766:Semantic 5674:eCryptfs 5618:configfs 5587:SquashFS 5475:POHMELFS 5376:OrangeFS 5201:ISO 9660 5121:UFS/UFS2 5069:Reliance 5057:ReiserFS 4903:Files-11 4797:bcachefs 4754:Disk and 4679:Category 4653:Licenses 4594:Marmotta 4425:XMLBeans 4405:Velocity 4365:Tapestry 4360:SystemDS 4355:Superset 4345:Struts 2 4340:Struts 1 4295:RocketMQ 4200:NetBeans 4180:mod_perl 4070:Geronimo 3960:Brooklyn 3890:Airavata 3885:ActiveMQ 3880:Accumulo 3873:projects 3560:"Hadoop" 3172:cite web 2985:5 August 2830:11 March 2819:Balram. 2427:Archived 2159:11157612 2089:12 April 2035:11 March 2009:11 March 1982:11 March 1932:28 April 1870:Big data 1850:Bigtable 1824:See also 1771:Branding 1696:analysis 1620:namenode 1543:Facebook 1359:Cloudera 1217:metadata 711:0.23.11 471:Version 288:Overview 267:big data 5979:Layouts 5965:Default 5628:debugfs 5602:UnionFS 5496:more... 5429:OpenAFS 5397:more... 5062:Reiser4 5032:OpenZFS 4923:HAMMER2 4879:ext3cow 4859:Episode 4634:Tuscany 4629:Stanbol 4589:Jakarta 4584:Harmony 4544:Beehive 4487:Taverna 4471:Logging 4443:Commons 4260:Phoenix 4255:Parquet 4235:OpenNLP 4230:OpenJPA 4225:OpenEJB 4185:MyFaces 4110:Iceberg 4005:CouchDB 4000:Cordova 3985:Cayenne 3965:Calcite 3895:Airflow 3566:22 July 3359:2180634 3315:11 June 3223:ibm.com 2942:30 July 2856:19 June 2804:21 June 2771:21 June 2321:5 April 2080:ibm.com 1692:Log or 1529:or the 1458:MapR FS 1375:file:// 1363:Datadog 1337:), the 1310:(FUSE) 1304:mounted 1260:sockets 1185:methods 1051:Legend: 947:2.10.2 695:0.22.0 679:0.21.0 645:0.19.2 628:0.18.3 611:0.17.2 594:0.16.4 577:0.15.3 560:0.14.4 543:0.13.1 526:0.12.3 509:0.11.2 493:0.10.1 441:History 206:.apache 199:Website 188:License 120: ( 102: ( 70: ( 6049:Hadoop 5852:Links 5826:Extent 5756:Object 5724:Global 5642:specfs 5638:procfs 5633:kernfs 5611:Pseudo 5592:UMSDOS 5547:Davfs2 5542:cramfs 5480:Hadoop 5460:Lustre 5346:BeeGFS 5312:NILFS2 5047:QNX4FS 5010:NILFS2 4918:HAMMER 4908:Fossil 4574:Giraph 4549:iBATIS 4461:Daemon 4420:Xerces 4410:Wicket 4385:Tomcat 4370:Thrift 4290:Roller 4250:PDFBox 4190:Mynewt 4165:Mahout 4160:Lucene 4140:JMeter 4120:Impala 4115:Ignite 4090:Hadoop 4075:Groovy 4010:cTAKES 3995:Cocoon 3905:Ambari 3900:Allura 3800:  3779:  3773:Apress 3754:  3731:3 July 3717:  3711:Apress 3692:  3453:23 May 3357:  3347:  3202:9 June 2899:  2741:6 June 2290:  2157:  2147:  1787:Papers 1681:, and 1627:Docker 1511:nodes. 1361:, and 1327:Thrift 1257:TCP/IP 1102:, and 1038:3.4.0 1019:3.3.6 1000:3.2.4 981:3.1.4 964:3.0.3 930:2.9.2 913:2.8.5 896:2.7.7 879:2.6.5 862:2.5.2 845:2.4.1 829:2.3.0 813:2.2.0 762:1.2.1 745:1.1.2 728:1.0.4 459:Yahoo! 408:Google 399:, and 353:Hadoop 204:hadoop 94:2.10.x 5952:Lists 5896:Modes 5741:Flash 5712:Types 5694:SSHFS 5679:EncFS 5656:WinFS 5651:tmpfs 5646:sysfs 5623:devfs 5557:FTPFS 5552:EROFS 5490:SSHFS 5371:OCFS2 5324:UBIFS 5319:YAFFS 5307:NILFS 5302:LogFS 5297:JFFS2 5253:EROFS 5243:exFAT 5148:Xiafs 5131:WAPBL 5116:UBIFS 5027:OneFS 5005:NILFS 5000:Next3 4990:MINIX 4896:exFAT 4824:Btrfs 4792:AthFS 4772:AdvFS 4624:Sqoop 4619:Slide 4614:Shale 4609:River 4599:MXNet 4554:Click 4539:AxKit 4527:Attic 4518:Log4j 4503:Batik 4466:Jelly 4429:Yetus 4415:Xalan 4330:Storm 4325:Spark 4315:Sling 4310:SINGA 4305:Shiro 4300:Samza 4280:Pivot 4275:Pinot 4220:Oozie 4215:OFBiz 4210:NuttX 4205:Nutch 4170:Maven 4155:Kylin 4145:Kafka 4130:James 4100:Helix 4095:HBase 4060:Flume 4055:Flink 4045:Felix 4035:Druid 4030:Drill 4020:Derby 3970:Camel 3945:Axis2 3920:Arrow 3915:Aries 3399:Yahoo 3355:S2CID 3251:(PDF) 3074:(PDF) 3005:(PDF) 2961:(PDF) 2525:(PDF) 2155:S2CID 1734:cores 1679:Flink 1649:HBase 1555:pools 1501:slots 1490:Jetty 1345:over 1335:OCaml 1331:Cocoa 1316:Linux 1181:POSIX 941:2.10 705:0.23 689:0.22 673:0.21 656:0.20 639:0.19 622:0.18 605:0.17 588:0.16 571:0.15 554:0.14 537:0.13 520:0.12 504:0.11 488:0.10 112:3.4.x 5856:Hard 5848:Fork 5729:Grid 5582:MVFS 5577:NOVA 5572:LTFS 5567:Lnfs 5562:FUSE 5532:CDfs 5522:AXFS 5517:Aufs 5455:GPFS 5440:Coda 5391:Xsan 5381:PVFS 5361:GFS2 5356:CXFS 5351:Ceph 5292:JFFS 5287:CHFS 5268:NVFS 5258:F2FS 5248:TFAT 5233:APFS 5223:and 5167:z/OS 5158:Xsan 5143:WAFL 5138:VxFS 5111:Tux3 5101:SNFS 5083:SFS 5052:ReFS 5022:NTFS 4974:MFS 4960:HTFS 4955:HPFS 4950:HFS+ 4913:GPFS 4884:ext4 4874:ext3 4869:ext2 4843:EFS 4834:CXFS 4829:CVFS 4816:z/VM 4801:BFS 4787:APFS 4767:ADFS 4639:Wave 4579:Hama 4569:Etch 4534:Apex 4451:BCEL 4400:UIMA 4375:Tika 4320:Solr 4285:Qpid 4195:NiFi 4175:MINA 4150:Kudu 4135:Jena 4105:Hive 4085:Gump 4050:Flex 3950:Beam 3940:Axis 3935:Avro 3798:ISBN 3777:ISBN 3752:ISBN 3733:2009 3715:ISBN 3690:ISBN 3660:2014 3634:2013 3612:2013 3590:2013 3568:2014 3546:2017 3521:2013 3499:2012 3477:2012 3455:2012 3433:2013 3411:2015 3381:2013 3345:ISBN 3317:2018 3287:2015 3262:2017 3234:2013 3204:2012 3178:link 3164:2017 3128:2013 3057:2013 3035:2016 3013:2016 2987:2016 2944:2013 2922:help 2879:2013 2858:2020 2832:2021 2806:2016 2773:2016 2743:2013 2708:2014 2686:2013 2507:2017 2482:2017 2435:2010 2349:2017 2323:2013 2301:2015 2288:ISBN 2268:2013 2246:2014 2216:2014 2189:2014 2145:ISBN 2118:2014 2091:2021 2062:2016 2037:2018 2011:2018 1984:2018 1958:2022 1934:2019 1880:HPCC 1522:FIFO 1418:MapR 1347:HTTP 1320:Unix 1167:The 975:3.1 958:3.0 924:2.9 907:2.8 890:2.7 873:2.6 856:2.5 839:2.4 823:2.3 807:2.2 790:2.1 773:2.0 756:1.2 739:1.1 722:1.0 449:and 429:and 414:and 261:for 208:.org 176:Type 157:Java 5699:ZFS 5684:EFS 5470:NFS 5465:NCP 5445:DFS 5435:AFP 5425:AFS 5412:NAS 5386:QFS 5263:JFS 5238:FAT 5225:SSD 5211:UDF 5196:HSF 5177:Sun 5173:ZFS 5163:zFS 5153:XFS 5079:RFS 5042:QFS 5037:PFS 5017:NSS 4970:LFS 4965:JFS 4944:MVS 4940:HFS 4930:HFS 4891:FAT 4864:ext 4839:DFS 4644:XML 4604:ODE 4513:Ivy 4508:FOP 4456:BSF 4270:Pig 4265:POI 4245:ORC 4015:CXF 3930:APR 3910:Ant 3337:doi 3227:IBM 2137:doi 2084:IBM 1779:or 1712:XML 1426:IBM 1396:FTP 1314:on 1031:3.4 1012:3.3 993:3.2 6015:: 5420:9P 5088:VM 3650:. 3537:. 3397:. 3353:. 3343:. 3303:. 3278:. 3253:. 3225:. 3221:. 3194:. 3174:}} 3170:{{ 2913:: 2911:}} 2907:{{ 2897:OL 2887:^ 2823:. 2797:. 2793:. 2775:. 2764:. 2760:. 2672:. 2653:. 2635:. 2617:. 2599:. 2581:. 2563:. 2545:. 2527:. 2498:. 2473:. 2455:. 2425:. 2421:. 2400:. 2396:. 2340:. 2236:. 2232:. 2206:. 2175:. 2153:. 2143:. 2108:. 2082:. 2078:. 2053:. 2028:. 2001:. 1975:. 1942:^ 1924:. 1882:– 1742:PB 1685:. 1677:, 1673:, 1636:. 1443:HP 1420:. 1365:. 1357:, 1289:. 1098:, 418:. 403:. 395:, 391:, 387:, 383:, 379:, 375:, 371:, 367:, 363:, 243:uː 42:, 5431:) 5427:( 5179:) 5175:( 5169:) 5165:( 5090:) 4946:) 4942:( 4936:) 4932:( 4818:) 4714:e 4707:t 4700:v 3856:e 3849:t 3842:v 3806:. 3785:. 3760:. 3735:. 3698:. 3662:. 3636:. 3614:. 3592:. 3570:. 3548:. 3523:. 3501:. 3479:. 3457:. 3435:. 3413:. 3383:. 3361:. 3339:: 3319:. 3289:. 3264:. 3236:. 3206:. 3180:) 3166:. 3130:. 3059:. 3037:. 3015:. 2989:. 2946:. 2924:) 2920:( 2903:. 2881:. 2860:. 2834:. 2808:. 2745:. 2724:. 2710:. 2688:. 2657:. 2639:. 2621:. 2603:. 2585:. 2567:. 2549:. 2531:. 2509:. 2484:. 2437:. 2351:. 2325:. 2303:. 2270:. 2248:. 2218:. 2191:. 2161:. 2139:: 2120:. 2093:. 2064:. 2039:. 2013:. 1986:. 1960:. 1936:. 1557:. 1389:. 427:C 249:/ 246:p 240:d 237:ˈ 234:ə 231:h 228:/ 224:( 124:) 106:) 74:)

Index


Original author(s)
Doug Cutting
Mike Cafarella
Developer(s)
Apache Software Foundation
Stable release
Repository
Hadoop Repository
Java
Operating system
Cross-platform
Type
Distributed file system
License
Apache License 2.0
hadoop.apache.org
Edit this at Wikidata
/həˈdp/
open-source
software framework
distributed storage
big data
MapReduce
programming model
computer clusters
commodity hardware
packaged code
data locality
processed

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.