Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
NOTE This post is one in a series on Hadoop for .NET Developers.
Hadoop is implemented as a set of interrelated project components. The core components are MapReduce, which handles job execution, and a storage layer, typically implemented as the Hadoop Distributed File System (HDFS). For the purpose of this post, we will assume HDFS is in use.
Hadoop components are implemented across a series of servers referred to as data (or compute) nodes. These nodes are where data are stored and processed.
A name node server keeps track of the data nodes in the environment, which data are stored on which node, and presents that data nodes as a singular entity. This singular representation is referred to as a cluster. If you are familiar with the term cluster from RDBMS implementations, please note that there is not necessarily any shared storage or other resources between the nodes. A Hadoop cluster is purely logical.