Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
HADOOP DISTRIBUTED FILE SYSTEM
#1

HADOOP DISTRIBUTED FILE SYSTEM

[attachment=16321]

Introduction

Apache Hadoopis asoftware frameworkthat supports data-intensivedistributed applicationsunder afree license.It enables applications to work with thousands of nodes andpetabytesof data.
Hadoop was inspired byGoogle'sMapReduceandGoogle File System(GFS) papers.
Hadoop is a Apacheproject being built and used by a global community of contributors, using theJavaprogramming language.
Yahoo!has been the largest contributor to the project, and uses Hadoop extensively across its businesses.
Hadoop was created byDoug Cutting, who named it after his son's toy elephant.

Benefits of Hadoop

Hadoop is designed to run on cheap commodity hardware
It automatically handles data replication and node failure
It does the hard work you can focus on processing data
Cost Saving and efficient and reliable data processing

Overall Architecture

A small Hadoop cluster will include a single master and multiple worker nodes.
The master node consists of:-
Jobtracker
Tasktracker
Namenode
Datanode.
A slave orworker nodeconsists of a datanodeand tasktracker, though it is possible to have data-only worker nodes, and compute-only worker nodes
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

Powered By MyBB, © 2002-2024 iAndrew & Melroy van den Berg.