Hadoop File System Commands
I just took a Hadoop developer training in the week of September 10. To me, Hadoop is not totally new as I’ve tried HelloWorld sample and Serengeti project. Still, I found it’s nice to get away from daily job and go through a series of lectures and hands-on labs in a training setting. Believe it or not, I felt more tired after training than a typical working day. This post is not much new but just helps me on the commands when needed later.
Hadoop File System (HDFS) is a fundamental building block in Hadoop ecosystem. It’s a file system designed to store big data including input data and result data. For that, HDFS distributes big files across networked data nodes. Although logically continuous, a big file can be split into many chucks, each of which can be saved on a different physical machine.
Time to learn how to "Google" and manage your VMware and clouds in a fast and secure
HTML5 AppYou can access the files with APIs, but more often with the command lines (which is, BTW, an application built on top of the HDFS APIs). There are about 30 commands to manage a Hadoop file system remotely, for example from a Linux shell. Don’t confuse the Hadoop file system with your local file system. In some way, you can think of Hadoop file system as a file system on another machine.
Syntax Overview
The basic syntax of HDFS commands is as follows:
$ hadoop fs -command [extra arguments]
For example:
$ hadoop fs -ls
The first part “hadoop fs” is always the same for file system related commands. After that is very much like typical Unix/Linux commands in syntax. Besides managing the HDFS itself, there are commands to import data files from local file system to HDFS, and export data files from HDFS to local file system. These commands are unique therefore deserve most attention.
[-put ... ] [-copyFromLocal ... ] [-moveFromLocal ... ] [-get [-ignoreCrc] [-crc] ] [-getmerge [addnl]] [-copyToLocal [-ignoreCrc] [-crc] ] [-moveToLocal [-crc] ]
A Typical Use Case
When using Hadoop, you need to move your data to a HDFS before processing it, and optionally move the result back to your local file system. Here is a typical flow:
$ hadoop fs -mkdir test $ hadoop fs -put input.txt test/input.txt $ hadoop fs -ls test $ hadoop fs -cat test/input.txt $ hadoop jar mr.jar WordCount test/input.txt test/output $ hadoop fs -ls test/output $ hadoop fs -lss test $ hadoop fs -get test/output .
Other Useful Commands
There are other commands you will find useful, for example the commands listed below:
$ hadoop fs -chmod 777 test/input.txt $ hadoop fs -cp test/input.txt test/input1.txt $ hadoop fs -cp test/input.txt test/input1.txt $ hadoop fs -rmr test
Space use in bytes for individual files or directories
$ hadoop fs -du
Space used in bytes in summary, therefore only one entry is given
$ hadoop fs -dus $ hadoop fs -count /test
Getting Help
Lastly but not least is the help command. When in doubt, you can always use help:
$hadoop fs -help
Don’t forget the “-“ before the help, or you will see something similar but different. You can also add specific command you want to get help on, for example,
$hadoop fs -help copyFromLocal
Hadoop File System Commands | http://t.co/YeSdvPPV http://t.co/5uWBlC5I #hadoop
Hadoop File System Commands – http://t.co/0b7mi0eP http://t.co/0b7mi0eP
Hadoop File System Commands http://t.co/UfvfpSNs #hadoop #HDFS
Good work sir, Thanks for the proper explanation about Hadoop shell commands . I found one of the good resource related hadoop fs commands and hadoop tutorial. It is providing in-depth knowledge on hadoop fs commands and hadoop tutorial. which I am sharing a link with you where you can get more clear on hadoop fs commands and hadoop tutorial. To know more Just have a look at this link
Hadoop Tutorial
Hadoop fs Commands