Saturday, July 23, 2016

HDFS Commands

HDFS Commands

1. To create directory (mkdir)

Usage: hdfs dfs -mkdir [-p] <paths>

Takes Path/URI as argument to create directories.
-p: much like Unix "mkdir -p", creating parent directories along the path.

hdfs dfs -mkdir /user/root/dir1 /user/root/dir2
hdfs dfs -mkdir hdfs://user/hadoop/dir hdfs://user/hadoop/dir
hdfs dfs -mkdir hdfs://localhost/user/root/hadoop

Exit Code:
Returns 0 on success and -1 on error

2. To list files (ls)

Usage: hdfs dfs -ls [-d] [-h] [-R] <args>

 -d: Directories are listed as plain files.
 -h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864).
 -R: Recursively list subdirectories encountered..

For a file ls returns stat on the file with the following format:

permissions number_of_replicas userid groupid filesize modification_date modification_time filename

For a directory it returns list of its direct children as in Unix. A directory is listed as:

permissions userid groupid modification_date modification_time dirname

Files within a directory are order by filename by default.
hdfs dfs -ls /user/root
hdfs dfs -ls hdfs://localhost/user/root/hadoopfile1

Exit Code:
Returns 0 on success and -1 on error.

3. Reads one or more files and prints them to standard output (Cat)

Usage: hdfs dfs -cat URI [URI ...]

Copies source paths to stdout.
hdfs dfs -cat hdfs://localhost/user/root/hadoopfile1
hdfs dfs -cat file://root/localfile2 /user/root/localfile3
hdfs dfs -cat hdfs://localfile hdfs://localfile1

Exit Code:
Returns 0 on success and -1 on error.

4. copyToLocal

Usage: hdfs dfs -copyToLocal URI <localdst>

Similar to get command, except that the destination is restricted to a local file reference.
 hdfs dfs -copyToLocal /user/root/hadoopfile
 hdfs dfs -copyToLocal hdfs://localhost/user/root/hadoopfile

5. copyFromLocal

Usage: hdfs dfs -copyFromLocal <localsrc> URI

Similar to put command, except that the source is restricted to a local file reference.
 -f: overwrite the destination if it already exists

 hdfs dfs -copyFromLocal ./localfile3
 hdfs dfs -copyFromLocal file:///root/localfile3 hdfs://localhost/user/root/hadoopfile/

6. cp

Usage: hdfs dfs -cp [-f] URI [URI ...] <dest><localdst>

Copy files from source to destination. This command allows multiple sources as well in which case the destination must be a directory.
hdfs dfs -cp /user/root/localfile1 /user/root/localfile4
hdfs dfs -cp /user/root/localfile1 /user/root/localfile2 /user/root/hadoopfile
hdfs dfs -cp hdfs://localhost/user/root/hadoopfile hdfs://localhost/user/root/hadoopfile4

Exit Code:
Returns 0 on success and -1 on error.

7. df

Usage: hdfs dfs -df [-h] URI [URI ...]<dest><localdst>

Displays free space.
 -h: format file sizes in a “human-readable” fashion (e.g 64.0m instead of 67108864)

hdfs dfs -df -h /user/root

8. du

Usage: hdfs dfs -du [-s] [-h] URI [URI ...]

Displays sizes of files and directories contained in the given directory or the length of a file in case its just a file.
 -s: aggregate summary of file lengths being displayed, rather than the individual files.
 -h: format file sizes in a “human-readable” fashion (e.g 64.0m instead of 67108864)

hdfs dfs -du /user/root
hdfs dfs -du hdfs://localhost/user/root/hadoopfile/

Exit Code:
Returns 0 on success and -1 on error.

9. get

Usage: hdfs dfs -get <src> <localdst>

Copy files to the local file system.
 hdfs dfs -get /user/root/hadoopfile
 hdfs dfs -get hdfs://localhost/user/root/hadoopfile localfiles

Exit Code:
Returns 0 on success and -1 on error.

10. help

Usage: hdfs dfs –help

Return usage output.

11. moveFromLocal

Usage: hdfs dfs –moveFromLocal <localsrc> <dst>

Similar to put command, except that the source localsrc is deleted after it’s copied.
 hdfs dfs -moveFromLocal localfile /user/root/

12. moveToLocal

Usage: hdfs dfs -moveToLocal [-crc] <src> <dst>

Displays a “Not implemented yet” message. 

13. mv

Usage: hdfs dfs -mv URI [URI ...] <dest>

Moves files from source to destination. This command allows multiple sources as well in which case the destination needs to be a directory. Moving files across file systems is not permitted.
 hdfs dfs -mv /user/root/localfile1 /user/root/localfile5

Exit Code:
Returns 0 on success and -1 on error.

14. put

Usage: hdfs dfs -put <localsrc> ... <dst>

Copy single src, or multiple srcs from local file system to the destination file system. Also reads input from stdin and writes to destination file system.

 hdfs dfs -put localfile5 /user/root/localfile6
 hdfs dfs -put localfile1 localfile2 /user/root/hadoopfile
 hdfs dfs -put localfile hdfs://hadoop/hadoopfile
Exit Code:
Returns 0 on success and -1 on error.

15. rm

Usage: hdfs dfs -rm [-f] [-r |-R] URI [URI ...]

Delete files specified as args.
 -f: not display a diagnostic message or modify the exit status to reflect an error if the file does not exist.
 -R: deletes the directory and any content under it recursively.
 -r: equivalent to -R.
 -skipTrash: bypass trash, if enabled, and delete the specified file(s) immediately. This can be useful when it is necessary to delete files from an over-quota directory.

 hdfs dfs -rm hdfs://localhost/file /user/localhost/emptydir

Exit Code:
Returns 0 on success and -1 on error.

16. tail

Usage: hdfs dfs -tail [-f] URI

Displays last kilobyte of the file to stdout.
-f: output appended data as the file grows, as in Unix.


hdfs dfs -tail pathname
Exit Code:
Returns 0 on success and -1 on error.

17. test

Usage: hdfs dfs -test -[defsz] URI

 -d: if the path is a directory, return 0.
 -e: if the path exists, return 0.
 -f: if the path is a file, return 0.
 -s: if the path is not empty, return 0.
 -z: if the file is zero length, return 0.

 hdfs dfs -test -e filename

18. text

Usage: hdfs dfs -text -[defsz] URI

Takes a source file and outputs the file in text format.

19. touchz

Usage: hdfs dfs -touchz URI [URI ...]

Create a file of zero length.
hdfs dfs -touchz pathname

Exit Code:
Returns 0 on success and -1 on error.

20. appendToFile

Usage: hdfs dfs -appendToFile <localsrc> ... <dst>

Append single src, or multiple srcs from local file system to the destination file system. Also reads input from stdin and appends to destination file system.
 hdfs dfs -appendToFile localfile /user/hadoop/localfile2
 hdfs dfs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile
 hdfs dfs -appendToFile localfile hdfs://localhost/hadoop/localfile2 

21. checksum

Usage: hdfs dfs -checksum URI

Returns the checksum information of a file.
hdfs dfs -checksum hdfs://localhost/file1

22. setrep

Usage: hdfs dfs -setrep [-R] [-w] <numReplicas> <path> [URI ...]

Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.
 -w: requests that the command wait for the replication to complete. This can potentially take a very long time.
 -R: accepted for backwards compatibility. It has no effect.

 hdfs dfs -setrep -w 3 /user/root/dir1

Exit Code:
Returns 0 on success and -1 on error.

23. stat

Usage: hdfs dfs -stat [format] <path> ...URI

Print statistics about the file/directory at <path> in the specified format. Format accepts filesize in blocks (%b), type (%F), group name of owner (%g), name (%n), block size (%o), replication (%r), user name of owner(%u), and modification date (%y, %Y). %y shows UTC date as “yyyy-MM-dd HH:mm:ss” and %Y shows milliseconds since January 1, 1970 UTC. If the format is not specified, %y is used by default.
 hdfs dfs -stat "%F %u:%g %b %y %n" /file

Exit Code:
Returns 0 on success and -1 on error.

24. count

Usage: hdfs dfs -count [-q] <paths>

Count the number of directories, files and bytes under the paths that match the specified file pattern. The output columns with -count are: DIR_COUNT, FILE_COUNT, CONTENT_SIZE, PATHNAME

 hdfs dfs -count hdfs://localhost/file1 hdfs://localhost/file2
 hdfs dfs -count -q hdfs://localhost/file1

Exit Code:
Returns 0 on success and -1 on error.

25. getmerge

Usage: hdfs dfs -getmerge <src> <localdst>

Takes a source directory and a destination file as input and concatenates files in src into the destination local file. Optionally addnl can be set to enable adding a newline character at the end of each file.

No comments:

Post a Comment