签到成功

知道了

CNDBA社区CNDBA社区

Hadoop HDFS 常用命令 汇总

2019-01-23 14:44 2447 0 原创 Hadoop
作者: dave

1 管理hadoop的相关命令

Hadoop管理的命令在HADOOP_HOME/sbin和HADOOP_HOME/bin 两个目录下,具体命令如下:http://www.cndba.cn/cndba/dave/article/3258

[http://www.cndba.cn@hadoopmaster sbin]$ pwd
/home/cndba/hadoop/sbin
[http://www.cndba.cn@hadoopmaster sbin]$ ll
total 108
-rwxr-xr-x. 1 cndba cndba 2756 Aug  2 12:30 distribute-exclude.sh
drwxr-xr-x. 4 cndba cndba   36 Aug  2 12:50 FederationStateStore
-rwxr-xr-x. 1 cndba cndba 1983 Aug  2 12:27 hadoop-daemon.sh
-rwxr-xr-x. 1 cndba cndba 2522 Aug  2 12:27 hadoop-daemons.sh
-rwxr-xr-x. 1 cndba cndba 1542 Aug  2 12:31 httpfs.sh
-rwxr-xr-x. 1 cndba cndba 1500 Aug  2 12:28 kms.sh
-rwxr-xr-x. 1 cndba cndba 1841 Aug  2 12:52 mr-jobhistory-daemon.sh
-rwxr-xr-x. 1 cndba cndba 2086 Aug  2 12:30 refresh-namenodes.sh
-rwxr-xr-x. 1 cndba cndba 1779 Aug  2 12:27 start-all.cmd
-rwxr-xr-x. 1 cndba cndba 2221 Aug  2 12:27 start-all.sh
-rwxr-xr-x. 1 cndba cndba 1880 Aug  2 12:30 start-balancer.sh
-rwxr-xr-x. 1 cndba cndba 1401 Aug  2 12:30 start-dfs.cmd
-rwxr-xr-x. 1 cndba cndba 5170 Aug  2 12:30 start-dfs.sh
-rwxr-xr-x. 1 cndba cndba 1793 Aug  2 12:30 start-secure-dns.sh
-rwxr-xr-x. 1 cndba cndba 1571 Aug  2 12:50 start-yarn.cmd
-rwxr-xr-x. 1 cndba cndba 3342 Aug  2 12:50 start-yarn.sh
-rwxr-xr-x. 1 cndba cndba 1770 Aug  2 12:27 stop-all.cmd
-rwxr-xr-x. 1 cndba cndba 2166 Aug  2 12:27 stop-all.sh
-rwxr-xr-x. 1 cndba cndba 1783 Aug  2 12:30 stop-balancer.sh
-rwxr-xr-x. 1 cndba cndba 1455 Aug  2 12:30 stop-dfs.cmd
-rwxr-xr-x. 1 cndba cndba 3898 Aug  2 12:30 stop-dfs.sh
-rwxr-xr-x. 1 cndba cndba 1756 Aug  2 12:30 stop-secure-dns.sh
-rwxr-xr-x. 1 cndba cndba 1642 Aug  2 12:50 stop-yarn.cmd
-rwxr-xr-x. 1 cndba cndba 3083 Aug  2 12:50 stop-yarn.sh
-rwxr-xr-x. 1 cndba cndba 1982 Aug  2 12:27 workers.sh
-rwxr-xr-x. 1 cndba cndba 1814 Aug  2 12:50 yarn-daemon.sh
-rwxr-xr-x. 1 cndba cndba 2328 Aug  2 12:50 yarn-daemons.sh
[http://www.cndba.cn@hadoopmaster sbin]$ ll ../bin
total 944
-rwxr-xr-x. 1 cndba cndba 421664 Aug  2 12:50 container-executor
-rwxr-xr-x. 1 cndba cndba   8580 Aug  2 12:27 hadoop
-rwxr-xr-x. 1 cndba cndba  11078 Aug  2 12:27 hadoop.cmd
-rwxr-xr-x. 1 cndba cndba  11026 Aug  2 12:30 hdfs
-rwxr-xr-x. 1 cndba cndba   8081 Aug  2 12:30 hdfs.cmd
-rwxr-xr-x. 1 cndba cndba   6237 Aug  2 12:52 mapred
-rwxr-xr-x. 1 cndba cndba   6311 Aug  2 12:52 mapred.cmd
-rwxr-xr-x. 1 cndba cndba 451360 Aug  2 12:50 test-container-executor
-rwxr-xr-x. 1 cndba cndba  11888 Aug  2 12:50 yarn
-rwxr-xr-x. 1 cndba cndba  12840 Aug  2 12:50 yarn.cmd
[http://www.cndba.cn@hadoopmaster sbin]$

1.1 启动关闭Hadoop

start-all.sh启动所有进程,等价于start-dfs.sh + start-yarn.sh
但是一般不推荐使用start-all.sh(因为开源框架中内部命令启动有很多问题)。

stop-all.sh 关闭所有进程。

1.2 启动单进程

sbin/start-dfs.sh
---------------
    sbin/hadoop-daemons.sh --config .. --hostname .. start namenode ...
    sbin/hadoop-daemons.sh --config .. --hostname .. start datanode ...
    sbin/hadoop-daemons.sh --config .. --hostname .. start sescondarynamenode ...
    sbin/hadoop-daemons.sh --config .. --hostname .. start zkfc ...         //

sbin/start-yarn.sh
--------------  
    libexec/yarn-config.sh
    sbin/yarn-daemon.sh --config $YARN_CONF_DIR  start resourcemanager
    sbin/yarn-daemons.sh  --config $YARN_CONF_DIR  start nodemanager

1.3 负载均衡

HDFS的数据在各个DataNode中的分布可能很不均匀,尤其是在DataNode节点出现故障或新增DataNode节点时。新增数据块时NameNode对DataNode节点的选择策略也有可能导致数据块分布不均匀。
用户可以使用命令重新平衡DataNode上的数据块的分布:

[http://www.cndba.cn@hadoopmaster ~]$ start-balancer.sh

2 常用的hdfs命令

2.1 Hdfs的所有命令

可以通过help查看hdfs的所有命令:

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -help
Usage: hadoop fs [generic options]
    [-appendToFile <localsrc> ... <dst>]
    [-cat [-ignoreCrc] <src> ...]
    [-checksum <src> ...]
    [-chgrp [-R] GROUP PATH...]
    [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
    [-chown [-R] [OWNER][:[GROUP]] PATH...]
    [-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst>]
    [-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
    [-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] [-e] <path> ...]
    [-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
    [-createSnapshot <snapshotDir> [<snapshotName>]]
    [-deleteSnapshot <snapshotDir> <snapshotName>]
    [-df [-h] [<path> ...]]
    [-du [-s] [-h] [-v] [-x] <path> ...]
    [-expunge]
    [-find <path> ... <expression> ...]
    [-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
    [-getfacl [-R] <path>]
    [-getfattr [-R] {-n name | -d} [-e en] <path>]
    [-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
    [-head <file>]
    [-help [cmd ...]]
    [-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]]
    [-mkdir [-p] <path> ...]
    [-moveFromLocal <localsrc> ... <dst>]
    [-moveToLocal <src> <localdst>]
    [-mv <src> ... <dst>]
    [-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
    [-renameSnapshot <snapshotDir> <oldName> <newName>]
    [-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
    [-rmdir [--ignore-fail-on-non-empty] <dir> ...]
    [-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
    [-setfattr {-n name [-v value] | -x name} <path>]
    [-setrep [-R] [-w] <rep> <path> ...]
    [-stat [format] <path> ...]
    [-tail [-f] <file>]
    [-test -[defsz] <path>]
    [-text [-ignoreCrc] <src> ...]
    [-touchz <path> ...]
    [-truncate [-w] <length> <path> ...]
    [-usage [cmd ...]]

我们这里看几个常用的命令。

http://www.cndba.cn/cndba/dave/article/3258
http://www.cndba.cn/cndba/dave/article/3258

2.2 查看指定目录下内容

hdfs dfs -ls [文件目录]
hdfs dfs -ls -R / //显式目录结构

2.3 在hadoop指定目录内创建新目录

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -mkdir /dave
[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -mkdir -p /oracle/mysql

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -ls -R /
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /dave
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle/mysql

注意这里生成的目录,是在datanode 节点的DFS目录上生成。 可以通过页面进行查看。

2.4 在hadoop指定目录下新建一个空文件

使用touchz命令:

[http://www.cndba.cn@hadoopslave1 ~]$ hdfs dfs  -touchz  /dave/cndba.txt
[http://www.cndba.cn@hadoopmaster current]$ hdfs dfs -ls -R /
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:51 /dave
-rw-r--r--   2 cndba supergroup          0 2019-01-23 21:51 /dave/cndba.txt
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle/mysql

这里判断HDFS中是文件还是目录的方式,查看权限的第一位,如果是d则表示为目录。http://www.cndba.cn/cndba/dave/article/3258

2.5 将本地文件存储至hadoop

语法:
hdfs dfs -put [本地地址] [hadoop目录]

[http://www.cndba.cn@hadoopmaster ~]$ ps -ef > ps.txt
[http://www.cndba.cn@hadoopmaster ~]$ pwd
/home/cndba
[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -put /home/cndba/ps.txt /dave[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -ls -R /
drwxr-xr-x   - cndba supergroup          0 2019-01-23 22:07 /dave
-rw-r--r--   2 cndba supergroup          0 2019-01-23 21:51 /dave/cndba.txt
-rw-r--r--   2 cndba supergroup      20659 2019-01-23 22:07 /dave/ps.txt
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle/mysql
[http://www.cndba.cn@hadoopmaster ~]$

2.6 将本地文件夹存储至hadoop

    hdfs dfs -put [本地目录] [hadoop目录] 
    hdfs dfs -put /home/t/dir_name /user/t

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -put /tmp /tmp
put: /tmp/.ICE-unix/16020 (No such device or address)
put: /tmp/.ICE-unix/5847 (No such device or address)
put: /tmp/.ICE-unix/6817 (No such device or address)
put: /tmp/.X11-unix/X0 (No such device or address)
put: /tmp/.esd-0: Permission denied
put: /tmp/.esd-1000: Permission denied
put: /tmp/.esd-989: Permission denied
……
[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -ls -R /
drwxr-xr-x   - cndba supergroup          0 2019-01-23 22:07 /dave
-rw-r--r--   2 cndba supergroup          0 2019-01-23 21:51 /dave/cndba.txt
-rw-r--r--   2 cndba supergroup      20659 2019-01-23 22:07 /dave/ps.txt
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle/mysql
drwxr-xr-x   - cndba supergroup          0 2019-01-23 22:10 /tmp
drwxr-xr-x   - cndba supergroup          0 2019-01-23 22:10 /tmp/.ICE-unix
drwxr-xr-x   - cndba supergroup          0 2019-01-23 22:10 /tmp/.Test-unix
-rw-r--r--   2 cndba supergroup         11 2019-01-23 22:10 /tmp/.X0-lock
drwxr-xr-x   - cndba supergroup          0 2019-01-23 22:10 /tmp/.X11-unix
……

2.7 打开某个已存在文件

语法:
hdfs dfs -cat [file_path]

示例:http://www.cndba.cn/cndba/dave/article/3258

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -cat /dave/ps.txt
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 Jan21 ?        00:00:28 /usr/lib/systemd/systemd --switched-root --system --deserialize 22
root         2     0  0 Jan21 ?        00:00:00 [kthreadd]
root         3     2  0 Jan21 ?        00:00:00 [ksoftirqd/0]
root         5     2  0 Jan21 ?        00:00:00 [kworker/0:0H]
root         7     2  0 Jan21 ?        00:00:00 [migration/0]
root         8     2  0 Jan21 ?        00:00:00 [rcu_bh]
root         9     2  0 Jan21 ?        00:02:19 [rcu_sched]
root        10     2  0 Jan21 ?        00:00:00 [lru-add-drain]
root        11     2  0 Jan21 ?        00:00:00 [watchdog/0]
root        12     2  0 Jan21 ?        00:00:00 [watchdog/1]
……

2.8 将hadoop上的文件下载至本地目录

语法:
hadoop dfs -get [文件目录] [本地目录]

示例:

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -get /dave/ps.txt /tmp
[http://www.cndba.cn@hadoopmaster ~]$ ll /tmp/ps.txt
-rw-r--r--. 1 cndba cndba 20659 Jan 23 22:15 /tmp/ps.txt

2.9 删除hadoop上指定文件

语法:
hdfs dfs -rm [文件地址]

示例:

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -rm /dave/ps.txt 
Deleted /dave/ps.txt
[http://www.cndba.cn@hadoopmaster ~]$

2.10 删除hadoop上指定文件夹(包含子目录等)

语法:
hdfs dfs -rm -r [目录地址]

http://www.cndba.cn/cndba/dave/article/3258
http://www.cndba.cn/cndba/dave/article/3258

示例:

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -rmr /tmp
rmr: DEPRECATED: Please use '-rm -r' instead.
Deleted /tmp
[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -ls -R /
drwxr-xr-x   - cndba supergroup          0 2019-01-23 22:16 /dave
-rw-r--r--   2 cndba supergroup          0 2019-01-23 21:51 /dave/cndba.txt
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle/mysql
[http://www.cndba.cn@hadoopmaster ~]$

2.11 将hadoop上的文件重命名

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -ls -R /
drwxr-xr-x   - cndba supergroup          0 2019-01-23 22:16 /dave
-rw-r--r--   2 cndba supergroup          0 2019-01-23 21:51 /dave/cndba.txt
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle/mysql

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -mv /dave/cndba.txt /dave/www.cndba.cn.txt

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfs -ls -R /
drwxr-xr-x   - cndba supergroup          0 2019-01-23 22:18 /dave
-rw-r--r--   2 cndba supergroup          0 2019-01-23 21:51 /dave/www.cndba.cn.txt
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle
drwxr-xr-x   - cndba supergroup          0 2019-01-23 21:33 /oracle/mysql

3 安全模式

HDFS的安全模式管理是管理端命令,对应的参数是hadoop dfsadmin。

http://www.cndba.cn/cndba/dave/article/3258

3.1 进入安全模式

NameNode在启动时会自动进入安全模式。安全模式是NameNode的一种状态,在这个阶段,文件系统不允许有任何修改。
系统显示Name node in safe mode,说明系统正处于安全模式,这时只需要等待十几秒即可退出安全模式。
在必要情况下,可以通过以下命令把HDFS置于安全模式:

http://www.cndba.cn/cndba/dave/article/3258
http://www.cndba.cn/cndba/dave/article/3258

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfsadmin -safemode enter
Safe mode is ON

3.2 退出安全模式

在上节说HDFS在启动时会临时进入一个安全模式,并且会自动退出,如果没有退出安全模式,可以手工来退出,命令如下:

[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfsadmin -safemode leave
Safe mode is OFF
 [A[http://www.cndba.cn@hadoopmaster ~]$ hdfs dfsadmin -safemode get
Safe mode is OFF
[http://www.cndba.cn@hadoopmaster ~]$

4 完整的HDFS目录帮助

Hdfs命令分几种类型,可以通过help进行查看:

[http://www.cndba.cn@hadoopmaster ~]$ hdfs -help
Usage: hdfs [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]

  OPTIONS is none or any of:

--buildpaths                       attempt to add class files from build tree
--config dir                       Hadoop config directory
--daemon (start|status|stop)       operate on a daemon
--debug                            turn on shell script debug mode
--help                             usage information
--hostnames list[,of,host,names]   hosts to use in worker mode
--hosts filename                   list of hosts to use in worker mode
--loglevel level                   set the log4j level for this command
--workers                          turn on worker mode

  SUBCOMMAND is one of:


    Admin Commands:

cacheadmin           configure the HDFS cache
crypto               configure HDFS encryption zones
debug                run a Debug Admin to execute HDFS debug commands
dfsadmin             run a DFS admin client
dfsrouteradmin       manage Router-based federation
ec                   run a HDFS ErasureCoding CLI
fsck                 run a DFS filesystem checking utility
haadmin              run a DFS HA admin client
jmxget               get JMX exported values from NameNode or DataNode.
oev                  apply the offline edits viewer to an edits file
oiv                  apply the offline fsimage viewer to an fsimage
oiv_legacy           apply the offline fsimage viewer to a legacy fsimage
storagepolicies      list/get/set block storage policies

    Client Commands:

classpath            prints the class path needed to get the hadoop jar and the required libraries
dfs                  run a filesystem command on the file system
envvars              display computed Hadoop environment variables
fetchdt              fetch a delegation token from the NameNode
getconf              get config values from configuration
groups               get the groups which users belong to
lsSnapshottableDir   list all snapshottable dirs owned by the current user
snapshotDiff         diff two snapshots of a directory or diff the current directory contents with a snapshot
version              print the version

    Daemon Commands:

balancer             run a cluster balancing utility
datanode             run a DFS datanode
dfsrouter            run the DFS router
diskbalancer         Distributes data evenly among disks on a given node
journalnode          run the DFS journalnode
mover                run a utility to move block replicas across storage types
namenode             run the DFS namenode
nfs3                 run an NFS version 3 gateway
portmap              run a portmap service
secondarynamenode    run the DFS secondary namenode
zkfc                 run the ZK Failover Controller daemon

SUBCOMMAND may print help when invoked w/o parameters or with -h.
[http://www.cndba.cn@hadoopmaster ~]$

版权声明:本文为博主原创文章,未经博主允许不得转载。

用户评论
* 以下用户言论只代表其个人观点,不代表CNDBA社区的观点或立场
dave

dave

关注

人的一生应该是这样度过的:当他回首往事的时候,他不会因为虚度年华而悔恨,也不会因为碌碌无为而羞耻;这样,在临死的时候,他就能够说:“我的整个生命和全部精力,都已经献给世界上最壮丽的事业....."

  • 2262
    原创
  • 3
    翻译
  • 578
    转载
  • 192
    评论
  • 访问:8068463次
  • 积分:4349
  • 等级:核心会员
  • 排名:第1名
精华文章
    最新问题
    查看更多+
    热门文章
      热门用户
      推荐用户
        Copyright © 2016 All Rights Reserved. Powered by CNDBA · 皖ICP备2022006297号-1·

        QQ交流群

        注册联系QQ