Hadoop:hadoop fs、hadoop dfs与hdfs dfs命令的区别

2017-04-10 10:59

http://

'Hadoop DFS'和'Hadoop FS'的区别
While exploring HDFS, I came across these two syntaxes for querying HDFS:
> hadoop dfs
> hadoop fs

why we have two different syntaxes for a common purpose

为什么会对同一个功能提供两种命令标记?

命令的定义

it seems like there's no difference between the two syntaxes. If we look at the definitions of the two commands (hadoop fs and hadoop dfs) in $HADOOP_HOME/bin/hadoop
从两个命令的定义中(在$HADOOP_HOME/bin/hadoop)可以看到这两者之间似乎没有什么区别。
...  
elif [ "$COMMAND" = "datanode" ] ; then  
  CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'  
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_DATANODE_OPTS"  
elif [ "$COMMAND" = "fs" ] ; then  
  CLASS=org.apache.hadoop.fs.FsShell  
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"  
elif [ "$COMMAND" = "dfs" ] ; then  
  CLASS=org.apache.hadoop.fs.FsShell  
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"  
elif [ "$COMMAND" = "dfsadmin" ] ; then  
  CLASS=org.apache.hadoop.hdfs.tools.DFSAdmin  
  HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"  
...  

貌似更有道理的解释

Unconvinced, and these excerpts made more sense to me:

FS relates to a generic file system which can point to any file systems like local, HDFS etc. But dfs is very specific to HDFS. So when we use FS it?can perform operation with from/to local or hadoop distributed file system to destination. But specifying DFS operation relates to?HDFS. 这个理由并不完全信服,下面这个解释貌似更有道理:
FS涉及到一个通用的文件系统,可以指向任何的文件系统如local,HDFS等。
但是DFS仅是针对HDFS的。
那么什么时候用FS呢?可以在本地与hadoop分布式文件系统的交互操作中使用。特定的DFS指令与HDFS有关。

Hadoop文档对这两个不同的shell的描述

Below are two excerpts from the Hadoop documentation that describe these two as different shells.

下面是两个摘录Hadoop文档,描述这两个不同的shell。

  1. FS Shell  
  2. The FileSystem (FS) shell is invoked by bin/hadoop fs. All the FS shell commands take path URIs as arguments. The URI format is scheme://autority/path. For HDFS the scheme is hdfs, and for the local filesystem the scheme is file. The scheme and authority are optional. If not specified, the default scheme specified in the configuration is used. An HDFS file or directory such as /parent/child can be specified as hdfs://namenodehost/parent/child or simply as /parent/child (given that your configuration is set to point to hdfs://namenodehost). Most of the commands in FS shell behave like corresponding Unix commands.? 
  1. DFShell  
  2. The HDFS shell is invoked by bin/hadoop dfs. All the HDFS shell commands take path URIs as arguments. The URI format is scheme://autority/path. For HDFS the scheme is hdfs, and for the local filesystem the scheme is file. The scheme and authority are optional. If not specified, the default scheme specified in the configuration is used. An HDFS file or directory such as /parent/child can be specified as hdfs://namenode:namenodeport/parent/child or simply as /parent/child (given that your configuration is set to point to namenode:namenodeport). Most of the commands in HDFS shell behave like corresponding Unix commands.? 

So, based on the above, we can conclude that it all depends on the scheme configuration. When using these two commands with absolute URI (i.e. scheme://a/b) the?behavior?shall be identical. Only it's the default configured scheme value for file and hdfs for fs and dfs respectively, which is the cause for difference in?behavior. 
由上述内容可以看出来,这两个命令依赖于模式的配置。当使用绝对URI(如scheme://a/b)这两个命令式相同的。只有默认的模式配置参数对dfs和fs起作用。
[ http://java.dzone.com/articles/difference-between-hadoop-dfs]
皮皮blog

stackoverflow的解释

Hadoop fs:使用面最广,可以操作任何文件系统。

hadoop dfs与hdfs dfs:只能操作HDFS文件系统相关(包括与Local FS间的操作),前者已经Deprecated,一般使用后者。

Following are the three commands which appears same but have minute differences

  1. hadoop fs {args}
  2. hadoop dfs {args}
  3. hdfs dfs {args}

    hadoop fs <args>
    

FS relates to a generic file system which can point to any file systems like local, HDFS etc. So this can be used when you are dealing with different file systems such as Local FS, HFTP FS, S3 FS, and others

  hadoop dfs <args>

dfs is very specific to HDFS. would work for operation relates to HDFS. This has been deprecated and we should use hdfs dfs instead.

  hdfs   dfs <args>

same as 2nd i.e would work for all the operations related to HDFS and is the recommended command instead of hadoop dfs

below is the list categorized as HDFS commands.

  **#hdfs commands**
  namenode|secondarynamenode|datanode|dfs|dfsadmin|fsck|balancer|fetchdt|oiv|dfsgroups

So even if you use Hadoop dfs , it will look locate hdfs and delegate that command to hdfs dfs

[何时使用hadoop fs、hadoop dfs与hdfs dfs命令]

from: ref: