Copying Data Using the webhdfs:// Protocol
Describes how to copy data from a HDFS cluster to a HPE Ezmeral Data Fabric cluster using the webhdfs://
protocol.
Before you can copy data from an HDFS cluster to a HPE Ezmeral Data Fabric cluster using the webhdfs://
protocol, you must configure the HPE Ezmeral Data Fabric cluster
to access the HDFS cluster. To do this, complete the steps listed in Configuring a HPE Ezmeral Data Fabric Cluster to Access an HDFS Cluster for
the security scenario that best describes your HDFS and HPE Ezmeral Data Fabric clusters, and then complete the steps listed
under Verifying Access to an HDFS Cluster.
The HDFS cluster must have WebHDFS enabled. Verify that the following parameter exists in the
hdfs-site.xml
file and that the value is set to true
.
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<NameNode>
- the IP address or hostname of the NameNode in the HDFS cluster<NameNode HTTP Port>
- the HTTP port on the NameNode in the HDFS cluster<HDFS path>
- the path to the HDFS directory from which you plan to copy data<Data Fabricfilesystem path>
- the path in the HPE Ezmeral Data Fabric cluster to which you plan to copy HDFS data
To copy data from the HDFS to the HPE Ezmeral Data Fabric file
system using the webhdfs://
protocol, complete the following step:
hadoop distcp webhdfs://<NameNode>:<NameNode HTTP Port>/<HDFS path> maprfs:///<Data Fabric filesystem path>
hadoop distcp webhdfs://nn2:50070/user/sara maprfs:///user/sara
maprfs:///...
are required.