Copy Data Using the hdfs:// Protocol
Describes the procedure to copy data from a HDFS cluster to a HPE Data Fabric cluster using the hdfs://
protocol.
Before you can copy data from an HDFS cluster to a HPE Data Fabric cluster using the
hdfs:// protocol, you must configure the HPE Data Fabric cluster to access the HDFS cluster. To
do this, complete the steps listed in Configuring a HPE Data Fabric Cluster to Access an
HDFS Cluster for the security scenario that best describes your HDFS and
HPE Data Fabric clusters, and then
complete the steps listed under Verifying Access to an HDFS Cluster.
<NameNode>- the IP address or hostname of the NameNode in the HDFS cluster<NameNode Port>- the port for connecting to the NameNode in the HDFS cluster<HDFS path>- the path to the HDFS directory from which you plan to copy data<Data Fabric File system path>- the path in the HPE Data Fabric cluster to which you plan to copy HDFS data<file>- a file in the HDFS path
To copy data from HDFS to HPE Data Fabric file system
using the hdfs:// protocol, complete the following steps:
- Run the following Hadoop command to determine if the HPE Data Fabric cluster can read the contents of a file in a
specified directory on the HDFS cluster:
hadoop fs -cat <NameNode>:<NameNode port>/<HDFS path>/<file>Examplehadoop fs -cat hdfs://nn1:8020/user/sara/contents.xml - If the HPE Data Fabric cluster can read the contents of
the file, run the
distcpcommand to copy the data from the HDFS cluster to the HPE Data Fabric cluster:hadoop distcp hdfs://<NameNode>:<NameNode Port>/<HDFS path> maprfs://<Data Fabric File system path>Examplehadoop distcp hdfs://nn1:8020/user/sara maprfs:///user/sara