Establishing Connections to the File System
The APIs for establishing connections to the file system and returning file-system handles are:
- hadoop-
<version>
:hdfsConnect()
- hadoop-
<version>
:hdfsConnectAsUser()
NOTEThis API ignores the impersonation request and is therefore equivalent tohdfsConnect()
. - hadoop-
<version>
:hdfsConnectNewInstance()
The hdfsConnectAsNewUserInstance()
API is not supported for connections to
file system fileservers.
These APIs behave in the same way:
- If
default
is specified for thehost
parameter, the APIs connect to the first cluster listed in the fileMAPR_HOME/conf/mapr-clusters.conf
. (MAPR_HOME
defaults to/opt/mapr
.) - If a hostname or IP address is specified for the
host
parameter:- Look in
MAPR_HOME/conf/mapr-clusters.conf
on the client node to match the specified hostname or IP address to a CLDB host and port. - If they find a match, they try to connect to the cluster, and all standard features for connections to Data Fabric clusters are available. These features include high availability across CLDBs and secure connections.
- If they do not find a match or if they cannot locate a
mapr-clusters.conf
file, they try to connect to the CLDB host specified in the call to create the connection. However, the standard features for connections to Data Fabric clusters are not available. For example, if the cluster is secured, the connection will fail.
- Look in
It is possible to have more than one open connection at a time. For each connection, simply
return the file-system handle to a different instance of hdfsFS
, as in this
example:
//Connect to Cluster 1 (picked up from /opt/mapr/conf/mapr-clusters.conf)
hdfsFS fs1 = hdfsConnectNewInstance("default", 7222);
//Connect to Cluster 2
hdfsFS fs2 = hdfsConnectNewInstance("n1c", 7222);
//Connect to Cluster 3
hdfsFS fs3 = hdfsConnectNewInstance("n1d", 7222);
You can then obtain file handles for files in each connected cluster, as in this example.
For each cluster, this example code calls hdfsOpenFile()
, passing in the
handle to the file system, the absolute path to a file (and the file is created before being
opened, if it doesn’t already exist) and a file-access flag that specifies to open the file
in write-only mode. This mode truncates existing files to offset 0, deleting their
content.
Ignore the last three parameters for this example. hdfsOpenFile()
returns
a handle to the file or an error message, if the open operation fails.
//Create files for write operations on all clusters
const char* writePath = "/tmp/write-file1.txt";
hdfsFile writeFile1 = hdfsOpenFile(fs1, writePath, O_WRONLY, 0, 0, 0);
if (!writeFile1) {
fprintf(stderr, "Failed to open %s for writing on Cluster 1!\n", writePath);
exit(-2);
}
hdfsFile writeFile2 = hdfsOpenFile(fs2, writePath, O_WRONLY, 0, 0, 0);
if (!writeFile2) {
fprintf(stderr, "Failed to open %s for writing on Cluster 2!\n", writePath);
exit(-2);
}
hdfsFile writeFile3 = hdfsOpenFile(fs3, writePath, O_WRONLY, 0, 0, 0);
if (!writeFile3) {
fprintf(stderr, "Failed to open %s for writing on Cluster 3!\n", writePath);
exit(-2);
}
fprintf(stderr, "Opened %s for writing successfully on all 3 clusters...\n", writePath);
After working with the files, close them and disconnect from the file system, as in this example:
// Close all files
if (writeFile1)
hdfsCloseFile(fs1, writeFile1);
if (writeFile2)
hdfsCloseFile(fs2, writeFile2);
if (writeFile3)
hdfsCloseFile(fs3, writeFile3);
// Disconnect from all clusters
hdfsDisconnect(fs1);
hdfsDisconnect(fs3);
hdfsDisconnect(fs3);