Accessing DataTaps in Kubernetes Pods
Describes the generic process for configuring Kubernetes pods to access DataTaps, including considerations and steps for Hadoop 2.x and Hadoop 3.x applications.
About this task
The hpecp-agent
observes pod creation. If the pod includes the
hpecp.hpe.com/dtap
label, the following occurs:
-
hpecp-agent
adds a sidecar container that implements the DataTaps. Thehpecp-agent
creates anemptyDir
volume nameddtap-shared-vol
. This volume is mounted to the/opt/bdfs
directory of the sidecar container and the application container. -
On startup, based on the appropriate Hadoop version, the sidecar container prepares the appropriate
bluedata-dtap.jar
file in the/opt/bdfs
directory. -
The
/opt/bdfs
directory in the sidecar DataTap container and in the application container mounts from the same volumedtap-shared-vol
. Thus, the application container can also directly access thebluedata-dtap.jar
in the/opt/bdfs
directory.
The following procedure is a generic example only.
- KubeDirector applications included with HPE Ezmeral Runtime Enterprise are preconfigured to be able to access DataTaps, and you need only set the pod label. See Accessing DataTaps in KubeDirector Applications.
- Spark Operator applications must be configured for DataTap access as described in Tutorial: Spark Configuration and Execution on Kubernetes.
- If a pod has the label
hpecp.hpe.com/dtap: hadoop2
orhpecp.hpe.com/dtap: hadoop3
, the DataTap sidecar container runs until the pod is deleted. In some scenarios—such as when a user submits a Spark Operator application—the application container exits automatically after the application is completed. If the DataTap sidecar container still runs after the application container exits, the pod is unable to enter a completed status. Because the pod does not enter the completed state, the pod continues to use resources instead of those resources being released for use by other pods.To ensure that the DataTap sidecar container also exits automatically after the application container exits, use one of the following labels:
- If the application is Hadoop 2.x, add the label:
hpecp.hpe.com/dtap: hadoop2-job
- If the application is Hadoop 2.x, add the
label:
hpecp.hpe.com/dtap: hadoop3-job
- If the application is Hadoop 2.x, add the label:
Procedure
-
Add one of the following sets of labels to the YAML file of the pod:
- If the application is Hadoop 2.x, add the following labels:
hpecp.hpe.com/dtap: hadoop2 hpecp.hpe.com/dtap: hadoop2-job
- If the application is Hadoop 2.x, add the following
labels:
hpecp.hpe.com/dtap: hadoop3 hpecp.hpe.com/dtap: hadoop3-job
- If the application is Hadoop 2.x, add the following labels:
-
In the application container, add
bluedata-dtap.jar
to theclasspath
, and then modify the Hadoopcore-site.xml
file.The following example adds the
fs.dtap.impl
,fs.AbstractFileSystem.dtap.impl
, andfs.dtap.impl.disable.cache
to thecore-site.xml
file:fs.dtap.impl com.bluedata.hadoop.bdfs.Bdfs fs.AbstractFileSystem.dtap.impl com.bluedata.hadoop.bdfs.BdAbstractFS fs.dtap.impl.disable.cache false