Accessing DataTaps in Kubernetes Pods
Describes the generic process for configuring Kubernetes pods to access DataTaps, including considerations and steps for Hadoop 2.x and Hadoop 3.x applications.
About this task
The hpecp-agent observes pod creation. If the pod includes the
                    hpecp.hpe.com/dtap label, the following occurs:
- 
                    hpecp-agentadds a sidecar container that implements the DataTaps. Thehpecp-agentcreates anemptyDirvolume nameddtap-shared-vol. This volume is mounted to the/opt/bdfsdirectory of the sidecar container and the application container.
- 
                    On startup, based on the appropriate Hadoop version, the sidecar container prepares the appropriate bluedata-dtap.jarfile in the/opt/bdfsdirectory.
- 
                    The /opt/bdfsdirectory in the sidecar DataTap container and in the application container mounts from the same volumedtap-shared-vol. Thus, the application container can also directly access thebluedata-dtap.jarin the/opt/bdfsdirectory.
The following procedure is a generic example only.
- KubeDirector applications included with HPE Ezmeral Runtime Enterprise are preconfigured to be able to access DataTaps, and you need only set the pod label. See Accessing DataTaps in KubeDirector Applications.
- Spark Operator applications must be configured for DataTap access as described in Tutorial: Spark Configuration and Execution on Kubernetes.
- If a pod has the label hpecp.hpe.com/dtap: hadoop2orhpecp.hpe.com/dtap: hadoop3, the DataTap sidecar container runs until the pod is deleted. In some scenarios—such as when a user submits a Spark Operator application—the application container exits automatically after the application is completed. If the DataTap sidecar container still runs after the application container exits, the pod is unable to enter a completed status. Because the pod does not enter the completed state, the pod continues to use resources instead of those resources being released for use by other pods.To ensure that the DataTap sidecar container also exits automatically after the application container exits, use one of the following labels: - If the application is Hadoop 2.x, add the label:
                            hpecp.hpe.com/dtap: hadoop2-job
- If the application is Hadoop 2.x, add the
                            label:hpecp.hpe.com/dtap: hadoop3-job
 
- If the application is Hadoop 2.x, add the label:
                            
Procedure
- 
                Add one of the following sets of labels to the YAML file of the pod: 
                - If the application is Hadoop 2.x, add the following labels:
                            hpecp.hpe.com/dtap: hadoop2 hpecp.hpe.com/dtap: hadoop2-job
- If the application is Hadoop 2.x, add the following
                            labels:hpecp.hpe.com/dtap: hadoop3 hpecp.hpe.com/dtap: hadoop3-job
 
- If the application is Hadoop 2.x, add the following labels:
                            
- 
                In the application container, add bluedata-dtap.jarto theclasspath, and then modify the Hadoopcore-site.xmlfile.The following example adds the fs.dtap.impl,fs.AbstractFileSystem.dtap.impl, andfs.dtap.impl.disable.cacheto thecore-site.xmlfile:fs.dtap.impl com.bluedata.hadoop.bdfs.Bdfs fs.AbstractFileSystem.dtap.impl com.bluedata.hadoop.bdfs.BdAbstractFS fs.dtap.impl.disable.cache false