Integrate Spark with HBase
Integrate Spark with HBase or HPE Data Fabric Database when you want to run Spark jobs on HBase or HPE Data Fabric Database tables.
About this task
Procedure
-
Configure the HBase version in the
/opt/mapr/spark/spark-<version>/mapr-util/compatibility.versionfile:
The HBase version depends on the current EEP and MapR version that you are running.hbase_versions=<version> -
If you want to create HBase tables with Spark, add the following property to
hbase-site.xml:<property> hbase.table.sanity.checks</name> <value>false</value> </property> -
On each Spark node, copy the
hbase-site.xmlto the{SPARK_HOME}/conf/directory.TIPStarting in the EEP 7.0.0 release, you do not have to complete step 3. Runningconfigure.shcopies thehbase-site.xmlfile to the Spark directory automatically. -
Specify the
hbase-site.xmlfile in theSPARK_HOME/conf/spark-defaults.conffile:spark.yarn.dist.files SPARK_HOME/conf/hbase-site.xml -
To verify the integration, complete the following steps: