Integrate Spark with HBase
Integrate Spark with HBase or HPE Ezmeral Data Fabric Database when you want to run Spark jobs on HBase or HPE Ezmeral Data Fabric Database tables.
About this task
Procedure
-
Configure the HBase version in the
/opt/mapr/spark/spark-<version>/mapr-util/compatibility.version
file:
The HBase version depends on the current EEP and MapR version that you are running.hbase_versions=<version>
-
If you want to create HBase tables with Spark, add the following property to
hbase-site.xml
:<property> hbase.table.sanity.checks</name> <value>false</value> </property>
-
On each Spark node, copy the
hbase-site.xml
to the{SPARK_HOME}/conf/
directory.TIPStarting in the EEP 7.0.0 release, you do not have to complete step 3. Runningconfigure.sh
copies thehbase-site.xml
file to the Spark directory automatically. -
Specify the
hbase-site.xml
file in theSPARK_HOME/conf/spark-defaults.conf
file:spark.yarn.dist.files SPARK_HOME/conf/hbase-site.xml
-
To verify the integration, complete the following steps: