Configure High Availability for SparkMaster
You configure high availability for the Spark Primary instance so that the instance does not become the single point of failure.
By using ZooKeeper to provide leader election and some state storage, you can launch multiple primary nodes in your cluster that are connected to the same ZooKeeper instance. Zookeeper elects one primary node to be the “leader,” and the others remain in standby mode. If the leader goes down, Zookeeper elects another primary node, recovers the old primary node's state, and resumes scheduling.
- Set
SPARK_DAEMON_JAVA_OPTS
inspark-env.sh
with the appropriate ZooKeeper information for the cluster.export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=<zookeeper1:5181,zookeeper2:5181,...> -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dzookeeper.sasl.client=false
- Restart the Spark Primary instance and Spark History Server services:
- For Spark 2.0.1 and
later:
maprcli node services -nodes <node-ip> -name spark-master -action restart
- For Spark
1.6.1:
maprcli node services -nodes <node-ip> -name spark-master -action restart maprcli node services -nodes <node-ip> -name spark-historyserver -action restart
- For Spark 2.0.1 and
later:
- On the primary node, restart the Spark Secondary instances as the
mapr
user.For Spark 2.x:/opt/mapr/spark/spark-<version>/sbin/stop-slaves.sh /opt/mapr/spark/spark-<version>/sbin/start-slaves.sh
For Spark 3.x:/opt/mapr/spark/spark-<version>/sbin/stop-workers.sh /opt/mapr/spark/spark-<version>/sbin/start-workers.sh