Step 1: Restart and Check Cluster Services
After upgrading core using either a manual offline or rolling upgrade method (not upgrading with the Installer) and upgrading your ecosystem components, configure and restart the cluster and services.
About this task
This procedure configures and restarts the cluster and services, including ecosystem components, remounts the NFS share, and checks that all packages have been upgraded on all nodes.
After finishing this procedure, run non-trivial health checks, such as performance benchmarks relevant to the cluster’s typical workload or a suite of common jobs. It is a good idea to run these types of checks when the cluster is idle. In this procedure, you configure each node in the cluster without changing the list of services that will run on the node. If you want to change the list of services, do so after completing the upgrade. After you have upgraded packages on all nodes, perform this procedure on all nodes to restart the cluster. Upon completion of this procedure, core services are running on all nodes.
Procedure
-
Merge any custom edits that you made to your cluster environment variables into
the new
/opt/mapr/conf/env_override.sh
file before restarting the cluster. This is because the upgrade process replaces your original/opt/mapr/conf/env.sh
file with a new copy ofenv.sh
that is appropriate for the Data Fabric release to which you are upgrading. The newenv.sh
does not include any custom edits you might have made to the originalenv.sh
. However, a backup of your originalenv.sh
file is saved as/opt/mapr/conf/env.sh<timestamp>
. Before restarting the cluster, you must add any custom entries from/opt/mapr/conf/env.sh<timestamp>
into/opt/mapr/conf/env_override.sh
, and copy the updatedenv_override.sh
to all other nodes in the cluster. See About env_override.sh. -
On each node in the cluster, remove the
mapruserticket
file. For manual upgrades, the file must be removed to ensure that impersonation works properly. Themapruserticket
file is re-created automatically when you restart Warden. For more information, see Upgrade Notes (Release 7.9).# rm /opt/mapr/conf/mapruserticket
-
If you are upgrading from core 6.1.x to core 7.x, create the
ssl_truststore.pem
andssl_keystore.pem
files. These files are used by the Data Access Gateway, Grafana, and Hue components. This step is necessary only for manual upgrades because upgrades performed with the Installer distribute the files automatically. Use these commands: -
Depending on the release from which you are upgrading, use one of the following
commands to create the new userkeystores and usertruststores. You must run the
command in order to enable log monitoring and the MCS and Object Store user
interfaces. You run this command once on any node, and then copy the resulting
files to all other nodes in the cluster:
- To upgrade from core 6.2.0 to 7.0.0 or
later:
manageSSLKeys.sh createusercert -a moss -u *.$(hostname -d) -ug <cluster_admin_id>:<cluster_admin_group>
- To upgrade from core 6.1.x to 7.0.0 or
later:
manageSSLKeys.sh createusercerts -ug <cluster_admin_id>:<cluster_admin_group> -N <cluster_name>
For more information about the user certs, see: - To upgrade from core 6.2.0 to 7.0.0 or
later:
-
On each node in the cluster, run
configure.sh
with the-R
option:# /opt/mapr/server/configure.sh -R -HS <hostname>
-
If ZooKeeper is installed on the node, start it:
# service mapr-zookeeper start
-
Start Warden.
# service mapr-warden start
- Run a simple health-check targeting the file system and MapReduce services only. Address any issues or alerts that might have come up at this point.
-
Set the new cluster version in the
/opt/mapr/MapRBuildVersion
file by running the following command on any node in the cluster:# maprcli config save -values {mapr.targetversion:"`cat /opt/mapr/MapRBuildVersion`"}
-
Verify the new cluster version:
For example:
# maprcli config load -keys mapr.targetversion mapr.targetversion 7.2.0.0.20230118195227.GA
-
Remount the Data Fabric NFS share:
The following example assumes that the cluster is mounted at
/mapr
:# mount -o hard,nolock <hostname>:/mapr /mapr
-
Run commands, as shown in the example, to check that the packages have been
upgraded successfully:
Check the following:
- All expected nodes show up in a cluster node list, and the expected services are configured on each node.
- A master CLDB is active, and all nodes return the same result.
- Only one ZooKeeper service claims to be the ZooKeeper leader, and all other ZooKeepers are followers.
mapr@m2-mapreng-vm167213:~$ maprcli node list -columns hostname,csvc hostname ip configuredservice m2-mapreng-vm167213.mip.storage.hpecorp.net 10.163.167.213 keycloak,s3server,cldb,ezotelcol,nfs4,collectd,hoststats,data-access-gateway,fileserver,mastgateway,opentsdb,gateway,apiserver m2-mapreng-vm167214.mip.storage.hpecorp.net 10.163.167.214 keycloak,s3server,cldb,ezotelcol,nfs4,collectd,hoststats,data-access-gateway,fileserver,mastgateway,opentsdb,gateway,apiserver m2-mapreng-vm167215.mip.storage.hpecorp.net 10.163.167.215 keycloak,s3server,cldb,ezotelcol,nfs4,collectd,hoststats,fileserver,mastgateway,opentsdb mapr@m2-mapreng-vm167213:~$ maprcli node cldbprimary cldbprimary ServerID: 5525564767900681920 HostName: m2-mapreng-vm167215.mip.storage.hpecorp.net mapr@m2-mapreng-vm167213:~$ service mapr-zookeeper status ● mapr-zookeeper.service - MapR Technologies, Inc. zookeeper service Loaded: loaded (/etc/systemd/system/mapr-zookeeper.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2024-07-04 01:22:37 PDT; 1 weeks 5 days ago Main PID: 172099 (java) Tasks: 0 (limit: 38470) Memory: 3.6M CGroup: /system.slice/mapr-zookeeper.service ‣ 172099 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Dzookeeper.log.dir=/opt/mapr/zookeeper/zookeeper-3.5.6/logs -Dzookeeper.log.file=zookeepe> Jul 04 01:22:37 m2-mapreng-vm167213 zookeeper[172098]: Starting zookeeper ... Jul 04 01:22:37 m2-mapreng-vm167213 zookeeper[172141]: STARTED
# maprcli node list -columns hostname,csvc hostname configuredservice ip centos55 nodemanager,cldb,fileserver,hoststats 10.10.82.55 centos56 nodemanager,cldb,fileserver,hoststats 10.10.82.56 centos57 fileserver,nodemanager,hoststats,resourcemanager 10.10.82.57 centos58 fileserver,nodemanager,webserver,nfs,hoststats,resourcemanager 10.10.82.58 ...more nodes... # maprcli node cldbmaster cldbmaster ServerID: 8851109109619685455 HostName: centos56 # service mapr-zookeeper status Redirecting to /bin/systemctl status mapr-zookeeper.service ● mapr-zookeeper.service - MapR Technologies, Inc. zookeeper service Loaded: loaded (/etc/systemd/system/mapr-zookeeper.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2021-05-26 09:18:54 PDT; 1 months 9 days ago Process: 2215 ExecStart=/opt/mapr/initscripts/zookeeper start (code=exited, status=0/SUCCESS) Main PID: 2510 (java) Tasks: 0 (limit: 410335) Memory: 4.5M CGroup: /system.slice/mapr-zookeeper.service ‣ 2510 /usr/lib/jvm/java-11-openjdk-11.0.9.11-3.el8_3.x86_64/bin/java -Dzookeeper.log.dir=/opt/mapr/zookeeper/zookeeper-3.8.3/logs -Dzookeeper.lo> May 26 09:18:53 <node> systemd[1]: Starting MapR Technologies, Inc. zookeeper service... May 26 09:18:53 <node> su[2459]: (to mapr) root on none May 26 09:18:53 <node> su[2459]: pam_unix(su:session): session opened for user mapr by (uid=0) May 26 09:18:53 <node> zookeeper[2215]: JMX disabled by user request May 26 09:18:53 <node> zookeeper[2215]: Using config: /opt/mapr/zookeeper/zookeeper-3.8.3/conf/zoo.cfg May 26 09:18:54 <node> zookeeper[2215]: Starting zookeeper ... STARTED May 26 09:18:54 <node> su[2459]: pam_unix(su:session): session closed for user mapr May 26 09:18:54 <node> systemd[1]: Started MapR Technologies, Inc. zookeeper service.