Enabling YARN Log Aggregation

To enable YARN log aggregation, add or edit the following properties in yarn-site.xml:

  • Set the value of the yarn.log-aggregation-enable to true.
  • Configure the yarn.log.server.url property to contain the URL of the YARN HistoryServer, which should look like the following:
    secure cluster https://<historyserver-host>:19890/jobhistory/logs
  • Optional: Set the yarn.nodemanager.remote-app-log-dir value to a location in the HPE Ezmeral Data Fabric file system. By default, the location is maprfs:///tmp/logs.
  • Optional: Set the yarn.nodemanager.remote-app-log-dir-suffix value to the name of the folder that should contain the logs for each user. By default, the folder name is logs.

Aggregated logs are owned by the user who runs the job. For example, if user admin runs a job, the logs are stored to maprfs:///tmp/logs/admin. If user analyst runs a job, the logs are stored to maprfs:///tmp/logs/analyst. If these two users do not share the same UNIX group, they will be unable to see each other's logs.

NOTE If centralized logging and YARN log aggregation are enabled, the logs for MapReduce version 2 applications are managed by Centralized Logging while the logs for non-MapReduce applications are managed by YARN log aggregation.