Queue Management with Hive-on-Tez

HiveServer2 provides built-in functionality to set-up and handle a pool of Tez sessions in default queues. Tez initiates a session and keeps it alive to run sequential queries. Quereies can be submitted through HiveServer2 clients, such as Beeline and the Hive CLI. You can manage queues through properties in hive-site.xml.

Queue management is strongly connected to the type of YARN Scheduler used. By default, an HPE Ezmeral Data Fabric cluster uses Fair Scheduler and Hive-on-Tez to run queries in queues with a user name. If a query is submitted from the Hive CLI, the real user name is used. If a query is submitted from a HiveServer2 client, such as Beeline, the queue name depends on the HiveServer2 impersonation configuration property, hive.server2.enable.doAs, where the queue name could be the real user name or the user name of the Hiveserver2 process.

With Capacity Scheduler, Hive queries submitted from the CLI and Beeline are configured through the capacity-scheduler.xml file. Default queue names are chosen from the scheduler settings, but you can also use the tez.queue.name=<queue_name> property to run queries in a specific queue.

Application Masters (AM) are strongly bound to YARN. You cannot change the queue for an AM that is already started. If impersonation is enabled for HiveServer2, a new AM starts next to an existing AM for a default queue. Do not use or close a default queue at the end of a lifetime.
NOTE
HiveServer2 works with or without impersonation. Impersonation is set through the hive.server2.enable.doAs property.

Run Queries in a Specific Queue

If you want all queries to run in a specific queue, you can configure a queue name through the tez.queue.name property. When you configure a queue name through the tez.queue.name property, Tez sets the queue name for all jobs submitted from the client to the configured tez.queue.name. You can set this property before each query through the Hive SET command, as shown:
set tez.queue.name=<queue_name>;
Or, you can set the property in the hive-site.xml file, as shown:
<property>
   <name>tez.queue.name</name>
   <value>my_queue</value>
</property> 
IMPORTANT
If you set tez.queue.name in hive-site.xml, and you want the queue name to persist across all queries in the session, you must also set the hive.server2.tez.unset.tez.queue.name property in hive-site.xml to false, as shown:
<property>
   <name>tez.queue.name</name>
   <value>my_queue</value>
</property> 
<property>
    <name>hive.server2.tez.unset.tez.queue.name</name>
    <value>false</value>
</property>

If hive.server2.tez.unset.tez.queue.name is set to true, Hive will not persist the tez.queue.name across queries and instead uses the default cluster queue names.

Configuration Properties

HiveServer2 has several settings related to queue management. Specify the following properties in the hive-site.xml file:
Property Description Default Value
tez.queue.name The queue name for all jobs submitted from a given client. Set through the Hive CLI via the SET command before running a query or through hive-site.xml. If you set the property through hive-site.xml, and you want the setting to persist across all queries that run, set hive.server2.tez.unset.tez.queue.name to false. No default. Must be explicitly set.
hive.server2.tez.initialize.default.sessions When set to true, enables you to use HiveServer2 without turning on Tez for HiveServer2. Useful when you want to run queries over Tez without the pool of sessions. false
hive.server2.tez.default.queues A list of comma-separated values that correspond to YARN queues of the same name. When HiveServer2 is launched in Tez mode, this configuration must be set to enable multiple Tez sessions to run in parallel on the cluster. empty string
hive.server2.tez.sessions.per.default.queue A positive integer that determines the number of Tez sessions that should launch in each of the queues specified by hive.server2.tez.default.queues. Determines the parallelism on each queue. For example, if you specify two default queues and two sessions per default queue, four application masters start. 1
hive.server2.tez.session.lifetime

Defines the lifetime of the Tez sessions launched by HiveServer2 when default sessions are enabled.

Set to 0 to disable session expiration.

162h
hive.server2.tez.unset.tez.queue.name
Controls whether the tez.queue.name persists across all queries in a session. Must be set to false for the tez.queue.name to persist. When set to true, the tez.queue.name only applies to the first query that runs; thereafter, the default cluster queue names are used.
NOTE
This functionality was introduced in EEP 7.01 and EEP 6.3.2. A patch for previous EEP versions is available. See Applying a Patch.
true