Troubleshooting Hive and Tez
This section includes Hive and Tez troubleshooting tips.
HDFS Literal Deprecated
Starting in Hive-2.3, the hdfs
literal is deprecated. Specifying a
table location using the hdfs
URI scheme will cause queries to fail
because the Hive parser recognizes the hdfs
literal in the LOCATION key
word and triggers HDFS encryption, which is not supported.
hdfs
literal with the LOCATION keyword in Hive
queries:CREATE TABLE IF NOT EXISTS i (id INT) LOCATION 'hdfs:///i';
LOG.warn("hdfs:// is deprecated filesystem and will be removed in future releases. Use maprfs://
instead");
hdfs
literal issues, update all instances of
hdfs
with maprfs
in tables, partitions, and
databases. Also update the hive-site.xml
file to remove
hdfs
from the URI scheme list.- Update hive-site.xml
- Remove
hdfs
from thehive.exim.uri.scheme.whitelist
Hive configuration property inhive-site.xml
, as shown:<property> <name>hive.exim.uri.scheme.whitelist</name> <value>maprfs,…,…,s3</value> </property>
- Update Tables and Partitions
- To replace the table and partition location with
maprfs
, run:MariaDB [hive]> update SDS set LOCATION = REPLACE(LOCATION, 'hdfs', 'maprfs') where LOCATION like '%hdfs%';
- Update Databases
- To replace the database location with
maprfs
, run:MariaDB [hive]> update DBS set DB_LOCATION_URI = REPLACE(DB_LOCATION_URI, 'hdfs', 'maprfs') where DB_LOCATION_URI like '%hdfs%';
Prohibited usage of datanucleus.schema.autoCreateAll
property
The usage of the datanucleus.schema.autoCreateAll
property is
prohibited in all cases. Instead of using this property, you must run the
schematool
command. Refer to HIVE-21302 for more information.
WebHCat
Secure WebHCat operations depend on the Hive metastore having Kerberos enabled. If Kerberos is not enabled for the Hive metastore, null pointer exceptions similar to the following will appear:
2013-10-06 20:38:55,198 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(134)) - MetaException(message:java.lang.NullPointerException)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_delegation_token(HiveMetaStore.java:3972)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
at com.sun.proxy.$Proxy5.get_delegation_token(Unknown Source)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_delegation_token.getResult(ThriftHiveMetastore.java:8063)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_delegation_token.getResult(ThriftHiveMetastore.java:8047)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
mapr-hive-2.3.6-mapr-1912
(EEP-6.3.0) package, you should manually replace the
old webhcat-default.xml
configuration file with the new one and restart
the WebHCat
service:cp /opt/mapr/hive/hive-2.3/hcatalog/etc/webhcat.new/webhcat-default.xml /opt/mapr/hive/hive-2.3/hcatalog/etc/webhcat/
Hive in an Azure Cluster
When Hive services are installed on an Azure cluster, it is possible that the services will not start because Azure assigns too long (over 64 symbols) host names. Perform following steps to fix this issue:
20180320175756.GA-1.x86_64
.- Edit the
/etc/hosts
file:
This is an example of anano /etc/hosts
/etc/hosts
file for an Azure cluster:127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.24.8.4 anaikregtestc73522602-cluster-com-mapr-vm0.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm0 172.24.8.5 anaikregtestc73522602-cluster-com-mapr-vm1.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm1 172.24.8.6 anaikregtestc73522602-cluster-com-mapr-vm2.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm2 172.24.8.7 anaikregtestc73522602-cluster-com-mapr-vm3.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm3 172.24.8.8 anaikregtestc73522602-cluster-com-mapr-vm4.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm4
- Add a short alias for each node: NOTEYou can use any short alias. In this example,
vm0
,vm1
,vm2
,vm3
, andvm4
are used:127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.24.8.4 anaikregtestc73522602-cluster-com-mapr-vm0.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm0 vm0 172.24.8.5 anaikregtestc73522602-cluster-com-mapr-vm1.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm1 vm1 172.24.8.6 anaikregtestc73522602-cluster-com-mapr-vm2.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm2 vm2 172.24.8.7 anaikregtestc73522602-cluster-com-mapr-vm3.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm3 vm3 172.24.8.8 anaikregtestc73522602-cluster-com-mapr-vm4.izqafobxqxbuzkv4led1p3snic.dx.internal.cloudapp.net anaikregtestc73522602-cluster-com-mapr-vm4 vm4
- Perform step 1 and 2 for each node in the cluster.
Tez Upgrade Issues
- Preserving configuration on Ubuntu is not supported from EEP 4.1.1 and EEP 5.0.0 (1803) to EEP 6.0.0 (1808) or EEP 5.0.1 (1808).
- Preserving Tomcat configuration is not supported from any previous EEP to EEP 6.0.0 (1808).
- You must manually stop the Tomcat service and delete the Tomcat folder as a
precondition if you are updating or upgrading Tez from the following EEPs:
- EEP 4.0.0
- EEP 4.1.0
User Names, Group Names, and LDAP
domain users
.
The following structure is possible in the MapR FileSystem:
drwxr-xr-x - afischer domain users 0 2018-10-03 16:10 /user/abc
drwxr-xr-x - mapr mapr 0 2018-10-05 16:51 /user/def
drwxr-xr-x - dschexnayder domain users 8 2018-10-10 13:30 /user/xyz
drwxr-xr-x - mapr mapr 1 2018-10-09 14:23 /user/hive
drwxr-xr-x - mapr mapr 11 2018-10-10 01:56 /user/mapr
drwxr-xr-x - mlitovsky domain users 0 2018-10-06 11:08 /user/hjbs
drwxr-xr-x - pcurtis domain users 5 2018-10-04 19:33 /user/jknd
drwxr-xr-x - mapr mapr 3 2018-10-08 16:29 /user/ewkd
drwxr-xr-x - talvarez domain users 0 2018-10-04 17:02 /user/lkd
-chgrp: 'domain users' does not match expected pattern for group
The workaround is to not use spaces in group names or user names.
HiveServer 2 takes time to start because of
get_all_databases
Materialized view registry and cache is introduced in HIVE-14496 for Hive 2.3.0.
The goal of the cache is to avoid parsing and creating logical plans for the materialized views at query runtime. When a query arrives, you need to consult this cache and extract the logical plans for the views (which are already parsed) from it. Materialized view registry class scans all databases and tables in each database during initialization and that may cause long time to start HiveServer2.
Property hive.materializedview.enable.views.registry
is added to
control the usage of materialized view registry:
Property: hive.materializedview.enable.views.registry
Default value: true
Description: In case of a large amount of databases and tables in Hive, usage of materialized view registry and cache force HiveServer2 to scan all of them in order to cache the query plan for a view. This leads to an extremely long time for HiveServer2 to start.
hive-site.xml
and restart Hive services.
<property>
<name>hive.materializedview.enable.views.registry</name>
<value>false</value>
</property>
Database and Table Names Containing a Dot (.)
HIVE-16907 rejects queries with database and table names that contain a dot
(.
), and this behavior is backported to Hive 2.3.
.
) in the name are not
supported now. For
example:{code}
insert into `tdb.t1` select * from t2;
{code}
Throws error:
{code}
FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:12 Table or database name may not contain dot(.) character 'tdb.t1'
{code}
Avoid using unsupported characters in database and table names.