Hive Features in HPE Data Fabric
Describes HPE Data Fabric-specific features in Hive.
Removing Temporary Hive Files
hive.scratchdir.lock property to
true on hive-site.xml file.<property>
<name>hive.scratchdir.lock</name>
<value>true</value>
</property>For the previous EEP versions, manually remove the temporary Hive files that are not used by the active Hive sessions.
-
If you have configured the HiveServer2 in a node, set
hive.scratchdir.lockproperty on thehive-site.xmlfile to automatically remove the temporary Hive files. -
If you have not configured the HiveServer2 in a node, set the
hive.scratchdir.lockproperty and run the following command to remove the temporary Hive files.hive --service cleardanglingscratchdir
Symbolic Link Support in Hive
Starting from EEP 7.1.0, all hadoop fs commands support operations on symlinks (symbolic links). Hive supports symlinks in EEP 8.0.0 onwards. You can create symlinks through the command line interface or file system API (MapRFileSystem.java).- NFS installed
- NFS
mounted (mount
hadoop fsto the local file system)
- Creating Symlinks
-
The following examples demonstrate how to create symbolic links via CLI and MapRFileSystem API:
- Create a relative symlink via
CLI:
ln -rs /mountPoint/path/to/file /mountPoint/path/to/symlink - Create an absolute symlink via
CLI:
ln -s /mountPoint/path/to/file /mountPoint/path/to/symlink - Create a symlink via MapRFileSystem
API:
MapRFileSystem maprFS = MapRFileSystem.get(new Configuration()); maprFS.createSymlink(pathToTarget, pathToLink, createParentFlag);
- Create a relative symlink via
CLI:
- Using Symlinks for Hive Operations
-
Once a symlink is created, you can use the symlink for Hive operations, such as table location and data file, as demonstrated in the following steps:
- Create a table
directory:
mkdir /mapr/my.cluster.com/user/hive/warehouse/ext_tbl_symlink - Create a symlink from a data source to a table
location:
ln -s /mapr/my.cluster.com/user/mapr/source_files/data.txt /mapr/my.cluster.com/user/hive/warehouse/ext_tbl_mh120/data_link.txt - Create an external Hive table in the
ext_tbl_symlinkdirectory (created in step 1):CREATE EXTERNAL TABLE file_link_table (...) ROW FORMAT DELIMITED FIELDS TERMINATED BY "," STORED AS TEXTFILE LOCATION '/user/hive/warehouse/ext_tbl_symlink';
- Create a table
directory:
- Configuring Symlinks Support
-
When you have many small files and you are using symlinks, the performance of Hive operations are slower.
To enable or disable the symlink support, configure the
hive.sym.link.support.enabledproperty inhive-site.xmlfile.<property> <name>hive.sym.link.support.enabled</name> <value>false</value> <description>Enables or disables symlink support in Hive. Enabling this functionality leads to verification of each files and folders to be a symlink which results in slower performance when there are many small files to process.</description> </property>The value of this property is set to
falseby default. To enable the symlink support, set the value totrueand restart Hive services.