Using Spark SQL API

Describes how to use Spark SQL API in HPE Ezmeral Unified Analytics Software.

In HPE Ezmeral Unified Analytics Software, you can use the Spark SQL API in two different ways:

External Metastore

NOTE
There will be some limitations to integration with external metastore.

To integrate Spark with external metastore, follow these steps:

  1. Set the metastore URI with the spark.hive.metastore.uris config option. This URI should be public and accessible from your Spark applications.
  2. Set the value of spark.sql.warehouse.dir property to the same value as that of external metastore. For example: if you want to query a managed table then the path to that managed table must match in both metastore and Spark runtime.
  3. Verify that the metastore host can accept external connections so that Spark can connect to the metastore. Configure the gateway rules for securing the metastore as the metastore doesn’t have authentication and authorization.
  4. Verify that your Spark applications are querying the data from locations accessible within the Spark runtime.

Temporary Views

The temporary view is a feature in the Spark DataFrame API. You can read data and create a temporary view for the data by using the temporary view feature. These views are not global and cannot be shared between any two Spark applications. You can use a temporary view in the following two scenarios:

  1. If the schema is available for your data, use DataFrame:create[OrReplace]TempView. Some file formats already include schema, for example, parquet files or CSV files with the header. You can read the file, create a DataFrame and then call the create[OrReplace]TempView function and give it the view name and finally, you can query data using Spark SQL API.
  2. If the schema is not available for your data, you can set it while creating or converting the DataFrame, then create the temporary view. By default, Spark sets aliases for the column names like underscore 1, underscore 2, and so on, however, you can set your own column names.