Connecting Data Sources
Provides instructions for connecting HPE Ezmeral Unified Analytics Software to external data sources.
Connecting data sources enables federated access to data for users with the appropriate
permissions. HPE Ezmeral Unified Analytics Software
includes PrestoDB and CSI connectors, enabling connections to multiple types of data sources.
Connecting to an external data source is as simple as selecting the data source type and
providing the required connection parameters and credentials.
IMPORTANT
- Only HPE Ezmeral Unified Analytics Software administrators can create data source connections.
- Each data source connection that you create must have a unique name. For example, you can create multiple Hive data source connections, but each connection created must have a different name.
- EzPresto does not support underscores ( _ ) in data source names. For example, hive_one is not supported; instead, use something like hiveone.
- Access to data in a data source is based on the username and password supplied when creating the data source connection. Data sources are accessible to all users with permission once they are connected.
Complete the following steps to connect a data source:
- In the left navigation pane, select Data Engineering > Data Sources.
- Select the tab that correlates with the type of data source that you want to connect:
- Structured Data (relational databases, such as MySQL and Hive)
- Object Store Data (S3 object stores, such as AWS S3 and MinIO)
- Data Volumes (mount volumes in file storage, such as HPE Ezmeral Data Fabric File Store and HPE GreenLake for File Storage)
- Complete the steps for the data type selected:
- Structured Data
-
- On the Structured Data tab, click Add New Data Source.
- Locate the tile with the type of data source that you want to connect, and click Create Connection. For example, if you want to connect to a Hive data source, locate the Hive tile and click Create Connection in the Hive tile.
- In the drawer that opens, enter the connection parameters and then click
Connect.TIPFor every data source that you connect, you have the option to select the Enable Local Snapshot Table option. This option caches remote table data to accelerate queries on the tables. The cache is active for the duration of the configured TTL (time-to-live) or until the remote tables in the data source are altered.
- Object Store Data
-
- On the Object Store Data tab, click Add New Data Source.
- Locate the tile with the type of data source that you want to connect, and click Add <data-source>. For example, if you want to connect to an Amazon S3 data source, locate the Amazon S3 tile and click Add Amazon S3 in the tile.
- In the drawer that opens, enter the connection parameters and then click Add.
- Data Volumes
-
- On the Data Volumes tab, click New Volume.
- Locate the tile with the type of data source that you want to connect, and click Add <data-source>. For example, if you want to connect to an HPE GreenLake for File Storage data source, locate the HPE GreenLake for File Storage tile and click Add HPE GreenLake for File Storage in the tile.
- In the drawer that opens, enter the connection parameters and then click Add.