Importing an External S3 Object Store

Describes how to import an external S3 object store into the global namespace.

Prerequisites

If Data Fabric accesses the internet via a proxy server, do the following.
  1. Specify the proxy settings in /opt/mapr/initscripts/mapr-s3server:
    export HTTPS_PROXY="http://<proxy server FQDN or proxy server IP>:8080"
    export http_proxy="http://<proxy server FQDN or proxy server IP>:8080"
    export HTTP_PROXY="http://<proxy server FQDN or proxy server IP>:8080"
    export https_proxy="http://<proxy server FQDN or proxy server IP>:8080"
  2. Run the following command to restart the related Data Fabric component:
    maprcli node services -name s3server -action restart -nodes $(hostname -f) -json

The Data Fabric is now able to access the internet via the proxy, and able to connect to an external S3 object store.

About this task

A fabric manager or a fabric user can import an external S3 object store into the global namespace to transfer data from the Data Fabric to the external S3 object store. You can import AWS S3, Google Cloud Platform (GCP), WEKA, Scality, VAST and other S3-compliant object stores into a global namespace to consolidate your data across external S3 object stores on Data Fabric.

Use the following steps to import an external S3 object store into the global namespace by using the Data Fabric UI.

Procedure

  1. Log on to the Data Fabric UI.
  2. If you are a fabric manager, select Fabric manager on the home page. Skip this step if you are a fabric user.
  3. If you are a fabric manager, click Global namespace. Optionally, you can switch to the fabric user view, where the Import External S3 button appears in the Resources section. If you are a fabric user, you can access the the Import External S3 button only in the Resources section.
  4. Click Import External S3.
  5. Enter the name for the S3 object store in Name.
  6. Enter the S3 vendor type, selecting from one of these values:
    • AWS
    • GCP
    • Generic (for WEKA, Scality, VAST, and other S3-compliant object stores)
  7. Depending on the vendor type you selected, fill in the remaining values by consulting the following tables. An asterisk (*) indicates a required field:
    • For AWS:
      Parameter Description
      Name* Name of the S3 object store.
      S3 Vendor Choose AWS.
      Region* The AWS region.
      Access type For the AWS vendor type, you can select from one of the following:
      • Access Credentials
      • Secure Token Service
      Access Credentials Selecting this value requires you to specify an access key and secret for access to the S3 object store. See the descriptions of these keys later in this table.
      Secure Token Service Selecting this value requires you to specify a Web identity role ARN. To configure the ARN, see Configuring STS for Data Fabric.

      STS is an access method that provides an alternative to the traditional access key and secret key. For more information, see Integrating the AWS Security Token Service (STS) with Data Fabric.

      Access key* A long-term credential for an Amazon user. The key enables access to S3 resources for all fabrics in the global namespace. For more information, see Managing access keys for IAM users in the Amazon documentation.
      Secret key* A long-term credential for an Amazon user. The key enables access to S3 resources for all fabrics in the global namespace. For more information, see Managing access keys for IAM users in the Amazon documentation.
      Web identity role ARN* An Amazon resource name (ARN) that enables STS authentication. See Configuring STS for Data Fabric.

    • For GCP:
      Parameter Description
      Name* Name of the S3 object store.
      S3 vendor Choose GCP.
      Region* The GCP region.
      Access key* A key that enables access to S3 resources for all fabrics in the global namespace.
      Secret key* A key that enables access to S3 resources for all fabrics in the global namespace.
    • For Generic:
      Parameter Description
      Name* Name of the S3 object store.
      S3 vendor Choose Generic.
      Access key* A key that enables access to S3 resources for all fabrics in the global namespace.
      Secret key* A key that enables access to S3 resources for all fabrics in the global namespace.
      Hostname / IP Address* Enter the host names or IP addresses of the external S3 object store as a comma-separated list.
      S3 server port The server port. The default value is 9000.
      Use TLS encryption TLS encryption enables communication over a secure connection. TLS is enabled by default.
      S3 server certificate If the generic S3 object store is not CA certified, you must drag and drop the S3 server certificate into this box to enable secure communication. If the S3 server is CA certified, the certificate is not required.
  8. Click Import.

Results

The S3 object store is imported into your global namespace, and is visible as part of the global namespace. The S3 object store is visible under the list of resources in the global namespace or list of resources with your username as the owner.

Related maprcli Commands
To implement the features described on this page, the Data Fabric UI relies on the following maprcli command. The command is provided for general reference. For more information, see maprcli Commands in This Guide.