Query with S3 Select
Describes how to query objects.
You can query CSV, JSON, and Apache Parquet files.
Usage Notes
Review the following notes related to the use of
S3 Select
before you run
any queries.- Parquet files
- Before you run any queries against Parquet files, set
export MINIO_API_SELECT_PARQUET=on
in the/opt/mapr/conf/env.sh
file and restart the Object Store server. You can restart the Object Store server from the Services page in the Control System or from the CLI by running the following command:/opt/mapr/bin/maprcli node services -nodes <space-delimited list of node names> -s3server restart
- JSON documents
- When you query a JSON document, you must include the
--json-input
parameter andtype=document
, as shown in the following example:/opt/mapr/bin/mc sql --json-input type=document --query "select * from S3Object" alias0/mybucket/example5.json
Using the CLI
Use the mc sql command to query objects.
Using the Object Store Interface
- Login to the Object Store Interface.
- Click the bucket icon from the left pane.
- From the Buckets page, click the bucket in which the object exists.
- Navigate to the Objects tab.
- View the list of objects.
- Scroll through the list of objects, or enter a name in the search field to search for the object.
- Select Query with S3 Select from the Actions menu of the object to query.
- Select the characteristics of the object such as the format, the number of lines that the object spans, the CSV delimiter for the fields and the compression type if any for the object.
- Select the output type either CSV or JSON and the CSV delimiter to use.
- Enter the query to run. The default query is
SELECT * FROM s3object s LIMIT 5
. - Click Run SQL Query.