HPE Ezmeral Data Fabric Database JSON ImportJSON

Imports one or more JSON documents into a HPE Ezmeral Data Fabric Database JSON table. The JSON documents must be flat text files.

Required Permissions

  • The readAce permission on the volume where the JSON documents to import are located.
  • The writeAce permission on the volume in which the destination table is located.

For information about how to set permissions on volumes, see Setting Whole Volume ACEs.

NOTE
The mapr user is not treated as a superuser. HPE Ezmeral Data Fabric Database does not allow the mapr user to run this utility unless that user is given the relevant permission or permissions with access-control expressions.

Syntax

mapr importJSON 
[-idfield <Name of ID field in JSON Data>]
[-bulkload <true|false>, default is false]
[-mapreduce : <true|false>, default is true]
-src <text file or directory>
-dst <JSON table>

Parameters

Parameter Description
idfield The name of the field that contains the value to use for each document's _id field.

An _id field is inserted into each document that is imported into a table, if the document does not already contain one.

Documents that do not already contain an _id field must contain a field with a value that can be used for the inserted _id field.

For example, each document might have a product_ID field with a value that would be suitable for the _id field.

Use quotation marks around the name.

bulkload A Boolean value that specifies whether or not to perform a full bulk load of the table. The default is not to use bulk loading (false). To use bulk load, you must set the -bulkload parameter of the table to true by running the command maprcli table edit -path <path to table> -bulkload true.

This parameter cannot be set to true when the -mapreduce parameter is set to false.

mapreduce

A Boolean value that specifies whether or not to use a MapReduce program to perform the copying operation. The default, preferred method is to use a MapReduce program (true).

src The path of a JSON document in text format or a directory of such documents.

If you specify a directory and that directory contains only the JSON files to import, use an asterisk at the end of the path, as in this example: /user/data/*

If you specify a directory and that directory contains both the JSON files to import and other files, use a more specific wildcard, such as *.json .

The path must be in the Data Fabric file system. To move files there from the Linux file system, use the command hadoop fs -copyFromLocal.

dst The path of the destination HPE Ezmeral Data Fabric Database JSON table.

Example

Suppose you have the following three JSON documents in the /tmp/users directory in yourData Fabric file system:

$ hadoop fs -cat /tmp/users/bcummings.json
{"_id":"bcummings","first_name":"Bettie","last_name":"Cummings"}

$ hadoop fs -cat /tmp/users/gjones.json
{"_id":"gjones","first_name":"Gilberto","last_name":"Jones"}

$ hadoop fs -cat /tmp/users/jdoe.json
{"_id":"jdoe","first_name":"John","last_name":"Doe"}

The following command imports the three documents into the JSON table in the path /apps/users:

$ mapr importJSON -idField _id -src /tmp/users/* -dst /apps/users

You can run mapr dbshell to see the imported documents:

maprdb mapr:> find /apps/users
{"_id":"bcummings","first_name":"Bettie","last_name":"Cummings"}
{"_id":"gjones","first_name":"Gilberto","last_name":"Jones"}
{"_id":"jdoe","first_name":"John","last_name":"Doe"}
3 document(s) found.