HPE Ezmeral Data Fabric Database JSON CopyTable
Copies data from one HPE Ezmeral Data Fabric Database JSON table to another HPE Ezmeral Data Fabric Database JSON table.
If the destination table does not exist, mapr copytable
creates the
destination table with the same metadata (column families and access control expressions) as
the source table, and then copies data.
If the destination table exists, mapr copytable
copies data only.
Required Permissions
mapr copytable
must have the following permissions,
which you can grant with access-control expressions: - The permission
readAce
on the volume where the source table is located, and the permissionwriteAce
on the volume where the destination table is or will be located. - The permission
adminperm
on the source table. - The permission for column-family and column reads (
readperm
) on the data in the source table that you want to copy. - When bulkload =
false
, the permission for column writes (writeperm
) on the destination table. - When bulkload =
true
(default), the permission to load the destination table with bulk loads (bulkloadperm
). - If the destination table does not yet exist:
createrenamefamily
on the source table.
For information about how to set permissions on volumes, see Setting Whole Volume ACEs.
For information about how to set permissions on tables, see Enabling Table and Stream Authorizations with ACEs.
mapr
user is not treated as a
superuser. HPE Ezmeral Data Fabric Database does not allow the mapr
user to run this utility unless that user is given the relevant permission or permissions
with access-control expressions.Syntax
mapr copytable
-src <source table path>
-dst <destination table path>
[-fromID <start key>]
[-toID <end key>]
[-bulkload <true|false> (default: false)]
[-mapreduce <true|false> (default: true)]
[-cmpmeta <true|false> (default: true)]
[-numthreads <number of threads> (default: 16)]
[-maxsplits <integer> (default: 2000)]
Parameters
Parameter | Description |
---|---|
src | The path of the table that you want to copy from. |
dst | The path of the table that you want to copy to. |
fromID |
The value of the
|
toID |
The value of the
|
bulkload | A Boolean value that specifies whether or not
to perform a full bulk load of the table. The default is not to use bulk loading
(false ). To use bulk load, you must set the
-bulkload parameter of the table to true by
running the command maprcli table edit -path <path to table> -bulkload
true . |
mapreduce |
A Boolean value that specifies whether or not to use a
MapReduce program to perform the copying operation. The default, preferred method
is to use a MapReduce program ( When this parameter is set to |
cmpmeta | A Boolean value that specifies whether or not to compare
table metadata such as column families and ACEs. The default is to compare
metadata (true ). Such comparisons are done when the destination table exists before mapr
copytable is run and checks that the user ID that runs mapr
copytable has the proper permissions on the destination table.Set the
value of this parameter to |
numthreads |
When -mapreduce is
false , this parameter specifies the number of threads allocated
to perform the copying of data. The default is 16. If additional CPU resources are
available, you might want to increase the number of threads to achieve better
performance. |
maxsplits | Sets the maximum number of destination table presplit tablets. Default is 2000.
If copytable fails with an Error NO ENTry message during table
creation, the operation could not complete within the timeout (10 minutes). Reduce
the value of -maxsplits . This functionality requires a patch. See
Applying a
Patch. |
Example
The following example copies documents starting from ID user000001
to ID
user009999
:
[user@hostname ~]$ mapr copytable -src /user1/tableA -dst
/mapr/clusterB/vol1/tableB -fromID user000001 -toID user009999
Monitoring mapr copytable
Operations
- If the copy table operation runs as a MapReduce v2 application, monitor the application using the ResourceManager UI.
- If the copy table operation runs as a client process, go to the Tables view of the destination table in the Control System. Then, on the Region tab, monitor the pace at which the number of rows increases.