Working with Bucket Volumes

Describes how to identify the volume associated with a bucket for offloading, mirroring, and creating snapshots.

Underlying each Object Store bucket is a volume. Every bucket created in an Object Store account is automatically associated with a volume. You can snapshot or mirror a bucket volume for disaster recovery. You can also offload data to reclaim storage space.

Offloading relates to data tiering. If you create an account in Object Store, specify the erasure coding scheme (ecscheme) in the storage_class. All buckets created in the account inherit the ecscheme. Underlying volumes are automatically tiered such that data in a bucket volume can be offloaded to a back-end volume to reclaim storage space.

Before you can snapshot, mirror, or offload a bucket, you must identify the volume associated with the bucket.

Identifying the Volume Associated with a Bucket

Before you mirror, snapshot, or offload a bucket, identify the name of the volume associated with the bucket. You can run the mrconfig s3 bucketinfo or /opt/mapr/bin/mc admin account info command to get to the volume name.

Using the mrconfig s3 bucketinfo command
  1. Run the mrconfig s3 bucketinfo command to get the volume ID (volid) of the volume hosting the bucket:
    /opt/mapr/server/mrconfig s3 bucketinfo <bucketName>
    //Example: /opt/mapr/server/mrconfig s3 bucketinfo acct01bkt01
    Note the volid in the output:
    bucketdirfid 20578.43.131282
    oltFid 20578.44.131284
    odtFid 20578.48.131292
    f2oFid 20578.51.131298
    volid 150046236
    creationTime 1644592034617
    accountName acct01
    Now that you have the volid, you can find the name of the volume.
  2. Run the volume list command, indicating the columns for which you want data and filtering on the volid:
    maprcli volume list -columns volumename,volumeid,mountdir -filter volumeid==150046236
    The output provides data for the columns specified - volume name, volumeid, and mount path respectively:
    mapr.s3bucketVol.0000021b 150046236  /var/objstore/domains/primary/accounts/201/bucketVols/mapr.s3bucketVol.0000021b
Using the mc admin account info command
  1. Run the mc admin account info command to locate the volume ID (volid) of the volume associated with the bucket:
    /opt/mapr/bin/mc admin account info myalias myaccount 
    Note that the Id is the account Id, which you use to get the volume information for a bucket:
    Name: myaccount
    Id: 1
    Admin: bob
    DefBucketPolicy: …
    Acl: { []}
    Quota: 102400
    AdvisoryQuota: 51200
    LabelName: default
    DareEnabled: false
    MinRep1: 1
    DesiredRepl: 3
    EcScheme: 2+1
    Size: 1108
    BucketCount: 1
    UserCount: 0
  2. Use the account Id (s3aId==1) to find information for the volumes, including the name:
    maprcli volume list -columns volumename,id,mountdir,ae,used -filter '[s3aId==1]'
    Note the volumename in the output:
    numFidMap	volumename	              numFile numS3Bucket  volid	  mountdir	                                                             used   numDir  numTable
    0	   mapr.s3.internal.objecstore.account1   0	0         232398301   /var/objstore/domains/primary/accounts/1 	                                0	2	7
    0	   mapr.s3.bucketVol.0000002	       0	1	5 9138624    /var/objstore/domains/primary/accounts/1/bucketVols/mapr.s3bucketVol.00000002    1108	0	4

Viewing Volume Information

Once you have the name of the volume associated with a bucket, run the volume info command to view volume details, such as data tiering information.

Run the volume info command with the name of the volume:
maprcli volume info -name <volumeName> -json

//Example: maprcli volume info -name mapr.s3bucketVol.0000021b -json
Note that the following example output is truncated, but you can see the data tiering details:
	"timeofday":"2022-02-22 07:00:49.530 GMT-0700 PM",
			"gateway":"Currently down",
			"ecstorevolume":" mapr.s3bucketVol.0000021b.236703387",

Manually Trigger Offloading for a Bucket Volume

When you create an account in Object Store, you configure the erasure coding (EC) topology which specifies where the back-end volume should reside. You also specify the topology (where the front-end volume resides) and the storage capacity for buckets. When you create a bucket, the system automatically creates the volumes (front-end and back-end) and configures data tiering.

The erasure coded (EC) volumes are automatically offloaded to the back-end volumes when they cross the storage threshold set for buckets. If you want to offload data from a bucket to reclaim storage space before the bucket crosses the storage threshold, you can perform a manual offload of the data.

When data is offloaded, you access the off-loaded data the same way you accessed the data prior to the offload.

To perform a manual offload, you need the name of the volume associated with a bucket.

To offload data from a bucket volume, run maprcli volume offload on the volume:
maprcli volume offload -name <volumeName> -json

//Example: maprcli volume offload -name mapr.s3bucketVol.0000021b -json
The command outputs the following information:
	"timeofday":"2022-02-22 07:00:49.530 GMT-0700 PM",
	"messages":[ "Successfully started offload."
To check the status of the tier job and offload, run maprcli volume tierjobstatus:
maprcli volume tierjobstatus -name <volumeName> -json

//Example: maprcli volume tierjobstatus -name mapr.s3bucketVol.0000021b -json
Once the offload completes, you can still access the data as you did before it was offloaded. For example, to find the tiering information for the offloaded data, run:
maprcli volume info -name <volumeName> -json | grep -i tier

//Example: maprcli volume info -name mapr.s3bucketVol.0000021b -json | grep -i tier

Mirroring a Bucket Volume

Typically, you mirror data for disaster recovery purposes. You can mirror bucket volumes and then use an S3 interface to access buckets and objects in the mirrored volume. Currently, you cannot promote a mirrored bucket volume to a read/write mirror; you can only read data from the mirrored volume.

Before you can mirror the volume associated with a bucket, you must first identify the name of the volume associated with a bucket. To mirror a bucket volume:

  1. Run the maprcli volume create command, indicating the source volume, path to the mirrored volume, and volume type:
    /opt/mapr/bin/maprcli volume create -name <mirrorVolumeName> -source <sourceVolumeName> -path <path/to/mirrorVolume> -type mirror
    //Example: /opt/mapr/bin/maprcli volume create -name mirvolbk2 -source -path /mirvolbk2 -type mirror
  2. Run the maprcli volume mirror start command to start volume mirroring:
    maprcli volume mirror start -name <mirrorVolumeName>
    //Example: maprcli volume mirror start -name mirvolbk2
  3. When mirroring completes, access data in the mirrored volume using an S3 interface, such as the mc ls command:
    /opt/mapr/bin/mc ls <alias>/filestore/<mirrorVolumeName>/
    //Example: /opt/mapr/bin/mc ls alias_m2/filestore/mirvolbk2/
ATTENTION: You must include the keyword filestore in the path to access the mirror.

Creating a Snapshot of a Bucket Volume

You can snapshot a bucket volume and then access objects in the snapshot. Snapshots provide a point-in-time copy of a volume. Only authorized users can access buckets and objects from a snapshot. The bucket policy (from the snapshotted bucket volume) and the IAM policy associated with the user must allow the user access to the bucket and/or objects. An IAM user can access buckets or an objects in snapshots if authorized; however, the system will deny IAM users access to files.

Before you create a snapshot of a bucket volume, get the name of the volume and its mount path, as described in Identifying the Volume Associated with a Bucket.

To create a snapshot from a bucket volume, run the maprcli volume snapshot create command:
maprcli volume snapshot create -volume <volumeName> -snapshotname <snapshotName>
//Example: maprcli volume snapshot create -volume mapr.s3bucketVol1.00000002 -snapshotname snap1

You can access snapshots in the .snapshot directory. To access snapshots, you need the volume mount path. In the following example, the volume mount path is /var/objstore/domains/primary/accounts/1/bucketVols/mapr.s3bucketVol.00000002. You also need to include the keyword /filestore/ with the alias to access the snapshot. The following example has the alias and keyword kalyanalias/filestore/.

To access the data in a snapshot, run:
/opt/mapr/bin/mc ls <alias>/filestore//volume/mount/path/.snapshot/<snapshotName>

//Example: /opt/mapr/bin/mc ls kalyanalias/filestore//var/objstore/domains/primary/accounts/1/bucketVols/mapr.s3bucketVol.00000002/.snapshot/snap1
Output returned:
[2022-02-22 10:41:27 PST]  3B /filestore/var/objstore/domains/primary/accounts/1/bucketVols/mapr.s3bucketVol.00000002/.snapshot/snap1/BucketListTable
[2022-02-22 10:41:47 PST]  0B /filestore/var/objstore/domains/primary/accounts/1/bucketVols/mapr.s3bucketVol.00000002/.snapshot/snap1/testac1

In the example, snap1 contains the BucketListTable and an account named testac1.

You can also perform operations on snapshots, such as copying data from an object in a snapshot to another directory. In the following example, bucket f1 data is copied to the /tmp/f11 directory:
/opt/mapr/bin/mc cp kalyanalias/filestore//var/objstore/domains/primary/accounts/1/bucketVols/mapr.s3bucketVol.00000002/.snapshot/snap1/testac1/f1  /tmp/f11