expandaudit

Describes how to use the expandaudit utility to expand IDs captured in the audit logs to their corresponding names.

As you perform operations on the directories, files, and tables that you are auditing, the audit logs capture records of those operations. Those records identify the affected directories, files, and tables by means of file IDs, the volumes on which the operations took place by means of volume identifiers, and the users who performed the operations by means of user IDs. These IDs are used instead of names in the audit records because fetching the actual names of these objects and users in real-time is costly in terms of performance.

You can use the expandaudit utility to create copies of your logs files in which the IDs are resolved into names and inserted into the audit records.

This utility acts on audit logs that exist in the current data-fabric cluster at the time that the utility is run.

Restrictions

This utility operates on audit logs for file system operations and HPE Ezmeral Data Fabric Database operations, which are logged in a local data-fabric volume on each node where the operations are performed. These operations are logged in FSAudit and DBAudit log files.

File identifiers are converted to names only when either of the following conditions is met:
  • The file exists at the time that expandaudit is run.
  • The file has been deleted but the deletion of the file was logged and the log files being processed by expandaudit include the record of the file deletion.

If a volume is deleted, expandaudit does not convert identifiers for files that were in the volume unless the creation of the volume and files were logged.

If the creation of a file is audited and the file is later renamed, the file ID is converted to the current name.

Permissions

Although the permissions on the tool are 755, the tool generates output only when run by root or the user mapr.

Syntax

/opt/mapr/bin/expandaudit 
expandaudit
    
 [-volumename volume name]
  [-volumeid volume ids. Either volume name or id must be specified]
   -o output directory
  [-i input directory]
  [-d Specify for deleted volumes only]
  [-cluster cluster name]
  [-t number of threads used for parallel expansion across cluster nod                                                                             es. default 10]
  For deleted volumes, user specified volume name will be used during expansion

Parameters

Parameter Description
cluster The name of the cluster on which to run the command.
d Required for deleted volumes as it indicates that the volume is deleted. If you specify this parameter, you must specify a volume ID to be used during expansion. The deleted volume is tracked by the specified volume ID. You can optionally specify a volume name. This specified volume name is used for the expanded output.
o The directory in the data-fabric file system in which to create the copies of the audit logs. The directory must already exist.
The directory structure is:
<output directory>/<volume id>/<node>/<day>/<expanded audit log files>
The file names are the same as the names of the input files, though you might see the following extensions:
  • .part: If present, this extension is on the log file with the most recent date. The input log file that corresponds to this output file might still have been receiving new audit records at the time that the expandaudit utility was run. If the utility is run again with the same output directory, the utility will update the .part file by including the most recent records and converting the identifiers in those records.
  • .pending: This extension indicates files that contain one or more identifiers that the utility could not convert.
NOTE
Sometimes, you might see a combination of these two types of files, part.pending, which indicates that there is a problem converting identifiers in the most recent audit file.
i The input directory for location of cluster audit logs. The default value is /var/mapr/local/.
t The number of threads to use for parallel expansion across cluster nodes. The default value is 10.
volumename The name of the volume being audited. You must specify either the volumename or the volumeid parameter.
volumeid The ID of the volume being audited. You must specify either the volumename or the volumeid parameter.

Sample Expansion of a Record for File System Operations

Original record
{"timestamp":{"$date":"2015-06-06T13:02:23.746Z"},"operation":"GETATTR","uid":"1","ipAddress": 
"10.10.104.53","srcFid":"2049.652.263696","volumeId":68048396,"status":0}
Record processed by the expandaudit utility
{"timestamp":{"$date":"2015-06-06T13:02:23.746Z"},"operation":"GETATTR","user":
"userA","uid":"1","ipAddress":"10.10.104.53","srcPath":"/customers/US_Western_Region.json", 
"srcFid":"2049.3296.268968","volumeName":"data_analysis","volumeId":68048396,"status":0}
ATTENTION
Here, uid expands to user, srcFid expands to srcPath, and volumeID expands to volumeName. The original fields are also preserved in the output.

Sample Expansion of a Record for HPE Ezmeral Data Fabric Database Table Operations

Original record
{"timestamp":{"$date":"2015-06-06T13:08:54.474Z"},"operation":"DB_PUT","uid":"1","ipAddress":
"10.10.104.51","volumeId":68048396,"columnFamily":"fam63","columnQualifier":"col_96","tableFid":
"2049.56.262518","status":0}
Record processed by the expandaudit utility
{"timestamp":"{$date=2015-06-06T13:08:54.474Z}","operation":"DB_PUT","user":"userA","uid": 
"1","ipAddress":"10.10.104.51","volumeName":"mapr.cluster.root","volumeId":"68048396", 
"columnFamily":"fam63","columnQualifier":"col_96","tablePath":"/mytable","tableFid":"2049.56.262518", 
"status":"0"}
ATTENTION
Here, uid expands to user, volumeID expands to volumeName, and tableFid expands to tablePath. The original fields are also preserved in the output.