Auditing in Data Fabric

Data Fabric allows you to log audit records of cluster-administration operations, and operations on directories, files, streams and tables.

The auditing capabilities in data-fabric are critical for regulatory compliance as well as for understanding user behavior. Regulations often require the ability to prove which user accessed which data. Logging user behavior helps to identify suspicious activities on sensitive data.

What Information is Collected?

If you enable auditing, data-fabric records information about data access, operations on data objects, and execution of maprcli commands, including the following:

  • All administrator activities that use maprcli commands, REST API calls, and actions performed on a cluster through the Control System
  • Authentication to the Control System
  • Operations on directories and files
  • Operations on HPE Ezmeral Data Fabric Database objects
  • Operations on HPE Ezmeral Data Fabric Streams

How is Auditing Typically Used?

By analyzing audit records, security analysts can answer questions such as these:

  • Who accessed customer records outside of business hours?
  • What actions did users take in the days before leaving the company?
  • What operations were performed without following change control?
  • Are users accessing sensitive files from protected or secured IP addresses?
  • Why do my reports sourced from the same underlying data look different?

Data scientists can analyze audit records to answers these questions:

  • Which data is used most frequently, is therefore of high value, and should be shared more broadly?
  • Which data is least commonly used, is therefore of low value, and could be purged?
  • Which data should be used more, is therefore underused, and needs better advertising?
  • Which administrative actions are most commonly performed and are therefore candidates for automation?

How does Auditing Work?

For a comprehensive explanation on how auditing works, see How Does Auditing Work?.

What are the Levels of Auditing?

Levels of Auditing explains the two levels of auditing.

What are the Prerequisites to Enable Auditing?

Ensure that you perform the prerequisites mentioned in Managing Auditing before enabling auditing.

How to Enable or Disable Auditing of Data Access Operations?

To enable or disable auditing of data access operations, see Enabling and Disabling Auditing of Data Access Operations.

What is Audited for Data Access Operations?

Auditing Data Access Operations describes the data access operations that are audited.

How to Enable or Disable Auditing of Cluster Administration Operations?

To enable or disable auditing of cluster administration operations, see Enabling and Disabling Auditing of Cluster Administration.

What is Audited for Cluster Administration Operations?

Auditing Cluster Operations describes the operations that are audited on a cluster.

How to Selectively Audit Data Fabric Objects?

To selectively audit Data Fabric Objects, see Selective Auditing of File-System, Table, and Stream Operations Using the CLI.

How to use Audit Logs?

After you enable auditing, audit records immediately start to be recorded in audit logs. You can use Apache Drill or other tools to process these logs. The following diagram shows the workflow for processing audit logs of cluster-administration operations:

The next diagram shows the workflow for processing audit logs of filesystem and table operations.

The step "Expand IDs in log files periodically" refers to the use of the expandaudit utility. Raw audit logs contain file identifiers, volume identifiers, and user identifiers. The expandaudit utility looks up the names that are associated with those identifiers and puts them in new copies of the audit logs. In addition, the data-fabric audit streaming feature uses an API to convert file and volume IDs. The information on audit log files can be used to interpret auditing messages.

How to Stream Audit Logs?

To stream audit logs, see Streaming Audit Logs.

How to Enable or Disable Audit Streaming

To enable or disable audit streaming, see Enabling and Disabling Audit Streaming Using the CLI.