Using Alternate Write Modes for HPE Ezmeral Data Fabric Database OJAI Connector

You can use alternate write modes supported by Data Fabric Database OJAI Connector for Apache Spark to save an Apache Spark DataFrame to a Data Fabric Database JSON table.

Normally, the Apache Spark DataFrameWriter class supports the following write modes:

  • Append
  • Overwrite
  • ErrorIfExists
  • Ignore
The HPE Ezmeral Data Fabric Database OJAI Connector for Apache Spark returns an OperationNotSupported exception if you attempt to use one of these modes. The following example returns the error:
import org.apache.spark.sql.SaveMode
import com.mapr.db.spark.sql._

df.write.mode(SaveMode.Append).saveToMapRDB("/tmp/userInfo")

The HPE Ezmeral Data Fabric Database OJAI Connector for Apache Spark provides the following alternative modes:

Insert
Inserts the data into the HPE Ezmeral Data Fabric Database table. Throws a DBException if a row with same _id value already exists in the table.
Overwrite
Overwrites the data in the table with the current DataFrame data. This operation drops the table and creates a new table with the data.
ErrorIfExists
Returns an exception (TableExistsException) if the table already exists. Otherwise, creates the table and inserts the data.
Ignore
Ignores the data in the table if the table already exists. Otherwise, creates the table and inserts the data.
InsertOrReplace
Replaces the row with the row in the DataFrame, if a row with the same _id already exists in the table. Otherwise, inserts the new row.

You cannot specify these modes using the Apache Spark SaveMode method. Doing so results in the same OperationNotSupported exception noted earlier. To use these modes, you must call the option method on a DataFrameWriter object. The following example sets the Insert mode:

df.write.option("Operation", "Insert").saveToMapRDB("/tmp/usersInfo")
NOTE
The UPDATE mode for HPE Ezmeral Data Fabric Database OJAI Connector is not supported and it results in an OperationNotSupported exception.