Using Alternate Write Modes for HPE Data Fabric Database OJAI Connector
You can use alternate write modes supported by Data Fabric Database OJAI Connector for Apache Spark to save an Apache Spark DataFrame to a Data Fabric Database JSON table.
Normally, the Apache
Spark DataFrameWriter class supports the following write modes:
- Append
- Overwrite
- ErrorIfExists
- Ignore
The HPE Data Fabric Database OJAI Connector for Apache Spark returns an
OperationNotSupported exception if you attempt to use one of these modes.
The following example returns the error:import org.apache.spark.sql.SaveMode
import com.mapr.db.spark.sql._
df.write.mode(SaveMode.Append).saveToMapRDB("/tmp/userInfo")
The HPE Data Fabric Database OJAI Connector for Apache Spark provides the following alternative modes:
- Insert
- Inserts the data into the HPE Data Fabric Database table. Throws a
DBExceptionif a row with same_idvalue already exists in the table. - Overwrite
- Overwrites the data in the table with the current DataFrame data. This operation drops the table and creates a new table with the data.
- ErrorIfExists
- Returns an exception (
TableExistsException) if the table already exists. Otherwise, creates the table and inserts the data. - Ignore
- Ignores the data in the table if the table already exists. Otherwise, creates the table and inserts the data.
- InsertOrReplace
- Replaces the row with the row in the DataFrame, if a row with the same
_idalready exists in the table. Otherwise, inserts the new row.
You cannot specify
these modes using the Apache Spark SaveMode method. Doing so results in the
same OperationNotSupported exception noted earlier. To use these modes, you
must call the option method on a DataFrameWriter object. The
following example sets the Insert mode:
df.write.option("Operation", "Insert").saveToMapRDB("/tmp/usersInfo")NOTE
The UPDATE mode for HPE Data Fabric Database OJAI Connector is not supported and it results in an OperationNotSupported exception.