Saving an Apache Spark DataFrame to a HPE Data Fabric Database JSON Table
To save an Apache Spark DataFrame to a HPE Data Fabric Database, invoke the
saveToMapRDB method on the DataFrame object (Scala). This
returns a DataFrameWriter object, from which you can invoke the
saveToMapRDB method. For Java and Python, invoke the
saveToMapRDB method on the MapRDBJavaSession object or
SparkSession object, respectively.
If a row with the same ID already exists, the savetoMapRDB method updates or overwrites that row.
If you want an exception to be thrown in this case, you can use the insertToMapRDB method.
import com.mapr.db.spark.sql._
df.write.saveToMapRDB("/tmp/userInfo")
For EEP 4.1.0 and later, you can directly
invoke the saveToMapRDB method on the DataFrame
object:
def saveToMapRDB(tableName: String, idFieldPath : String = "_id", createTable: Boolean = false, bulkInsert:Boolean = false): Unit
import org.apache.spark.sql.SparkSession
import com.mapr.db.spark.sql._
val df = spark.loadFromMapRDB("/tmp/user_profiles")
df.saveToMapRDB(tableName, createTable = true)
For saving a DataFrame (Dataset<Row> ), apply the following method on a
MapRDBJavaSession object:
def saveToMapRDB[T](df: DataFrame[T], tableName: String, idFieldPath: String, createTable: Boolean, bulkInsert: Boolean): Unit
import com.mapr.db.spark.sql.api.java.MapRDBJavaSession;
MapRDBJavaSession maprSession = new MapRDBJavaSession(sparkSession);
Dataset<Row> ds = maprSession.loadFromMapRDB("/tmp/user_profiles");
maprSession.saveToMapRDB(ds, "/tmp/userInfo");
For saving a DataFrame, apply the following method on a Dataframe:
def saveToMapRDB(dataframe, table_name, id_field_path = default_id_field, create_table = False, bulk_insert = False)
from pyspark.sql import SparkSession
df = spark.loadFromMapRDB("/tmp/user_profiles")
sparkSession.saveToMapRDB(df, table_name, create_table=True)