Integrate Spark-SQL (Spark 1.6.1) with Avro
You integrate Spark-SQL with Avro when you want to read and write Avro data. This information is for Spark 1.6.1 or earlier users.
About this task
Use the following steps to perform the integration. Previous versions of Spark do not require these steps.
Procedure
-
Download the Avro 1.7.7 JAR file to the Spark lib
(
opt/mapr/spark/spark-<version>/lib
) directory.You can download the file from the maven repository: http://mvnrepository.com/artifact/org.apache.avro/avro/1.7.7 -
Use one of the following methods to add the Avro 1.7.7 JAR to the
classpath:
- Prepend the Avro 1.7.7 JAR file to the spark.executor.extraClassPath and
spark.driver.extraClassPath in the spark-defaults.conf
(
/opt/mapr/spark/spark-<version>/conf/spark-defaults.conf
) file:spark.executor.extraClassPath /opt/mapr/spark/spark-<spark_version>/lib/avro-1.7.7.jar:<rest_of_path> spark.driver.extraClassPath /opt/mapr/spark/spark-<spark_version>/lib/avro-1.7.7.jar:<rest_of_path>
- Specify the Avro 1.7.7 JAR files with command line arguments on the
spark
shell:
/opt/mapr/spark/spark-<version>/bin/spark-shell \ --packages com.databricks:spark-avro_2.10:2.0.1 \ --driver-class-path /opt/mapr/spark/spark-<version>/lib/avro-1.7.7.jar \ --conf spark.executor.extraClassPath=/opt/mapr/spark/spark-<version>/lib/avro-1.7.7.jar --master <master-url>
- Prepend the Avro 1.7.7 JAR file to the spark.executor.extraClassPath and
spark.driver.extraClassPath in the spark-defaults.conf
(