Jump to main content
HPE Ezmeral Data Fabric – Customer-Managed   7.3 Documentation
  • About Release 7.3
  • 7.3 Installation
  • 7.3 Data Fabric
  • 7.3 Administration
  • 7.3 Development
  • Other Docs
  1. Home
  2. 7.3 Development

    This section contains information related to application development for Ezmeral ecosystem components and HPE Ezmeral Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.

  3. Ecosystem Components

    The following sections provide information about each open-source project that is supported by the HPE Ezmeral Data Fabric.

  4. Apache Spark
  5. Structured Streaming in Spark

    Starting in EEP 5.0.0, structured streaming is supported in Spark.

HPE Ezmeral Data Fabric – Customer-Managed 7.3 Documentation
  • 7.3 Development

    This section contains information related to application development for Ezmeral ecosystem components and HPE Ezmeral Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.

    • Application Development Process

      Before you start developing applications on the HPE Ezmeral Data Fabric platform, consider how you will get the data into the platform, the storage format of the data, the type of processing or modeling that is required, and how the data will be accessed.

    • File Store and Apps

      The following sections provide information about accessing the File Store with C and Java applications.

    • HPE Ezmeral Data Fabric Database and Apps

      This section contains information about developing client applications for JSON and key-value tables.

    • Apache Kafka Wire Protocol Service

      Starting in EEP 9.0.0, HPE Ezmeral Data Fabric Streams supports Apache Kafka Wire Protocol Service. Apache Kafka Wire Protocol Service is a TCP/IP service that emulates a Kafka cluster backed by HPE Ezmeral Data Fabric Streams. The service makes it possible for Apache Kafka clients written in any programming language to access topics in HPE Ezmeral Data Fabric Streams.

    • HPE Ezmeral Data Fabric Streams and Apps

      HPE Ezmeral Data Fabric Streams brings integrated publish and subscribe messaging to HPE Ezmeral Data Fabric.

    • MapReduce and Apps

      This section contains information associated with developing YARN applications.

    • Kubernetes Interfaces for Data Fabric

      This section describes how to leverage the capabilities of the Kubernetes Interfaces for Data Fabric.

    • Ecosystem Components

      The following sections provide information about each open-source project that is supported by the HPE Ezmeral Data Fabric.

      • Ecosystem Packs

      • Apache Airflow

        This topic provides an overview of Apache Airflow on HPE Ezmeral Data Fabric.

      • AsyncHBase

      • Cascading

      • Apache Drill
      • Hadoop
      • HBase

      • HBase Client and HPE Ezmeral Data Fabric Database Binary Tables

      • HCatalog
      • Hive
      • HttpFS
      • Hue
      • Livy

        Apache Livy is primarily used to provide integration between Hue and Spark.

      • HPE Ezmeral Data Fabric Streams Clients and Tools

        Describes the supported HPE Ezmeral Data Fabric Streams tools and clients.

      • NiFi

        This topic provides an overview of Apache NiFi on HPE Ezmeral Data Fabric.

      • Ranger
      • Apache Spark
        • Getting Started with Spark Interactive Shell

          After you have a basic understanding of Apache Spark and have it installed and running on your cluster, you can use it to load datasets, apply schemas, and query data from the Spark interactive shell.

        • Apache Spark Feature Support

          HPE Ezmeral Data Fabric supports most Apache Spark features. However, there are some exceptions.

        • Spark Standalone
        • Spark on YARN
        • Spark configure.sh

          Starting in the EEP 4.0 release, run configure.sh -R to complete your Spark configuration when manually installing Spark or upgrading to a new version.

        • Spark SQL Thrift Server

          Spark SQL Thrift (Spark Thrift) was developed from Apache Hive HiveServer2 and operates like HiveSever2 Thrift server.

        • Spark History Server SSL

          Describes how to enable SSL for Spark History Server.

        • HPE Ezmeral Data Fabric Database Connectors for Apache Spark

          This section describes the HPE Ezmeral Data Fabric Database connectors that you can use with Apache Spark.

        • Integrating Spark

          This section includes the following topics about configuring Spark to work with other ecosystem components.

        • Spark JDBC and ODBC Drivers

          MapR provides JDBC and ODBC drivers so you can write SQL queries that access the Apache Spark data-processing engine. This section describes how to download the drivers, and install and configure them.

        • Spark API Changes

          This topic describes the public API changes that occurred for specific Spark versions.

        • Structured Streaming in Spark

          Starting in EEP 5.0.0, structured streaming is supported in Spark.

          • Prerequisites for Using Structured Streaming in Spark

            To deploy a structured streaming application in Spark, you must create a MapR Streams topic and install a Kafka client on all nodes in your cluster.

          • Using Structured Streaming to Create a Word Count Application

            The example in this section creates a dataset representing a stream of input lines from Kafka and prints out a running word count of the input lines to the console.

          • Writing a Structured Spark Stream to HPE Ezmeral Data Fabric Database JSON Table

            The example in this section writes a structured stream in Spark to HPE Ezmeral Data Fabric Database JSON table.

          • Writing a Spark Stream Word Count Application to HPE Ezmeral Data Fabric Database

            The example in this section writes a Spark stream word count application to HPE Ezmeral Data Fabric Database.

        • PAM Authentication for Spark

          Spark supports PAM authentication on secure MapR clusters.

        • Read or Write LZO Compressed Data for Spark

          This topic provides details for reading or writing LZO compressed data for Spark.

        • Ports Used by Spark

          To run a Spark job from a client node, ephemeral ports should be opened in the cluster for the client from which you are running the Spark job.

        • ACL Configuration for Spark

          Starting in the EEP 6.0 release, the ACL configuration for Spark is disabled by default.

      • YARN
      • Zeppelin

    • Maven and the HPE Ezmeral Data Fabric

      This section discusses topics associated with Maven and the HPE Ezmeral Data Fabric.

    • Developer's Reference

      This section contains in-depth information for the developer.

    • API Documentation

      HPE Ezmeral Data Fabric supports public APIs for file system, HPE Ezmeral Data Fabric Database, and HPE Ezmeral Data Fabric Streams. These APIs are available for application-development purposes.

Structured Streaming in Spark

Starting in EEP 5.0.0, structured streaming is supported in Spark.

Related Links

Spark streaming is integrated with HPE Ezmeral Data Fabric Streams for Apache Kafka.

  • MapR Event Store For Apache Kafka Clients and Tools
  • Prerequisites for Using Structured Streaming in Spark
    To deploy a structured streaming application in Spark, you must create a MapR Streams topic and install a Kafka client on all nodes in your cluster.
  • Using Structured Streaming to Create a Word Count Application
    The example in this section creates a dataset representing a stream of input lines from Kafka and prints out a running word count of the input lines to the console.
  • Writing a Structured Spark Stream to HPE Ezmeral Data Fabric Database JSON Table
    The example in this section writes a structured stream in Spark to HPE Ezmeral Data Fabric Database JSON table.
  • Writing a Spark Stream Word Count Application to HPE Ezmeral Data Fabric Database
    The example in this section writes a Spark stream word count application to HPE Ezmeral Data Fabric Database.
(Topic last modified: 2022-01-18)
©Copyright 2023 Hewlett Packard Enterprise Development LP -
Partners | Support | Dev-Hub | Community | Training | ALA | Privacy Policy | Glossary