About Release 8.0.0
This site contains documentation for HPE Data Fabric release 8.0.0, including installation, configuration, administration, and reference content, as well as content for the associated ecosystem components and drivers.
8.0.0 Installation
This section contains information about installing HPE Data Fabric software. It also contains information about how to migrate data and applications from an Apache Hadoop cluster to a HPE Ezmeral Data Fabric cluster.
8.0.0 Upgrade
This section describes how to upgrade HPE Data Fabric software.
8.0.0 Data Fabric
HPE Data Fabric is the industry-leading data platform for AI and analytics that solves enterprise business needs.
8.0.0 Administration
This section describes how to manage the nodes and services that make up a cluster.
8.0.0 Development
This section contains information related to application development for Ezmeral ecosystem components and HPE Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.
- Application Development Process
  Before you start developing applications on the HPE Data Fabric platform, consider how you will get the data into the platform, the storage format of the data, the type of processing or modeling that is required, and how the data will be accessed.
- File Store and Apps
  The following sections provide information about accessing the File Store with C and Java applications.
- HPE Data Fabric Database and Apps
  This section contains information about developing client applications for JSON and key-value tables.
- Apache Kafka Wire Protocol Service
  HPE Data Fabric Streams supports Apache Kafka Wire Protocol Service. Apache Kafka Wire Protocol Service is a TCP/IP service that emulates a Kafka cluster backed by HPE Data Fabric Streams. The service makes it possible for Apache Kafka clients written in any programming language to access topics in HPE Data Fabric Streams.
- Model Context Protocol (MCP)
- HPE Data Fabric Streams and Apps
  HPE Data Fabric Streams brings integrated publish and subscribe messaging to HPE Data Fabric.
- MapReduce and Apps
  This section contains information associated with developing YARN applications.
- Kubernetes Interfaces for Data Fabric
  This section describes how to leverage the capabilities of the Kubernetes Interfaces for Data Fabric.
- Ecosystem Components
  The following sections provide information about each open-source project that is supported by the HPE Data Fabric.
  - Ecosystem Packs
  - Apache Airflow
    This topic provides an overview of Apache Airflow on HPE Data Fabric.
  - AsyncHBase
  - Cascading
  - Apache Drill
  - Apache Flink
  - Hadoop
  - HBase
  - HBase Client and HPE Data Fabric Database Binary Tables
  - HCatalog
  - Hive
  - HttpFS
  - Hue
  - Livy
    Apache Livy is primarily used to provide integration between Hue and Spark.
  - HPE Data Fabric Streams Clients and Tools
    Describes the supported HPE Data Fabric Streams tools and clients.
    - KSQL
      KSQL is an open-source streaming SQL engine that implements continuous, interactive queries.
    - Kafka Streams
      Kafka Streams is a programming library used for creating Java or Scala streaming applications and, specifically, building streaming applications that transform input topics into output topics.
    - Kafka REST Proxy
      The Kafka REST Proxy provides a RESTful interface to HPE Data Fabric Streams clusters to consume and produce messages and to perform administrative operations.
    - Kafka Connect
      Kafka Connect is a utility for streaming data between HPE Data Fabric Streams and other storage systems.
    - Kafka Schema Registry
      Kafka Schema Registry provides a RESTful interface for storing and retrieving schemas.
    - Structured Streaming in Spark
      Starting in EEP 5.0.0, structured streaming is supported in Spark.
      - Prerequisites for Using Structured Streaming in Spark
        To deploy a structured streaming application in Spark, you must create a Data Fabric Streams topic and install a Kafka client on all nodes in your cluster.
      - Using Structured Streaming to Create a Word Count Application
        The example in this section creates a dataset representing a stream of input lines from Kafka and prints out a running word count of the input lines to the console.
      - Writing a Structured Spark Stream to HPE Data Fabric Database JSON Table
        The example in this section writes a structured stream in Spark to HPE Data Fabric Database JSON table.
      - Writing a Spark Stream Word Count Application to HPE Data Fabric Database
        The example in this section writes a Spark stream word count application to HPE Data Fabric Database.
  - NiFi
    This topic provides an overview of Apache NiFi on HPE Data Fabric.
  - OTel
    This topic provides an overview of OpenTelemetry on HPE Data Fabric.
  - Apache Polaris
  - Ranger
  - Apache Spark
  - YARN
  - Zeppelin
- Maven and the HPE Data Fabric
  This section discusses topics associated with Maven and the HPE Data Fabric.
- Developer's Reference
  This section contains in-depth information for the developer.
- API Documentation
  HPE Data Fabric supports public APIs for file system, HPE Data Fabric Database, and HPE Data Fabric Streams. These APIs are available for application-development purposes.
Other Docs
This section contains release-independent information, including: Installer documentation, Ecosystem release notes, interoperability matrices, security vulnerabilities, and links to other Data Fabric version documentation.
Glossary
Definitions for commonly used terms in MapR Converged Data Platform environments.

Structured Streaming in Spark

Starting in EEP 5.0.0, structured streaming is supported in Spark.

HPE Data Fabric 8.0.0 Software Documentation
Abstract	This site contains documentation for HPE Data Fabric Software version 8.0.0 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
Published	July 2026
Edition	8.0.0
Topic last updated	2024-07-05

Structured Streaming in Spark

Related Links