Configuring the Kafka Storage Plugin
To configure Kafka as a data source in Drill, update the
<drill_home>/jars/3rdParty
directory with the required JAR files, restart
Drill, and configure the kafka
storage plugin in the Drill Web UI.
Verify that the nodes in your cluster meet the requirements and then complete the steps listed.
Requirements
- HPE Ezmeral Data Fabric 7.0 or later cluster
- Drill 1.16.1 or later installed on nodes
- The HPE Ezmeral Data Fabric Kafka
client package (kafka-2.1.1, 2.6.1, or later) installed on at least one node. The Kafka
client installation provides the following kafka JAR files that you copy into the
<drill_home>/jars/3rdParty
directory (step 4):NOTE: Kafka 2.1.1 is used as an example. The version of your Kafka JAR files may differ.- Kafka-2.1.1
- kafka_2.11-2.1.1.200-mapr-710.jar
- kafka-clients-2.1.1.200-mapr-710.jar
- Kafka-2.6.1 (if you have eep-800 or later installed)
- kafka_2.13-2.6.1.0-eep-800.jar
- kafka-clients-2.6.1.0-eep-800.jar
- kafka-eventstreams-0.1.0.0-eep-800.jar
- Kafka-2.1.1
Steps
- Remove the specified JAR files from the
<drill_home>/jars/3rdParty
directory based on the Drill installation method:- If you installed Drill using RPM or Debian
packages, only remove JAR files that start with kafka, such as
kafka-clients-<version>.jar
andkafka_<version>.jar
, from the<drill_home>/jars/3rdParty
directory. - If you installed Drill using a TAR file, remove all the JAR files that start with
mapr
andkafka
, such asmaprdb-<version>-mapr.jar, maprfs-<version>-mapr.jar
,kafka_<version>-mapr.jar
, andkafka-clients-<version>.jar
, from the<drill_home>/jars/3rdParty
directory.
- If you installed Drill using RPM or Debian
packages, only remove JAR files that start with kafka, such as
- (Only perform this step if you installed Drill using a TAR file.) Copy the following
JAR files from the
/opt/mapr/lib directory
into<drill_home>/jars/3rdParty
directory: - Copy the
mapr-streams-6.2.0.0-mapr.jar
file from the/opt/mapr/lib
directory into the<drill_home>/jars/3rdParty
directory. - Copy the following kafka JAR files from the
/opt/mapr/kafka/kafka-*/libs
directory into the<drill_home>/jars/3rdParty
directory:NOTE: Kafka 2.1.1 is used as an example. The version of your Kafka JAR files may differ.- Kafka-2.1.1
kafka_2.11-2.1.1.200-mapr-710.jar
kafka-clients-2.1.1.200-mapr-710.jar
- Kafka-2.6.1 (if you have eep-800 or later installed)
kafka_2.13-2.6.1.0-eep-800.jar
kafka-clients-2.6.1.0-eep-800.jar
kafka-eventstreams-0.1.0.0-eep-800.jar
- Kafka-2.1.1
- Issue the following command to restart
Drill:
$ maprcli node services -name drill-bits -action restart -nodes <node hostnames separated by a space>
- Log in to the Drill Web UI, and configure the kafka storage
plugin. See Kafka Storage Plugin for instructions.
NOTE: When configuring the kafka storage plugin, you must also include the following parameter in the storage plugin configuration:
"streams.consumer.default.stream": "<path-to-stream>"
Usage Example
This example shows a Drill query on a Streams data set, which was made accessible to Drill through the kafka storage plugin.
streams.consumer.default.stream
parameter pointing to the /YelpStream
directory, as
shown:"streams.consumer.default.stream": "/YelpStream"
use kafka;
+-----+----------------------------------+
| ok | summary |
+-----+----------------------------------+
| true | Default schema changed to [kafka] |
+-----+----------------------------------+
show tables;
+-------------+---------------------------+
| TABLE_SCHEMA | TABLE_NAME |
+-------------+---------------------------+
| kafka | /YelpStream:UserTable |
| kafka | /YelpStream:ReviewTable |
| kafka | /YelpStream:BusinessTable |
+-------------+---------------------------+
/YelpStream
directory, limiting the results to one row
data:select * from `/YelpStream:BusinessTable` limit 1;
+---+----------+-----------+----------+----+------------+-----+--------+---------+----+-------------+----+------------+-----+-----+----+----------+----------------+--------------+-----------------+-----------+
| _id | attributes | business_id | categories | city | full_address | hours | latitude | longitude | name | neighborhoods | open | review_count | stars | state | type | kafkaTopic | kafkaPartitionId | kafkaMsgOffset | kafkaMsgTimestamp | kafkaMsgKey |
+---+----------+-----------+----------+----+------------+-----+--------+---------+----+-------------+----+------------+-----+-----+----+----------+----------------+--------------+-----------------+-----------+
| --1emggGHgoG6ipd_RMb-g | {"Accepts Credit Cards":"true","Parking":{"garage":"false","lot":"true","street":"false","valet":"false","validated":"false"},"Price Range":"1","Ambience":{},"Good For":{},"Music":{}} | --1emggGHgoG6ipd_RMb-g | ["Food","Convenience Stores"] | Las Vegas | 3280 S Decatur Blvd
Westside
Las Vegas, NV 89102 | {"Friday":{},"Monday":{},"Saturday":{},"Sunday":{},"Thursday":{},"Tuesday":{},"Wednesday":{}} | 36.1305306 | -115.2072382 | Sinclair | ["Wes