Configuration Parameters

This topic describes configuration parameters that are either specific to HPE Ezmeral Data Fabric Streams or supported from Apache Kafka.

Table 1. AdminClient configuration parameters specific to HPE Ezmeral Data Fabric Streams
Parameter Description This parameter, when set during creation of the AdminClient instance, ensures that the specified stream is using the the AdminClient instance for all administrative operations.


/mapr/<cluster name>/<volume name>/<stream name>
Specifies the length of time in milliseconds to wait for a response from the HPE Ezmeral Data Fabric Streams server if soft mount is configured (fs.mapr.hardmount is set to false). Default: 120000 Minimum: 30000
Applicable as of MapR 6.0.1, is used instead of fs.mapr.rpc.timeout

For producer and consumer applications, make sure the configuration value for both producers and consumers is set to greater than 50000 to avoid Message Fetch RPC overload.

use.brokers A boolean flag that specifies whether or not the Apache Kafka clients (Producer, Consumer, Admin) should connect to Apache Kafka brokers or HPE Ezmeral Data Fabric Streams services. Default: false

To connect the Apache Kafka clients to HPE Ezmeral Data Fabric Streams services, set this flag as false.

The bootstrap.servers property is optional and is ignored.

To connect the Apache Kafka clients to Apache Kafka brokers, set this flag as true.

The bootstrap.servers property is required, and must be filled in the client configuration and console scripts.
Table 2. Consumer configuration parameters specific to HPE Ezmeral Data Fabric Streams
Parameter Description
streams.consumer.buffer.memory Specifies how much memory to use for caching pre-fetched messages. Messages that are in subscribed topics and partitions are pre-fetched and cached to improve performance. Default 64MB Specifies the path and name of the stream that the consumer subscribes to if, when subscribing to a topic, the consumer does not specify a stream.

Specifies the length of time in milliseconds to wait for a response from the HPE Ezmeral Data Fabric Streams server if a soft mount is configured (fs.mapr.hardmount is set to false). Default: 305000 Minimum: 300000

For producer and consumer applications, make sure the configuration value for both producers and consumers is set to greater than 50000 to avoid Message Fetch RPC overload.

use.brokers A boolean flag that specifies whether or not the Apache Kafka clients (Producer, Consumer, Admin) should connect to Apache Kafka brokers or HPE Ezmeral Data Fabric Streams services. Default: false

To connect the Apache Kafka clients to HPE Ezmeral Data Fabric Streams services, set this flag as false.

The bootstrap.servers property is optional and is ignored.

To connect the Apache Kafka clients to Apache Kafka brokers, set this flag as true.

The bootstrap.servers property is required, and must be filled in the client configuration and console scripts.
Table 3. Consumer configuration parameters supported from Apache Kafka
Parameter Description The frequency in milliseconds that the offsets are committed. Default: 1000ms
auto.offset.reset Specifies what HPE Ezmeral Data Fabric Streams should do when there is no initial offset, such as when a consumer starts reading from a partition. Default: latest
Reset the offset to the offset of the earliest message in the partition.
Reset the offset to the offset of the latest message in the partition. If true, periodically commits the highest offsets of the messages fetched by the consumer in all of the partitions for the topics that the consumer is subscribed to. Default: true
fetch.min.bytes The minimum amount of data the server should return for a fetch request. If insufficient data is available, the server will wait for this minimum amount of data to accumulate before answering the request.

This minimum applies to the totality of what a consumer has subscribed to.

Works in conjunction with the timeout interval that is specified in the poll function. If the minimum number of bytes is not reached by the time that the interval expires, the poll returns with nothing.

For example, suppose the value is set to 6 bytes and the timeout on a poll is set to 100ms. If there are 5 bytes available and no further bytes come in before the 100ms expire, the poll returns with nothing. Default: 1 byte

The maximum amount of data the server should return for a fetch request. If the first record batch in the first non-empty partition of the fetch is larger than this configuration, the record batch is still returned to ensure that the consumer can make progress.
This parameter is new as of MapR 6.0.1. The maximum amount of time the HPE Ezmeral Data Fabric Streams server will block before answering the fetch request if there isn't sufficient data to satisfy the requirement given by fetch.min.bytes. A string 2457 up to bytes long that uniquely identifies the group of consumer processes to which this consumer belongs. By setting the same group ID, multiple consumer processes indicate that they are all part of the same consumer group. Putting consumers into groups provides benefits that are described in Consumer Groups.

It is possible for a single consumer to be in a group.

max.poll.records Places an upper bound on the number of records returned from each call.
This parameter is new as of MapR 6.0.1.
max.partition.fetch.bytes The number of bytes of message data to attempt to fetch for each partition in each poll request. These bytes will be read into memory for each partition, so this parameter helps control the memory that the consumer uses. Default: 64KB

The size of the poll request must be at least as large as the maximum message size that the server allows or else it is possible for producers to send messages that are larger than the consumer can fetch.

If the first record batch in the first non-empty partition of the fetch is larger than this configuration, the record batch is still returned to ensure that the consumer can make progress.
This is a behavior change as of MapR 6.0.1.
Table 4. Producer configuration parameters specific to HPE Ezmeral Data Fabric Streams
Parameter Description Messages are buffered in the producer for at most the specified time. A thread will flush all the messages that have been buffered for more than the time specified. Default: 3 * 1000 msec create default stream
streams.parallel.flushers.per.partition If enabled, producer may have multiple parallel send requests to the server for each topic partition. If this setting is set to true, it is possible for messages to be sent out of order. Default: true create default stream Specifies the stream that the producer will use by default if the producer does not provide the name of a stream when specifying a topic to write to.
/mapr/<cluster name>/<volume name>/<stream name>
create default stream
fs.mapr.hardmount Specifies whether to use a hard mount or a soft mount for connections to the MapR Streams server.

The default is to use a hard mount and the value is true.

If a value for this parameter is set in the core-site.xml file, the value in that file is ignored.

create default stream
fs.mapr.rpc.timeout Specifies the length of time in seconds to wait for a response from the HPE Ezmeral Data Fabric Streams server if the configuration parameter fs.mapr.hardmount is set to false. Default: 300. Minimum value: 30.
Applicable to MapR 6.0.0 and earlier. As of MapR 6.0.1, use

If a soft mount is used, the time expires while a producer waits for a response from the HPE Ezmeral Data Fabric Streams server, and the producer used the KafkaProducer.send(ProducerRecord<K,V> record, Callback callback) method, the callback is invoked with the error EAGAIN, which means "Resource temporarily unavailable."

create default stream

Specifies the length of time in milliseconds to wait for a response from the HPE Ezmeral Data Fabric Streams server if soft mount is configured (fs.mapr.hardmount is set to false). Default: 30000 Minimum: 30000

For producer and consumer applications, make sure the configuration value for both producers and consumers is set to greater than 50000 to avoid Message Fetch RPC overload.

use.brokers A boolean flag that specifies whether or not the Apache Kafka clients (Producer, Consumer, Admin) should connect to Apache Kafka brokers or HPE Ezmeral Data Fabric Streams services. Default: false

To connect the Apache Kafka clients to HPE Ezmeral Data Fabric Streams services, set this flag as false.

The bootstrap.servers property is optional and is ignored.

To connect the Apache Kafka clients to Apache Kafka brokers, set this flag as true.

The bootstrap.servers property is required, and must be filled in the client configuration and console scripts.
Table 5. Producer configuration parameters supported from Apache Kafka
Parameter Description
buffer.memory The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are generated faster than they can be delivered to the server the producer will block. Default: 33554432 Producers can tag records with a client ID that identifies the producer. Consumers can then be aware of which producer sent a message or set of messages. Apache Drill or other analytic tools querying messages can include this ID in the filters for their queries. Default: No client ID. The producer generally refreshes the topic metadata from the server when there is a failure. It will also poll for this data regularly. Default: 300 * 1000 msec