JDBC Configuration Options
Use the following parameters to configure the Kafka Connect for HPE Ezmeral Data Fabric Streams JDBC connector; they are modified in the
quickstart-sqlite.properties
file.
Configuration Modes
In standalone mode, JDBC connector configuration is specified in the quickstart-sqlite.properties file. Additional configurations such as the offset storage location and the port for the REST interface are specified in the connect-standalone.properties file. See Configuring in Standalone Mode.
/opt/mapr/kafka-connect-jdbc/kafka-connect-jdbc-<version>/etc/kafka-connect-jdbc/quickstart-sqlite.properties
/opt/mapr/kafka/kafka-<version>/config/connect-standalone.properties
/opt/mapr/kafka/kafka-<version>/config/connect-distributed.properties
JDBC Source Configuration Options
Parameters | Description |
---|---|
connection.url |
JDBC connection URL for the database to load.
|
connection.user |
JDBC connection user.
|
connection.password |
JDBC connection password.
|
connection.attempts |
Maximum number of attempts to retrieve a valid JDBC connection.
|
connection.backoff.ms |
Backoff time in milliseconds between connection attempts.
|
table.whitelist |
List of tables to include in copying. If specified, table.blacklist may not be set.
|
table.blacklist |
List of tables to exclude from copying. If specified, table.whitelist may not be set.
|
numeric.precision.mapping |
Whether or not to attempt mapping numeric values by precision to integral types.
|
schema.pattern | Schema pattern to fetch table metadata from the database.
|
mode |
The mode for updating a table each time it is polled. Options include:
Valid Values: [, bulk, timestamp, incrementing, timestamp+incrementing] The name
of the strictly incrementing column to use to detect new rows. Any empty value
indicates the column should be autodetected by looking for an autoincrementing
column. This column may not be nullable.
NOTE If you are using Hive JDBC with incrementing or timestamp mode, you should
set the validate.non.null property to false because there are
no "not null" columns in Hive. |
timestamp.column.name |
The name of the timestamp column to use to detect new or modified rows. This column may not be nullable.
|
validate.non.null |
By default, the JDBC connector will validate that all incrementing and timestamp tables have NOT NULL set for the columns being used as their ID/timestamp. If the tables don’t, JDBC connector will fail to start. Setting this to false will disable these checks.
NOTE If this parameter is false, specify exactly all columns that need to be
imported to HPE Ezmeral Data Fabric Streams in the query parameter.
For example instead of "query" : "select * from table" , use
"query" : "select col1, col2 from table"
|
incrementing.column.name |
The name of the strictly incrementing column to use to detect new rows. Any empty value indicates the column should be autodetected by looking for an auto-incrementing column. This column may not be nullable.
|
query |
If specified, the query to perform to select new or updated rows. Use this setting to join tables, select subsets of columns in a table, or filter data. If used, this connector will only copy data using this query – whole-table copying will be disabled. Different query modes may still be used for incremental updates, but in order to properly construct the incremental query, it must be possible to append a WHERE clause to this query (i.e. no WHERE clauses may be used). If you use a WHERE clause, it must handle incremental queries itself.
|
poll.interval.ms |
Frequency (milliseconds) to poll for new data in each table.
|
batch.max.rows |
Maximum number of rows to include in a single batch when polling for new data. This setting can be used to limit the amount of data buffered internally in the connector.
|
table.poll.interval.ms |
Frequency (milliseconds) to poll for new or removed tables, which may result in updated task configurations to start polling for data in added tables or stop polling for data in removed tables.
|
topic.prefix |
Prefix to prepend to table names to generate the name of the Kafka topic to
publish data to, or in the case of a custom query, the full name of the topic to
publish to. For example:
|
table.types | By default, the JDBC connector will only detect tables with type TABLE from the
source Database. This config allows a command separated list of table types to
extract. Options include:
Typically, TABLE or VIEW are used.
|
timestamp.delay.interval.ms | How long to wait after a row with certain timestamp appears before it is included
in the result. You may choose to add some delay to allow transactions with
earlier timestamp to complete. The first execution fetches all available records
(for example, starting at timestamp 0) until the current time minus the delay. Every following
execution retrieves data from the last time data was fetched until the current time minus the
delay.
|
JDBC Sink Configuration Options
Parameters | Description |
---|---|
connection.url |
JDBC connection URL.
|
connection.user |
JDBC connection user.
|
connection.password |
JDBC connection password.
|
insert.mode | The insertion mode to use.
|
batch.size | Specifies how many records to attempt to batch together for insertion into the
destination table, when possible.
|
table.name.format | A format string for the destination table name, which may contain
${topic} as a placeholder for the originating topic name. For
example, table_${topic} for the topic orders maps to the table name
table_orders .
|
pk.mode | The primary key mode, also refer to pk.fields documentation for interplay.
Supported modes are:
|
pk.fields | List of comma-separated primary key field names. The runtime interpretation of
this config depends on the pk.mode:
|
fields.whitelist | List of comma-separated record value field names. If empty, all fields from the
record value are utilized, otherwise used to filter to the desired fields. Note:
pk.fields is applied independently in the context of which field(s) form the primary
key columns in the destination database, while this configuration is applicable for
the other columns.
|
auto.create | Whether to automatically create the destination table based on record schema if
it is found to be missing by issuing CREATE.
|
auto.evolve | Whether to automatically add columns in the table schema when found to be
missing relative to the record schema by issuing ALTER.
|
max.retries | The maximum number of times to retry on errors before failing the task.
|
retry.backoff.ms | The time in milliseconds to wait following an error before a retry attempt is
made.
|