User-Configurable Data Fabric Cluster Parameters
This article describes two methods for configuring Data Fabric cluster parameters:
Method1 : Template CR
This section refers to the Data Fabric Custom Resource (CR) template that HPE Ezmeral Runtime Enterprise reads when generating the CR for creating the Data Fabric cluster. Modifications to this CR template are effective if made before creating the Data Fabric cluster. Kubernetes Administrator users can access this template at:
/opt/bluedata/common-install/bd_mgmt/picasso_dataplatform_cr.cfg
This file is a partial CR specification where some fields have been templatized for
use by HPE Ezmeral Runtime Enterprise. Advanced users may modify the non-templatized fields. You
cannot change CLDB and MFS pod specifications here. Hewlett Packard Enterprise recommends limiting modifications to either
enabling/disabling services or changing service resource allocations. You may want
to save a copy of the original
/opt/bluedata/common-install/bd_mgmt/picasso_dataplatform_cr.cfg
file before making the modifications.
For example, set the following values to avoid bringing up pods related to monitormetrics services when a Data Fabric cluster is created:
spec:monitoring:monitormetrics=false
spec:monitoring:opentsdb:count=0
spec:monitoring:grafana:count=0
spec:monitoring:elasticsearch:count=0
spec:monitoring:kibana:count=0
monitormetrics=true
in the CR
template and subsequently changing it to false
in the downloaded
cluster CR might not stop the metrics pods. After successful cluster creation, you may download the CR that was applied in the Kubernetes cluster using the HPE Ezmeral Runtime Enterprise web interface and can then either patch or modify and reapply it, as described in Upgrading and Patching Data Fabric Clusters on Kubernetes.
All of the following cautions apply when modifying a template CR:
- Only advanced uses should modify the default values for keys related to HPE Ezmeral Data Fabric services in the template CR.
- The CR template is in YAML format. Preserve all indentations, spaces, and other punctuation.
- Disabling essential items (for example
admincli
) may cause the cluster to malfunction. - When decreasing resource allocations for a service-pod, be sure to keep the resource allocation above the minimum required for that pod to function.
Method 2: Using bd_mgmt_config
Kubernetes Administrator users can modify configuration key values for a Data Fabric cluster in order to fine-tune that cluster.
Key modification can cause performance loss and/or render the cluster inoperable. Do not modify the default key values unless you are familiar with the keys and how changing their values can affect the Data Fabric cluster.
Only change the value when modifying a configuration key. Always preserve the key name and format (e.g. Tuple, integer, string, etc.).
Environment Setup
To modify a key, you must first execute the following commands on the Controller host to set up the environment:
ERTS_PATH=/opt/bluedata/common-install/bd_mgmt/erts-*/bin
NODETOOL=/opt/bluedata/common-install/bd_mgmt/bin/nodetool
NAME_ARG=`egrep '^-s?name' $ERTS_PATH/../../releases/1/vm.args`
RPCCMD="$ERTS_PATH/escript $NODETOOL $NAME_ARG rpcterms"
Key Value Lookup
To look up the value of a configuration key, execute the following command:
$RPCCMD bd_mgmt_config lookup "<configuration_key_name>."
For example, the command:
$RPCCMD bd_mgmt_config lookup "datafabric_cldb_cpu_req_limit_percents."
Returns something similar to:
{35,75}
Modifying a Key Value
To change the value of a configuration key, execute the following command:
$RPCCMD bd_mgmt_config update "<configuration_key_name>. <value>."
For example, the command:
$RPCCMD bd_mgmt_config update "datafabric_cldb_cpu_req_limit_percents. {50,70}."
Returns (if successful):
ok
Available Keys
The following configuration keys are available:
-
{datafabric_cldb_wakeup_timeout, 1500}
This integer value specifies how long the HPE Ezmeral Runtime Enterprise bootstrap add-on for HPE Ezmeral Data Fabric Kubernetes Edition must wait after Data Fabric CR creation/application until the cluster pods have come up, in seconds. Periodic status checks occur during this time period. Cluster creation fails if the cluster does not come up during this period.
-
{datafabric_cldb_cpu_req_limit_percents, {35, 75}}.
This tuple value influences the
requestcpu
andlimitcpu
for an intended CLDB pod specified in the Data Fabric CR. The{X, Y}
tuple denotes the{requestcpu, limitcpu}
values as percentages of the number of logical CPU cores in the system info of a CLDB node. The new or updated Data Fabric CR will specify X% of the node's logical CPU cores as therequestcpu
for a CLDB pod and Y% as thelimitcpu
for a CLDB pod. -
{datafabric_cldb_mem_req_limit_percents, {60, 75}}.
This tuple value influences the
requestmemory
andlimitmemory
for an intended CLDB pod specified in the Data Fabric CR. The{X, Y}
tuple denotes the{requestmemory, limitmemory}
values as percentages of the total available memory in a CLDB node's system info. The new or updated Data Fabric CR will specify X% of the node's total available memory as therequestmemory
for a CLDB pod, and Y% as thelimitmemory
for a CLDB pod. -
{datafabric_mfs_cpu_req_limit_percents, {40, 70}}.
This tuple value influences the
requestcpu
andlimitcpu
for an intended MFS Group specified in the Data Fabric CR. The{X, Y}
tuple denotes the{requestcpu, limitcpu}
values as percentages of the number of logical CPU cores in an MFS node's system info. The new or updated Data Fabric CR will specify X% of the node's logical CPU cores as therequestcpu
for each MFS Group, and Y% as thelimitcpu
for each MFS Group. -
{datafabric_mfs_mem_req_limit_percents, {60, 75}}.
This tuple value influences the
requestmemory
andlimitmemory
for an intended MFS Group specified in the Data Fabric CR. The{X, Y}
tuple denotes the{requestmemory, limitmemory}
values as percentages of the total available memory in an MFS node's system info. The new or updated Data Fabric CR will specify X% of the node's total available memory as therequestmemory
for each MFS Group, and Y% as thelimitmemory
for each MFS Group. -
{datafabric_hilowperf_disktype_capacity_ratio, {2, 3}}.
This configuration key is only relevant when nodes that can be used to schedule a Data Fabric cluster CLDB or MFS pod have multiple disk types (e.g. hard disk, SSD, or NVMe) among the node's persistent disks. Normally, HPE Ezmeral Data Fabric on Kubernetes only allows a node may to be represented by one disk type when it is considered for scheduling a CLDB or MFS pod.
This tuple value denotes a capacity ratio,
x/y
, which guides ECP policy in how thedisktype
anddiskcount
are specified in thediskinfo
section of the specification for a CLDB pod-set or an MFS group. The Data Fabric CR will specify a higher-performing disk type to represent a node, if that disk type is present in relatively-sizable capacity.If the capacity of the higher-performing disk type is
x/y
or more of the capacity of a lower-performing disk type (both disk types must be present among the node's persistent disks), then the node will be counted as having the higherdisktype
. Thediskcount
will equal the actual number of persistent disks of the higher-performing disk type that are present on the node. Thus, setting a low value forx/y
(such as1/100
) can help force a preference for the higher-performing disk type.If the higher-performing disk type is less than
x/y
of the lower-performing disk type, then the lowerdisktype
will represent that node. Ifm
disks of the higher type andn
disks of the lower type are present in the node, thediskcount
for the node will equalm+n
, by convention.Adjusting this value allows a user to force a higher-performing or a lower-performing disk type to be used to represent nodes used for CLDBs or MFSs.
- Example 1: If
{x, y}
is{1, 2}
; a node's persistent disks includep
NVMe disks totaling 500 GB;q
SSDs, totaling 5 TB;r
HDDs, totaling 20 TB: the node will be counted as having adisktype
ofHDD
with adiskcount
of the sum,p+q+r
, of the counts of the disk types. - Example: 2: If
{x, y}
is{1, 2}
; a node's persistent disks includep
NVMe disks, totaling 500 GB;q
SSDs, totaling 800 GB; r HDDs, totaling 1.2 TB: the node will be counted as having adisktype
ofNVMe
with adiskcount
ofp
, the actual number of NVMe disks present. - Example: 3: If
{x, y}
is{1, 2}
; a node's persistent disks includep
NVMe disks, totaling 200 GB;q
SSDs, totaling 800 GB;r
HDDs, totaling 1.2 TB: the node will be counted as having adisktype
ofSSD
with adiskcount
ofp+q
, the sum of the counts of the NVMe disks and SSDs present. In this example, changing{x, y}
to{1, 5}
would count the node as adisktype
ofNVMe
, and adiskcount
ofp
. Changing{x, y}
to{1, 1}
would count the node as adisktype
ofHDD
, with adiskcount
ofp+q+r
.
- Example 1: If