User-Configurable Data Fabric Cluster Parameters
This article describes two methods for configuring Data Fabric cluster parameters:
Method1 : Template CR
This section refers to the Data Fabric Custom Resource (CR) template that HPE Ezmeral Runtime Enterprise reads when generating the CR for creating the Data Fabric cluster. Modifications to this CR template are effective if made before creating the Data Fabric cluster. Kubernetes Administrator users can access this template at:
/opt/bluedata/common-install/bd_mgmt/picasso_dataplatform_cr.cfg
This file is a partial CR specification where some fields have been templatized for
use by HPE Ezmeral Runtime Enterprise. Advanced users may modify the non-templatized fields. You
cannot change CLDB and MFS pod specifications here. Hewlett Packard Enterprise recommends limiting modifications to either
enabling/disabling services or changing service resource allocations. You may want
to save a copy of the original
/opt/bluedata/common-install/bd_mgmt/picasso_dataplatform_cr.cfg
file before making the modifications.
For example, set the following values to avoid bringing up pods related to monitormetrics services when a Data Fabric cluster is created:
spec:monitoring:monitormetrics=false
spec:monitoring:opentsdb:count=0
spec:monitoring:grafana:count=0
spec:monitoring:elasticsearch:count=0
spec:monitoring:kibana:count=0
monitormetrics=true in the CR
template and subsequently changing it to false in the downloaded
cluster CR might not stop the metrics pods. After successful cluster creation, you may download the CR that was applied in the Kubernetes cluster using the HPE Ezmeral Runtime Enterprise web interface and can then either patch or modify and reapply it, as described in Upgrading and Patching Data Fabric Clusters on Kubernetes.
All of the following cautions apply when modifying a template CR:
- Only advanced uses should modify the default values for keys related to HPE Ezmeral Data Fabric services in the template CR.
- The CR template is in YAML format. Preserve all indentations, spaces, and other punctuation.
- Disabling essential items (for example
admincli) may cause the cluster to malfunction. - When decreasing resource allocations for a service-pod, be sure to keep the resource allocation above the minimum required for that pod to function.
Method 2: Using bd_mgmt_config
Kubernetes Administrator users can modify configuration key values for a Data Fabric cluster in order to fine-tune that cluster.
Key modification can cause performance loss and/or render the cluster inoperable. Do not modify the default key values unless you are familiar with the keys and how changing their values can affect the Data Fabric cluster.
Only change the value when modifying a configuration key. Always preserve the key name and format (e.g. Tuple, integer, string, etc.).
Environment Setup
To modify a key, you must first execute the following commands on the Controller host to set up the environment:
ERTS_PATH=/opt/bluedata/common-install/bd_mgmt/erts-*/bin
NODETOOL=/opt/bluedata/common-install/bd_mgmt/bin/nodetool
NAME_ARG=`egrep '^-s?name' $ERTS_PATH/../../releases/1/vm.args`
RPCCMD="$ERTS_PATH/escript $NODETOOL $NAME_ARG rpcterms"
Key Value Lookup
To look up the value of a configuration key, execute the following command:
$RPCCMD bd_mgmt_config lookup "<configuration_key_name>."
For example, the command:
$RPCCMD bd_mgmt_config lookup "datafabric_cldb_cpu_req_limit_percents."
Returns something similar to:
{35,75}
Modifying a Key Value
To change the value of a configuration key, execute the following command:
$RPCCMD bd_mgmt_config update "<configuration_key_name>. <value>."
For example, the command:
$RPCCMD bd_mgmt_config update "datafabric_cldb_cpu_req_limit_percents. {50,70}."
Returns (if successful):
ok
Available Keys
The following configuration keys are available:
-
{datafabric_cldb_wakeup_timeout, 1500}This integer value specifies how long the HPE Ezmeral Runtime Enterprise bootstrap add-on for HPE Ezmeral Data Fabric Kubernetes Edition must wait after Data Fabric CR creation/application until the cluster pods have come up, in seconds. Periodic status checks occur during this time period. Cluster creation fails if the cluster does not come up during this period.
-
{datafabric_cldb_cpu_req_limit_percents, {35, 75}}.This tuple value influences the
requestcpuandlimitcpufor an intended CLDB pod specified in the Data Fabric CR. The{X, Y}tuple denotes the{requestcpu, limitcpu}values as percentages of the number of logical CPU cores in the system info of a CLDB node. The new or updated Data Fabric CR will specify X% of the node's logical CPU cores as therequestcpufor a CLDB pod and Y% as thelimitcpufor a CLDB pod. -
{datafabric_cldb_mem_req_limit_percents, {60, 75}}.This tuple value influences the
requestmemoryandlimitmemoryfor an intended CLDB pod specified in the Data Fabric CR. The{X, Y}tuple denotes the{requestmemory, limitmemory}values as percentages of the total available memory in a CLDB node's system info. The new or updated Data Fabric CR will specify X% of the node's total available memory as therequestmemoryfor a CLDB pod, and Y% as thelimitmemoryfor a CLDB pod. -
{datafabric_mfs_cpu_req_limit_percents, {40, 70}}.This tuple value influences the
requestcpuandlimitcpufor an intended MFS Group specified in the Data Fabric CR. The{X, Y}tuple denotes the{requestcpu, limitcpu}values as percentages of the number of logical CPU cores in an MFS node's system info. The new or updated Data Fabric CR will specify X% of the node's logical CPU cores as therequestcpufor each MFS Group, and Y% as thelimitcpufor each MFS Group. -
{datafabric_mfs_mem_req_limit_percents, {60, 75}}.This tuple value influences the
requestmemoryandlimitmemoryfor an intended MFS Group specified in the Data Fabric CR. The{X, Y}tuple denotes the{requestmemory, limitmemory}values as percentages of the total available memory in an MFS node's system info. The new or updated Data Fabric CR will specify X% of the node's total available memory as therequestmemoryfor each MFS Group, and Y% as thelimitmemoryfor each MFS Group. -
{datafabric_hilowperf_disktype_capacity_ratio, {2, 3}}.This configuration key is only relevant when nodes that can be used to schedule a Data Fabric cluster CLDB or MFS pod have multiple disk types (e.g. hard disk, SSD, or NVMe) among the node's persistent disks. Normally, HPE Ezmeral Data Fabric on Kubernetes only allows a node may to be represented by one disk type when it is considered for scheduling a CLDB or MFS pod.
This tuple value denotes a capacity ratio,
x/y, which guides ECP policy in how thedisktypeanddiskcountare specified in thediskinfosection of the specification for a CLDB pod-set or an MFS group. The Data Fabric CR will specify a higher-performing disk type to represent a node, if that disk type is present in relatively-sizable capacity.If the capacity of the higher-performing disk type is
x/yor more of the capacity of a lower-performing disk type (both disk types must be present among the node's persistent disks), then the node will be counted as having the higherdisktype. Thediskcountwill equal the actual number of persistent disks of the higher-performing disk type that are present on the node. Thus, setting a low value forx/y(such as1/100) can help force a preference for the higher-performing disk type.If the higher-performing disk type is less than
x/yof the lower-performing disk type, then the lowerdisktypewill represent that node. Ifmdisks of the higher type andndisks of the lower type are present in the node, thediskcountfor the node will equalm+n, by convention.Adjusting this value allows a user to force a higher-performing or a lower-performing disk type to be used to represent nodes used for CLDBs or MFSs.
- Example 1: If
{x, y}is{1, 2}; a node's persistent disks includepNVMe disks totaling 500 GB;qSSDs, totaling 5 TB;rHDDs, totaling 20 TB: the node will be counted as having adisktypeofHDDwith adiskcountof the sum,p+q+r, of the counts of the disk types. - Example: 2: If
{x, y}is{1, 2}; a node's persistent disks includepNVMe disks, totaling 500 GB;qSSDs, totaling 800 GB; r HDDs, totaling 1.2 TB: the node will be counted as having adisktypeofNVMewith adiskcountofp, the actual number of NVMe disks present. - Example: 3: If
{x, y}is{1, 2}; a node's persistent disks includepNVMe disks, totaling 200 GB;qSSDs, totaling 800 GB;rHDDs, totaling 1.2 TB: the node will be counted as having adisktypeofSSDwith adiskcountofp+q, the sum of the counts of the NVMe disks and SSDs present. In this example, changing{x, y}to{1, 5}would count the node as adisktypeofNVMe, and adiskcountofp. Changing{x, y}to{1, 1}would count the node as adisktypeofHDD, with adiskcountofp+q+r.
- Example 1: If