Node Types
Depending on the size of your cluster, nodes may or may not perform specialized work.
In a production data-fabric cluster, some nodes are typically dedicated to cluster coordination and management, and other nodes are tasked with data storage and processing duties. An edge node provides user access to the cluster, concentrating open user privileges on a single host. In smaller clusters, the work is not so specialized, and a single node may perform data processing as well as management.
Nodes Running ZooKeeper and CLDB
High latency on a ZooKeeper node can lead to an increased incidence of ZooKeeper quorum failures. A ZooKeeper quorum failure occurs when the cluster finds too few copies of the ZooKeeper service running. If the ZooKeeper node is running other services, competition for computing resources can lead to increased latency for that node. If your cluster experiences issues relating to ZooKeeper quorum failures, consider reducing or eliminating the number of other services running on the ZooKeeper node.
Nodes for Data Storage and Processing
Most nodes in a production cluster are data nodes. FileServer and NodeManager run on data nodes. Data nodes can be added or removed from the cluster as requirements change over time.
Edge Nodes
So-called Edge nodes provide a common user access point for the data-fabric webserver and other client tools. Edge nodes may or may not be part of the cluster, as long as the edge node can reach cluster nodes. Nodes on the same network can run client services and other services, but edge nodes and client nodes may not host data-fabric monitoring components.