DAPE.work

Master Not Discovered

Severity: Critical
Elasticsearch Version: 8.5.0

Problem

Cluster fails to elect a master node, leading to cluster instability and inability to perform cluster-wide operations.

Root Cause

Network partition or misconfiguration preventing nodes from communicating or voting, or insufficient eligible master nodes.

How to Detect

Symptoms

  • Cluster health status is yellow or red with no master node elected
  • Cluster state shows 'no master' or 'disconnected' nodes
  • Logs indicate 'no suitable master node found' or 'master election failed'

Commands

curl -X GET 'localhost:9200/_cluster/health?pretty'
curl -X GET 'localhost:9200/_cat/nodes?v'
curl -X GET 'localhost:9200/_cluster/state/master_node'

Remediation Steps

  1. Verify network connectivity between nodes and ensure all nodes can communicate on cluster ports
  2. Check Elasticsearch logs for errors related to voting or master election
  3. Ensure quorum is maintained; verify minimum master nodes setting (discovery.zen.minimum_master_nodes)
  4. Restart nodes in a controlled manner to allow re-election
  5. Adjust 'discovery.zen.minimum_master_nodes' if necessary to match the number of master-eligible nodes
  6. Remove any unresponsive or misconfigured nodes from the cluster configuration

Prevention

  • Configure 'discovery.zen.minimum_master_nodes' appropriately based on the number of master-eligible nodes
  • Implement network redundancy and monitoring to prevent partitions
  • Regularly review cluster logs for early signs of election issues
  • Maintain consistent configuration across nodes

Production Example

curl -X GET 'localhost:9200/_cluster/health?pretty'