Cluster Red Unassigned Shards
Severity:
Critical
Elasticsearch Version:
8.5.0
Problem
Cluster health is RED due to unassigned primary shards after node failure
Root Cause
Node failure caused primary shards to become unassigned, preventing cluster from allocating shards properly
How to Detect
Symptoms
- Cluster health status is RED
- Unassigned primary shards listed in _cat/shards
- Node failure logs indicating shard allocation issues
Commands
curl -X GET "localhost:9200/_cluster/health"
curl -X GET "localhost:9200/_cat/shards?v"
curl -X GET "localhost:9200/_cluster/allocation/explain"
Remediation Steps
- Identify unassigned primary shards using _cat/shards
- Check shard allocation explanations with _cluster/allocation/explain
- Attempt to reroute unassigned shards with POST /_cluster/reroute with specific shard allocation commands
- If necessary, manually allocate shards using cluster reroute API
- Verify cluster health status after reroute
Prevention
- Implement shard allocation awareness and replica settings
- Ensure sufficient node capacity and disk space
- Configure shard allocation filtering to prevent overloading nodes
- Regularly monitor cluster health and shard status
Production Example
curl -X POST "localhost:9200/_cluster/reroute" -H 'Content-Type: application/json' -d '{"commands": [{"allocate_primary": {"index": "my_index", "shard": 0, "node": "node_name"}}]}'