Disk Watermark Exceeded

Severity: High

Elasticsearch Version: 7.17.0

Problem

Shard allocation refused due to disk watermark exceeded

Root Cause

Disk usage on data nodes exceeded high watermark, preventing shard allocation to avoid disk space exhaustion

How to Detect

Symptoms

Elasticsearch cluster health shows red or yellow status
Shard allocation failures in logs
Cluster state indicates shards unassigned due to disk issues

Commands

curl -X GET 'localhost:9200/_cluster/health?pretty'
curl -X GET 'localhost:9200/_cat/shards?v'
curl -X GET 'localhost:9200/_cluster/allocation/explain?pretty'

Remediation Steps

Identify nodes with high disk usage using '_cat/allocation' API
Free disk space by deleting unnecessary data or snapshots
Adjust disk watermark settings temporarily: curl -X PUT 'localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d '{"persistent": {"cluster.routing.allocation.disk.watermark.high": "85%"}}'
Reroute shards away from overutilized nodes: curl -X POST 'localhost:9200/_cluster/reroute' -H 'Content-Type: application/json' -d '{"commands": [{"move": {"index": "<index_name>", "shard": <shard_number>, "from_node": "<node_name>", "to_node": "<target_node>"}}]}']
Monitor shard reallocation and disk usage
Once disk space is freed, restore watermark settings to default

Prevention

Implement regular disk usage monitoring and alerts
Configure appropriate disk watermarks based on node capacity
Schedule routine cleanup of old indices and snapshots
Use data lifecycle management policies to automate data retention

Production Example

curl -X PUT 'localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d '{"persistent": {"cluster.routing.allocation.disk.watermark.high": "85%"}}'

View all Elasticsearch errors