-
Notifications
You must be signed in to change notification settings - Fork 450
Open
Description
The Percona MongoDB Exporter currently does not expose any metric related to the MongoDB fsyncLock state (i.e., when the server is locked using db.fsyncLock() for backup operations).
This makes it difficult to monitor or alert on situations where a MongoDB instance is stuck in a locked state or backups leave the node locked longer than expected.
Expected Behavior:
- The exporter should expose one or more Prometheus metrics indicating:
- Whether the server is currently fsync-locked
- The lock count returned by the
fsynccommand - The duration the instance has been in fsyncLock (optional but useful)
Example potential metrics:
mongodb_fsync_lock_state (0/1)
mongodb_fsync_lock_count
mongodb_fsync_lock_seconds
Why This Matters
- Backups that rely on db.fsyncLock() can leave a node in a locked state if something fails.
- Operators currently have no way to detect this via Prometheus/Grafana.
- Monitoring this state is critical for:
- Backup automation
- Cluster availability
- Detecting issues on MongoDB secondaries used for backups
- Preventing prolonged write blocking due to accidental fsync-lock
Suggested Solution
Add support in the exporter to read and expose fsyncLock values from MongoDB diagnostic output by:
- Capturing the fsyncLock / lockCount returned by MongoDB admin commands.
- Lock state command
db.adminCommand({ currentOp: 1 }).fsyncLock - Mapping these fields to Prometheus metrics under a consistent naming scheme.
Happy to help test or validate the feature if added.