-
Notifications
You must be signed in to change notification settings - Fork 290
Add Pressure Stall Information (PSI) metrics (reopened #2996) #3068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # docs/system/system-metrics.md
Co-authored-by: James Thompson <[email protected]>
Co-authored-by: James Thompson <[email protected]>
|
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
|
@thompson-tomo @braydonk @trask |
|
@alpineQ can you rebase/merge in master as the doc templates have been updated. |
|
@thompson-tomo any updates on this? |
thompson-tomo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs and definitions look good to me based on published guidance & clarification.
|
hi @alpineQ, this will need review and approval from @open-telemetry/semconv-system-approvers |
Closes #2995
Changes
This PR adds support for Linux Pressure Stall Information (PSI) metrics to the system semantic conventions.
PSI is a Linux kernel feature (available since kernel 4.20) that identifies and quantifies resource contention by measuring the time impact that CPU, memory, and I/O resource crunches have on workloads.
New Metrics
system.linux.psi.pressure(Gauge): Measures resource pressure as a percentage of time that tasks were stalled over a time window (10s, 60s, or 300s)system.linux.psi.total_time(Counter): Tracks the total cumulative stall time in microseconds since system bootNew Attributes
system.psi.resource: The resource type (cpu,memory,io)system.psi.stall_type: The stall severity (somefor partial stalls,fullfor complete stalls where all non-idle tasks are blocked)system.psi.window: The time window for pressure calculation (10s,60s,300s)Use Cases
PSI metrics enable:
References
Relevant issues and PRs
There are issues on this matter in:
And 2 PRs that I am proposing to address these issues:
Important
Pull requests acceptance are subject to the triage process as described in Issue and PR Triage Management.
PRs that do not follow the guidance above, may be automatically rejected and closed.
Merge requirement checklist
[chore]Reopened #2996