Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics thread table size #1456

Closed
erthalion opened this issue Dec 5, 2023 · 2 comments · Fixed by #1558
Closed

Metrics thread table size #1456

erthalion opened this issue Dec 5, 2023 · 2 comments · Fixed by #1558
Assignees
Labels
optional Nice to have feature, but not a blocker

Comments

@erthalion
Copy link
Contributor

erthalion commented Dec 5, 2023

While troubleshooting vanilla Falco and previous memory related issues it
proved to be useful to have an understanding how the thread cache is growing.
To make it more visible, expose the current thread table size as a new
prometheus metric, e.g. rox_collector_events{type="threadCacheSize"}.

The numbers we're interested in could be obtained via libsinsp inspector
function get_thread_count(). Since this metric is not directly dependent on
event stream, we need to decide when exactly to take the counters current value
-- to not do unnecessary work if it's changing slowly, but to be fine-grained
enough to notice relevant spikes.

When troubleshooting vanilla Falco, the hacky solution I used was to log thread
count with throttled logging, and it was providing enough information. Thus, the
proposal is to update thread counter metric based on the number of processes
received, but with some throttling, e.g. when we receive every n'th process we
capture current thread table size.

Part of #1320

@erthalion erthalion changed the title prometheus metric for thread table size Metrics thread table size Dec 5, 2023
@ovalenti
Copy link
Contributor

ovalenti commented Dec 5, 2023

The CollectorStatsExporter runs a loop dedicated to publish the current counters/timers every 5s. Maybe this is acceptable as a time basis ?

@erthalion
Copy link
Contributor Author

The CollectorStatsExporter runs a loop dedicated to publish the current counters/timers every 5s. Maybe this is acceptable as a time basis ?

Yeah, sounds reasonable.

@ovalenti ovalenti self-assigned this Dec 5, 2023
@erthalion erthalion added the optional Nice to have feature, but not a blocker label Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optional Nice to have feature, but not a blocker
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants