Why Prometheus + Grafana over other monitoring options?

posted 2 months ago

silmarine@discuss.tchncs.de

I’ve been setting up and testing prometheus and grafana for about a week now, since that seems to be the universally accepted solution for self-hosted monitoring. But I’m starting to question why it is so accepted. On top of prometheus not seeming useful on it’s own (needing grafana to visualize and alertmanager for alerts) it feels like with each thing i want to monitor I have to spin up another docker container to export/gather the data. There are other options like LibreNMS that seems to have all that built into one container. So what does this Prometheus/Grafana stack have that other monitoring services don’t? Is it really worth having to set up each of these specialized exporters and dashboards? Or am I mistaken that it’s the main solution everyone uses? Are you using something different for monitoring?

Sort:

Hot Top Controversial New Old

[ - ]

Nundrum@yall.theatl.social

1 point

2 months ago

LibreNMS has a very different purpose from your other monitoring options - it’s network monitoring at a large scale, not a generic data storage / data visualization platform. If your goal is to monitor your selfhosted servers and services, this is going to be an odd fit and you’ll probably struggle against it.

Better fits for an out-of-the-box monitoring setup would be CheckMK or Zabbix.

These other “stacks” for monitoring are a little more bespoke. To cover it briefly:

Grafana is popular because it is a fantastic visualization platform. The backend data storage is pluggable.

There are many options for data storage, all that are a little different. Graphite, is push-based and the Statsd compatibility makes it super simple to push your own metrics into it. Prometheus is pull-based. And InfluxDB is more of a time-series database.

permalink

report

[ - ]

Pax@lemmy.world

2 points

2 months ago

I’ve been using Zabbix for years now. Does what I need it to do.

permalink

report

[ - ]

Max-P@lemmy.max-p.me

3 points

2 months ago

Separate components that do one thing and only that thing and does it well are good. Extra containers are basically free.

The exporters provide the metrics. They can be standalone executables like the node exporter, can also be included in apps themselves easily since it’s just HTTP. It’s trivial to add metrics to just about anything without needing extra ports. Its protocol is also easier and more efficient than SNMP.
Prometheus scrapes those metrics and stores it into its database. In other apps that’d be the role things like PostgreSQL have: you don’t really use it directly, but it’s no less important.
Grafana is the frontend you slap in front of Prometheus to actually display your metrics.
Alertmanager looks at the metrics and sends alerts. It’s separate because if your Prometheus box goes down, how are you gonna be alerted of that?

All 4 of those can be swapped with something else equivalent and it all still works. Don’t like the UI? Replace Grafana. Don’t like Prometheus? There’s VictoriaMetrics and InfluxDB

It looks silly on a small scale, but it scales up very well. Couple hundred VMs per Prometheus install, node exporters on every VM and a single Grafana cluster to visualize the data for the whole infrastructure at once.

That makes it all well liked in enterprise which means there are exporters for damn near anything (even the Lemmy server has a built-in exporter I can scrape with Prometheus), which in turn makes it the easy solution for self-hosters too, and here we are.

I feel like it’s easier to set up than some of the all in one solutions I’ve used previously, despite being several components. They’re all components that basically just work out of the box.

permalink

report

[ - ]

godber@lemmy.az.social

1 point

2 months ago

The number one reason is that Grafana is king of the open source operational dashboards. Grafana works with so many backends and has worked so well for so long it’s hard to beat.

Then when you start considering the metric collection and storage setting up a node exporter and black box exporter covers 80% of your use cases. There are scaling and security advantages to Prometheus’ pull architecture too.

I share your annoyance with having to roll out multiple services but I recently bundled them all together into a docker compose that I had been considering sharing publicly. If you can wait a couple weeks I can share that.

One other thought is that separating all of the development of the exporter components means that the teams with real expertise in the service being monitored can collect the best metrics. Rather than a monitoring project making a half ass metric collector for a service they have never used or managed.

Also Grafana has built in alerting so alert manager can be skipped in some cases.

permalink

report

[ - ]

darkham@lemmy.ml

2 points

2 months ago

I was asking myself the same. As everyone talk about these I used them until I discovered ChekMK, and others. Now I’m no longer using Grafana and Prometheus…

permalink

report

Selfhosted

!selfhosted@lemmy.world

Create post

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

Community stats

3.4K
Monthly active users
1.6K
Posts
14K
Comments

Community stats

Community moderators