z3bra
I’ve just started digging into it myself ! Here’s my current setup (I’ll see how it scales in the long term):
- syslog on every host
- Telegraf collects and parse logs
- InfluxDB stores everything
- Grafana for dashboards
I run OpenBSD on all my servers, and configure all the services to log via syslog.
Then I configuré syslog to send only those I care about (https, DNS, …) to a central telegraf instance, using the syslog protocol (RFC3164).
On this collector, telegraf gets all these logs and parse them using custom grok patterns I’m currently building, to make sense out of every log line it receives. The parsed logs are in turns stored in Influxdb, running on the same host.
I then use Grafana to query InfluxDB and create dashboards out of these logs. Grafana can also display the logs “as-is” so you can search through them (it’s not ideal though as you simply search by regex from the full message, so it’s on par with grep at least).
This setup is fairly new and seem to work very well. Telegraf is also very low on resource usage for now. I’ll have to continue adding grok patterns and send more application logs to it to see how it handles the load. I do have a few questions still unanswered for now, but time will tell:
Q: Should I first collect via a central syslog before sending to telegraf ?
This would let syslog archive all logs in plain text, rotate and compress them. I would also only have a single host to configure for sending logs to telegraf. However this would eat up space, and could hide the original sending hostname for each log. I might try that someday.
Q: Should I run telegraf on each host ?
This would distribute the load of the grok parsing amongst all hosts, and then all telegraf processes will send directly to the central one for collection, or even directly into influxdb. I would also benefit from telegraf being install on each host to collect more data (CPU, network stats, …). However it makes the configuration more complex to handle.
Q: What is a good retention period ?
For now, influxDB doesn’t expire any data, as I don’t have much yet. In the long run, I should probably delete old data, but it’s hard to tell what is “old” in my case.
Q: Do I need an interface to read logs ?
I use this setup mostly for graphs, as grafana can make sense of fields like “http_verb”, “http_code” and such. However, it is much more practical for me to dig into the logs right on the server, in /var/log
. Having an interface like chronograf or graylog seems practical, but I feel like it’s overdoing it.
Bonus:
Short answer: Don’t bother, it’s too complex to setup (unless your app is HTTP or supports the PROXY protocol). You better read your proxy logs instead.
Long answer: What you want is called “IP transparency” and require your proxy to “spoof” the IP address of the client when forwarding packets to the remote server. Some proxies do it (Nginx plus, Avi Vantage, Fortinet) but are paid services. I don’t know for free solutions as I only ever implemented it with those listed above.
This require a fairly complex setup though:
0. IP address spoofing
The proxy must rewrite all downstream request to spoof the client IP address, making it look like the traffic originates from the client at the TCP layer.
1. Backend server routing
As the packet will most likely originate from random IP on the internet, your backend server must have a way to route back the traffic to the proxy, instead of it’s default gateway. Otherwise you’d implement what is called "Direct Server Return*, which won’t work in your case (packet will be dropped by the client as originating from your backend server directly, and not from the proxy).
You have two solutions here:
- set your default gateway to the proxy over its VPN interface (don’t do that unless you truly understand all the implications of such a setup)
- use packet tagging and VRF on the backend server to route back all traffic coming from the VPN, back to the VPN interface (I’m not even sure this would work with an IPsec VPN though because of ACL…)
3. Intercept and route back return traffic
The proxy must be aware that it must intercept this traffic targeted at the destination IP of the client as part of a proxied request. This require a proxy that can bind on an IP that is not configured on the system.
So yeah, don’t do that unless you NEED to do that (trust me as I had to do it, and hated setting it up).
Edit: apparently haproxy supports this feature, which they call transparent mode
ELI5
So it’s saturday afternoon, a very hot one, so you ask your daddy for an ice cream (hosted service). The shop you go in is very bizarre though, as there is one vendor (TCP port) for each flavor (docker service/virtualhost). But it’s tricky because they’re all roaming in the shop, and you don’t know who’s responsible for each flavor. Your dad is also not very comfortable paying these vendors directly because they only accept cash and do not provide any receipt (self-signed certificate/no TLS).
Hopefully, there is the manager (reverseproxy) ! This girl is right where you expect her: behind the counter (port 80/443), accept credit cards and has a receipt machine (Domain name + associated certificate). She also knows everyone on her team, and who’s responsible for each flavor !
So you and your dad come to see the nice lady, ask for a strawberry + chocolate ice cream, and pay her directly. Once done, she forwards your request directly to the vendors responsible for each flavor, and give you back your ice cream + receipt. Life is good, and tasty !
Don’t even bother with a SWAP partition. Create an empty file on your / partition so you can grow/shrink it as needed.
did if=/dev/zero of=/SWAP bs=1024m count=4
mkswap /SWAP
swapon /SWAP
Is the flying puffy the techno-mage’s system ? If yes, what’s the hostname ?
I made a cool exercise some time ago: checking my top 10 used commands, to see how I can “optimize” them, and maybe create a few aliases to save a few keystrokes. Turns out I don’t need that much aliases in the end:
alias v='vis' # my text editor
alias sv='doas vis'
alias ll='ls -l'
And that’s pretty much it ^^ I do have a lot of scripts though, some of them being one-liners, but it makes them shell independent, which I like :)
For reference, here is my analysis of my top 10 most used commands.
edit: I do have a bunch of git aliases though for common subcommands I use. I prefer using them as git <alias>
over using she’ll specific aliases so I can always type “git
” and not mentally think wether I need a git command or a shell alias.
I used to do it, but not anymore as it was kind of clunky in my case. But I used dendrite as the matrix server rather than Synapse, so that’s most likely the reason.
I ended up moving the database on a separate host rather than using SQLite, and it added way too much latency to the whole system. Storage was also a big issue in my case, as all media received by the bridges are stored locally, and boy does it grow fast in my case haha.
When dendrite reaches a more mature state, I’ll 100% do it again though !