Hi! Im new to self hosting. Currently i am running a Jellyfin server on an old laptop. I am very curious to host other things in the future like immich or other services. I see a lot of mention of a program called docker.
search this on The internet I am still Not very clear what it does.
Could someone explain this to me like im stupid? What does it do and why would I need it?
Also what are other services that might be interesting to self host in The future?
Many thanks!
EDIT: Wow! thanks for all the detailed and super quick replies! I’ve been reading all the comments here and am concluding that (even though I am currently running only one service) it might be interesting to start using Docker to run all (future) services seperately on the server!
A program isn’t just a program: in order to work properly, the context in which it runs — system libraries, configuration files, other programs it might need to help it such as databases or web servers, etc. — needs to be correct. Getting that stuff figured out well enough that end users can easily get it working on random different Linux distributions with arbitrary other software installed is hard, so developers eventually resorted to getting it working on their one (virtual) machine and then just (virtually) shipping that whole machine.
Yes, technically chroot and jails are wrappers around kernel namespaces / cgroups and so is docker.
But containers were born in a post chroot era as an attempt at making the same functionality much more user friendly and focused more on bundling cgroups and namespaces into a single superset, where chroot on its own is only namespaces. This is super visible in early docker where you could not individually dial those settings. It’s still a useful way to explain containers in general in the sense that comparing two similar things helps you define both of them.
Also cgroups have evolved alongside containers at this point and work rather differently now compared to 18 years ago when cgroups were invented and this differentiation mattered more than now. We’re at the point where differentiation between VMs and Containers is getting really hard since both more and more often rely on the same kernel features that were developed in recent years on top of cgroups
a chroot is different, but it’s an easy way to get an idea of what docker is:
it also contains all the libraries and binaries that reference each other, such that if you call commands they use the structure of the chroot
this is far more relevant to a basic understanding of what docker does than explaining kernel namespaces. once you have the knowledge of “shipping around applications including dependencies”, then you can delve into isolation and other kinds of virtualisation
Isn’t all of this a complete waste of computer resources?
I’ve never used Docker but I want to set up a Immich server, and Docker is the only official way to install it. And I’m a bit afraid.
Edit: thanks for downvoting an honest question. Wtf.
It can be, yes. One of the largest complaints with Docker is that you often end up running the same dependencies a dozen times, because each of your dozen containers uses them. But the trade-off is that you can run a dozen different versions of those dependencies, because each image shipped with the specific version they needed.
Of course, the big issue with running a dozen different versions of dependencies is that it makes security a nightmare. You’re not just tracking exploits for the most recent version of what you have installed. Many images end up shipping with out-of-date dependencies, which can absolutely be a security risk under certain circumstances. In most cases the risk is mitigated by the fact that the services are isolated and don’t really interact with the rest of the computer. But it’s at least something to keep in mind.
If it were actual VMs, it would be a huge waste of resources. That’s really the purpose of containers. It’s functionally similar to running a separate VM specific to every application, except you’re not actually virtualizing an entire system like you are with a VM. Containers are actually very lightweight. So much so, that if you have 10 apps that all require database backends, it’s common practice to just run 10 separate database containers.
On the contrary. It relies on the premise of segregating binaries, config and data. But since it is only running one app, then it is a bare minimum version of it. Most containers systems include elements that also deduplicate common required binaries. So, the containers are usually very small and efficient. While a traditional system’s libraries could balloon to dozens of gigabytes, pieces of which are only used at a time by different software. Containers can be made headless and barebones very easily. Cutting the fat, and leaving only the most essential libraries. Fitting in very tiny and underpowered hardware applications without losing functionality or performance.
Don’t be afraid of it, it’s like Lego but for software.
It’s not. Imagine Immich required library X to be at Y version, but another service on the server requires it to be at Z version. That will be a PitA to maintain, not to mention that getting a service to run at all can be difficult due to a multitude of reasons in which your system is different from the one where it was developed so it might just not work because it makes certain assumptions about where certain stuff will be or what APIs are available.
Docker eliminates all of those issues because it’s a reproducible environment, so if it runs on one system it runs on another. There’s a lot of value in that, and I’m not sure which resource you think is being wasted, but docker is almost seamless without not much overhead, where you won’t feel it even on a raspberry pi zero.
The main “wasted” resources here is storage space and maybe a bit of RAM, actual runtime overhead is very limited. It turns out, storage and RAM are some of the cheapest resources on a machine, and you probably won’t notice the extra storage or RAM usage.
VMs are heavy, Docker containers are very light. You get most of the benefits of a VM with containers, without paying as high of a resource cost.
I’ve had immich running in a VM as a snap distribution for almost a year now and the experience has been leaps and bounds easier than maintaining my own immich docker container. There have been so many breaking changes over the few years I’ve used it that it was just a headache. This snap version has been 100% hands off “it just works”.
Interesting idea (snap over docker).
I wonder, does using snap still give you the benefit of not having to maintain specific versions of 3rd party software?
But why can I “just install a program” on my windows machine or on my phone and it is that easy?
You might notice that your Windows installation is like 30 gigabytes and there is a huge folder somewhere in the system path called WinSXS. Microsoft bends over backwards to provide you with basically all the versions of all the shared libs ever, resulting in a system that can run programs compiled from decades ago just fine.
In Linux-land usually we just recompile all of the software from source. Sometimes it breaks because Glibc changed something. Or sometimes it breaks because (extremely rare) the kernel broke something. Linus considers breaking the userspace API one of the biggest no-nos in kernel development.
Even so, depending on what you’re doing you can have a really old binary run on your Linux computer if the conditions are right. Windows just makes that surface area of “conditions being right” much larger.
As for your phone, all the apps that get built and run for it must target some kind of specific API version (the amount of stuff you’re allowed to do is much more constrained). Android and iOS both basically provide compatibility for that stuff in a similar way that Windows does, but the story is much less chaotic than on Linux and Windows (and even macOS) where your phone app is not allowed to do that much, by comparison.
In case of phones, there’s less of a myriad of operating systems and libraries.
A typical Android app is (eventually) Java with some bundled dependencies and ties in to known system endpoints (for stuff like notifications and rendering graphics).
For windows these installers are usually responsible for getting the dependencies. Which is why some installers are enormous (and most installers of that size are web installers, so it looks smaller).
Docker is more aimed at developers and server deployment, you don’t usually use docker for desktop applications. This is the area where you want to skip inconsistencies between environments, especially if these are hard to debug.
Caveat: I am not a programmer, just an enthusiast. Windows programs typically package all of the dependency libraries up with each individual program in the form of DLLs (dynamic link library). If two programs both require the same dependency they just both have a local copy in their directory.
So instead of having problems getting the fucking program to run, you have problems getting docker to properly build/run when you need it to.
At work, I have one program that fails to build an image because of a 3rd party package who forgot to update their pgp signature; one that builds and runs, but for some reason gives a 404 error when I try to access it on localhost; one that whoever the fuck made it literally never ran it, because the Dockerfile
was missing some 7 packages in the apt install line.
Building from source is always going to come with complications. That’s why most people don’t do it. A docker compose file that ‘just’ downloads the stable release from a repo and starts running is dramatically more simple than cross-referencing all your services to make sure there are no dependency conflicts.
There’s an added layer of complexity under the hood to simplify the common use case.
There are two ends here, as a user and as a developer. As a user Docker images just work, so you solve almost every problem you’re having which would be your users having them and giving up on using your software.
Then as a developer docker can get complicated, because you need to build a “system” from scratch to run your program. If you’re using an unstable 3d party package or missing packages it means that those problems would be happening in the deploy servers instead of your local machines, and each server would have its own set of problems due to which packages they didn’t have or had the wrong version, and in fixing that for your service you might be breaking other service already running there.
Yeah, it’s another layer, and so there definitely is an https://xkcd.com/927/ aspect to it… but (at least in theory) only having problems getting Docker (1 program) to run is better than having problems getting N problems to run, right?
(I’m pretty ambivalent about Docker myself, BTW.)
…baby don’t hurt me… No more…
Docker enables you to create instances of an operating system running within a “container” which doesn’t access the host computer unless it is explicitly requested. This is done using a Dockerfile
, which is a file that describes in detail all of the settings and parameters for said instance of the operating system. This might be packages to install ahead of time, or commands to create users, compile code, execute code, and more.
This instance of an operating system, usually a “server,” is great because you can throw the server away at any time and rebuild it with practically zero effort. It will be just like new. There are many reasons to want to do that; who doesn’t love a fresh install with the bare necessities?
On the surface (and the rabbit hole is deep!), Docker enables you to create an easily repeated formula for building a server so that you don’t get emotionally attached to a server.
Please don’t call yourself stupid. The common internet slang for that is ELI5 or “explain [it] like I’m 5 [years old]”.
I’ll also try to explain it:
Docker is a way to run a program on your machine, but in a way that the developer of the program can control.
It’s called containerization and the developer can make a package (or container) with an operating system and all the software they need and ship that directly to you.
You then need the software docker (or podman, etc.) to run this container.
Another advantage of containerization is that all changes stay inside the container except for directories you explicitly want to add to the container (called volumes).
This way the software can’t destroy your system and you can’t accidentally destroy the software inside the container.
I know it’s ELI5, but this is a common misconception and will lead you astray. They do not have the same level of isolation, and they have very different purposes.
For example, containers are disposable cattle. You don’t backup containers. You backup volumes and configuration, but not containers.
Containers share the kernel with the host, so your container needs to be compatible with the host (though most dependencies are packaged with images).
For self hosting maybe the difference doesn’t matter much, but there is a difference.
A million times this. A major difference between the way most vms are run and most containers are run is:
VMs write to their own internal disk, containers should be immutable and not be able to write to their internal filesystem
You can have 100 identical containers running and if you are using your filesystem correctly only one copy of that container image is on your hard drive. You have have two nearly identical containers running and then only a small amount of the second container image (another layer) is wasting disk space
Similarly containers and VMs use memory and cpu allocations differently and they run with extremely different security and networking scopes, but that requires even more explanation and is less relevant to self hosting unless you are trying to learn this to eventually get a job in it.
Thank you for the thorough response. After looking carefully at what you wrote I didn’t really see a difference between the term self-hosting and home network.
You said you have software that automatically downloads media. The way I see this using movies for instance, if I own the movies and have them on my machine, then I can stream them over my network and have full control. Whereas if I “own” them on Amazon and steam it from there, they can track the viewing experience, push ads, or even remove the content completely. I understand that… But if I want a NEW movie, I’m back to Amazon to get it in the first place (or Netflix, or Walmart, etc. I get it). I’m fact, personally I’ve started actually buying disks of the movies/music I like most so that it can’t really be taken away and I can enjoy it even without an Internet connection. Am I missing something? Unless of course the media you are downloading is pirated.
I know I’m asking what seems to be a huge question but I’m really only asking for a broad description, sort of an ELI5 thing.