Since the term is a bit vague and surely nerdy, what do we mean by that? According to Wikipedia:
Self-hosting in the context of website management and online publishing is used to describe the practice of running and maintaining a website using a private web server.
While this is certainly true when you perceive web services as websites mostly, but just heading over to awesome-selfhosted it opens up an ocean of web services or apps, which can be run on your server. This is not new whatsoever, but before the times of SaaS this used to be the norm mostly for organizations. The main difference is, that nowadays everything has moved to the internet, not just company code hosting and some quirky project management or a forum, but basically everything we interact with via laptop or phone. This is nice, and the SaaS way of delivering those made all this possible. But it also meant data itself as well as data processing moved to a 3rd party. Now everyone can make up his own mind and decide if this is good or bad, but to take a step back, why do we even have to think about it?
Self-hosting is hard!
For all practical purposes, it really is. The whole concept of running software services on one's own server or infrastructure is often even unknown or just perceived as outdated and possibly even dangerous (security patches, up-time, data loss, ...). The internet is at the centre of our daily life, yet so few people and organizations even dare to think about something the internet was built for. Public conversation slowly picks up on data sovereignty, but still tools are lacking everywhere, and the SaaS deployment model is not helping much. And we get it, there is a huge gap between the one using a web service and the technical knowledge required to also deploy and, more importantly, maintain it. Even one such service (think about a kanban board provided by a web service) is hard, let alone multiple ones.
The sovereignty of the tools
The interesting parts for me were always in areas, where it seemed like there is a technical issue which prevents people from gaining control or the aforementioned sovereignty of the tools and data everyone relies on. Technical problems can mostly be solved in a technical way, so why is that gap still there in 2021? Apple has solved the deployment and updates of app on mobiles phones long ago. Just imagine having to head over to your local IT person or smartphone vendor to install an app. Of course, this renders the whole thing quite a lot more useless and inhibiting. Now if, for a moment, we look purely at the issue on hand, regardless of why or why not self-hosting often makes sense, why isn't this solved yet?
Turns out there are a lot of details which need to align and get right to make it work:
- Infrastructure needs to be performant, stable and easy to replace
- Relevant software has to exist
- A domain is required
- Security aspects have to be dealt with
- Backups and restore. Disasters strike eventually, but they shouldn't be fatal.
- Software needs to be installed and updated on a regular basis
- A user interface to deal with all that
Now, that list can possibly be extended quite a bit. But let's look at those from a 2021 perspective.
The basis for the whole endeavour. We have entered essentially a race to the bottom when it comes to data centre infrastructure. VPS providers are everywhere. There are geographically local ones, servers for every use-case and price category, storage can be attached in a few clicks. Components can be clones, rebuilt and destroyed magically in an instant. DigitalOcean paved the way from complex AWS environments to great ease of use. No more reading up again and again what a security group or IAM rule is, just hand me that resource already. Those infrastructure bits are also amazingly stable and reliable. So this issue is basically solved, it will only get cheaper from here on.
This is an obvious one, but the projects and software for the web services to run have to exist and have to be maintained. Only 10 years back, this was a different landscape. Next to WordPress and other PHP based CMS's, there wasn't much tangible coming up. However, with the drastic improvement of technology stacks on the backend (NodeJS, Django, rails, go, even PHP is now great to work with) also came a constant flow of great new apps. The aforementioned awesome-selfhosted list is a good example of that. It is easier than ever to create web services, and developers start realizing that potential.
Mostly a matter of buying one, but also a matter of understanding what a domain is and how it works. There is some potential to make getting a domain and managing the DNS records more user-friendly and obvious. But that aspect is still a bit in the techy or geeky space, unfortunately. Good news is, some registrars are abstracting away the details already, often bound to website (read WordPress) offerings. This is certainly doable by now without actually having to deep-dive into the DNS world, since many registrars offer APIs for products to integrate with.
Strong defaults, properly setup packages, a solid firewall configuration are all solvable problems. The issue lies in the detail, but that is for the most part best left with experts, who also follow vulnerability updates. The truth here is that distributions often already provide timely patches and have already built automatic update mechanisms for lower level components. The default settings often still have to be tweaked and remain an area where everyone managing a server has to read up and properly re-configure services and apps accordingly, while everyone basically will apply the very same changes.
The backbone of every deployment. Disaster strikes eventually. It may be simple disk failure, accidental deletions or other unforeseen events. A backup which exists, is valid and correct is necessary for recovery and the recovery has to be at least documented and tested as well. Often times though, backups are an afterthought, since they don't affect immediate service while everything is in good condition. Nevertheless, once unrecoverable data loss is experienced, the realization is stark. The easiest way currently are automated server snapshots, often supported by the server providers. They are usually a good trade-off, but fall short if a single app on the server needs to be restored. This then means either run one app or service per server or implement a more fine-grained backup and restore solution. The first one is less cost-effective, whereas the second option can get really complex and time-consuming.
To keep this separate from the security updates, regular software updates are necessary not so much to ensure high up-time but to benefit from added features and also to ensure to have a future-proof setup. Sometimes external APIs change, requiring a software component to adjust, to ensure continuity. Other times it is simply a matter of providing a better service. Either way, SaaS has taught us how iterative updates add real benefit to the user. Self-hosting should not take this away, after all upstream apps usually release quite often with often great additions in functionality. The issue is, that updates can be disruptive and don't follow a common process. App specific data migration scripts have to be called, underlying frameworks have to be also updated alongside, the database version may or may not be changed, ... Due to that lack of streamlined process and, depending on your backup strategy, the requirement to quickly rollback, updating apps and services is often neglected in the self-hosting space.
SSH+vim is really only good and accessible for a select few. Often changing config files and using the terminal can lead to typos, especially during high stress times like when a service is down. Those tools undeniably give great power to the sysadmin, but often some safety guards are important as well. This can be achieved with tying many things mentioned before into a solid web interface, guiding the user, interleaving additional information and documentation links for advanced use-cases. Also protecting the user from accidental actions and well, it is simply often less time required achieving tasks through a well-designed interface.
Self-hosting is easy!