This article will take 10 minutes to read.
How do you keep up to date with changes in technology? How do you get your startup off the ground from a technical standpoint? What can you do to build out your tools and processes in a sustainable, low risk way? Finding space for continual learning at work can be tough. If you, like me have a passion for technology but have transitioned away from a technical role and miss fiddling with technology on a daily basis, you might consider creating a lab at home to keep your skills sharp, and stay up to date on industry trends and best practice.
Why have a lab
Having a lab represents freedom. Freedom to learn about new ideas in a safe space, freedom to use many different tools to make your life at home better, easier or more automated and freedom to build something new that eventually others could be involved in.
A lab is anything you want it to be. In my case it’s an Intel NUC and Raspberry Pi running various versions of linux as a base, with many docker containers on top. For you, it could network storage equipment and backup power supplies, it could be a big server running virtualisation tools or even a whole bunch of Internet of Things devices. Part of the joy of having a lab is that it can reflect your current feelings and priorities, and be flexible enough to change in tune with your priorities.
Working at the right layer of complexity
For me, the most important things in a lab are that I’m using the right tool for the job, and that I’m working at the right level of abstraction. In my case, I’m running a lab at home to automate parts of my life, enable the building of my own software product (maybe one day I’ll launch it in to a startup!) and that I can make the internet a better place for my family. To that end, I’ve got many tools running at home that fit under those categories (and some that don’t). Let’s first talk a little about what I’m not doing.
- I’m not currently focused on how networking works
- I’m not currently looking at configuration management and how that works (I’ve got some experience with puppe though)
- I’m not currently interested in virtualisation, hardware security or hardware in general
Another way of saying this is that right now, I care about applications and how they talk to each other, I don’t care about the underlying infrastructure (unless that has a material affect on the applications themselves).
So my chosen layer of complexity is the application layer, that means that where possible I should find a way of removing the complexity below that layer. This is not to say I don’t care about what hardware I’m running, it’s to say that I don’t care beyond a certain point. My low maintenance lab consists of three key devices right now.
It’s worth reiterating that my focus here is low maintenance. That means simple hardware, and low running cost. The Pi costs about $8 in power per year, the NUC a little bit more, all of this equipment runs cool and quiet. It’s also not overkill. For me it’s more important to have the right kit at the right time, than it is to have plenty of room to grow. I could have started with a big network attached storage device and enterprise grade server, but that’s a bunch of cash I don’t need to spend right now, for a hobby that might not be around in a couple of years time. It’s going to be slightly more expensive to scale because of this. For example the RAM on the NUC tops out at 16gb, at the moment I have 8gb with one slot left. If I want 32gb of RAM I either need a new server box, or another NUC. I’ll probably get another NUC. I think the tradeoff is worth it.
Raspberry Pi - Network Services
The Raspberry Pi 3b was actually the first thing I bought, at the time I wanted to make a retro gaming box for my kids but they’re still a little young for that. Now I treat it as a tool for network services. It’s my DNS server (so I can get gucci names for my services like $SERVICE.$MY_CUSTOM_DOMAIN.co.nz) as well as a network wide ad-blocker (thank you Pi-Hole!). I also wanted to run docker containers on it, but the ARM chip in it makes that a hassle. In the future I’d like to run a VPN from it, so that I can access all of my home tooling while I’m away from home but I never seem to get time to sit down and do that.
Intel NUC - General Compute and Storage
The NUC is my most recent addition, I picked it up in a sale last year. I’m pleased with how it’s performing and will probably scale out by adding NUCs until power management and things like that get annoying. It’s a 7th generation i5 processor, which is to say it’s not a beastly multitasking machine. It’s not and never will be good for things like rendering farms or bitcoin mining. It is however, good for a lot of moderate usages. I’m currently running 30+ docker containers (but more on that soon) and CPU usage hovers at around 3% idling, and bursts up to about 30% when it’s working hard. With how powerful entry level CPUs are these days, processors generally aren’t your bottleneck. It’s RAM - this is a very different picture. There is 8gb of RAM available right now, plus another 8gb of swap on the M.2 SSD inside the NUC. RAM usage sits at about 60% when idle, and maxes out fairly regularly. RAM is my next purchase.
Bare Metal, Virtualisation and Containers
For context, there is no right or wrong choice for you here. It really depends on what you’re optimising for. In my case, I want the regular maintenace of the systems and applications I use to be minimal and recovery from failure should be easy (reflecting the fact that I am highly likely to break the entire network and that this should be encouraged. It’s where learning happens). I don’t care so much about speed and efficient use of resources since the cost of scaling isn’t prohibitive, and I don’t really care about isolating things as a primary effect of the choice I make.
Business tend to have different needs so if you’re running a lab to skill up for work, your needs might differ.
For this reason I chose containerisation as my go-to application paradigm. It gives me the ability to store my application config in VCS, making recovery from failure alsmost trivial. With good orchestration tools I can manage resources well enough, and it also means that I can isolate applications from my underlying hardware. Docker is the de-facto standard for containerisation, and I’m already familiar with it through the work I did on continuous delivery for Healthlink and Spark.
That isolation from the host is important to me, as it means that I can treat my machines like cattle, not like a pet. With Docker I can take advantage of the underlying OS (in this case, Ubuntu Server 18.04) in a way that keeps the OS clean and tidy. All changes sit inside containers and not on the host OS - I can swap out Ubuntu with CentOS as many times as I want, my containers don’t care. I can upgrade the version of an application inside a container 300 times and the underlying OS doesn’t care. Things stay tidy. This also means that the process for recovering from a critical failure is small - application configuration is in version control, if the NUC grenades over the weekend, my recovery is to re-run that config on a new node. The hardest part of that is buying a new node because I hate spending money. The use of containers also mitigates some of the scaling pain. Rather than manually moving stuff from one machine to a new machine, or having to configure a bunch of stuff on a new machine, it’s easy to scale out the software side of the lab - all thanks to a tool called Rancher.
Rancher is the tool I use for container management and orchestration. I consider it to be my one stop shop for all things lab management, and is the main view (I do have other views) in to the state of my lab. This isn’t a sales pitch for rancher as much as I love it, this is mainly about how I use Rancher to solve the problems I want to solve.
The feature I use most with Rancher is orchestration. I specify the config of an image I want to run (via compose file or via the GUI) and Rancher worries about what host it goes on, what other containers it’s grouped with and so on. When the container is running I can inspect it, run a shell to it or modify it’s config easily. Rancher also has an awesome concept - the catalogue, that contains many images and config maintained by the community, that you can one click run in the style of Sandstorm, Bitnami and Cloudron, except rather than just deploying the application, you can deploy applications ready to scale, with pub/sub queues, redis and so on already integrated in to the application state.
Rancher comes with a load balancer based on HAproxy that works well out of the box. It has integrations in to Rancher functionality that make it super easy to set up reverse proxies with SSL termination. I simply add my SSL certs to Rancher’s cert store, then select them from a drop down menu within the load balancer. I can apply custom HAproxy config to automatically redirect http requests to https, and can configure path or link based routing.
The final part of the Rancher puzzle for me is the config files I mentioned earlier. Rancher maintains vanilla docker compatible
docker-compose.yml files for every stack of applications it manages, and updates that compose file when you make changes to the configuration. It also provides a
rancher-compose.yml that is very simillar to docker compose, but provides rancher specific config (e.g start with 3 containers or 10 containers, and so on). This for me is crucial. All of that config is held in VCS and so if the machine grenades, my state is stored safely elsewhere. That;s just the container config though. There is also config specific to the applications, that tend to be stored within the file system - in this case, Rancher makes it easy for me to volume out all of the config to the host, I don’t back that up currently but am looking at cloud backups for that also. I’ll probably have a daily backup to some hardware I’m running locally, and maybe a weekly or monthly backup to a cloud backup service provider (the state of my applications changes very infrequently so monthly or fortnightly is probably the best compromise there for cost vs currency).
What I’m Running
Right now I’ve got several stacks running within Rancher. I’ve grouped applications around themes for those stacks so that it’s easier to see what capability I have running and what I’m missing. The stacks are called:
- Home Admin
- Network Admin
- Media Automation
- Media Streaming
- Product Development
I’ll run through each of them at a high level:
Home admin for me is mainly about automating the things that relate to the physical space I live in, aka my house. I’ve got Mayan EDMS running as a paper removal machine. All of my bills, legal documents and other official docs get scanned and added to shelves within Mayan, so that I can search on the text inside the docs instead of just for PDF names and so on. I’ve docker-linked to Mayan to it’s own postgres container - this is a pattern I repeat and abuse for all of the tools I run. As an aside, the reason that I do a container and database instance per tool is so that I have more flexibility to upgrade or blow away databases. It also means that the databases are clean with only tables and data from one application. I choose postgres because it’s what I’m familliar with. It can be a pain but it’s a pain I know!
I’m also running a janitor tool that periodically scans for images that have no active containers, and dangling data volumes. If it finds anything not in use, it deletes it. Since my config is all stored externally, I can be fairly ruthless in terms of space saving this way.
Network admin is a bit of a small stack right now. It currently contains a netdata instance for host health monitoring - Rancher has the ability to manage hosts but the reporting from netdata is much more detailed. The stack also contains an instance of Rancher’s load balancer - this is where the reverse proxy and SSL termination functionality comes from.
Media Automation + Media Streaming
My media automation efforts are the most complex stack. I’m using SABnzbd, Sonarr, Radarr, Ombi (half heartedly), Tautulli and Plex. My focus here is on making available all the Blurays, DVDs and other media that I’ve collected over the past 15 years or so. All the childhood movies whose disks got broken, the numerous copies of the Matrix trilogy that I somehow acquired and the stuff that I generally grew up on, that I’d like to share with my kids now that they’re getting old enough.
This is the stack that is most actively worked on. It contains an instance of some software I’m developing in my spare time, a postgres database, an SMTP mail server and interceptor (for testing email notifications that come from the software I’m building), and an instance of the swagger editor.
This is a little bit of detail behind the thinking I went through when I decided I wanted to spend some time away from development, so that I could focus on learning about product and wider business things. Your lab might have different needs and look different, that diversity should be celebrated. There is no right or wrong lab!
I went with docker as it fit with my goals, those goals were about making the things I care about less, easier. That frees me up to focus on the stuff that I care about.
Rancher is cool.