Homelab & Nerding

Goodbye Azure Kubernetes, Hello Hetzner?

On to plan D!

After a few good months of experimentation since my last post, I’ve come to the conclusion that Azure is not the place to meet my goals.

Let’s recap on the goals, and how well Azure and my deployment skills were able to meet each:

  1. Migrate OhanaAlpine workloads – Azure: 3, My skills: 5

    I was able to get my static sites migrated – including fixing the build pipeline on one, SSL cert generation via LetsEncrypt, and MariaDB and WordPress running as single pod deployments. I learned so much doing this – which was one of the primary goals of doing it – but ultimately ran into some Azure limitations that were showstoppers. More on those below.
  2. Single node and ability to scale – Azure: 5, My skills: 4

    Single node was decent enough, and while I know Azure could scale out capably, but I never got the single-node deployment working well enough to want to take it to multi-node. There were some Azure performance limitations that stopped that.
  3. Less than $50/mo, and lower is better – Azure 2, My skills: 3

    When I bailed, my bill was running close to $70/mo for 1x Standard_B2s worker node, a load balancer, Azure Disk and Azure File storage, and very minimal egress charges.
  4. No single point of failure / self-restoring pods – Azure: 4, My skills: 4

    It’s hard to get to this goal with a single node, so I worked to make the workloads self-healing in the event of a node rebuild. Using the Gitlab Agent for Kubernetes, a good tool and a carryover from the Linode days made deployments super-simple. If a node gets blown away, the agent will pull all the manifests from a gitlab rep and rebuild the workloads. Works like a charm. The only catch is that it doesn’t put data back – that is, databases will be empty and WordPress will be stock out-of-the-box new. Using Azure Files as a read-write-many (RWX) filesystem was going to be a key to this, and it worked for manual reloads, but the performance wasn’t there to pursue the automation further, nor was it nearly good enough to use as the backing file system for WordPress’s wp_content directory.
  5. Flexible for other deployments – Azure: n/a, My skills: n/a

    I never got all of my core workloads going in a way that I was ready to call production capable so I never branched into other workloads.

As alluded to above, WordPress with Azure File as ANY part of the backing store is a no-go performance-wise. This was enough to put me off of Azure, as it’s near impossible to fix without seriously scaling up and spending $$$. With a Files share mapped in to wp-content, I was getting 30sec page load times, and that’s with images failing to load. Mind you, that’s with Azure’s HDD “Transaction Optimized” storage. It’s a known problem with Azure File handling large numbers of small files.

Corroborating sites on Azure Files performance issues with large numbers of small files in WordPress:

The path forward is either to go with un-scalable Azure Disk, scale up to way out of budget configurations, or move a platform other than Azure.

And with that, I cancelled my Azure subscription and deleted all my data there. 

I put a question out to the collective wisdom of Reddit, and they came back with a few good options for providers that would meet my needs:

  • Hetzner – super-affordable and highly regarded, but no managed Kubernetes.
  • Digital Ocean – affordable with managed Kubernetes
  • Civo
  • Vulture
  • Symbiosis.host
  • Linode

Right now, I’m leaning strongly towards Hetzner, even if that means spinning up my own Kubernetes cluster. Their pricing is such that I can do 3 control nodes, 3 worker nodes, a load balancer, and a mix of storage for $48/mo or so.

Folks have pointed out that I could pay for static hosting and WordPress hosting for less cost and hassle than I’m going through with this, but I point out that the end product is only a small part of the goal. I’ve learned so much about Kubernetes and automation doing this so far, and I’m not done yet!

Homelab & Nerding

Goodbye Linode Kubernetes, Hello… Azure Kubernetes?

On to plan C!

After a fair amount of googling and noodling, I’ve come to the conclusion that Linode’s LKE Kubernetes service can’t do what I want it to, at least in a way that doesn’t feel hacky and get expensive. My goals – and working on this migration has helped sharpen these:

  1. Migrate my OhanaAlpine VPS docker workloads over to Kubernetes.
  2. Do so in a way that can run comfortably on a single node, or scale up to 3-5 for testing  / research / upgrades.
  3. Not be cost-prohibitive. << $50/mo, the lower the better.
  4. Not have a single point of failure, even in the single node config. That means that if the single node got recycled that it’d be able to reconstitute itself including data (DB, app data, files) from backup as part of the rebuild if necessary.
  5. Be flexible for other deployments.

LKS hit all but #4, and I could not for the life of me figure out a way to do that that wasn’t kludgey. Here’s why:

  • Persistent storage doesn’t persist for node recycles. That is, if I put MariaDB tables out on a PV/PVC block storage volume, it doesn’t get re-attached if the node is recycled and built from scratch.
  • Linode doesn’t offer RWX access for PV. That is, a block storage volume can only be attached to one volume at a time.
  • Related, there’s no easy, obvious way to do shared storage across nodes / pods easily. I looked into Longhorn that might do the trick, it depends on at least 1 node in the cluster being running. I know that should be the norm, but that violates #4
  • I thought about S3 object storage, either as primary shared storage (I don’t think RWX is required for that) and as backup storage to store backups for Longhorn to bootstrap with.  It all felt overly complicated and rickety to set up. Yandex S3 was the lead option, and while S3 kinda got close, it wasn’t really a proven option. I may circle back to this one day.

What I really wanted was a file storage service from Linode, sorta like EFS from AWS. If I could reliably and securely mount an NFS share in a pod or across pods, that would have solved most of my problems, or at least been a non-hacky way to achieve my goals. Why doesn’t Linode offer this?  Oh, I could have spun up my own, but that’s more cost and complexity. Not out of the question in the future but feels like too heavy of a lift for now.

So, what’s a hacker to do? Without changing my requirements (I’m looking at you, item #4…) looking at alternatives Kubernetes hosting is the next step. Looking at Digital Ocean, AWS, GCP, and others, it seems like Azure (AKS) is the best way forward. I think it’s a super-capable platform, but I’m not totally crazy about it because it looks expensive with even a minimal cluster. It comes with a $200 credit to use in the first month and a bunch of free services for the first 12mos, and that should give me time to get built and see what steady-state costs are going to be. I might yet fail on #3 but at least I’m learning, right?