Hello, Kubernetes!

As a matter of learning, and to get my personal sites off of the cobbled VPS where they’ve happily lived for a while, I took on migrating them all to a Kubernetes cluster. How hard could it be, right? Or rather, how many learning opportunities could there be in this endeavor? Let’s discuss a few.

K3S on a Linode Nanode will work, right? I figured I’d try it and see. On a train ride from NYC to NC, I built out K3S on 2x 1GB/1CPU VPS’, and it was… alright. I didn’t end up with enough useful capacity afterwards to actually deploy much, but it built. I tore that down, and moved on to plan B – if I have to move up to a (slightly) more expensive VPS, for the same price why not have Linode do the Kubernetes control plane for me?

So far, Linode Kubernes Engine (or LKE) has been solid. I configured it with the GitLab Agent for Kubernetes (aka agentk), and made that Pull tool the core of my CI configuration. Once set up – and it’s straightforward to set up – I check a manifest yaml into the agent’s repository, and the agent pulls it down and executes it.

All was good until I kept failing on pulling a container image from a private repository of mine. They say you learning by failing fast and often, and I did learn.

  1. A container image pulled from a private registry with no namespace descriptors in the manifest failed consistently with an error:

"Failed to pull image <registry URL>: rpc error: code = Unknown desc = Error response from daemon: Head <registry URL /w tag>: denied: access forbidden

This failed however I created the container registry secret, and regardless of where I configured its use.

  1. When I made that same project public in GitLab, after a few hour pause, it pulled and provisioned successfully. This made me think it might be a auth problem and not a connectivity problem between my LKE nodes and GitLab. Win. Deleted this and made the project private again.
  2. Created a new namespace. Added the gitlab project token as a secret in the new namespace and tried the same private project in the namespace, using namespace directives in the manifest yaml and it worked. I have connection and authorization in an explicit namespace. Win.
  3. I created a newly named secret in default namespace, took the same project def and changed the namespace descriptors in the manifest YAML from newNamespace to default (and the imagePullSecrets), and it worked. Win.

So, I got it working in a sane and reproducible way, but I’m still not sure why it failed in the first place. It’s like agentk wasn’t looking in default for the imagePullSecrets until default was explicitly declared in the manifest. It’s not obvious to me where it was looking though.

Manifests are now triggering the successful pull and deployment of both public and private packages, and I’ve learned an amazing amount about deployments and secrets and namespaces and private registry auth and Kubernetes details.

Next up, Ingress with Nginx!

(Parts of this were re-used from a bug comment I made on this thread. In retrospect, I don’t think the problem I was having was exactly the one in the bug description, but it’s close, and this thread was helpful in me figuring out what was going on with my private container auth.)


How I Work – An Update on Capture

In my last How I Work post, I was much enamored with Bullet Journaling – or BuJo for short – for note capture. As a recap, I had never used a note-taking system that worked the way my brain did, and as such, nothing ever stuck until Bullet Journaling. Its solid, simple, flexible capture that just works. How to organize and track for retrieval after the fact I never got the hang of, so my system ended up being easy-in, serial page-flipping search to get anything out. Still, it was an major improvement from living in my head.

GoodNotes – Redefining Capture

My wife Susie turned me on to an iPad journaling app that she’d found through her Life Coach School and NoBS coaching communities called GoodNotes. GoodNotes is much like a paper journal in that you’re capturing notes and sketches, but rather that ink on paper, the app captures into digital ink. One of the fantastic features was being able to use PDF templates for pages. I had an effective Daily template that had some standard gratitude and goals questions that I found helpful to start the day with – many thanks to Robert Terekedis for his [article on effective Digital Bujo]) https://blog.euc-rt.me/2020-03-06-organizing-adapted-digital-bullet-journal-ipad-goodnotes/#gsc.tab=0). Robert’s work was a helpful foundation to get me started. Moving to GoodNotes on the iPad with Apple Pencil was a solid upgrade.

Importing PDFs to use as immutable templates is GoodNotes superpower.

Feature Rundown

  • Capture of text was equally as good as the paper Bujo.
  • Having easy access to more (digital) ink colors and line widths was a plus
  • Erasing easily was a huge plus
  • Copy and paste of inked text was fantastic for rearranging a day’s notes and for daily and monthly migrations. With a paper Bujo, rewriting everything at the first of a new month as part of migration always struck me as more hassle than benefit. With GoodNotes, it’s lasso/copy/paste and you’re done. That made migrations much easier.
  • Pasting pictures was very helpful. Often, after I’d captured notes from a meeting, I’d go back on the Mac GoodNotes client and paste in useful images and screencaps from the meeting. It was super useful to have notes and images together – not easily doable with the paper BuJo.
  • One of the big expected benefits I was looking forward to with GoodNotes was Search. Since GN converts inked writing to text behind the scenes, it’s all searchable. Unfortunately, the search is disappointingly basic. Sure, you could find the word "lasagne", and it’d show you all occurrences of that word, but you couldn’t search for "lasagne AND vegetable" to find those words on the same page, or "lasagne AND date in JAN2022" to find pages within date ranges. The final disappointment with search was a lack of hashtag searching. I tried tagging my notes with #ALMOND, and it wouldn’t search for the hash symbol – it just ignored it, making hashtagging notes not so useful.

The Verdict

The final verdict: good for capture, but not for long-term storage and retrieval. In my next How Do I Work post, I’ll cover what came next: Obsidian, and cover the GoodNotes feature that made the transition relatively painless.


How Do I Work – v4.0

I’ve been taking a hard look at how I do work. By that, I mean how I chose what work to do and how I arrange my day to make it happen. There have been a few iterations over the last year, and I keep getting better and better at it.

All In My Head – v1.0

In the beginning, there was my not great memory and my “feels“. My todos lived in my head (or my inbox) and calendar had a scatter of hard scheduled meetings and open time. Project time had to happen in my scattered free time, and was largely done based on what I felt like doing in the moment, or what urgent thing had been dropped on me. That was good enough in some ways, but I wasn’t as nearly as effective or productive as I could have been. Some days, it was hard to state a solid item that I had done that day, but I certainly stayed busy doing emails and the like.

I dabbling in GTD (Getting Things Done), in both paper and digital versions, and that never stuck. Capturing and organizing open items and tracking contexts never worked for me and took more effort than it benefitted.

Effective Capture – v2.0

Then I discovered Bullet Journal. (Site / Book) It’s a light-process, flexibly structured system of capturing things – todo’s, notes of different kinds, events to be organized, and whatever else comes up. All it takes is a notebook and a pen. It clicked for me in really useful ways. My biggest win was finally capturing all the todo’s that used to (poorly) exist just in my head and getting them on paper. That alone has been life-changing.


Goodbye, CoreOS

For my now-retired at-home hosting, and for my first migration to the cloud, I used CoreOS as my hosting OS. It’s optimized to be a container server and for super-efficient automated deployments. It worked well, and I loved the elegance behind the design, even though I never did multi-node automated CI/CD deployments with it. I had had a problem with it since the cloud migration, where every 11 days after reboot, it would spike CPU and memory and get unresponsive for a few minutes at a time. I could never entirely nail down what was causing it, but best I can tell, logging was getting sticky trying to log failed login attempts from SSH. That I had MariaDB/MySQL, a couple of low-activity wordpress sites, and automated Let’sEncrypt cert management running in 2G of memory wasn’t helping either.

Since I built that version 8mos or so ago, the CoreOS people, since acquired by RedHat, announced upcoming the end-of-life of CoreOS on 26MAY2020. That, combined with my ongoing minor problems led me to port everything to a new cloud host.

This time I went with a more conventional linux, and ported everything over in a few hours, along with DR / rebuild documentation. In fact, this post is the first on the new platform. Here’s hope it’s happily running in 12 days, and thanks for your service CoreOS.


Personal Disaster Recovery – part one

About last year at this time, Susie and I were knee-jerking to the fresh news of my cancer, and the first bucket list thing we jumped on was a long-discussed trip to Hawaii. It was an utterly amazing trip, and I’m so glad we were able to go. Thanks, too, to Delta for offering me very generous flight credits to be bumped – credits that paid for most of the airfare for both of us.

Susie and I at Waipio Valley, north of Hilo, HI. Life is good. #hansleythuglife

We skipped town just as hurricane Florence was hitting NC, and in preparation for the storm, I had powered down most of the tech in the house before we left, with instructions for the house-sitter on what to turn on first to get the house back online. In “normal people’s” houses, they turn their wireless router back on and they’re good to go. Not so in a nerd’s home. I had been playing with Windows Server, and had it hosting all of the DNS roles for both internal and external hosts, and it had worked pretty seamlessly up until this. When the house-sitter fired everything up, some glitch, either in the Xen server hosting everything, or in the Windows server startup kept it from completely starting, which meant that while the network was working, the router was handing out a dead address for DNS, which meant that nothing – Apple TV, the house-sitter’s laptop, etc. – was connecting to the outside world. So, Susie’s driving down from some gorgeous waterfall hike north of Hilo, HI, and I’m on the phone with the house-sitter, trying to talk her through how to set a manual DNS entry on all her devices so she could do work while she was here. We got her going – mostly – and this started a conversation with Susie about how we could simplify tech around the house so that if / when something ever happens to me, she won’t have to call in the techie friends to have them decode home tech and get it working again.

We’d talked about this before, but a little more abstractly, when the husband of a good grad school friend died suddenly in a one-car accident. She had techie friends come over and rip out most of the late husband’s custom home build and install a stock wireless router just so she and her kid could get on the internet. Susie didn’t want to be in that position if at all possible.

This time it wasn’t abstract. The goal was – without killing off technical tinkering, innovation, and learning – how to simplify and document things at home so that should something ever happen to me, that friends could step in and easily help keep her running. In the business world, companies talk and think about disaster recovery plans, and if they’re smart, the practice them occasionally – fail over to backup servers, restore things from backup, switch to alternative network feeds. It was time to think like that for all of our home systems, but first, we had to make sure that what we were running was set up in the best way.

Roughly speaking, what we had running at home that we had to analyze was:

1. FreeNAS – running 4 fairly old 4T drives in a ZFS array, storing pics, movies, backups, and other files
2. Xen Server – running VMs for Windows Server, Win10 running Blue Iris for the home IP cameras, and two CoreOS instances hosting a bunch of docker containers – WordPress, MariaDB, Ubiquity’s unifi controller, among others.
3. Home Networking – AT&T and Ubiquity with a mix of APs in the house and garage.
4. Physical Security – Exterior IP cameras talking to a Blue Iris VM.

More on this saga in the next segment.


No time for you…

Ever since we made the move to AT&T Uverse Fiber – and happily said “See ya!” to TimeWarner / Spectrum – we’ve been a household with clocks adrift. NTP (Network Time Protocol) set up on servers didn’t work – although I blamed myself for botched configurations, NTP on security cameras didn’t work, and network time sync from Windows and MacOS routinely failed. As you might expect, none of the system clocks were in sync, and some were out enough to cause the occasional problems. Then I ran across this forum post on AT&T’s forums, which detailed out what services / ports that AT&T blocks by default for its residential customers – and port 123, both inbound and outbound, which NTP uses was on the list. The complete list is:

So, I get why they do this – a misconfigured NTP server or botnet node can hammer a target server to its knees in a accidental or intentional DDOS. Protecting people from themselves makes sense for NTP and for the other ports they regularly block.

Identifying the problem was half the battle. Getting AT&T to fix it was it’s own effort. What started with a chat session from the Uverse site went like this:

  1. Chat session with the billing / business group, who kindly said he couldn’t help, and transferred my chat to a tech support group.
  2. The first tech support group changed this from a chat into a call, and couldn’t handle my request, and bounced me to a higher tier support group.
  3. This upper tier support group turned out to be a premium internet support group who would only talk to me if I was paying for premium internet support – which he kindly offered to sign me up for on the call. Annoyed at the prospect of paying for something which I thought I already had, I declined. This guy transferred me to a different internet support group who he said could help.
  4. Internet support group #2 – a different group from item 2 above – couldn’t help me either, but instead transferred me to the Fiber Support group.
  5. A fine support rep from the Fiber Support group knew what I was asking (“Can you unblock port 123 for me?”) and how to do it! Once I agreed that I was taking on risk by unblocking this, she proceeded, and by the end of the 40 minute call, I had at least one of my machines able to set its clock from an NTP server. Success!

The total time on the phone / chat with AT&T to get this resolved as just under an hour, and that was once I knew what the problem was and that it was fixable.

As collateral damage, when the rep made the NTP unblock change on my fiber gateway, they ended up breaking all other inbound ports, disconnecting several of my servers from the internet. No amount of reconfiguring and resetting on my side was able to resolve this. Another call to AT&T fiber support and this was resolved on the first call, but that’s another hour or two of my life spent on troubleshooting and resolving something that wasn’t part of the problem to start with.

So, kudos for the right person at AT&T being able to fix this, but it took far too much work on my part to realize that this was an intentional thing on AT&T’s part and that there was a fix for it.  Total time spent on configuration, troubleshooting, research, support call, and testing was easily 6-8 hours, but at least I can happily report that all our home systems now believe it’s the same time.



WordPress Works!

Finally, after much wrangling with Apache, PHP, WordPress and its multisite configuration, and SSL certificates, I think I have our little network of sites up, usable, and secure. Time will tell.