2022 State of the Tools

A picture of many tools on a white surface.
Photo by Cesar Carlevarino Aragon on Unsplash

It's time for this year's State of the Tools. Things are a bit in flux this year, as I'm working on transitioning my endpoints from Windows to macOS, but I'll run down where I'm at and where I'm headed; there have also been substantial changes in our personal computing infrastructure.

Hardware & OS

Organizing these posts can be a bit random, so let's start with systems themselves.

Work Endpoint Computing

I've moved entirely into my Surface Laptop 4 (i7, 32GB/1TB) running Windows 11 for work computing now, and reloaded my i9 to be a Linux compute server that lives on my desk. Since I work from home 1–2 days most weeks this is a lot more convenient than having different sets of windows and browser tabs at work and home. It also addresses the problem that Zoom refuses to allow me to log in to more than one computer at a time — I can mostly just be logged in on the laptop.

At the office, this connects to a Surface Dock 2 driving 2 4K Dell monitors, and the laptop sits to the side to give me a 3rd screen. I use a Surface bluetooth ergo keyboard and the wireless version of the Kensington Expert Mouse. I've used some version or another of the Expert Mouse since about 2010, and it's a solid product that helps a lot with my RSI. The product has been on the market for probably 20 years at this point, and if they ever stop production it will be very sad.

I also have an M1 Mac Mini for software testing and driving my shared monitor, and an old MacBook Air for debugging on Intel-based Macs. I also still have the Surface Go 2, which I mostly use as a tablet. In my next work computer refresh, I plan to get a 14" MacBook Pro, and keep a Windows VM on a server somewhere for Windows-based testing and debugging. While I appreciated the travel weight benefits of a 10" tablet for my portable computing, life and the world change, and it's really more convenient to have a usable keyboard and screen that are still portable.

Work Backend Computing

I've moved more computing to remotely-accessed iron since switching to a purely laptop endpoint. My Dell workstation is now a compute server (8-core i9, 64GB RAM, 512GB SDD + 10TB HDD), and I put a new 8GB Turing-based Quadro card in it (pretty modest, but the largest card this computer's case and PSU can handle). Note to self, don't by a small form factor box for heavy compute. This machine is only used by me, not the lab.

My research group has a dual Xeon server with 24 total cores for data storage and modest compute. The CPUs are ancient and only clocked at 2GHz, so it isn't very fast, but it has a lot of memory & a decent amount of disk, so it's just fine for storage and interactive computing. Large compute-intensive jobs we run on the university's clusters. I run medium jobs on my dedicated server.

Both of these machines are running RHEL8 with XFS filesystems; DVC can use the reflink feature in modern XFS to reduce storage requirements quite a bit.

Personal Endpoint

We're diving a bit deeper into the Apple ecosystem this year, and bought an M2 MacBook Air to replace my aging (and showing its age) Surface Pro 4. I used a Mac for work at Texas State 2014–2016 and didn't really like it, but working with this thing I am pretty much sold. The battery lasts for ages, it's incredibly fast, and having a consistent CLI experience between local and remote systems is nice. The hardware is also very nice. I can even do a bit of light gaming on it.

We also still use a Windows desktop for most of our gaming some personal computing work. I'm thinking hard about getting a Steam Deck and moving most of my gaming over to that in the next year or two, but we will see. Not planning to make that purchase in the next few months. Overall, I think the Deck (or the Deck 2 that I assume is coming) would be a pretty good fit for my particular gaming needs and style, especially driving the TV with a dock and my XBox controller.

Personal Infrastructure

Our QNAP TS-269L NAS is still chugging along, and in addition to storing backups of teaching materials, it's doing extra duty as our router and firewall and boasting an upgrade to 8TB of mirrored storage. Its CPU is pretty meager; an Atom D2701 that, while 64-bit, doesn't have crypto extensions or even frequency scaling. For the time being, though, it is getting the job done.

It took over router duty when our NetGear router decided for whatever reason that software updates are optional. Even after I manually installed the latest firmware, it refused to detect and install subsequent updates. Security updates are pretty important for the home network perimeter, so I decided to move routing & firewall duties to a machine whose OS I could control directly and make sure was up to date and use the NetGear in AP-only mode. Since the NAS has two network ports, it's filling that role for now.

The NAS started the year running Alpine with XFS on Linux software raid, but I tried out FreeBSD on it again with ZFS. This was a pretty ergonomic environment, and pf is a very approachable firewall language when it came time make it a router (iptables has usually confused me), but there were a couple big issues:

  • Significant performance problems when doing CPU- and disk-intensive work, such as synchronizing large file directories. I do not know the cause, but my suspicion is that its task scheduler is not as effective at handling significant load on underpowered hardware; filesystem and the transfer application would get priority over network routing, so our machines would fall off the network.
  • FreeBSD's security reputation is… spotty. From what I've read in various sources trying to get a good picture is basically that there are some good ideas, but without near the attention that Linux has had, there are likely many outstanding security-relevant bugs and things don't get caught as quickly.

So back to Alpine Linux but keeping ZFS (which Alpine supports very well). After a few false starts, I got the necessary firewalling rules working in nftables (which is a substantial ergonomic improvement from iptables IMO). Alpine also has a pretty BSD flavor to its administration, due to using OpenRC and lots of classic shell scripts. I like the niceties of modern Linux for a lot of things, but for my basic network infrastructure at home Alpine is working quite well, and I haven't seen a recurrence of the performance problems I had on FreeBSD. And with the gcompat package, VS Code remote editing even works. I've also installed Wireguard for personal device VPNs, so I can browse (more) securely while traveling.

The Raspberry Pi 0 media widget isn't seeing a lot of use these days, but it's still there, and still running Alpine fine, with shairport-sync attempting to provide Apple streaming support but glitching out rather more frequently than I would like.

In the next year or two, I'm hoping to add a dedicated router (NanoPi R5C is the current leader), replace the NetGear with a WiFi 6 access point, and upgrade the NAS to something with a more modern CPU.

Mobile

Not much to say here. Still on iPhone & Apple Watch, no plans to change that.

Interactive Software

Across my various devices, I tend to rely on the same software for my interactive (graphical) computing:

  • Chrome for work browsing, Firefox for personal.
  • Office for editing, spreadsheets, and presentations; I do use Google Docs for work stuff that is lightweight and/or needs high collaboration.
  • VS Code for text editing.
  • Overleaf for LaTeX document preparation.
  • Paperpile for reference management.
  • DynaList for tracking work notes and my runway document.
  • FreeCommander for Windows file management, Marta on macOS (when I am not just using the system's native file browser).
  • The Affinity suite for graphics & photo editing.
  • Visio for diagramming (when PowerPoint or diagrams.net won't do). I'm not using Grapholite much any more. After switching to Mac I will look at OmniGraffle.
  • Drawboard PDF for tablet-based PDF reading and annotation; Acrobat Pro on laptop/desktop.
  • 1Password for password management.
  • iTerm 2 for my macOS terminal (Microsoft Terminal on Windows).

Command Line

I do a fair amount of work from the command line, both for running the software and analyses I'm doing and also for general file and data manipulations. Key pieces of this load:

  • zsh on *nix (including WSL), PowerShell on Windows.
  • nano for quick terminal editing; I've also been experimenting with Helix and Micro but the nano keybindings are pretty engrained now.
  • tmux for terminal multiplexing / disconnection.
  • mosh for connecting to servers at home, will probably start using it more across the board. I'm now just using OpenSSH's direct PKCS#11 support for my YubiKey.
  • direnv to provide ergonomic project-specific configuration.
  • zoxide (on *nix) and ZLocation (PowerShell) for directory navigation.
  • htop for performance monitoring.
  • bat for file viewing.
  • fd for finding files.
  • ripgrep for searching files.
  • ndcu and dust for managing disk usage.
  • exa for listing files.
  • pandoc for Markdown processing.

Data Infrastructure

On the personal side, we moved off of Microsoft for our primary data infrastructure. E-mail is now on Fastmail and we're using iCloud for file sharing, calendars, contacts, tasks, etc. Microsoft was working fine for us, but with moving towards macOS it wasn't making as much sense, and we were able to cut our infrastructure & services bill a bit. Using Syncthing for data transfer between endpoints and the NAS, Samba/CIFS for direct access to NAS storage, and Backblaze for endpoint backups.

For work, there are a few things in play:

  • Google Drive for cloud/synchronized document storage (and all the Google Doc storage, of course).
  • git and GitHub for source code and text document management. Most of my documents either live in Google Drive or a Git repo.
  • Data Version Control (dvc) for managing research data and pipelines.
  • Minio (S3-compatible storage server) for storing and synchronizing DVC-managed data.
  • rsync and robocopy for CLI data transfer; WinSCP for GUI transfer.
  • Kopia for endpoint backup, using a university CIFS drive as the backup target.

Programming and Science

I am still using Python + Pandas + NumPy as my primary software development environment for scientific computing & open-source work. I wouldn't say I love it, but it's widely-known, gets the job done, and has libraries for a lot of things. I've been working on filling a few gaps myself, such as the seedbank package for random number generator initialization. For plotting I'm mostly using plotnine, but sometimes seaborn and matplotlib.

I use Rust for high-performance data processing code; this year, I finished rewriting the Book Data Tools to be completely in Rust (except for notebooks computing statistics on the data set) instead of a combination of Rust, Python, and SQL.

JavaScript is my primary language for web stuff; this web site is generated with a custom JavaScript codebase. I've been experimenting with Deno some as an alternative to node.js, and might start doing more with it. I also reach for JavaScript when doing some data processing that is heavily web-native, like extracting data from a Twitter archive dump.

I've also started playing around a bit more with TCL for lightweight scripting. I've used it for a while for helping with some setup bits for my Unix shell environment, and have now written some admin scripts in it. It definitely has some warts and missing holes, but it's a rather pleasant language to do simple things in, and it's fantastic for building eDSLs.

I use Conda/Mamba (typically through the Mambaforge distribution these days) for managing my primary scientific software environments, as it allows me to get Python, Rust, R, and other tools as needed. These days, if something isn't in conda-forge or available on a pretty basic OS install, I don't depend on it in my research projects, so that students and collaborators can get all the requirements from environment.yml.

For projects that aren't in Conda, I use rustup for managing my Rust versions and asdf for managing most other interpreter environments (Python, Node, Ruby, etc.). I've been using pip-tools to pre-resolve Python dependencies for lack of a better solution.

Media Production

I'm not recording as many videos as I did the year I did the build-out of CS 533, but I do still record and edit a fair number.

Camtasia is my go-to recording & video production software; it's much easier to use for my use case (educational videos) than something like Preimier. I also have a USB shuttle to help with precise navigation while editing.

The game-changing addition this year was TechSmith Audiate. It's an audio editor and transcriber that allows you to edit an audio track by editing its text transcription. Corrections just correct the transcript without editing the underlying audio; deletions edit the audio track as well. This way I can more aggressively trim my videos to delete blank spaces, false starts, etc.; previously, I only edited out big gaps, because more detailed editing to clean up the speech is very labor-intensive. It integrates with Camtasia, so I can export the audio track from Camtasia to Audiate, clean it up, and round-trip back with an edit list that Camtasia uses to trim the corresponding video. This cleanup process also gives me working captions at the same time. I haven't gone back and re-edited all my videos with this, but it makes it much less time-taking to produce both a cleaner video and edited captions for new work.

For recording hardware, I'm using a Logitech Brio as my primary webcam, and a Blue Yeticaster as the microphone. I have a green screen that can clip to the back of my chair and some USB lights to facilitate chroma-key work. I also use my phone sometimes, with the Camo software that turns a phone into a webcam. This, combined with a desk-mounted boom arm and a QuadLock mount, allows me to do document camera work in my office either on video or on Zoom without needing to record and edit in a separate video stream. I also use my phone for mobile video recording.

I'm still using Unsplash and The Noun Project for sourcing most of my artwork & icons, with the occasional addition from OpenClipArt.

Analog Information

I'm still using Leuchtturm1917 A5 notebooks and a collection of fountain pens for my daily and weekly productivity management, note-taking, and journaling. I use a Pentel Orenz mechanical pencil for working out maths, sketching diagrams, and other things where I need to be able to erase.

I also bought a turntable this year and began collecting vinyl music. Tactile, analog engagement helps me ground myself and feel more connected to what I'm doing; it's been effective in my journaling, and I wanted to bring that to a more purposeful connection to music (in some contexts — I still make heavy use of streaming as well). I've been putting together an eclectic collection including P!nk, Sturgill Simpson, The New York Rock & Roll Orchestra, Chicago Transit Authority (before they changed their name — found that album for $1 in the bargain bin), Carly Rae Jepsen, and many others. My weekly ritual has now extended to include spinning an album to go with my whiskey and longhand planning.