Issues with OpenBSD on a t460s
As this was quite an annoying issue I've been dragging around during more than a year, I made a TLDR that may be useful to some that are just looking for a workaroud/fix. If you're interested in the whole story and the lessons I got from this ordeal, just continue reading.
How it all started
When coming to OpenBSD from Linux I was used to search online when faced with issues or just looking for guides on how to install/setup things. That is how I ended up using this seemingly thorough and complete guide as additional material to complement the great FAQ installation guide. This seemed like a good idea to have multiple sources of information, especially when you are not yet used to having good documentation and amazing man pages. Now with some more experience, I know that simply using the FAQ and the man pages gives you enough information to install OpenBSD.
This was when OpenBSD was at their 6.7 or 6.8 release, the guide talked about OpenBSD 6.4 so I thought there might still be some valuable additions to the FAQ, and notably fixing screen tearing in the X11 setup part of the guide. For some strange reasons I thought this setup was an essential addition to the setup to allow for a nicer experience… Not that such a detail would matter at the time, I was a pretty happy user. OpenBSD is a beautiful, simple and secure system after all, and I am downplaying my praises as I keep them for another blog entry.
The troubles start
Some time passed and one day, around the time they were releasing 6.9, after a sysupgrade(8) (I usually run -current snapshots on my laptop/desktop) I started to have some weird issues. When browsing the web with "modern" web browsers like Chromium, Iridium, even Surf or the very cool Nyxt, it would freeze my computer except the mouse that could still move but not interact with anything. After a few reboots and some trials the issue seemed to be linked with WebKit-based web browsers. After asking a bit around at #openbsd on Libera.Chat (Freenode at the time RIP…) I realized that nobody else seemed to face such an issue and decided that I might just be better off completely ditching the "modern" web.
And just like that a very interesting journey started, I already had NetSurf installed and just started using it full-time. I was pretty happy with my js-free life although there were a few annoyances like Vultr (the VPS provider I currently use) admin panel not accessible without Javascript… Or any services requiring Javascript to work, were inaccessible, things like netbanking etc… That is how I came to realize the full extent to which people like Drew DeVault with SourceHut were a godsend. I already appreciated such effort to debloat the web, but being forced to browse the web like so, made me realize how important those efforts are.
Two issues arised after that, the first one is that despite ditching those so-called "modern" web browsers my computer would sometimes freeze randomly when watching videos on mpv. For some reason the freeze was much more likely to occur if the laptop was not plugged to the power… This made me question the nature of the issue, but since it was not too likely if the power cord was on, I didn't bother investigating and just made sure to plug my laptop whenever watching videos. The second issue was more of a turning point. NetSurf would crash whenever I would scroll down on some wikipedia articles. I don't remember if it would also freeze my whole computer but I don't think so, since I never associated this specific issue with this whole ordeal until later when I found what was actually the root cause. I just thought NetSurf was not very compatible with my setup at the time. So props to the guys behind NetSurf for not crashing my whole system! This was a turning point in that I had not much choice left, things like lynx, w3m etc… Did not really appeal to me, if I want to browse the web in a text only way, I'd rather just use things like Gopher or Gemini and simply give up on the web altogether.
Here comes my interest in Plan 9. I was actually already playing around with a 9front connecting to my machine with the amazing Drawterm. And that is where I told myself: "Screw this, why not just use Mothra?". So did I and holy cow did it work well! I keep my 9front stories for another time, but I wanted to highlight how far this single issue led me to explore.
Investigating the issue
This little arrangement with OpenBSD + Drawterm + Mothra for web browsing worked flawlessely, until the mpv crashes started to get more frequent. Even with the power cord on! Although by this time I had already made so that most of my computing needs would be fulfilled from within 9front, watching videos is still a significant part of what I use computers for. This is why I still needed OpenBSD and why the issue getting worse was the last straw in order to force me to investigate and do something about it.
So first thing I did was fire up my irc client and reach out to the people at #openbsd on Libera.Chat and ask if anybody had any clues on what it might be. I got two very interesting suggestions from there. The first one was that it sounds awfully like a ram corruption issue, basically the idea is that some part of my ram would be broken and the memory would be corrupted there. It made sense in that each time I would fire up something memory hungry like a "modern" web browser, it would fill the memory up to the corrupted portion and once it touches it, something breaks and freezes (I still thought the computer was actually crashing at that point in time) my computer. It also made sense in that mpv uses less memory, hence crashes less often. With this in mind I got a USB key, flashed Memtest86+ (which is Free and Open Source Software vs the non-free MemTest86) and started it on my computer. Guess what? Some of the memory soldered on the motherboard was indeed corrupted! Everything made sense to me and I had finally figured out what was wrong with my computer after a year. The annoying part was that I actually needed to replace the motherboard now. After looking for a bit, I managed to find a pretty good deal: 2 thinkpads t460s in good condition for 320$. Although you can find t460s motherboards for 100$ on ebay, the seller agreed to run the MemTest on both laptop and that was a good guarantee to me that I would be able to fix my issue as well as have some spare parts in case something else would go wrong. The next week I get the two laptops and swapped the hard drive with a lot of enthusiasm only to find out that the issue is still around. Apparently computing with corrupted memory is no fun. The data can get corrupted and so the "infection" can spread to the hard drive, some system files might be corrupted. I am slowly loosing hope tho as OpenBSD's FFS is usually pretty good at dealing with corrupted storage. And to my great despair, after a full reinstallation the issue persisted… Little did I know that during the initial boot everithing was fine but as soon as I copied my previous configuration it started to misbehave. I didn't notice that as I was going through both the OpenBSD FAQ and this other guide I used the previous year as a good addition.
At that point, I really started to loose patience with OpenBSD, I really believed it might just be incompatible with the hardware. My partner whose opinion matters so much to me even suggested that I am getting too obsessed in trying to make OpenBSD work on a t460s. Maybe she was right, maybe I should either give up on OpenBSD or on the t460s. But first I still wanted to explore the second lead I had before giving up all hopes… The idea was to try to crash my machine and then try to connect to it via SSH and this would already tell me if the issue was just a graphical freeze or a complete system crash. When I tried to connect, it worked! Good news, I might finally learn more about the nature of the issue as I had access to a running system. First thing I did was poke at the logs and there was indeed a few weird thing, some mention of await and Xorg timing out. A concurrency issue? At least something was wrong "just" with Xorg, and the system still running, the ability to still play music all confirmed it. That's when I wrote on the OpenBSD's bugs mailing list. Unfortunately nobody answered and I still didn't know what the issue was, although it was already very good that I pinpointed it to something linked with Xorg.
Finally a solution
After a few more days of obstinated searches I stumbeled upon something that hinted at an issue with Intel drivers. After a bit of testing I realized that the "TearFree" option in the Intel drivers was actually causing the freezes. And OMG did I realize how I had taken that advice of enabling the Intel drivers with the "TearFree" option as something that was trustworthy! Not enabling the Intel drivers at all and thus using the default modesetting driver made the issue completely disappear. And the best was that there wasn't even a screen tearing issue to begin with… And that's how I finally fixed an annoying self-inflicted bug that I dragged with me for more than a year.
Take away
I would like to conclude with the morale of the story, basically what I learned from this year long ordeal:
First of all, trust the OpenBSD devs! Don't change the defaults unless you think you know better than them. Second off, don't trust some online blogs more than the OpenBSD devs. The likelyhood that the man pages and the FAQ holds everything you needs is very high. Don't try to take shortcuts by blindly following a thorough guide. You will ultimately loose more time like I did. When you have a manual at you disposal, by all means, use it! Learning is also a requirement for you to control and understand your system, such is the path that leads to freedom. And lastly when debugging a computer issue, don't loose faith and trust in your ability to find a solution!
That's all, I touched on a lot of very interesting topics to which I will most certainly dedicade their own blog entry. And I managed to type this whole text in the Colemak layout without too much strain, the progress feels real. See you soon!
TLDR
The issue was that my thinkpad t460s seemed like it was crashing (it was actually freezing) when I would open a "modern" web browser or sometimes randomly during videos being watched on mpv. If I remeber correctly the issue started somewhere aroud the release of OpenBSD 6.9.
The issue was caused by the Intel Drivers that I had enable for its "TearFree" option. Completely removing the config from `/etc/X11/xorg.conf.d/` and letting OpenBSD default to the modesetting driver. The difference is pretty much unnoticeable and the freezes are gone.
The issue was also described on the OpenBSD mailing list.