I’m having a very frustrating problem with a number of issues that seem to compound my frustrations. Before I go over this, I want to express that I have experience with Jenkins, Travis, CircleCI and Buildkite. I also want to express that none of the aforementioned services ever had these problems because what I’m about to describe is a basic feature.
I am building UnrealEngine docker images on a Self Hosted github runner. Why I have to do this to begin with is another frustration unrelated to Github, but it’s a requirement none-the-less. The self-hosted runner is satisfactory and I have no problems with it, great work, it may be one of the best self-hosted CI agents I’ve used.
The problem however arises from the sheer size of a 6-10 hour full build of UnrealEngine (depending on the required build settings). The total length of the logs is between 30-60k lines, and the line length is well above average. Due to the nature of the build and issues surrounding UnrealEngine builds, changing the verbosity for various dependencies and build stages is not accessible.
Problem #1: The way Github loads logs in Github actions seems poorly implemented. When attempting to scroll through the logs, a memory leak occurs that results in the browser crashing. I am working on a system with 64gb of memory, no other open chrome tabs and free memory is hovering around 90% before attempting to load the logs. This behavior was replicated to Brave, Firefox and even Edge, so it does not seem to be unique to Chrome. Any form of scrolling more than a 100 lines at a time results in a memory leak. This includes the scroll wheel, page down and end. Because it’s a docker build, annotating the error in the job is easier said than done, and this is exasperated by the fact that an abstraction manages the initialization of the build (outside of my control). I already hear you typing to suggest i just make less verbose logs, use different software, etc. Sometimes we don’t have control over these variables, if I did I would do it, and this is one of those cases. So please understand this if you’re about to suggest a workaround related to uncontrollable variables. This frustration highlights another frustration.
Problem #2 When a github action fails, and you expand the job, github loads the head of the logs. At some point in time someone decided it made sense for the logs to start at line #1. Maybe there’s a use case where this makes sense, but I challenge anyone to explain to me who cares what happens on line #1 on a log when a job has failed. I can find no reasonable explanation how it is good UX to the lines from the head of a file during a failed job, common sense suggests to me that what a user is concerned with in a failed job, as they would be with a stack trace, is the cause of failure. When it comes to logs, 99% of the time that is at the bottom of the logs.
I post this in the hope that someone from Github will rectify the situation, because as a stop gap I have been forced to hack in a greylog implementation to access the runner logs, and it’s extremely difficult to maintain. (I am very open to a better solution, there might be one, but was unable to identify it!) I’m inches away from abandoning Github Actions and going to Buildkite where the GUI considers basic usage as a basic requirement of their software.
Dear Github, please either invert log lines or consider loading the tail of the logs, as opposed to the head, on job error. This simple change would circumvent the more complex requirements and likely convoluted solutions derived from problem #1, and should not be terribly difficult to implement either. Thank you.