In part 3 of our ‘Understanding your graphs’ mini-series, we talked about GitHub Enterprise Authentication graphs. In part 4, we’re going to talk about GitHub Enterprise Application server and background job graphs.
The application servers section provides insight into the activity of GitHub Enterprise services which provide data to users, or integrations.
- Profile of active sessions connected to GitHub Enterprise backend services. This graph provides a summary of the volume and type of activity from users.
- Web unicorns sessions are often the largest portion of this graph, as users interact via the Web UI and API.
- High error rates may indicate a problem with a service, or potential saturation due to request volume.
- Please reach out to GitHub Business Support if you regularly encounter errors on this graph.
- Service workers which are currently serving a request.
- User and integration daily activity trends are very visible in this graph.
- Plateaus for extended periods in this graph indicate worker saturation, and should be investigated for any request queueing.
- Worker counts automatically scale with system memory size at boot.
- Values in this graph indicate that requests were required to wait for a worker process to become available before it was able to process and serve the request.
- If requests are constantly queuing, users will notice delays in responsiveness, as well as encounter errors or timeouts more frequently.
- Queued requests occurring regularly is a major indicator of an undersized appliance for the amount of incoming requests.
The Application request / response section looks at the rate of requests, how quickly those requests are responded to, and with what status they returned.
- Per minute request counts, broken down by type.
- API is typically the highest on systems with many integrations or active CI and project management tools.
- Reflects the speed of web requests at the 90th percentile in milliseconds.
- Times of over a few seconds can indicate a poor user experience due to long browser load times, or slow API responses.
- Time spent in Ruby garbage collection within the GitHub Enterprise web application.
- Plateaus for extended periods of GC time may indicate a problem with the GitHub Enterprise application itself.
- Time spent accessing disk IO by data services which GitHub Enterprise depends on.
- Plateaus for extended periods of time may indicate system resource saturation.
- The number of responses per HTTP status code.
2xxsuccessful status codes will normally be the largest.
401Unauthorized codes will also be present in environments where API and Git over HTTP traffic is present, as initial requests from clients may not provide authentication headers.
500statuses indicate a potential issue with the GitHub Enterprise application, and should be investigated with support.
- Represents the number of application exceptions generated per minute.
- High rates of errors may indicate an issue impacting the GitHub Enterprise application.
- Number of tasks queued for background processing on the GitHub Enterprise appliance.
- Many user and application actions trigger jobs which run asynchronously on GitHub Enterprise, and are queued to be processed by resqued.
- Workers which process the
maint_git-servqueues are paused during GitHub Enterprise Backup Utilities snapshot runs. It is normal to see the number for this queue increase while a snapshot is in progress. The queue should then drain rather quickly once the snapshot run is complete.
- As there are a finite number of resque worker processes, queues which never drain to 0 may indicate resource saturation or in some cases jobs which have gotten stuck, requiring manual intervention to clear.
- Many queues simultaneously having hundreds or thousands of jobs pending can indicate resource saturation. Queue length can also be inspected from the SSH admin console by running
When E-mail for notifications is enabled, this graph displays the length of the onboard
postfix mail queues.
High numbers of deferred E-mail messages may indicate a problem with the configured SMTP server, or failures in mail delivery to specific user E-mail addresses.
Continue the conversation
There’s more to come in the “Understanding your graphs” mini-series. If you’d like to follow along, just subscribe to the “Understanding your graphs” label (link below). Please let us know if you have any questions in the comments.