Help
cancel
Showing results for 
Search instead for 
Did you mean: 

Understanding your graphs part 5 - Network and Storage

GitHub Staff

Understanding your graphs part 5 - Network and Storage

 

In part 4 of our 'Understanding your graphs' mini-series, we talked about GitHub Enterprise Application servers and Background job graphs. In part 5, we're going to talk about GitHub Enterprise Network and Storage graphs.

 

Network

 

The network interface graphs can be useful in profiling user activity, and throughput of traffic in and out of the the GitHub Enterprise appliance.

 

Clients

Network clients graph

  • Breaks down the number of clients per TCP port, which is useful for examining how users are interacting with GitHub Enterprise.

 

Sockets

network sockets graph

  • Further details on the TCP connection state, which can be useful in troubleshooting network or Load Balancer issues in some cases.

 

Interface Throughput

eth0 interface throughput graph

  • The amount of data transferred inbound and outbound from the GitHub Enterprise appliance.
  • TX (outbound) traffic is most commonly higher than RX (inbound), especially when many systems are "polling" the API or Git repositories for changes.
  • Plateaus in this graph can be an indication of link saturation or reaching the maximum possible link throughput.

 

Interface Errors

eth0 interface errors graph

  • The presence of any errors may indicate a problem with the physical or virtual network card, or cables connected to the Hypervisor host system.

 

Replication Throughput

tun0 interface throughput graph

  • The amount of data sent to, and received by replica instances over the internal OpenVPN interface.

 

Replication Interface Errors

tun0 interface errors graph

  • Errors may occur here due to saturation or MTU problems on the physical link; however, these are generally not critical errors.

 

Storage

 

GitHub Enterprise repository performance is very dependent on the underlying storage system. Low latency, local SSD disks provide the highest performance.  For more information on the GitHub Enterprise storage architecture, please see the System Overview guide on our documentation site.

 

Disk usage (Root Device)

root disk utilization graph

  • Disk space in bytes available for root volume storage.
  • Growth on this volume is generally due to logging, which is on a 24 hour rotation schedule.
  • The root volume reaching 100% usage can cause a system outage, or indicate a service issue which is causing extreme log growth.

 

Disk usage (Data Device dm-0)

data disk utilization graph

  • Disk space in bytes available for the user data volume.
  • All user profile data, pull request and issue metadata, repositories, and release assets are stored on this device.
  • The data volume reaching 85% usage will cause problems with the built in search functionality of GitHub Enterprise. It is recommended to increase storage capacity of the data volume prior to reaching 85% usage.

 

Disk latency (Root Device & Data Device dm-0)

root disk latency graph

data disk latency graph

  • For best IO performance, average latency values below 10ms are recommended.
  • Large spikes may be an indication of storage system saturation.

 

Disk operations (Root Device)

root disk ops graph

  • Abnormally large amounts of time spent in root IO suddenly appearing may indicate a logging issue, or a general storage problem.

 

Disk operations (Data Device dm-0)

Data disk ops graph

  • Abnormally large amounts of time spent in data volume IO suddenly appearing may indicate a repository maintenance issue, or a general storage problem.
  • Graph for reads trends generally follows the pattern of Git fetch or clone traffic on the system.

 

Disk pending operations (root Device)

Root disk pending ops graph

  • Pending disk operations on the root device may indicate storage system saturation for the root volume.

 

Disk pending operations (Data Device dm-0)

Data Disk pending ops graph

  • Pending disk operations on the data device may indicate storage system saturation for the data volume.

 

Disk traffic (Root Device)

Root Disk traffic graph

  • Write traffic on the root volume is mostly due to logging and collectd graph data collection.
  • Read traffic on the root volume is typically very low; However, support bundle generation may cause temporary spikes.

 

Disk traffic (Data Device dm-0)

Data Disk traffic graph

  • Read and write trends depend on user and integration activity.
  • Plateaus in this graph may indicate storage system saturation.

 

Continue the conversation

 

There's more to come in the "Understanding your graphs" mini-series. If you'd like to follow along, just subscribe to the "Understanding your graphs" label (link below). Please let us know if you have any questions in the comments.