Help
cancel
Showing results for 
Search instead for 
Did you mean: 

High-Availability and Disaster Recovery for GitHub Enterprise

GitHub Staff

High-Availability and GitHub Enterprise Backup Utilities at glance

 

High-Availability for GitHub Enterprise

 

High-Availability for GitHub Enterprise takes the form of an Active-Passive setup, where a High-Availability replica is a warm standby instance that is kept in sync with your main production GitHub Enterprise instance, or primary asynchronously. This allows you to have a redundant GitHub Enterprise instance in a different datacenter or the Cloud, which is then available for planned or unplanned outages of your primary GitHub Enterprise instance. We strongly recommend planned failovers when possible, as any synchronization delay between the two instances will equate to data loss.

 

What High-Availability replication can do:

  • Increase availability by storing a redundant copy in a different location.
  • Help with datacenters cutovers or hardware migrations when configuration doesn't change.

What it doesn't:

  • Creating staging servers to test upgrades or configuration changes. Please refer to the GitHub Enterprise Backup Utilities instead.
  • Scaling out. Please refer to our Geo-Replication feature instead.
  • Automatic failover. While technically possible, the potential data loss when promoting an outdated replica negates any advantage of automated failovers.
  • Full instance backups. Please refer to the GitHub Enterprise Backup Utilities instead.
  • Zero downtime upgrades. To prevent data divergence between the instances, they both need to be upgraded during the same maintenance window, with no user access during the upgrade process.

 

GitHub Enterprise Backup Utilities

 

The GitHub Enterprise Backup Utilities is a suite of of tools developed by GitHub and made available publicly to take application-aware and consistent snapshots of your GitHub Enterprise instance. The GitHub Enterprise Backup Utilities also include deduplication features, ensuring faster backup times and lower total space usage for later snapshots, while ensuring the integrity of each individual snapshot. This means that while the space on disk is lower, each snapshot contains a full backup of your instance and can be used to restore a copy of your data.

 

They can be used to either restore a copy of your GitHub Enterprise instance to a staging server, to test upgrades and features, or as a main part of your disaster recovery strategy to produce full copies, down to the configuration settings, of your production instance. They can also be used to supplement the recovery features of your GitHub Enterprise instance and prevent against accidental or malicious data loss.

 

As stated earlier, the GitHub Enterprise Backup Utilities are fully application-aware, and are strongly recommended over hypervisor level snapshots. Hypervisor level snapshots can lead to data corruption, reduced performance on your instance, and in more critical cases, full outage of your GitHub Enterprise instance.

 

Configuring your High-Availability replica

 

The High-Availability replica is designed to act as a primary after a controlled failover, and as such, needs to be provisioned with the same amount of memory, CPU cores, and storage as the primary GitHub Enterprise instance. As all data written to the primary will be replicated to your High-Availability instance, you need to ensure a high-capacity, low latency (when possible) link is used between the two instances. This will ensure both instances are synchronized at all times and prevent data loss when failing over. We strongly recommend storing the replica instance on a different storage and or hypervisor subsystem, or even a different datacenter, to increase availability.

 

Not sure if your replica is up to date? You can monitor the replication status with ghe-repl-status. The ghe-repl-status command returns Nagios-compatible error codes—0 - OK, 1 - WARNING, 2- CRITICAL— so you can use check_by_ssh or a similar tool at regular intervals to keep an eye on your replica. We know that granting SSH access to your monitoring system can be scary, so you can restrict access to your GitHub Enterprise instance by prefixing your key file with the allowed command when you set the public SSH key via the Management Console:

 

command="/usr/local/bin/ghe-repl-status" ssh-rsa AAAA....

 

In the event of a catastrophic failure of the primary instance, a well maintained replica GitHub Enterprise instance will lessen the overall disruption to services, and reduce the recovery and restoration efforts required to restore users' access to GitHub Enterprise. Monitoring the replication status should be an integral part of your Disaster Recovery strategy, and will ensure minimum or no data loss when failing over to the replica.

 

Configuring the GitHub Enterprise Backup Utilities

 

The GitHub Enterprise Backup Utilities cannot be installed on your GitHub Enterprise instance. Doing so would defeat the purpose of taking backups, as well as potentially impact the performance of your GitHub Enterprise instance. A Linux server with a recent version of Git and rsync is required, as well as a filesystem that supports both symbolic and hard links.

 

This server will need a reliable and fast link to your production GitHub Enterprise instance, with access over port 122 allo.... No other ports are necessary. You will need at least 5 times the storage allocated to your GitHub Enterprise instance reserved, and more if you would like to keep additional backup snapshots. We also strongly recommend fast storage to avoid bottlenecks during the data transfer phase.

 

Once your server is ready, you only need to clone the GitHub Enterprise Backup Utilities repository locally, then edit backup.config to include .... Finally, generate a new SSH key, add it to your GitHub Enterprise instance, and you can start taking backups of your instance immediately.

 

You can monitor the progress of your backups as you start a backup task manually, or send the output to a log file, for instance for scheduled backups:

 

ghe-backup -v 1>>/opt/backup-utils/backup.log 2>&1

 

Benchmarks for each backup run are also stored in the benchmarks folder for every snapshot. This can help you schedule your backups and identify possible bottlenecks.

 

As mentioned earlier, each individual snapshot is both an incremental copy and a full copy of your instance. In practice, this means you can take a copy of any successful snapshot and copy it to a remote location for purposes such as backup redundancy, data archiving, overlapping backup schedules.

 

Finally, be sure to test your backup restores often! For instance, you can deploy a blank GitHub Enterprise instance and select Migrate. You can then restore the latest backup snapshot with ghe-restore -v [IP]. Is your staging instance already configured, and you want a copy of the latest settings? Act with caution, and use ghe-restore -v -c [IP] to restore the configuration of your instance alongside your data.

 

Conclusion

 

The GitHub Enterprise Backup Utilities and High-Availability features complement each other and will help you be ready for any disaster coming your way, planned or unplanned. Not sure how to implement them? Feel free to comment here, or reach out to us via the Support Portal for any issue you may have.

2 Comments
Ground Controller Lvl 1

I'm running GitHub enterprise on AWS. What AWS service would you recommend for the backup host? EC2 + RDS? Glacier? Something else? Are there any AWS architectural best practises to consider when setting up GitHub Enterprise Backup Utilities? Thank you.

GitHub Staff

Hi @clare-aire!

 

We recommend fast storage for backups to ensure they complete quickly, so Glacier is out of the question unless you use it to store older snapshots. You can find more about our requirements in the GitHub Enterprise Backup Utilities repository.

 

If you have specific questions which are not answered there, you can also open a ticket with GitHub Enterprise support.

 

Ta,

François