Need your expert opinion here. We are running a self hosted Github Enterprise server and we are planning to ingest the audit/system logs to a log aggregator for monitoring purposes.
Based on Github documentation, the best way is to only forward the logs in syslog format.
As the audit logs contains multiple categories, what would be the best way to parse this events? Is there a table schema or a ‘github like connector’ that we can use to parse this syslog events?
Or is there any better way of transporting this logs to an external SIEM solution preferably in JSON format etc?
Hi @kraken8585, In case its of interest to you as an option, the audit log streaming feature (as seen on Cloud) is due to be delivered to GHES (self-hosted) this quarter.
This will be JSON and perhaps a more favourable choice for other reasons also.
Audit log streaming (Server) #344
Customers will be able to stream to Splunk, Azure Event Hub, Amazon S3 and Google Cloud Storage.
If you look at the GHEC documentation I expect it gives you a feel for the same to be delivered to GHES.
Thanks @byrneh - Until this is released, I’m planning to ingest the audit logs via REST api .
However, there are a couple of road blocks that I’m facing
There is no cursor and pagination info available in the json output. This info is necessary to setup an automation to ingest the latest logs from the server based on the cursor/pagination that i store during each ingestion. This allows me to skip cursor data that has already been ingested in my next api call.
I tried using the “after”/“before” in the query parameter but i’m still getting all the logs instead of logs matching the conditional date/time set in the query. Any idea what is the time format of the “after”/“before” parameter? I tried both epoch and YYYY-MM-DDTHH:MM:SSZ but it didnt work.
Is there a unique id that represents each log data?
If the above is not the best method, what alternative would you suggest to ingest this log until audit log streaming is released?