'Storing HTTP Request/Response Logs

I have an API with around 20 requests/second and I also supply webhooks, however, I need to implement content logging for this(request/response body). The problem is that currently, the logs are stored in MariaDB, but it's reaching its limits with now over 500 mil rows.

An example of what I want is what github does with it's webhook logs or Stripe with there logs. Example: enter image description here

I'm very curious about how they achieved this with the low latency needed to grab the request. Can anyone push me in the right direction on what data storage is capable of handling these volumes as well as what query engine is best for this?

I was first thinking of pushing the requests+responses/webhooks into kafka, and then storing it somewhere. I was thinking between JSON and Canical log lines such as what Splunk is doing, but haven't decided yet.

My current thoughts:

  1. HTTP Request -> My Application -> Kafka or AWS Kinesis -> S3? <- Presto or AWS Athena

Can someone point me in the right direction?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source