'Build Streaming Data Architecture on AWS
I would like to design a streaming data architecture for my company's project.
Basically, we receive the records which are user's information and we need to use this information to calculate some metrices in real time, and then send these metrices to marketing department for further analysis.
Currently, we have a backend team that receives these records, transforms them to useful data and store them in a S3 bucket. We also have a lambda function that is triggered by this S3 bucket; inside this lambda function we calculate these metrics and store them in RDS.
Normally, we receive about 6k-7k records per day and this architecture still works fine. The problem of this architecture is that it is not easy to scale up due to the restriction on time and memory of lambda functions. There were some days that we receive a lot of record in a short time. Therefore, lambda function is unable to handle these record within 15 minutes (this is the restriction of lambda function).
I have been looking at other architectures to solve this problem. One alternative is that I will use Spark instead of Lambda function, because there is no restriction on Spark and it is easy to scale. But I am wondering if Spark will work and the pricing cost of it. If not, can you provide other architectures for me?
Any help will be appreciated.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
