'Parsing a JSON file from a S3 Bucket
I am relatively new to Amazon Web Services.
I need help on parsing a JSON file from an S3 Bucket using Python. I was able to read in the JSON file from S3 using the S3 trigger connected to the lambda function and display it on Cloud-Watch aswell. I need help on how to parse the "results" from the JSON file and calculate max, min and average of the "Results".
Here is my JSON file:
Student = [{"Student_ID": 1,
"Name":"Erik",
"ExamSubject": "English",
"Result": 72.3,
"ExamDate": "9/12/2020",
"Sex": "M"},
{"Student_ID": 2,
"Name":"Daniel",
"ExamSubject": "English",
"Result": 71,
"ExamDate": "9/12/2020",
"Sex": "M"},
{"Student_ID": 3,
"Name":"Michael",
"ExamSubject": "English",
"Result": 62,
"ExamDate": "9/12/2020",
"Sex": "M"},
{"Student_ID": 4,
"Name":"Sven",
"ExamSubject": "English",
"Result": 73,
"ExamDate": "9/12/2020",
"Sex": "M"},
{"Student_ID": 5,
"Name":"Jake",
"ExamSubject": "English",
"Result": 84.15,
"ExamDate": "9/12/2020",
"Sex": "M"},
]
print(Student)
and here is the code I have used on the lambda function so far:
import json
import boto3
s3 = boto3.client('s3')
def lambda_handler(event, context):
bucket = 'finalyearpro-aws'
key = 'StudentResults.json'
try:
data = s3.get_object(Bucket=bucket, Key=key)
json_data = data['Body'].read().decode('utf-8')
print (json_data)
except Exception as e:
raise e
How do I add to this code to make it read the "Results" from the JSON file, do analysis on it (max, min, average) and display on Lambda console.
Solution 1:[1]
You can load the document using boto3.resource('s3').Object(...).get() and then parse it into python with json.loads():
import json
import boto3
s3 = boto3.resource('s3')
def lambda_handler(event, context):
bucket = 'finalyearpro-aws'
key = 'StudentResults.json'
obj = s3.Object(bucket, key)
data = obj.get()['Body'].read().decode('utf-8')
json_data = json.loads(data)
print(json_data)
Solution 2:[2]
json.loads(json_data) will parse the json string and create list of dicts (for this data) from it. After that you can iterate over the list and do whatever you want, i.e.
data = json.loads(json_data)
min([r['Result'] for r in data])
Solution 3:[3]
boto3 has switched to a new resource format (see https://github.com/boto/boto3/issues/56). If you are getting error 'S3' object has no attribute 'Object', please try the following:
import boto3
import json
s3 = boto3.resource('s3')
obj = s3.Bucket('bucket-name').Object('object-key')
jsonStr = obj.get()['Body'].read().decode('utf-8')
jsonObj = json.loads(jsonStr)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Yaroslav Fyodorov |
| Solution 3 | Jason |
