'tracking sleep time with excel
Solution 1:[1]
You will have to use the SDK for that, with your favorite language and a script to filter out objects recursively for the file formats you want
and then export the list as csv or json whatever you prefer more readable
Solution 2:[2]
Here's an idea for how to solve this with the awscli and a couple of other command lines tools (grep and awk, freely available on Mac and Linux).
aws s3 ls s3://mybucket --recursive \
| grep -v -E '^.+/$' \
| awk '{na=split($NF, a, "."); tot[a[na]] += $3; num[a[na]]++;} END {for (e in tot) printf "%15d %6d %s\n", tot[e], num[e], e};'
Step by step, aws s3 ls s3://mybucket --recursive results in output like this:
2021-11-24 12:45:39 57600 cat.png
2021-09-29 13:15:48 93651 dog.png
2021-09-29 14:16:06 1448 names.csv
2021-02-15 15:09:56 0 pets/
2021-02-15 15:09:56 135 pets/pets.json
Piping that through grep -v -E '^.+/$' removes the folders, and the result looks like this:
2021-11-24 12:45:39 57600 cat.png
2021-09-29 13:15:48 93651 dog.png
2021-09-29 14:16:06 1448 names.csv
2021-02-15 15:09:56 135 pets/pets.json
Finally, the AWK script is called for each line. It splits the last word of each line on the period character (split($NF, a, ".")) so it can work out what the file extension is (stored in a[na]). It then aggregates the file size by extension in tot[extension] and the file count by extension in num[extension]. It finally prints out the aggregated file size and file count by extension, which looks something like this:
151251 2 png
1448 1 csv
135 1 json
You could also solve this fairly simply e.g. in Python using the boto3 SDK.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | priyanshu kumar |
| Solution 2 |

