'Does Amazon S3 Manifest File Support Parquet Format?

According to this AWS documentation it appears that Amazon S3 does not support parquet format in the manifest file but I find this hard to believe because that's a very common file format that's used and for Athena/Redshift you are supposed to use parquet format from what I understand. Here's another piece of documentation that references the S3 manifest file in relation to Redshift and parquet file format but I'm not too sure what it means exactly https://docs.aws.amazon.com/redshift/latest/dg/loading-data-files-using-manifest.html.

I'm just trying to create a data set in Amazon QuickSight using some parquet files in one of my S3 buckets. I tried omitting the globalUploadSettings field in my manifest and was able to pull the data in but QuickSight doesn't know what type of file it is so it just displays the information with a bunch of � characters.

Manifest I currently have:

{
    "fileLocations": [
        {
            "URIPrefixes": [
                "https://s3.amazonaws.com/myBucket/myFolderWithData/"
            ]
        }
    ]
}


Solution 1:[1]

AWS S3 does not support parquet format in the manifest file, but you can use Athena as Dataset to support parquet format.

Importing File Data

You can use files in Amazon S3 or on your local (on-premises) network as data sources. QuickSight supports files in the following formats:

CSV and TSV – Comma-delimited and tab-delimited text files

ELF and CLF – Extended and common log format files

JSON – Flat or semistructured data files

XLSX – Microsoft Excel files

QuickSight supports UTF-8 file encoding, but not UTF-8 (with BOM).

Files in Amazon S3 that have been compressed with zip, or gzip (www.gzip.org ), can be imported as-is. If you used another compression program for files in Amazon S3, or if the files are on your local network, remove compression before importing them.

https://docs.aws.amazon.com/quicksight/latest/user/supported-data-sources.html

Solution 2:[2]

For s3 manifest file parquet format you need to mention content length as well .

Link : https://docs.aws.amazon.com/redshift/latest/dg/loading-data-files-using-manifest.html

S3 manifest file example for parquet format:

{
  "entries": [
    {"url":"s3://mybucket/unload/manifest_0000_part_00", "meta": { "content_length": 5956875 }},
    {"url":"s3://mybucket/unload/unload/manifest_0001_part_00", "meta": { "content_length": 5997091 }}
 ]
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Slavik Muz
Solution 2 Sowmya