'Connect timeout from Presto / Trino to Amazon S3
I currently have a Kubernetes setup outside of AWS where a data lake which resides in Amazon S3 gets queried using Presto v348. Data is stored in parquet file format. Additional component is a Hive metastore.
I encounter the following error and am at a loss on regards to troubleshooting the underlying issue:
io.prestosql.spi.PrestoException: Unable to execute HTTP request: Connect to s3-eu-central-1.amazonaws.com:80 [s3-eu-central-1.amazonaws.com] failed: connect timed out
This issue sometimes arises with bigger queries and interestingly brings the system into a state where all following queries time out. There are cases where in 1/5 of tries the query will succeed. Smaller queries in general work perfectly fine. This gets better after about 10-20min. Restarting Presto does not solve the 10-20min problem. Therefore I suspect there must be another problem.
I am aware of the fact that I might run into a performance ceiling, but the fact that instead of an error there are just timeouts and the whole system is unusable for 10-20 minutes is not acceptable.
I have already increased configs like hive.s3.max-connections
in Presto and fs.s3a.connection.maximum
in the metastore config but it doesn't seem to solve the problem. Besides these, I found no suggestions on how to tweak the setup to prevent the error from happening.
Presto connector config:
connector.name=hive-hadoop2
hive.metastore.uri=thrift://hive-metastore:9083
hive.metastore.username=prestodb
hive.s3.aws-access-key="S3_ACCESS_KEY"
hive.s3.aws-secret-key="S3_SECRET_KEY"
hive.s3.endpoint=s3-eu-central-1.amazonaws.com
hive.s3.ssl.enabled=false
hive.s3.path-style-access=true
hive.parquet.use-column-names=true
hive.allow-drop-table=true
hive.s3-file-system-type=PRESTO
hive.s3.max-connections=50000
hive.s3select-pushdown.max-connections=50000
hive.s3.connect-timeout=60s
hive.allow-rename-column=true
Metatore config:
core-site.xml: |
<configuration>
<property>
<name>fs.s3a.connection.ssl.enabled</name>
<value>false</value>
</property>
<property>
<name>fs.s3a.access.key</name>
<value>xxx</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>xxx</value>
</property>
<property>
<name>fs.s3a.fast.upload</name>
<value>true</value>
</property>
<property>
<name>fs.s3a.connection.maximum</name>
<value>50000</value>
</property>
<property>
<name>fs.s3a.connection.establish.timeout</name>
<value>60000</value>
</property>
<property>
<name>fs.s3a.threads.max</name>
<value>64</value>
</property>
<property>
<name>fs.s3a.max.total.tasks</name>
<value>128</value>
</property>
</configuration>
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|