'Error Connecting to GCS using Private Keys
Scenario is that we have Project1 from where we are trying to access Project2 GCS. We are passing private key of project 2 to SparkSession and job is running in project 1 but it is giving Invalid PKCS8 data.
Dataproc version - 1.4
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.private.key.id","<private-key-id>");
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.private.key",<private-key>");
session.sparkContext().hadoopConfiguration().set("fs.gs.auth.service.account.email","<client-email>");
ERROR:
2022-02-17T16:19:09.231359147Z DEFAULT Invalid PKCS8 data. at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.privateKeyFromPkcs8(CredentialFactory.java:346) at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.getCredentialsFromSAParameters(CredentialFactory.java:310) at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.CredentialFactory.getCredential(CredentialFactory.java:393) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getCredential(GoogleHadoopFileSystemBase.java:1324) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.createGcsFs(GoogleHadoopFileSystemBase.java:1459) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configure(GoogleHadoopFileSystemBase.java:1443) at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:467) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3242) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3291) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3259) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:470) at com.gcp.util.Day2Util.deleteGCSPartFile(Day2Util.java:430) at com.gcp.ReadGCSWithSA.main(ReadGCSWithSA.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:939) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:948) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Please let me know if there is any other way to pass the SA details. Please note we don't have the access to pass service account credential file.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
