'How to use Python to connect to an Oracle database by ApacheBeam?
import apache_beam as beam
from apache_beam.io.jdbc import ReadFromJdbc
with beam.Pipeline() as p:
result = (p
| 'Read from jdbc' >> ReadFromJdbc(
fetch_size=None,
table_name='table_name',
driver_class_name='oracle.jdbc.driver.OracleDriver',
jdbc_url='jdbc:oracle:thin:@localhost:1521:orcl',
username='xxx',
password='xxx',
query='selec * from table_name'
)
|beam.Map(print)
)
When I run the above code, the following error occurs:
ERROR:apache_beam.utils.subprocess_server:Starting job service with ['java', '-jar', 'C:\\Users\\YFater/.apache_beam/cache/jars\\beam-sdks-java-extensions-schemaio-expansion-service-2.29.0.jar', '51933']
ERROR:apache_beam.utils.subprocess_server:Error bringing up service
Solution 1:[1]
Abache Beam needs to use a java expansion service in order to use JDBC from python.
You get an error because Beam cannot launch the expansion service.
To fix this, install the Java runtime in the computer where you run apache beam, and make sure java is in your path.
IF the problem persists after installing java (or if you already have it installed), probably the JAR files Beam downloaded are bad (maybe the download stopped or the file was truncated due to disk full). In that case, just remove the contents of the $HOME/.apache_beam/cache/jars directory and re-run the beam pipeline.
Solution 2:[2]
Add classpath param in ReadFromJdbc
Example: classpath=['~/.apache_beam/cache/jars/ojdbc8.jar'],
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Iñigo González |
| Solution 2 | Rodrigo Brasil |
