'PyArrow setting column types with Table.from_pydict (schema)
With a PyArrow table created as pyarrow.Table.from_pydict(d) all columns are string types.
Creating a schema object as below [1], and using it as pyarrow.Table.from_pydict(d, schema=s) results in errors such as:
pyarrow.lib.ArrowTypeError: object of type <class 'str'> cannot be converted to int
Is there a means to set column types in tables created from dictionaries? Context is writing to Parquet files. A similar approach in Pandas is df.astype(schema).dtypes.
[1]
schema = pa.schema([
('id', pa.int32()),
('message_id', pa.string()),
('transaction_id', pa.string()),
])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
