'PyArrow setting column types with Table.from_pydict (schema)

With a PyArrow table created as pyarrow.Table.from_pydict(d) all columns are string types.

Creating a schema object as below [1], and using it as pyarrow.Table.from_pydict(d, schema=s) results in errors such as:

pyarrow.lib.ArrowTypeError: object of type <class 'str'> cannot be converted to int

Is there a means to set column types in tables created from dictionaries? Context is writing to Parquet files. A similar approach in Pandas is df.astype(schema).dtypes.

[1]

schema = pa.schema([
  ('id', pa.int32()),
  ('message_id', pa.string()),
  ('transaction_id', pa.string()),
])

pyarrow

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'PyArrow setting column types with Table.from_pydict (schema)

Sources

Related Questions