'Expected: StringType, Found: INT32

It's giving an error since I put typing, I put INT but I didn't know there was an empty field '', now I can't even select, how can I reverse this, I've tried to cast but it doesn't work

dfNewRecords = spark.sql("""

      cast(a.ped_pca as long) as ped_pca, 
      cast(a.transporte as long) as transporte, 
      cast(a.fornecimento as long) as fornecimento,
      cast(a.codigo_material as long) as codigo_material,

      from df_new  a
      where
      not exists ( select 1 from df_hist b
                   where
                      a.numero_nota_fiscal = b.numero_nota_fiscal
                      and a.centro = b.centro
                      and b.data_puxada between '{}' and '{}'
                      and a.codigo_material = b.codigo_material
                      and a.transporte = b.transporte
                      and a.cliente = b.cliente
                  )
      and (a.data_puxada <> 'Data Puxada' or a.data_puxada is not null)
   order by 5 desc""".format(date_ago,date_after)).repartition(16)

after I added typing, the following error appears: Parquet column cannot be converted. Column: [ped_pca], Expected: StringType, Found: INT32

Even making select without typing now only this error appears



Solution 1:[1]

I faced same issue, after digging a least in my use case I had folder with parquet part files with same column having different types in different part.

Set spark.sql.parquet.mergeSchema to true

Here is more details on Spark parquet read. To resolve this for me, I deleted old schema part files because conversion wasn't possible (string to date)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Sumit Pawar