'Distinguish PySpark and Pandas DataFrames in Python type hints (PyCharm)

In PyCharm it seems that the type hints do not trigger a warning if a pyspark.sql.DataFrame is used in place of a pandas.DataFrame or vice versa.

e.g. the following code will not generate any warnings at all:

from pyspark.sql import DataFrame as SparkDataFrame
from pandas import DataFrame as PandasDataFrame

def test_pandas_to_spark(a: PandasDataFrame) -> SparkDataFrame:
    return a

def test_spark_to_pandas(b: SparkDataFrame) -> PandasDataFrame:
    return b.toPandas()

test_spark_to_pandas(PandasDataFrame({'a': [1, 2, 3]}))

Is this known / possible to fix?

BTW: I do have pyspark stubs installed: pyspark-stubs==2.4.0.post2



Solution 1:[1]

There is now a library called pandas-stubs which provides pandas type hints for static type checking tools to pick up on.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 amin_nejad