'Distinguish PySpark and Pandas DataFrames in Python type hints (PyCharm)
In PyCharm it seems that the type hints do not trigger a warning if a pyspark.sql.DataFrame is used in place of a pandas.DataFrame or vice versa.
e.g. the following code will not generate any warnings at all:
from pyspark.sql import DataFrame as SparkDataFrame
from pandas import DataFrame as PandasDataFrame
def test_pandas_to_spark(a: PandasDataFrame) -> SparkDataFrame:
return a
def test_spark_to_pandas(b: SparkDataFrame) -> PandasDataFrame:
return b.toPandas()
test_spark_to_pandas(PandasDataFrame({'a': [1, 2, 3]}))
Is this known / possible to fix?
BTW: I do have pyspark stubs installed: pyspark-stubs==2.4.0.post2
Solution 1:[1]
There is now a library called pandas-stubs which provides pandas type hints for static type checking tools to pick up on.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | amin_nejad |
