'What is the accepted syntax for PySpark's SQL expression-based filters?

The PySpark documentation for filters says that it accepts "a string of SQL expression".

Is there a reference of the accepted syntax for this parameter? The best I could find is the page about the WHERE clause in Spark SQL docs. Obviously some examples, like "id > 200", "length(name) > 3", or "id BETWEEN 200 AND 300", would work. But what about others? Filters like "age > (SELECT 42)" seem to work, so I assume nested expressions are OK. This just raises more questions:

What databases can these nested expressions refer to? Is there a way I can create a nested SELECT expression referring to the current dataframe, e.g. to do something like "age > (SELECT avg(age) FROM <current_dataframe>)" as a filter? (I know there are other ways of achieving this, I am only interested in what SQL expressions can do.)
Are there other, more advanced things that are allowed in filter expressions?
Finally, is there an online resource explaining this in more detail?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'What is the accepted syntax for PySpark's SQL expression-based filters?

Sources

Related Questions