pyspark.sql.functions.nanvl#

pyspark.sql.functions.nanvl(col1, col2)[source]#

Returns col1 if it is not NaN, or col2 if col1 is NaN.

Both inputs should be floating point columns (DoubleType or FloatType).

New in version 1.6.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
col1Column or column name

first column to check.

col2Column or column name

second column to return if first is NaN.

Returns
Column

value from first column or second if first is NaN .

Examples

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([(1.0, float('nan')), (float('nan'), 2.0)], ("a", "b"))
>>> df.select("*", sf.nanvl("a", "b"), sf.nanvl(df.a, df.b)).show()
+---+---+-----------+-----------+
|  a|  b|nanvl(a, b)|nanvl(a, b)|
+---+---+-----------+-----------+
|1.0|NaN|        1.0|        1.0|
|NaN|2.0|        2.0|        2.0|
+---+---+-----------+-----------+