pyspark.sql.functions.tuple_sketch_summary_double#

pyspark.sql.functions.tuple_sketch_summary_double(col, mode=None)[source]#

Returns the aggregated summary value from a Datasketches TupleSketch with double summaries.

New in version 4.2.0.

Parameters
colColumn or column name

The column containing a binary TupleSketch representation

modeColumn or str, optional

The summary mode: “sum” (default), “min”, “max”, or “alwaysone”

Returns
Column

The aggregated summary value.

Examples

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([(1, 10.0), (2, 20.0), (2, 30.0)], ["key", "value"])
>>> df.agg(sf.tuple_sketch_summary_double(
...     sf.tuple_sketch_agg_double("key", "value"))).show()
+------------------------------------------------------------------------------+
|tuple_sketch_summary_double(tuple_sketch_agg_double(key, value, 12, sum), sum)|
+------------------------------------------------------------------------------+
|                                                                          60.0|
+------------------------------------------------------------------------------+