pyspark.sql.functions.width_bucket#
- pyspark.sql.functions.width_bucket(v, min, max, numBucket)[source]#
Returns the bucket number into which the value of this expression would fall after being evaluated. Note that input arguments must follow conditions listed below; otherwise, the method will return null.
New in version 3.5.0.
- Parameters
- Returns
Column
the bucket number into which the value would fall after being evaluated
Examples
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([ ... (5.3, 0.2, 10.6, 5), ... (-2.1, 1.3, 3.4, 3), ... (8.1, 0.0, 5.7, 4), ... (-0.9, 5.2, 0.5, 2)], ... ['v', 'min', 'max', 'n']) >>> df.select("*", sf.width_bucket('v', 'min', 'max', 'n')).show() +----+---+----+---+----------------------------+ | v|min| max| n|width_bucket(v, min, max, n)| +----+---+----+---+----------------------------+ | 5.3|0.2|10.6| 5| 3| |-2.1|1.3| 3.4| 3| 0| | 8.1|0.0| 5.7| 4| 5| |-0.9|5.2| 0.5| 2| 3| +----+---+----+---+----------------------------+