pyspark.sql.functions.randn#

pyspark.sql.functions.randn(seed=None)[source]#

Generates a random column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.

New in version 1.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters

seedint (default: None): Seed value for the random generator.

Returns

Column: A column of random values.

See also

pyspark.sql.functions.rand()
pyspark.sql.functions.randstr()
pyspark.sql.functions.uniform()

Notes

The function is non-deterministic in general case.

Examples

Example 1: Generate a random column without a seed

>>> from pyspark.sql import functions as sf
>>> spark.range(0, 2, 1, 1).select("*", sf.randn()).show() 
+---+--------------------------+
| id|randn(3968742514375399317)|
+---+--------------------------+
|  0|      -0.47968645355788...|
|  1|       -0.4950952457305...|
+---+--------------------------+

Example 2: Generate a random column with a specific seed

>>> spark.range(0, 2, 1, 1).select("*", sf.randn(seed=42)).show()
+---+------------------+
| id|         randn(42)|
+---+------------------+
|  0| 2.384479054241...|
|  1|0.1920934041293...|
+---+------------------+