pyspark.sql.functions.locate#

pyspark.sql.functions.locate(substr, str, pos=1)[source]#

Locate the position of the first occurrence of substr in a string column, after position pos.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
substrliteral string

a string

strColumn or column name

a Column of pyspark.sql.types.StringType

posint, optional

start position (zero based)

Returns
Column

position of the substring.

Notes

The position is not zero based, but 1 based index. Returns 0 if substr could not be found in str.

Examples

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([('abcd',)], ['s',])
>>> df.select('*', sf.locate('b', 's', 1)).show()
+----+---------------+
|   s|locate(b, s, 1)|
+----+---------------+
|abcd|              2|
+----+---------------+
>>> df.select('*', sf.locate('b', df.s, 3)).show()
+----+---------------+
|   s|locate(b, s, 3)|
+----+---------------+
|abcd|              0|
+----+---------------+