text Package

Classes

CharTokenizer

Text transforms that can be performed on data before training a model.

WordTokenizer

Description The input to this transform is text, and the output is a vector of text containing the words (tokens) in the original text. The separator is space, but can be specified as any other character (or multiple characters) if needed.