ColumnConcatenator Class
Combines several columns into a single vector-valued column.
- Inheritance
-
nimbusml.internal.core.preprocessing.schema._columnconcatenator.ColumnConcatenatorColumnConcatenatornimbusml.base_transform.BaseTransformColumnConcatenatorsklearn.base.TransformerMixinColumnConcatenator
Constructor
ColumnConcatenator(columns=None, **params)
Parameters
Name | Description |
---|---|
columns
|
a dictionary of key-value pairs, where key is the output column name and value is a list of input column names.
The << operator can be used to set this value (see Column Operator) For example
'induced']})
'induced']}) For more details see Columns. |
params
|
Additional arguments sent to compute engine. |
Examples
###############################################################################
# ColumnConcatenator
import numpy
from nimbusml import FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.preprocessing.schema import ColumnConcatenator
# data input (as a FileDataStream)
path = get_dataset('infert').as_filepath()
data = FileDataStream.read_csv(path, sep=',', numeric_dtype=numpy.float32)
print(data.head())
# age case education induced parity pooled.stratum row_num ...
# 0 26.0 1.0 0-5yrs 1.0 6.0 3.0 1.0 ...
# 1 42.0 1.0 0-5yrs 1.0 1.0 1.0 2.0 ...
# 2 39.0 1.0 0-5yrs 2.0 6.0 4.0 3.0 ...
# 3 34.0 1.0 0-5yrs 2.0 4.0 2.0 4.0 ...
# 4 35.0 1.0 6-11yrs 1.0 3.0 32.0 5.0 ...
# transform usage
xf = ColumnConcatenator(columns={'features': ['age', 'parity', 'induced']})
# fit and transform
features = xf.fit_transform(data)
# print features
print(features.head())
# Feature is a vectory type column, when a dataset with vectortype column is
# the final output, the vector column will convert into multiple columns for
# each slot.
# age case education features.age features.induced features.parity ...
# 0 26.0 1.0 0-5yrs 26.0 1.0 6.0 ...
# 1 42.0 1.0 0-5yrs 42.0 1.0 1.0 ...
# 2 39.0 1.0 0-5yrs 39.0 2.0 6.0 ...
# 3 34.0 1.0 0-5yrs 34.0 2.0 4.0 ...
# 4 35.0 1.0 6-11yrs 35.0 1.0 3.0 ...
Remarks
ColumnConcatenator
creates a single vector-valued column from
multiple
columns. It can be performed on data before training a model. The
concatenation
can significantly speed up the processing of data when the number of
columns
is as large as hundreds to thousands.
Methods
get_params |
Get the parameters for this operator. |
get_params
Get the parameters for this operator.
get_params(deep=False)
Parameters
Name | Description |
---|---|
deep
|
Default value: False
|