PoissonRegressionRegressor Class

Reference

Train an Poisson regression model.

Inheritance: nimbusml.internal.core.linear_model._poissonregressionregressor.PoissonRegressionRegressor

PoissonRegressionRegressor

nimbusml.base_predictor.BasePredictor

PoissonRegressionRegressor

sklearn.base.RegressorMixin

PoissonRegressionRegressor

Constructor

PoissonRegressionRegressor(normalize='Auto', caching='Auto', l2_regularization=1.0, l1_regularization=1.0, optimization_tolerance=1e-07, history_size=20, enforce_non_negativity=False, initial_weights_diameter=0.0, maximum_number_of_iterations=2147483647, stochastic_gradient_descent_initilaization_tolerance=0.0, quiet=False, use_threads=True, number_of_threads=None, dense_optimizer=False, feature=None, label=None, weight=None, **params)

Parameters

Name	Description
feature	see Columns.
label	see Columns.
weight	see Columns.
normalize	Specifies the type of automatic normalization used: `"Auto"`: if normalization is needed, it is performed automatically. This is the default choice. `"No"`: no normalization is performed. `"Yes"`: normalization is performed. `"Warn"`: if normalization is needed, a warning message is displayed, but normalization is not performed. Normalization rescales disparate data ranges to a standard scale. Feature scaling insures the distances between data points are proportional and enables various optimization methods such as gradient descent to converge much faster. If normalization is performed, a `MaxMin` normalizer is used. It normalizes values in an interval [a, b] where `-1 <= a <= 0` and `0 <= b <= 1` and `b - a = 1`. This normalizer preserves sparsity by mapping zero to zero.
caching	Whether trainer should cache input training data.
l2_regularization	L2 regularization weight.
l1_regularization	L1 regularization weight.
optimization_tolerance	Tolerance parameter for optimization convergence. Low = slower, more accurate.
history_size	Memory size for L-BFGS. Lower=faster, less accurate. The technique used for optimization here is L-BFGS, which uses only a limited amount of memory to compute the next step direction. This parameter indicates the number of past positions and gradients to store for the computation of the next step. Must be greater than or equal to `1`.
enforce_non_negativity	Enforce non-negative weights. This flag, however, does not put any constraint on the bias term; that is, the bias term can be still a negtaive number.
initial_weights_diameter	Sets the initial weights diameter that specifies the range from which values are drawn for the initial weights. These weights are initialized randomly from within this range. For example, if the diameter is specified to be `d`, then the weights are uniformly distributed between `-d/2` and `d/2`. The default value is `0`, which specifies that all the weights are set to zero.
maximum_number_of_iterations	Maximum iterations.
stochastic_gradient_descent_initilaization_tolerance	Run SGD to initialize LR weights, converging to this tolerance.
quiet	If set to true, produce no output during training.
use_threads	Whether or not to use threads. Default is true.
number_of_threads	Number of threads.
dense_optimizer	If `True`, forces densification of the internal optimization vectors. If `False`, enables the logistic regression optimizer use sparse or dense internal states as it finds appropriate. Setting `denseOptimizer` to `True` requires the internal optimizer to use a dense internal state, which may help alleviate load on the garbage collector for some varieties of larger problems.
params	Additional arguments sent to compute engine.

Examples


   ###############################################################################
   # PoissonRegressionRegressor
   from nimbusml import Pipeline, FileDataStream
   from nimbusml.datasets import get_dataset
   from nimbusml.feature_extraction.categorical import OneHotVectorizer
   from nimbusml.linear_model import PoissonRegressionRegressor

   # data input (as a FileDataStream)
   path = get_dataset('infert').as_filepath()

   data = FileDataStream.read_csv(path)
   print(data.head())
   #    age  case education  induced  parity ... row_num  spontaneous  ...
   # 0   26     1    0-5yrs        1       6 ...       1            2  ...
   # 1   42     1    0-5yrs        1       1 ...       2            0  ...
   # 2   39     1    0-5yrs        2       6 ...       3            0  ...
   # 3   34     1    0-5yrs        2       4 ...       4            0  ...
   # 4   35     1   6-11yrs        1       3 ...       5            1  ...

   # define the training pipeline
   pipeline = Pipeline([
       OneHotVectorizer(columns={'edu': 'education'}),
       PoissonRegressionRegressor(feature=['parity', 'edu'], label='age')
   ])

   # train, predict, and evaluate
   metrics, predictions = pipeline.fit(data).test(data, output_scores=True)

   # print predictions
   print(predictions.head())
   #       Score
   # 0  35.158913
   # 1  35.191872
   # 2  35.158913
   # 3  35.172092
   # 4  32.845158

   # print evaluation metrics
   print(metrics)
   #    L1(avg)    L2(avg)  RMS(avg)  Loss-fn(avg)  R Squared
   # 0  4.154053  24.429028  4.942573     24.429028   0.110628

Remarks

Poisson regression is a parameterized regression method. It assumes that the log of the conditional mean of the dependent variable follows a linear function of the dependent variables. Assuming that the dependent variable follows a Poisson distribution, the parameters of the regressor can be estimated by maximizing the likelihood of the obtained observations.

Reference

Poisson regression

Methods

get_params

Get the parameters for this operator.

get_params

Get the parameters for this operator.

get_params(deep=False)

Parameters

Name	Description
deep	Default value: False

通过

PoissonRegressionRegressor Class

Constructor

Parameters

Examples

Remarks

Methods

get_params

Parameters