NaiveBayesClassifier Class

Reference

Machine Learning Naive Bayes Classifier

Inheritance: nimbusml.internal.core.naive_bayes._naivebayesclassifier.NaiveBayesClassifier

NaiveBayesClassifier

nimbusml.base_predictor.BasePredictor

NaiveBayesClassifier

sklearn.base.ClassifierMixin

NaiveBayesClassifier

Constructor

NaiveBayesClassifier(normalize='Auto', caching='Auto', feature=None, label=None, **params)

Parameters

Name	Description
feature	see Columns.
label	see Columns.
normalize	Specifies the type of automatic normalization used: `"Auto"`: if normalization is needed, it is performed automatically. This is the default choice. `"No"`: no normalization is performed. `"Yes"`: normalization is performed. `"Warn"`: if normalization is needed, a warning message is displayed, but normalization is not performed. Normalization rescales disparate data ranges to a standard scale. Feature scaling insures the distances between data points are proportional and enables various optimization methods such as gradient descent to converge much faster. If normalization is performed, a `MaxMin` normalizer is used. It normalizes values in an interval [a, b] where `-1 <= a <= 0` and `0 <= b <= 1` and `b - a = 1`. This normalizer preserves sparsity by mapping zero to zero.
caching	Whether trainer should cache input training data.
params	Additional arguments sent to compute engine.

Examples


   ###############################################################################
   # NaiveBayesClassifier
   from nimbusml import Pipeline, FileDataStream
   from nimbusml.datasets import get_dataset
   from nimbusml.feature_extraction.categorical import OneHotVectorizer
   from nimbusml.naive_bayes import NaiveBayesClassifier

   # data input (as a FileDataStream)
   path = get_dataset('infert').as_filepath()

   data = FileDataStream.read_csv(path)
   print(data.head())
   #    age  case education  induced  parity ... row_num  spontaneous  ...
   # 0   26     1    0-5yrs        1       6 ...       1            2  ...
   # 1   42     1    0-5yrs        1       1 ...       2            0  ...
   # 2   39     1    0-5yrs        2       6 ...       3            0  ...
   # 3   34     1    0-5yrs        2       4 ...       4            0  ...
   # 4   35     1   6-11yrs        1       3 ...       5            1  ...


   # define the training pipeline
   pipeline = Pipeline([
       OneHotVectorizer(columns={'edu': 'education'}),
       NaiveBayesClassifier(feature=['age', 'edu'], label='induced')
   ])

   # train, predict, and evaluate
   metrics, predictions = pipeline.fit(data).test(data, output_scores=True)

   # print predictions
   print(predictions.head())
   #   PredictedLabel   Score.0   Score.1   Score.2
   # 0               2 -5.297264 -5.873055 -4.847996
   # 1               2 -5.297264 -5.873055 -4.847996
   # 2               2 -5.297264 -5.873055 -4.847996
   # 3               2 -5.297264 -5.873055 -4.847996
   # 4               0 -1.785266 -3.172440 -3.691075

   # print evaluation metrics
   print(metrics)
   #   Accuracy(micro-avg)  Accuracy(macro-avg)   Log-loss  Log-loss reduction ...
   # 0             0.584677             0.378063  34.538776       -3512.460882 ...

Remarks

Naive Bayes is a probabilistic classifier that can be used for multiclass problems. Using Bayes' theorem, the conditional probability for a sample belonging to a class can be calculated based on the sample count for each feature combination groups. However, Naive Bayes Classifier is feasible only if the number of features and the values each feature can take is relatively small. It also assumes that the features are strictly independent.

Reference

Naive Bayes

Methods

decision_function	Returns score values
get_params	Get the parameters for this operator.

decision_function

Returns score values

decision_function(X, **params)

get_params

Get the parameters for this operator.

get_params(deep=False)

Parameters

Name	Description
deep	Default value: False

通过

NaiveBayesClassifier Class

Constructor

Parameters

Examples

Remarks

Methods

decision_function

get_params

Parameters