eland.ml.MLModel.import_model¶

Transform and serialize a trained 3rd party model into Elasticsearch. This model can then be used for inference in the Elastic Stack.

Parameters¶

es_client: Elasticsearch client argument(s)

elasticsearch-py parameters or
elasticsearch-py instance

model_id: str

The unique identifier of the trained inference model in Elasticsearch.

model: An instance of a supported python model. We support the following model types:

sklearn.tree.DecisionTreeClassifier
- NOTE: When calculating the probabilities of a given classification label, Elasticsearch utilizes
  softMax. SKLearn instead normalizes the results. We try to account for this during model serialization, but probabilities may be slightly different in the predictions.
sklearn.tree.DecisionTreeRegressor
sklearn.ensemble.RandomForestRegressor
sklearn.ensemble.RandomForestClassifier
lightgbm.LGBMRegressor
- Categorical fields are expected to already be processed
- Only the following objectives are supported
  
  “regression”
  
  “regression_l1”
  
  “huber”
  
  “fair”
  
  “quantile”
  
  “mape”
lightgbm.LGBMClassifier
- Categorical fields are expected to already be processed
- Only the following objectives are supported
  
  “binary”
  
  “multiclass”
  
  “multiclassova”
xgboost.XGBClassifier
- only the following objectives are supported:
  
  “binary:logistic”
  
  “multi:softmax”
  
  “multi:softprob”
xgboost.XGBRanker
- only the following objectives are supported:
  
  “rank:map”
  
  “rank:ndcg”
  
  “rank:pairwise”
xgboost.XGBRegressor
- only the following objectives are supported:
  
  “reg:squarederror”
  
  “reg:linear”
  
  “reg:squaredlogerror”
  
  “reg:logistic”
  
  “reg:pseudohubererror”

feature_names: List[str]

Names of the features (required)

classification_labels: List[str]

Labels of the classification targets

classification_weights: List[str]

Weights of the classification targets

es_if_exists: {‘fail’, ‘replace’} default ‘fail’

How to behave if model already exists

fail: Raise a Value Error
replace: Overwrite existing model

es_compress_model_definition: bool

If True will use ‘compressed_definition’ which uses gzipped JSON instead of raw JSON to reduce the amount of data sent over the wire in HTTP requests. Defaults to ‘True’.

Examples¶

>>> from sklearn import datasets
>>> from sklearn.tree import DecisionTreeClassifier
>>> from eland.ml import MLModel

>>> # Train model
>>> training_data = datasets.make_classification(n_features=5, random_state=0)
>>> test_data = [[-50.1, 0.2, 0.3, -0.5, 1.0], [1.6, 2.1, -10, 50, -1.0]]
>>> classifier = DecisionTreeClassifier()
>>> classifier = classifier.fit(training_data[0], training_data[1])

>>> # Get some test results
>>> classifier.predict(test_data)
array([0, 1])

>>> # Serialise the model to Elasticsearch
>>> feature_names = ["f0", "f1", "f2", "f3", "f4"]
>>> model_id = "test_decision_tree_classifier"
>>> es_model = MLModel.import_model(
...   'http://localhost:9200',
...   model_id=model_id,
...   model=classifier,
...   feature_names=feature_names,
...   es_if_exists='replace'
... )

>>> # Get some test results from Elasticsearch model
>>> es_model.predict(test_data)
array([0, 1])

>>> # Delete model from Elasticsearch
>>> es_model.delete_model()