Sklearn2pmml
Python library for converting Scikit-Learn pipelines to PMML
Install / Use
/learn @jpmml/Sklearn2pmmlREADME
SkLearn2PMML 
Python package for converting Scikit-Learn pipelines to PMML.
Features
This package is a thin Python wrapper around the JPMML-SkLearn library.
News and Updates
The current version is 0.129.2 (13 March, 2026):
pip install sklearn2pmml==0.129.2
See the NEWS.md file.
Prerequisites
- Java 11 or newer. The Java executable must be available on system path.
- Python 3.8 or newer.
Installation
Installing a release version from PyPI:
pip install sklearn2pmml
Alternatively, installing the latest snapshot version from GitHub:
pip install --upgrade git+https://github.com/jpmml/sklearn2pmml.git
Usage
Command-line application
The sklearn2pmml module is executable.
The main application loads the estimator object from the Pickle file (-i or --input; supports joblib, pickle or dill variants), performs the conversion, and saves the result to a PMML file (-o or --output):
python -m sklearn2pmml --input pipeline.pkl --output pipeline.pmml
Getting help:
python -m sklearn2pmml --help
On some platforms, the Pip package installer additionally makes the main application available as a top-level command:
sklearn2pmml --input pipeline.pkl --output pipeline.pmml
Library
A typical workflow can be summarized as follows:
- Create a
PMMLPipelineobject, and populate it with pipeline steps as usual. Thesklearn2pmml.pipeline.PMMLPipelineclass extends thesklearn.pipeline.Pipelineclass with the following functionality:
- If the
PMMLPipeline.fit(X, y)method is invoked withpandas.DataFrameorpandas.Seriesobject as anXargument, then its column names are used as feature names. Otherwise, feature names default to "x1", "x2", .., "x{number_of_features}". - If the
PMMLPipeline.fit(X, y)method is invoked withpandas.Seriesobject as anyargument, then its name is used as the target name (for supervised models). Otherwise, the target name defaults to "y".
- Fit and validate the pipeline as usual.
- Optionally, compute and embed verification data into the
PMMLPipelineobject by invokingPMMLPipeline.verify(X)method with a small but representative subset of training data. - Convert the
PMMLPipelineobject to a PMML file in local filesystem by invoking thesklearn2pmml.sklearn2pmml(estimator, pmml_path)utility method.
Developing a simple decision tree model for the classification of iris species:
import pandas
iris_df = pandas.read_csv("Iris.csv")
iris_X = iris_df[iris_df.columns.difference(["Species"])]
iris_y = iris_df["Species"]
from sklearn.tree import DecisionTreeClassifier
from sklearn2pmml.pipeline import PMMLPipeline
pipeline = PMMLPipeline([
("classifier", DecisionTreeClassifier())
])
pipeline.fit(iris_X, iris_y)
from sklearn2pmml import sklearn2pmml
sklearn2pmml(pipeline, "DecisionTreeIris.pmml", with_repr = True)
Developing a more elaborate logistic regression model for the same:
import pandas
iris_df = pandas.read_csv("Iris.csv")
iris_X = iris_df[iris_df.columns.difference(["Species"])]
iris_y = iris_df["Species"]
from sklearn_pandas import DataFrameMapper
from sklearn.decomposition import PCA
from sklearn.feature_selection import SelectKBest
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn2pmml.decoration import ContinuousDomain
from sklearn2pmml.pipeline import PMMLPipeline
pipeline = PMMLPipeline([
("mapper", DataFrameMapper([
(["Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"], [ContinuousDomain(), SimpleImputer()])
])),
("pca", PCA(n_components = 3)),
("selector", SelectKBest(k = 2)),
("classifier", LogisticRegression(multi_class = "ovr"))
])
pipeline.fit(iris_X, iris_y)
pipeline.verify(iris_X.sample(n = 15))
from sklearn2pmml import sklearn2pmml
sklearn2pmml(pipeline, "LogisticRegressionIris.pmml", with_repr = True)
Documentation
Integrations:
- Training Scikit-Learn GridSearchCV StatsModels pipelines
- Converting Scikit-Learn H2O.ai pipelines to PMML
- Converting customized Scikit-Learn estimators to PMML
- Training Scikit-Learn StatsModels pipelines
- Upgrading Scikit-Learn XGBoost pipelines
- Training Python-based XGBoost accelerated failure time models
- Converting Scikit-Learn PyCaret 3 pipelines to PMML
- Training Scikit-Learn H2O.ai pipelines
- One-hot encoding categorical features in Scikit-Learn XGBoost pipelines
- Training Scikit-Learn TF(-IDF) plus XGBoost pipelines
- Converting Scikit-Learn TF(-IDF) pipelines to PMML
- Converting Scikit-Learn Imbalanced-Learn pipelines to PMML
- Converting logistic regression models to PMML
- Stacking Scikit-Learn, LightGBM and XGBoost models
- Converting Scikit-Learn GridSearchCV pipelines to PMML
- Converting Scikit-Learn TPOT pipelines to PMML
- Converting Scikit-Learn LightGBM pipelines to PMML
Extensions:
- Extending Scikit-Learn with feature cross-references
- Extending Scikit-Learn with UDF expression transformer
- Extending Scikit-Learn with CHAID models
- Extending Scikit-Learn with prediction post-processing
- Extending Scikit-Learn with outlier detector transformer
- Extending Scikit-Learn with date and datetime features
- Extending Scikit-Learn with feature specifications
- Extending Scikit-Learn with GBDT+LR ensemble models
- Extending Scikit-Learn with business rules model
Miscellaneous:
- Upgrading Scikit-Learn decision tree models
- Measuring the memory consumption of Scikit-Learn models
- Benchmarking Scikit-Learn against JPMML-Evaluator
- Analyzing Scikit-Learn feature importances via PMML
Archived:
De-installation
Uninstalling:
pip uninstall sklearn2pmml
License
SkLearn2PMML is licensed under the terms and conditions of the GNU Affero General Public License, Version 3.0.
If you would like to use SkLearn2PMML in a proprietary software project, then it is possible to enter into a licensing agreement which makes SkLearn2PMML available under the terms and conditions of the BSD 3-Clause License instead.
Additional information
SkLearn2PMML is developed and maintained by Openscoring Ltd, Estonia.
Interested in using Java PMML API software in your company? Please contact info@openscoring.io
Related Skills
node-connect
338.7kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
338.7kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.6kCommit, push, and open a PR
