Pmml4s

PMML scoring library for Scala

Generate Convert Improve

Install / Use

/learn @autodeployai/Pmml4s

About this skill

Quality Score

0/100

README

PMML4S

PMML4S is a PMML (Predictive Model Markup Language) scoring library for Scala. It provides both Scala and Java Evaluator API for PMML.

Features
- Models support
- Transformations support
Installation
- SBT users
- Maven users
Use in Scala
Use in Java
Use PMML in Spark
Use PMML in PySpark
Use PMML in Python
Deploy PMML as REST API
Attentions
Support
License

Features

PMML4S is a lightweight, clean and efficient implementation based on the PMML specification from 2.0 through to the latest 4.4.1.

Models support

It supports the following models:

Not yet supported models:

Transformations support

It supports the following transformations:

Normalization
Discretization
Value mapping
Text Indexing
Functions

Not yet supported transformations:

Aggregation
Lag

Installation

SBT users

libraryDependencies += "org.pmml4s" %%  "pmml4s" % pmml4sVersion

Maven users

<dependency>
  <groupId>org.pmml4s</groupId>
  <artifactId>pmml4s_${scala.version}</artifactId>
  <version>${pmml4s.version}</version>
</dependency>

Use in Scala

PMML4S is really easy to use. Just do one or more of the following:

Load model.

import org.pmml4s.model.Model
import scala.io.Source

// load a model from an IO source that supports various sources, e.g. from a URL locates a PMML model.
val model = Model(Source.fromURL(new java.net.URL("http://dmg.org/pmml/pmml_examples/KNIME_PMML_4.1_Examples/single_iris_dectree.xml")))

import org.pmml4s.model.Model

// load a model from those help methods, e.g. pathname, file object, a string, an array of bytes, or an input stream.
val model = Model.fromFile("single_iris_dectree.xml")

Call predict(values) to predict new values that can be in different types, and the type of results is always same as inputs.

values in a Map:

scala> val result = model.predict(Map("sepal_length" -> 5.1, "sepal_width" -> 3.5, "petal_length" -> 1.4, "petal_width" -> 0.2))
result: Map[String,Any] = Map(probability -> 1.0, probability_Iris-versicolor -> 0.0, probability_Iris-setosa -> 1.0, probability_Iris-virginica -> 0.0, predicted_class -> Iris-setosa, node_id -> 1)

values in a list of pairs of keys and values:

scala> val result = model.predict("sepal_length" -> 5.1, "sepal_width" -> 3.5, "petal_length" -> 1.4, "petal_width" -> 0.2)
result: Seq[(String, Any)] = ArraySeq((predicted_class,Iris-setosa), (probability,1.0), (probability_Iris-setosa,1.0), (probability_Iris-versicolor,0.0), (probability_Iris-virginica,0.0), (node_id,1))

values in an Array:

The order of those values is supposed as same as the input fields list, and the order of results is same as the output fields list.

scala> val inputNames = model.inputNames
inputNames: Array[String] = Array(sepal_length, sepal_width, petal_length, petal_width)

scala> val result = model.predict(Array(5.1, 3.5, 1.4, 0.2))
result: Array[Any] = Array(Iris-setosa, 1.0, 1.0, 0.0, 0.0, 1)

scala> val outputNames = model.outputNames
outputNames: Array[String] = Array(predicted_class, probability, probability_Iris-setosa, probability_Iris-versicolor, probability_Iris-virginica, node_id)

values in the JSON format:

It supports the following styles, and the JSON string can take more than more records to predict, and the results are still a string in JSON with the same format as input.

 - `records` : list like [{column -> value}, … , {column -> value}]
 - `split` : dict like {‘columns’ -> [columns], ‘data’ -> [values]}

scala> val result = model.predict("""[{"sepal_length": 5.1, "sepal_width": 3.5, "petal_length": 1.4, "petal_width": 0.2}, {"sepal_length": 7, "sepal_width": 3.2, "petal_length": 4.7, "petal_width": 1.4}]""")
result: String = [{"probability":1.0,"probability_Iris-versicolor":0.0,"probability_Iris-setosa":1.0,"probability_Iris-virginica":0.0,"predicted_class":"Iris-setosa","node_id":"1"},{"probability":0.9074074074074074,"probability_Iris-versicolor":0.9074074074074074,"probability_Iris-setosa":0.0,"probability_Iris-virginica":0.09259259259259259,"predicted_class":"Iris-versicolor","node_id":"3"}]

scala> val result = model.predict("""{"columns": ["sepal_length", "sepal_width", "petal_length", "petal_width"], "data":[[5.1, 3.5, 1.4, 0.2], [7, 3.2, 4.7, 1.4]]}""")
result: String = {"columns":["predicted_class","probability","probability_Iris-setosa","probability_Iris-versicolor","probability_Iris-virginica","node_id"],"data":[["Iris-setosa",1.0,1.0,0.0,0.0,"1"],["Iris-versicolor",0.9074074074074074,0.0,0.9074074074074074,0.09259259259259259,"3"]]}

values in the PMML4S's Series:

import org.pmml4s.data.Series
import org.pmml4s.util.Utils

// The input schema contains a list of input fields with its name and data type, you can prepare data based on it.
scala> val inputSchema = model.inputSchema
inputSchema: org.pmml4s.common.StructType = StructType(StructField(sepal_length,double), StructField(sepal_width,double), StructField(petal_length,double), StructField(petal_width,double))

// There are several factory methods to construct a Series object.
// 1. values in a Map
scala> val result = model.predict(Series.fromMap(Map("sepal_length" -> "5.1", "sepal_width" -> "3.5", "petal_length" -> "1.4", "petal_width" -> "0.2"), inputSchema))
val result: org.pmml4s.data.Series = [Iris-setosa,1,1,0,0,1],[(predicted_class,string),(probability,real),(probability_Iris-setosa,real),(probability_Iris-versicolor,real),(probability_Iris-virginica,real),(node_id,string)]

// 2. values in an Array
scala> val result = model.predict(Series.fromArray(Array(5.1, 3.5, 1.4, 0.2), inputSchema))
val result: org.pmml4s.data.Series = [Iris-setosa,1,1,0,0,1],[(predicted_class,string),(probability,real),(probability_Iris-setosa,real),(probability_Iris-versicolor,real),(probability_Iris-virginica,real),(node_id,string)]

// 3. DataVals in a Seq
// Suppose the row is a record in map from an external columnar data, e.g. a CSV file, or relational database.
scala> val row = Map("sepal_length" -> "5.1", "sepal_width" -> "3.5", "petal_length" -> "1.4", "petal_width" -> "0.2")

// You need to convert the data to the desired type defined by PMML, and keep the same order as defined in the input schema.
scala> val values = inputSchema.map(x => Utils.toDataVal(row(x.name), x.dataType))
val values: Seq[org.pmml4s.data.DataVal] = List(5.1, 3.5, 1.4, 0.2)

scala> val result = model.predict(Series.fromSeq(values))
result: org.pmml4s.data.Series = [Iris-setosa,1.0,1.0,0.0,0.0,1],[(predicted_class,string),(probability,double),(probability_Iris-setosa,double),(probability_Iris-versicolor,double),(probability_Iris-virginica,double),(node_id,string)]

Which format to use?

You can use any formats of values according to your environment. In most cases, you don't need to call Utils.toDataVal explicitly to convert data to ones defined by PMML for others, the conversion will be operated properly automatically. e.g. those input values are string, not double, you can still get the same correct results.

scala> val result = model.predict(Map("sepal_length" -> "5.1", "sepal_width" -> "3.5", "petal_length" -> "1.4", "petal_width" -> "0.2"))
result: Map[String,Any] = Map(probability -> 1.0, probability_Iris-versicolor -> 0.0, probability_Iris-setosa -> 1.0, probability_Iris-virginica -> 0.0, predicted_class -> Iris-setosa, node_id -> 1)

scala> val result = model.predict(Array("5.1", "3.5", "1.4", "0.2"))
result: Array[Any] = Array(Iris-setosa, 1.0, 1.0, 0.0, 0.0, 1)

Understand the result values.

You can see the names of output fields, like predicted_class, probability, actually those names are tri

Related Skills

node-connect

351.8k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

110.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

351.8k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

351.8k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

autodeployai

View profile

View on GitHub

GitHub Stars66

CategoryDevelopment

Updated5mo ago

Forks12

autodeployai/pmml4s

Languages

Scala

Security Score

97/100

Audited on Oct 30, 2025

No findings

Pmml4s

Install / Use

README

PMML4S

Table of Contents

Features

Models support

Transformations support

Installation

SBT users

Maven users

Use in Scala

Related Skills