SkillAgentSearch skills...

SynapseML

Simple and Distributed Machine Learning

Install / Use

/learn @microsoft/SynapseML

README

SynapseML

Synapse Machine Learning

SynapseML (previously known as MMLSpark), is an open-source library that simplifies the creation of massively scalable machine learning (ML) pipelines. SynapseML provides simple, composable, and distributed APIs for a wide variety of different machine learning tasks such as text analytics, vision, anomaly detection, and many others. SynapseML is built on the Apache Spark distributed computing framework and shares the same API as the SparkML/MLLib library, allowing you to seamlessly embed SynapseML models into existing Apache Spark workflows.

With SynapseML, you can build scalable and intelligent systems to solve challenges in domains such as anomaly detection, computer vision, deep learning, text analytics, and others. SynapseML can train and evaluate models on single-node, multi-node, and elastically resizable clusters of computers. This lets you scale your work without wasting resources. SynapseML is usable across Python, R, Scala, Java, and .NET. Furthermore, its API abstracts over a wide variety of databases, file systems, and cloud data stores to simplify experiments no matter where data is located.

SynapseML requires Scala 2.12, Spark 3.4+, and Python 3.8+.

| Topics | Links | | :------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Build | Build Status codecov Code style: black | | Version | Version Release Notes Snapshot Version | | Docs | Website Scala Docs PySpark Docs Academic Paper | | Support | Gitter Mail | | Binder | Binder | | Usage | Downloads |

<!-- markdownlint-disable MD033 --> <details open> <summary> <strong><em>Table of Contents</em></strong> </summary> </details> <!-- markdownlint-enable MD033 -->

Features

<!-- markdownlint-disable MD033 -->

| <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/vw-blue-dark-orange.svg"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/cog_services_on_spark_2.svg"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/decision_tree_recolor.png"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/mmlspark_serving_recolor.svg"> | | :----------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------: | | Vowpal Wabbit on Spark | The Cognitive Services for Big Data | LightGBM on Spark | Spark Serving | | Fast, Sparse, and Effective Text Analytics | Leverage the Microsoft Cognitive Services at Unprecedented Scales in your existing SparkML pipelines | Train Gradient Boosted Machines with LightGBM | Serve any Spark Computation as a Web Service with Sub-Millisecond Latency |

| <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/microservice_recolor.png"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/emails/onnxai-ar21_crop.svg"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/emails/scales.svg"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/bindings.png"> | | :----------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------: |:-----------------------------------------------------------------------------------------------------------------------:| | HTTP on Spark | ONNX on Spark | Responsible AI | Spark Binding Autogeneration | | An Integration Between Spark and the HTTP Protocol, enabling Distributed Microservice Orchestration | Distributed and Hardware Accelerated Model Inference on Spark | Understand Opaque-box Models and Measure Dataset Biases | Automatically Generate Spark bindings for PySpark and SparklyR |

| <img width="150" src="https://mmlspark.blob.core.windows.net/graphics/emails/isolation forest 3.svg"> |

Related Skills

View on GitHub
GitHub Stars5.2k
CategoryOperations
Updated13h ago
Forks860

Languages

Scala

Security Score

100/100

Audited on Mar 26, 2026

No findings