SynapseML
Simple and Distributed Machine Learning
Install / Use
/learn @microsoft/SynapseMLREADME
Synapse Machine Learning
SynapseML (previously known as MMLSpark), is an open-source library that simplifies the creation of massively scalable machine learning (ML) pipelines. SynapseML provides simple, composable, and distributed APIs for a wide variety of different machine learning tasks such as text analytics, vision, anomaly detection, and many others. SynapseML is built on the Apache Spark distributed computing framework and shares the same API as the SparkML/MLLib library, allowing you to seamlessly embed SynapseML models into existing Apache Spark workflows.
With SynapseML, you can build scalable and intelligent systems to solve challenges in domains such as anomaly detection, computer vision, deep learning, text analytics, and others. SynapseML can train and evaluate models on single-node, multi-node, and elastically resizable clusters of computers. This lets you scale your work without wasting resources. SynapseML is usable across Python, R, Scala, Java, and .NET. Furthermore, its API abstracts over a wide variety of databases, file systems, and cloud data stores to simplify experiments no matter where data is located.
SynapseML requires Scala 2.12, Spark 3.4+, and Python 3.8+.
| Topics | Links |
| :------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Build |
|
| Version |
|
| Docs |
|
| Support |
|
| Binder |
|
| Usage |
|
Features
<!-- markdownlint-disable MD033 -->| <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/vw-blue-dark-orange.svg"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/cog_services_on_spark_2.svg"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/decision_tree_recolor.png"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/mmlspark_serving_recolor.svg"> | | :----------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------: | | Vowpal Wabbit on Spark | The Cognitive Services for Big Data | LightGBM on Spark | Spark Serving | | Fast, Sparse, and Effective Text Analytics | Leverage the Microsoft Cognitive Services at Unprecedented Scales in your existing SparkML pipelines | Train Gradient Boosted Machines with LightGBM | Serve any Spark Computation as a Web Service with Sub-Millisecond Latency |
| <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/microservice_recolor.png"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/emails/onnxai-ar21_crop.svg"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/emails/scales.svg"> | <img width="800" src="https://mmlspark.blob.core.windows.net/graphics/Readme/bindings.png"> | | :----------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------: |:-----------------------------------------------------------------------------------------------------------------------:| | HTTP on Spark | ONNX on Spark | Responsible AI | Spark Binding Autogeneration | | An Integration Between Spark and the HTTP Protocol, enabling Distributed Microservice Orchestration | Distributed and Hardware Accelerated Model Inference on Spark | Understand Opaque-box Models and Measure Dataset Biases | Automatically Generate Spark bindings for PySpark and SparklyR |
| <img width="150" src="https://mmlspark.blob.core.windows.net/graphics/emails/isolation forest 3.svg"> |
Related Skills
tmux
337.3kRemote-control tmux sessions for interactive CLIs by sending keystrokes and scraping pane output.
claude-opus-4-5-migration
83.2kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
terraform-provider-genesyscloud
Terraform Provider Genesyscloud
blogwatcher
337.3kMonitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI.
