Sparkdl
MLlib Convolutional and Feedforward Neural Network implementation with a high level API and advanced optimizers.
Install / Use
/learn @JeremyNixon/SparkdlREADME
Deep Learning for Spark MLlib
Distributed deep learning on Spark. Keras-like API and integration. Spark Package homepage.
Convolutional Neural Network
import org.apache.spark.ml.dl._
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)
val data = sqlContext.read.format("libsvm").load("path_to_dataset.txt")
val dataset = data.withColumnRenamed("label", "labels")
# Set up architecture for convolutional neural network
val model = new Sequential()
model.add(new Convolution2D(8, 1, 3, 3, 28, 28))
model.add(new Activation("relu"))
model.add(new Dropout(0.5))
model.add(new Dense(6272, 10))
model.add(new Activation("softmax"))
model.compile(loss="categorical_crossentropy",
optimizer=new Optimizer().adam(lr=.001),
metrics="Accuracy")
val trained = model.fit(dataset, num_iters=500)
Feedforward Neural Network
import org.apache.spark.ml.dl._
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)
val data = sqlContext.read.format("libsvm").load("path_to_dataset.txt")
val dataset = data.withColumnRenamed("label", "labels")
# Set up architecture for feedforward neural network
val model = new Sequential()
model.add(new Dense(784, 100))
model.add(new Activation("relu"))
model.add(new Dense(100, 10))
model.add(new Activation("softmax"))
model.compile(loss="categorical_crossentropy",
optimizer=new Optimizer().adam(lr=.001),
metrics="Accuracy")
val trained = model.fit(dataset, num_iters=500)
Installation Options
Spark Packages
Clone and Build
Clone the project with:
git clone https://github.com/JeremyNixon/sparkdl.git
cd into the repository and run
sbt assembly
to build the project.
Next we need to publish locally to ivy or to maven. To publish to ivy, run
sbt publish-local
or to publish to maven, run
sbt publishM2
and call spark with
./spark-shell --packages default:sparkdl_2.11:0.0.1
to run spark with sparkdl.
Contribution Guide
To contribute to the project, you'll need to build and modify your cloned fork of the project.
Once you've made your changes, run:
sbt assembly
to build. Then you'll need to publish to your local maven or ivy with a different version number. Modify the version number in pom.xml and in build.sbt, for example from 1.0.0 to 1.0.1. Then call:
sbt publish-local
# sbt publishM2 to publish to maven will also work, resulting in a different name.
To publish to ivy. Once your local copy has been published, you can call it from spark packages. This call will be different from the original call - your scala version will be appended to the name and the root of the call will now be default. The version number will also change to the version you've provided. For example:
./spark-shell --packages default:sparkdl_2.11:0.0.1
Where you can run your modified code.
Related Skills
node-connect
353.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
353.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
353.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
