Learnware
Based on the learnware paradigm, the learnware package supports the entire process including the submission, usability testing, organization, identification, deployment, and reuse of learnwares. Simultaneously, this repository serves as Beimingwu's engine, supporting its core functionalities.
Install / Use
/learn @Learnware-LAMDA/LearnwareREADME
Introduction
Learnware paradigm was proposed by Professor Zhi-Hua Zhou in 2016 [1, 2]. In the learnware paradigm, developers worldwide can share models with the learnware dock system, which effectively searches for and reuse learnware(s) to help users solve machine learning tasks efficiently without starting from scratch.
The learnware package provides a fundamental implementation of the central concepts and procedures within the learnware paradigm. Its well-structured design ensures high scalability and facilitates the seamless integration of additional features and techniques in the future.
In addition, the learnware package serves as the engine for the Beimingwu System and can be effectively employed for conducting experiments related to learnware.
[1] Zhi-Hua Zhou. Learnware: on the future of machine learning. Frontiers of Computer Science, 2016, 10(4): 589–590 <br/> [2] Zhi-Hua Zhou. Machine Learning: Development and Future. Communications of CCF, 2017, vol.13, no.1 (2016 CNCC keynote)
Learnware Paradigm
A learnware consists of a high-performance machine learning model and specifications that characterize the model, i.e., "Learnware = Model + Specification". These specifications, encompassing both semantic and statistical aspects, detail the model's functionality and statistical information, making it easier for future users to identify and reuse these models.
<div align="center"> <img src="./docs/_static/img/learnware_market.svg" width="700" height="auto" style="max-width: 100%;" /> </div>The above diagram illustrates the learnware paradigm, which consists of two distinct stages:
Submitting Stage: Developers voluntarily submit various learnwares to the learnware doc system, and the system conducts quality checks and further organization of these learnwares.Deploying Stage: When users submit task requirements, the learnware doc system automatically selects whether to recommend a single learnware or a combination of multiple learnwares and provides efficient deployment methods. Whether it’s a single learnware or a combination of multiple learnwares, the system offers convenient learnware reuse interfaces.
Framework and Infrastructure Design
<div align="center"> <img src="./docs/_static/img/learnware_framework.svg" width="700" height="auto" style="max-width: 100%;"/> </div>The architecture is designed based on the guidelines including decoupling, autonomy, reusability, and scalability. The above diagram illustrates the framework from the perspectives of both modules and workflows.
- At the workflow level, the
learnwarepackage consists ofSubmitting StageandDeploying Stage.
| Module | Workflow |
| ---- | ---- |
| Submitting Stage | The developers submit learnwares to the learnware market, which conducts usability checks and further organization of these learnwares. |
| Deploying Stage | The learnware market recommends learnwares according to users’ task requirements and provides efficient reuse and deployment methods. |
- At the module level, the
learnwarepackage is a platform that consists ofLearnware,Market,Specification,Model,Reuse, andInterfacemodules.
| Module | Description |
| ---- | ---- |
| Learnware | The specific learnware, consisting of specification module, and user model module. |
| Market | Designed for learnware organization, identification, and usability testing. |
| Specification | Generating and storing statistical and semantic information of learnware, which can be used for learnware search and reuse. |
| Model | Including the base model and the model container, which can provide unified interfaces and automatically create isolated runtime environments. |
| Reuse | Including the data-free reuser, data-dependent reuser, and aligner, which can deploy and reuse learnware for user tasks. |
| Interface | The interface for network communication with the Beimingwu backend.|
Quick Start
Installation
Learnware is currently hosted on PyPI. You can easily install learnware by following these steps:
pip install learnware
In the learnware package, besides the base classes, many core functionalities such as "learnware specification generation" and "learnware deployment" rely on the torch library. Users have the option to manually install torch, or they can directly use the following command to install the learnware package:
pip install learnware[full]
Note: However, it's crucial to note that due to the potential complexity of the user's local environment, installing learnware[full] does not guarantee that torch will successfully invoke CUDA in the user's local setting.
Prepare Learnware
In the learnware package, each learnware is encapsulated in a zip package, which should contain at least the following four files:
learnware.yaml: learnware configuration file.__init__.py: methods for using the model.stat.json: the statistical specification of the learnware. Its filename can be customized and recorded in learnware.yaml.environment.yamlorrequirements.txt: specifies the environment for the model.
To facilitate the construction of a learnware, we provide a Learnware Template that users can use as a basis for building their own learnware. We've also detailed the format of the learnware zip package in Learnware Preparation.
Learnware Package Workflow
Users can start a learnware workflow according to the following steps:
Initialize a Learnware Market
You can initialize a basic Learnware Market named "demo" using the code snippet below:
from learnware.market import instantiate_learnware_market
# instantiate a demo market
demo_market = instantiate_learnware_market(market_id="demo", name="easy", rebuild=True)
Upload Learnware
Before uploading your learnware to the Learnware Market, you'll need to create a semantic specification, semantic_spec. This involves selecting or inputting values for semantic tags to describe the features of your task and model.
For instance, the following code illustrates the semantic specification for a Scikit-Learn type model. This model is tailored for education scenarios and performs classification tasks on tabular data:
from learnware.specification import generate_semantic_spec
semantic_spec = generate_semantic_spec(
name="demo_learnware",
data_type="Table",
task_type="Classification",
library_type="Scikit-learn",
scenarios="Education",
license="MIT",
)
After preparing the semantic specification, you can insert your learnware into the learnware market using a single line of code:
demo_market.add_learnware(zip_path, semantic_spec)
Here, zip_path is the file path of your learnware zip package.
Semantic Specification Search
To identify learnwares that align with your task's purpose, you'll need to provide a semantic specification, user_semantic, that outlines your task's characteristics. The Learnware Market will then perform an initial search based on user_semantic, which filters learnwares by considering the semantic information of your task.
# construct user_info, which includes a semantic specification
user_info = BaseUserInfo(id="user", semantic_spec=semantic_spec)
# search_learnware: performs semantic specification search when user_info doesn't include a statistical specification
search_result = demo_market.search_learnware(user_info)
single_result = search_results.get_single_results()
# single_result: the List of Tuple[Score, Learnware] returned by semantic specification search
print(single_result)
Statistical Specification Search
If you generate and provide a statistical specification file rkme.json, the Learnware Market will conduct learnware identification based on statistical information, and return more targeted models. Using the API we provided, you can easily generate this statistical specification locally.
For example, the code below executes learnware search when using Reduced Kernel Mean Embedding (RKME) as the statistical specification:
import learnware.specification as specification
user_spec = specification.RKMETableSpecification()
# unzip_path: directory for unzipped learnware zipfile
user_spec.load(os.path.join(unzip_path, "rkme.json"))
user_info = BaseUserInfo(
semantic_spec=user_semantic, stat_info={"RKMETableSpec
