SemGen
A tool for semantics-based annotation and composition of biosimulation models
Install / Use
/learn @SemBioProcess/SemGenREADME
SemGen
SemGen is an experimental software tool for automating the modular composition and decomposition of biosimulation models.
SemGen facilitates the construction of complex, integrated models, and the swift extraction of reusable submodels from larger ones. SemGen relies on the semantically-rich SemSim model description format to help automate these modeling tasks.
With SemGen, users can:
- Visualize models using D3 force-directed networks,
- Create SemSim versions of existing models and annotate them with rich semantic data,
- Automatically decompose models into interoperable submodels,
- Semi-automatically merge models into more complex systems, and
- Encode models in executable simulation formats.
Table of Contents
Getting Started
These instructions will help you use SemGen to visualize, annotate, extract, and merge models.
Prerequisites
SemGen is a Java-based program and requires Java Runtime Environment version 1.7 (64-bit) or higher to execute.
To check your Java version, go to a command prompt and enter:
java -version
Installing Pre-built Binaries
Simply download the appropriate build for your operating system from the releases page.
Windows: Download and run the Windows installer. You will then be able to run SemGen from the location where you installed it by double-clicking the SemGen.exe file, or if using installation defaults, from the Windows Start menu.
Mac: Open the SemGen .dmg file, and drag SemGen.app to Applications folder. Double-click SemGen.app to start the program.
Linux: Unarchive the SemGen .tar.gz file. Double-click the SemGen.jar file in the main SemGen directory to start the program
Building from Source
SemGen can be built from source using Apache Ant. From the root of the source directory, run the following two commands:
ant -buildfile build.xml build # compile the Java sources to .class files
ant -buildfile build.xml create_jar # bundle the .class files and third-part dependencies into a .jar
This will create the file SemSimAPI.jar in the root directory. You can run this file as follows to start the Py4J server:
java -classpath ./SemSimAPI.jar semsim.Py4J
Running SemGen
Here is a primer on how to use SemGen to load, visualize, annotate, extract, and merge models.
In SemGen, the Project tab will be your main workspace:

- Search: Hovering your cursor over the magnifying glass brings up the search bar. You can search for example models, or currently visualized nodes by typing in search terms. The search can be performed over the name, description, or the annotation.
- Project Actions: The menu on the left side contains project-level actions. This menu can be collapsed/expanded by clicking the chevron on the left edge.
- Stage Options: The menu on the right side contains visualization options, as well as additional information about the selected node. This menu can be collapsed/expanded by clicking the chevron on the right edge.
- Selection/Navigation: The buttons in the top right corner toggles click-and-drag between moving the visualization, and selecting multiple nodes. Additionally, the mouse scroll wheel can be used to zoom in/out of the visualization.
Loading a model
To load a model, click the Open model button under Project Actions on the lefthand side. This will prompt you to select a model file to load (SemGen currently supports SemSim, CellML, SBML, JSim file formats):

Once you select a model, it will be loaded in SemGen and visualized as a model node:

Alternatively, SemGen comes with a library of example models. These can be accessed by using the search bar. Hover over the magnifying glass on the top left and type in terms to search for. Click the model name in the results to load the model:

Visualizing a model
Once a model is loaded in SemGen, there are several ways to visualize and explore the model.
Select the model you want to visualize by clicking the model node (selected node will have a yellow ring around it). Then click one of the visualizations from the Project Actions menu on the lefthand side.
An entire model or submodel can be moved by clicking and dragging the hull surrounding the group of nodes. You can also adjust the view by clicking and dragging the whitespace around the model or zooming in and out using the mouse wheel.


NOTE: Occassionally, the layout algorithm may push a model's nodes drastically outside the viewing range. Re-clicking one of the visualization buttons in the Project Actions menu usual repositions the nodes inside the viewing range. See issue #214
Submodels
The submodel visualization shows the hierarchical and/or compartmental organization of the model:

Each submodel node can be further expanded by double clicking it:

Dependencies
The dependency visualization shows the mathematical dependency network in the model:

Different node types can be hidden or shown in the Stage Option menu, which can be useful for visualizing large models:

PhysioMap
PhysioMap displays the physiological processes and their participants (sources, sinks, and mediators) based on the semantics of the biological processes and entities:

Annotator
Click here for a comprehensive Annotator tutorial.
With the Annotator tool, you can convert mathematical models into the SemSim format and annotate the model's codewords using concepts from online reference ontologies. Currently the Annotator can convert MML, SBML, and CellML models into the SemSim format. The Semantics of Biological Processes group maintains a protocol for annotating a model which can help guide the annotation process.
To annotate a model, click Annotate button under Project Actions. This will create a new Annotation tab:

Composite annotations
Each composite annotation consists of a physical property term connected to a physical entity or physical process term. The physical entity term can itself also be a composite of ontology terms. We recommend using only terms from the Ontology of Physics for Biology (OPB) for the physical property annotation components. For the physical entity annotations we recommend using robust, thorough, and widely accepted online reference ontologies like the Foundational Model of Anatomy (FMA), Chemical Entities of Biological Interest (ChEBI), and Gene Ontology cellular components (GO-cc). For physical processes annotations, we recommend creating custom terms and defining them by identifying their thermodynamic sources, sinks and mediators from the physical entities in the model.
When you edit a composite annotation for a model codeword, the Annotator provides an interface for rapid searching and retrieval of reference ontology concepts via the BioPortal web service.
Example: Suppose you are annotating a beta cell glycolysis model that includes a codeword representing glucose concentration in the cytosol of the cell.
A detailed composite annotation would be:
OPB:Chemical concentration <propertyOf> CHEBI:glucose <part_of> FMA:Portion of cytosol <part_of> FMA:Beta cell
In this case we use the term Chemical concentration from the OPB for the physical property part of the annotation, and we compose the physical entity part by linking four concepts - one from the OPB, one from ChEBI and two from the FMA. This example illustrates the post-coordinated nature of the SemSim approach to annotation and how it provides high expressivity for annotating model terms.
The above example represents a very detailed composite annotation, however, such detail may not be necessary to disambiguate concepts in a given model. For example, there may not be any other portions of glucose within the model apart from that in the cytosol. In this case, one could use the first three terms in the composite annotation and still disambiguate the model codeword from the rest of the model's contents:
OPB:Chemical concentration <propertyOf> CHEBI:glucose
Although this annotation approach does not fully capture the biophysical meaning of the model codeword, SemGen is more likely to find semantic overlap between models if they use this shallower annotation style. This is mainly because the SemGen Merger tool currently only recognizes semantic equivalencies; it does not identify semantically similar terms in models that
Related Skills
node-connect
343.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
90.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
