SkillAgentSearch skills...

SemGen

A tool for semantics-based annotation and composition of biosimulation models

Install / Use

/learn @SemBioProcess/SemGen
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

SemGen

SemGen is an experimental software tool for automating the modular composition and decomposition of biosimulation models.

SemGen facilitates the construction of complex, integrated models, and the swift extraction of reusable submodels from larger ones. SemGen relies on the semantically-rich SemSim model description format to help automate these modeling tasks.

With SemGen, users can:

  • Visualize models using D3 force-directed networks,
  • Create SemSim versions of existing models and annotate them with rich semantic data,
  • Automatically decompose models into interoperable submodels,
  • Semi-automatically merge models into more complex systems, and
  • Encode models in executable simulation formats.

Table of Contents

Getting Started

These instructions will help you use SemGen to visualize, annotate, extract, and merge models.

Prerequisites

SemGen is a Java-based program and requires Java Runtime Environment version 1.7 (64-bit) or higher to execute.

To check your Java version, go to a command prompt and enter:

java -version

Installing Pre-built Binaries

Simply download the appropriate build for your operating system from the releases page.

Windows: Download and run the Windows installer. You will then be able to run SemGen from the location where you installed it by double-clicking the SemGen.exe file, or if using installation defaults, from the Windows Start menu.

Mac: Open the SemGen .dmg file, and drag SemGen.app to Applications folder. Double-click SemGen.app to start the program.

Linux: Unarchive the SemGen .tar.gz file. Double-click the SemGen.jar file in the main SemGen directory to start the program

Building from Source

SemGen can be built from source using Apache Ant. From the root of the source directory, run the following two commands:

ant -buildfile build.xml build # compile the Java sources to .class files
ant -buildfile build.xml create_jar # bundle the .class files and third-part dependencies into a .jar

This will create the file SemSimAPI.jar in the root directory. You can run this file as follows to start the Py4J server:

java -classpath ./SemSimAPI.jar semsim.Py4J

Running SemGen

Here is a primer on how to use SemGen to load, visualize, annotate, extract, and merge models.

In SemGen, the Project tab will be your main workspace:

  • Search: Hovering your cursor over the magnifying glass brings up the search bar. You can search for example models, or currently visualized nodes by typing in search terms. The search can be performed over the name, description, or the annotation.
  • Project Actions: The menu on the left side contains project-level actions. This menu can be collapsed/expanded by clicking the chevron on the left edge.
  • Stage Options: The menu on the right side contains visualization options, as well as additional information about the selected node. This menu can be collapsed/expanded by clicking the chevron on the right edge.
  • Selection/Navigation: The buttons in the top right corner toggles click-and-drag between moving the visualization, and selecting multiple nodes. Additionally, the mouse scroll wheel can be used to zoom in/out of the visualization.

Loading a model

To load a model, click the Open model button under Project Actions on the lefthand side. This will prompt you to select a model file to load (SemGen currently supports SemSim, CellML, SBML, JSim file formats):

Once you select a model, it will be loaded in SemGen and visualized as a model node:

Alternatively, SemGen comes with a library of example models. These can be accessed by using the search bar. Hover over the magnifying glass on the top left and type in terms to search for. Click the model name in the results to load the model:

Visualizing a model

Once a model is loaded in SemGen, there are several ways to visualize and explore the model.

Select the model you want to visualize by clicking the model node (selected node will have a yellow ring around it). Then click one of the visualizations from the Project Actions menu on the lefthand side.

An entire model or submodel can be moved by clicking and dragging the hull surrounding the group of nodes. You can also adjust the view by clicking and dragging the whitespace around the model or zooming in and out using the mouse wheel.

NOTE: Occassionally, the layout algorithm may push a model's nodes drastically outside the viewing range. Re-clicking one of the visualization buttons in the Project Actions menu usual repositions the nodes inside the viewing range. See issue #214

Submodels

The submodel visualization shows the hierarchical and/or compartmental organization of the model:

Each submodel node can be further expanded by double clicking it:

Dependencies

The dependency visualization shows the mathematical dependency network in the model:

Different node types can be hidden or shown in the Stage Option menu, which can be useful for visualizing large models:

PhysioMap

PhysioMap displays the physiological processes and their participants (sources, sinks, and mediators) based on the semantics of the biological processes and entities:

Annotator

Click here for a comprehensive Annotator tutorial.

With the Annotator tool, you can convert mathematical models into the SemSim format and annotate the model's codewords using concepts from online reference ontologies. Currently the Annotator can convert MML, SBML, and CellML models into the SemSim format. The Semantics of Biological Processes group maintains a protocol for annotating a model which can help guide the annotation process.

To annotate a model, click Annotate button under Project Actions. This will create a new Annotation tab:

Composite annotations

Each composite annotation consists of a physical property term connected to a physical entity or physical process term. The physical entity term can itself also be a composite of ontology terms. We recommend using only terms from the Ontology of Physics for Biology (OPB) for the physical property annotation components. For the physical entity annotations we recommend using robust, thorough, and widely accepted online reference ontologies like the Foundational Model of Anatomy (FMA), Chemical Entities of Biological Interest (ChEBI), and Gene Ontology cellular components (GO-cc). For physical processes annotations, we recommend creating custom terms and defining them by identifying their thermodynamic sources, sinks and mediators from the physical entities in the model.

When you edit a composite annotation for a model codeword, the Annotator provides an interface for rapid searching and retrieval of reference ontology concepts via the BioPortal web service.

Example: Suppose you are annotating a beta cell glycolysis model that includes a codeword representing glucose concentration in the cytosol of the cell.

A detailed composite annotation would be:

OPB:Chemical concentration <propertyOf> CHEBI:glucose <part_of> FMA:Portion of cytosol <part_of> FMA:Beta cell

In this case we use the term Chemical concentration from the OPB for the physical property part of the annotation, and we compose the physical entity part by linking four concepts - one from the OPB, one from ChEBI and two from the FMA. This example illustrates the post-coordinated nature of the SemSim approach to annotation and how it provides high expressivity for annotating model terms.

The above example represents a very detailed composite annotation, however, such detail may not be necessary to disambiguate concepts in a given model. For example, there may not be any other portions of glucose within the model apart from that in the cytosol. In this case, one could use the first three terms in the composite annotation and still disambiguate the model codeword from the rest of the model's contents:

OPB:Chemical concentration <propertyOf> CHEBI:glucose

Although this annotation approach does not fully capture the biophysical meaning of the model codeword, SemGen is more likely to find semantic overlap between models if they use this shallower annotation style. This is mainly because the SemGen Merger tool currently only recognizes semantic equivalencies; it does not identify semantically similar terms in models that

Related Skills

View on GitHub
GitHub Stars20
CategoryDevelopment
Updated8mo ago
Forks4

Languages

Java

Security Score

67/100

Audited on Aug 3, 2025

No findings