SMECS
Software Metadata Extraction and Curation Software (SMECS)
Install / Use
/learn @NFDI4Energy/SMECSREADME
Software Metadata Extraction and Curation Software (SMECS)
| A web application to extract and curate research software metadata following the CodeMeta <https://codemeta.github.io/>_ (version 3.0 <https://raw.githubusercontent.com/codemeta/codemeta/3.0/codemeta.jsonld>) software metadata standard.
|
| SMECS facilitates the extraction of research software metadata from GitHub and GitLab repositories. It provides a user-friendly graphical interface for visualizing the retrieved metadata, enabling researchers and research software engineers to create high-quality metadata without reentering information already available elsewhere. The curated metadata is exported as CodeMeta-compliant JSON, ensuring integration with other tools and enhancing the discoverability, reuse, and impact of research software.
|
| 📄 For more details, see our Paper <http://dx.doi.org/10.14279/eceasst.v85.2708>.
|
| Authors: Stephan Ferenz, Aida Jafarbigloo
|
Phases in SMECS
| The workflow of SMECS consists of four sequential phases: Start, Extraction, Curation, and Export. | .. image:: https://github.com/NFDI4Energy/SMECS/blob/master/docs/Extraction_via_hermes-1.png :alt: SMECS Workflow :width: 1000px |
- Start Phase
In the Start phase, users provide two key inputs: - A repository link (GitHub or GitLab) - A personal access token for the corresponding platform SMECS can operate without user-provided tokens for some repositories by using internal default tokens. However: - For other GitLab instances, a user-provided token is always required. - Providing a token can enable SMECS to extract more detailed metadata from certain repositories. | 2. Extraction Phase
The Extraction phase uses HERMES <https://github.com/softwarepub/hermes>_ harvesting steps to retrieve metadata from multiple sources. For details on the metadata fields, see: Metadata Terms in SMECS <https://github.com/NFDI4Energy/SMECS/blob/master/static/schema/codemeta_schema.json>. Once the inputs from the Start phase are submitted, SMECS initiates metadata retrieval using four HERMES harvesters:
- GitHub
- GitLab
- CFF (Citation File Format <https://citation-file-format.github.io/>)
- CodeMeta
GitHub and GitLab metadata are harvested via the HERMES GitHub/GitLab plugin <https://github.com/softwarepub/hermes-plugin-github-gitlab>_.
All harvested metadata are mapped to CodeMeta using existing crosswalks from CodeMeta and HERMES, plus a custom crosswalk we created for GitLab. The metadata are then processed and merged via the HERMES processing step, producing a unified set of metadata. These results are displayed in the Curation phase. The HERMES-based approach ensures an interoperable, modular architecture that makes it easy to integrate additional harvesting sources in the future.
| 3. Curation Phase
The Curation phase allows users to edit and refine the extracted metadata. The metadata are displayed in a form-based interface organized into four main tabs: #. General Information #. Provenance #. Related Persons #. Technical Aspects
Key visualization and curation features include:
- Metadata Visualization & User-Friendly Interface: Metadata is displayed in a structured, easy-to-read format. The interface is intuitive, responsive, and allows smooth navigation through metadata fields.
- Missing Metadata Identification: SMECS flags fields where metadata is absent.
- Required Metadata Properties: Certain fields are marked as mandatory to ensure completeness of the final output.
- Editable Fields: Users can directly edit or correct metadata within the interface.
- Tagging Feature: Some fields allow multiple values for better metadata organization.
- Suggestion Lists: For selected fields, SMECS provides suggestions to reduce manual input and ensure consistency.
- Form-to-JSON Synchronization: Updates in the form are mirrored in the JSON view (one-directional) so users can track changes instantly.
- Export Phase
In the Export phase, the curated metadata can be downloaded as a CodeMeta 3.0–compliant JSON file. Users can: - Include this file in their repository to make their research software more FAIR - Use it for other purposes, such as uploading metadata to a software registry
| | Installation and Usage
Install from GitHub
-
Cloning the repository .. code-block:: shell
git clone https://github.com/NFDI4Energy/SMECS.git
-
Navigate to the Project Directory
.. code-block:: shellcd SMECS
-
Creating virtual environment
-
Ensure that
Python 3.10 or higher <https://www.python.org/>_ is installed on your system.- Windows: Check the version with
py --version. - Unix/MacOS: Check the version with
python3 --version.
- Windows: Check the version with
-
Create the virtual environment.
-
Windows: .. code-block:: shell
py -m venv my-env
-
Unix/MacOS: .. code-block:: shell
python3 -m venv my-env
| for more details visit
Creation of virtual environments <https://docs.python.org/3/library/venv.html>_ -
-
Activate virtual environment.
- Windows: .. code-block:: shell
my-env\Scripts\activate
- Unix/MacOS: .. code-block:: shell
source my-env/bin/activate
(Note that activating the virtual environment change the shell's prompt and show what virtual environment is being used.)
-
-
Managing Packages with pip
-
Ensure you can run pip from command prompt.
-
Windows: .. code-block:: shell
py -m pip --version
-
Unix/MacOS: .. code-block:: shell
python3 -m pip --version
-
-
Update pip Packages if required.
-
Windows: .. code-block:: shell
py -m pip install --upgrade pip
-
Unix/MacOS: .. code-block:: shell
python3 -m pip install --upgrade pip
-
-
Install a list of requirements specified in a Requirements.txt. * Windows: .. code-block:: shell
py -m pip install -r requirements.txt * **Unix/MacOS:** .. code-block:: shell python3 -m pip install -r requirements.txt
| for more details visit
Installing Packages <https://packaging.python.org/en/latest/tutorials/installing-packages/>_ |
| -
-
Running the project
- Open and run the project in an editor (e.g. VS code).
- Run the project.
-
Windows: .. code-block:: shell
py manage.py runserver
-
Unix/MacOS: .. code-block:: shell
python3 manage.py runserver
-
-
To see the output on the browser follow the link shown in the terminal. (e.g. http://127.0.0.1:8000/) | | Install through Docker
To get started with SMECS using Docker, follow the steps below:
-
Prerequisites
- Make sure
Docker <https://www.docker.com/products/docker-desktop/>_ is installed on your local machine.
- Make sure
-
Cloning the Repository .. code-block:: shell
git clone https://github.com/NFDI4Energy/SMECS.git
-
Navigate to the Project Directory .. code-block:: shell
cd smecs
-
Building the Docker Images .. code-block:: shell
docker-compose build
-
Starting the Services .. code-block:: shell
docker-compose up
-
Accessing the Application
- Navigate to
http://localhost:8000in your web browser.
- Navigate to
-
Stopping the Services .. code-block:: shell
docker-compose down | | Setting Up GitLab/GitHub Personal Token | To enhance the functionality of this program and ensure secure interactions with the GitLab/GitHub API, users are required to provide their personal access token. Follow these steps to integrate your token:
-
Generate a GitLab Token:
- Visit
Create a personal access token <https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html#create-a-personal-access-token>_ for more information on how to generate a new token.
- Visit
-
Generate a GitHub Token:
- Visit
Managing your personal access tokens <https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens>_ for more information on how to generate a new token. | | Tip for developers | If the page does not refresh correctly, clear the browser cache. You can force Chrome to pull in new data and ignore the saved ("cached") data by using the keyboard shortcutCmd+Shift+Ron Mac, andCtrl+F5orCtrl+Shift+Ron Windows. | Collaboration
- Visit
| We believe in the power of collaboration and welcome contributions from the community to enhance the SMECS workflow. Whether you have found a bug, have a feature idea, or want to share feedback, your contribution matters. Feel free to submit a pull request, open up an issue, or reach out with any questions or concerns.
|
| To see upcoming features in SMECS, please refer to our open issues <https://github.com/NFDI4Energy/SMECS/issues?q=is%3Aopen+is%3Aissue>.
| To stay updated on upcoming changes to the HERMES GitHub and GitLab Plugin <https://github.com/softwarepub/hermes-plugin-github-gitlab>, visit the project’s issues page <https://github.com/softwarepub/hermes-plugin-github-gitlab/issues>_. And if you have questions, suggestions, feedback, or need to report a bug, please open a new issue there <https://github.com/softwarepub/hermes-plugin-github-gitlab/issues>
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
400Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
