Gitminer
GitMiner is a Pharo Smalltalk library that helps developers in analyzing git repositories. With GitMiner retrieving data about source code, diffs, developer identities, changed files, and commits in a Smalltalk environment will be simpler than ever :)
Install / Use
/learn @USIREVEAL/GitminerREADME
GitMiner
GitMiner is a Pharo Smalltalk library that helps developers in analyzing git repositories. With GitMiner, retrieving data about source code, diffs, developer identities, changed files, and commits in a Smalltalk environment will be simpler than ever :)
Originally developed by Stefano Campanella and Carmen Armenti in the REVEAL group.
Installation
[!NOTE]
To ensure the project works without any issues, you have to make sure that you are also running the cloc submodule on your machine. To do so, go check the gitminer-cloc repository and follow the instructions to install it.
If you want to only mine the git repositories, pull from the most recent stable version v1.0.0:
Metacello new
baseline: 'GitMiner';
repository: 'github://USIREVEAL/gitminer:v1.0.0';
load.
If you want to use the APIs to mine GitHub repositories, you need to pull from the github branch:
Metacello new
baseline: 'GitMiner';
repository: 'github://USIREVEAL/gitminer:github';
load.
Once your service is running, ensure that the miner is pointing to the correct URL of your cloc service. You can do this by executing the following code (before executing any mining operation):
GMFlagStore uniqueInstance CLOCEndpoint: 'http://localhost:8080/'
Usage
Mining Repositories
To use GitMiner, you can start by creating a new GMRepository instance with the path to your local git repository:
repo := GMRepository from: aGitRepoPath.
This will create a new repository instance and start mining. Once the mining is done, you can access the mined data through the repository instance. You can eventually save the repository to a file for later use:
repo serializeTo: aPath.
Or if you want to update an already existing repository:
repo := GMRepository from: aGitRepoPath basedOn: aPathWithSerializedData.
Loading Repositories
To load a previously serialized repository, you can use the following code:
repo := GMRepository deserializeFrom: aPathWithSerializedData.
CLI
You can also mine repositories using the command-line interface. For more information, see the GitMiner-CLI package.
[!NOTE]
The Serialization format used by GitMiner is proprietary and, even if based on JSON, it is not compatible with other tools.
Publications
Gitminer was used to support the following scientific research papers:
- C. Armenti and M. Lanza (2025), "Telling Software Evolution Stories With Sonification", International Conference on Program Comprehension (ICPC), pp. 398–402, IEEE. doi: 10.1109/ICPC66645.2025.00050
- C. Armenti and M. Lanza (2024), "Using Animations to Understand Commits", International Conference on Software Maintenance and Evolution (ICSME), pp. 660–665, IEEE. doi: 10.1109/ICSME58944.2024.00069
- C. Armenti and M. Lanza (2024), "Using Interactive Animations to Analyze Fine-grained Software Evolution", Working Conference on Software Visualization (VISSOFT), pp. 36–47, IEEE. doi: 10.1109/VISSOFT64034.2024.00014
- S. Campanella and M. Lanza (2024), "Hidden in the Code: Visualizing True Developer Identities", Working Conference on Software Visualization (VISSOFT), pp. 24–35, IEEE. doi: 10.1109/VISSOFT64034.2024.00013
Related Skills
node-connect
346.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
