Doris
Git repository mining tool written in Java. Plublished under GPL 3
Install / Use
/learn @gingerswede/DorisREADME
Doris
Table of contents
License
Doris is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. Doris is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with Doris. If not, see <http://www.gnu.org/licenses/>.
About
Doris was created by Emil Carlsson as part of a bachelor thesis about problems encountered when mining software repositories. The main goal of the thesis was to find a mining tool that could handle git, work with as few dependencies as possible, and also provide automated reproducible extraction and measurement pipeline.
Recent changes
- Update to JGit 3.0 (by https://github.com/robinst)
- Optimization to mining process (by https://github.com/robinst)
- Fixed bug with not working tags.
- Added parameter object.
Coming updates
- Working -b flag to specify branch. At the moment only master-branch can be mined.
Working JARs
Most common on MacOS is 1.6. To find out your JRE type java -version.
JRE 1.7
- Version 1.1.0 (Current version)
JRE 1.6
- Version 1.1.0 (Current version)
Dependencies
Doris is written in Java and requires Java (JRE 1.7 or newer) to be installed on the computer running it.
Usage guide
When using parameters and not specifying target directory, Doris will automatically create a directory with the same name as the .git file used for mining. If no parameters are passed to Doris, Doris will prompt for URI to .git file and the target to store the results from the mining. All flags are to be appended after the command to initialize Doris. When using flags the URI flag must be included as a minimum. Notice: If you have downloaded a working JAR this will have version number appended to it. Then change Doris.jar to doris-vX.Y.Z.jar to run these commands or rename jar to remove version numbering.
Run Doris on Windows:
C:\> java -jar c:\path\to\doris.jar
Run Doris on Unix-like OS:
$ java -jar doris.jar
Help
-h, --help [flag]
Shows help information. If a flag is appended it will show help information of that particular flag.
URI
-u, --uri <link to .git-file>
Specifies the URI where the .git file can be found. The protocols that Doris can handle is http(s)://, git:// and file://. Example of formatting: git://github.com/GingerSwede/Doris.git.
Target
-t, --target <path to target directory>
Specifies the target where the different commits should be stored. When omitted Doris will use the current working directory and set up a folder named after the .git-file used in the URI.
Start point
-s, --startpoint <commit sha-1>
Set a starting point for Doris to start mining the repository from. Full sha-1 is needed. If the sha-1 value is incorrect the mining will never be started.
End point
-e, --endpoint <commit sha-1>
Set a commit where Doris should stop mining. Full sha-1 is needed. If the sha-1 value is incorrect the mining will not stop. The given sha-1 commit will not be included in the mining results.
Limit
-l, --limit <max number of commits>
Set a maximum number of commits Doris should mine. Amount is to be given as an integer (e.g., 6, 10, 600).
No log
-n, --nolog
When this flag is passed the logging option in Doris is turned off. This is recommended when mining larger repositories that will generate many commits. All information that is logged by Doris can manually be obtained through the .git-file copied to local access. It can be found in the same directory as the mining results.
Important
If the -e and the -l flag is used in combination Doris will end on the flags criteria that is reached first.
Log file
Unless the -n flag is used Doris will automatically log basic information about the different commits in an xml-file. The log contain information about parent commit, author, committer, commit message and commit time (given in UNIX time). Example:
<project project_name="ExampleRepository">
<commit commit_name="08046e7b57f772f270619601d1a9420f76320066" commit_number="0" commit_time="1358168496">
<author e_mail="john.doe@example.com" name="John Doe"/>
<committer e_mail="john.doe@example.com" name="John Doe"/>
<commit_message>
Initial commit
</commit_message>
</commit>
</project>
Related Skills
node-connect
347.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
