SameCodeFinder
A Text Scanner which can find same or similar sourcecode
Install / Use
/learn @startry/SameCodeFinderREADME
SameCodeFinder
SameCodeFinder is a static code text scanner which can find the similar or the same code file in a big directory.
Feature
SameCodeFinder could detect the same function in the source code files. The finder could show the Hamming Distacnce between two funcitons.
- Find the same code which need to be extract to reuse
- Show the Hamming Distance between each soucecode file(Support All kinds of soucecode type)
- Show the Hamming Distance between each soucecode function(Support Java and Object-C now)
The below photo show the calculate result of MWPhotoBrowser

The result come from the command
python SameCodeFinder.py ~/Projects/opensource/MWPhotoBrowser/ .m --max-distance=10 --min-linecount=3 --functions --detail
Usage
Install the python implement of SimHash
pip install simhash
Visit A Python Implementation of Simhash Algorithm if you want to know more about the module.
python SameCodeFinder.py [arg0] [arg1]
Optional
[arg0]- Target Directory of files should be scan
[arg1]- Doc Suffix of files should be scan, eg
- .m - Object-C file
- .swift - Swift file
- .java - Java file
- Doc Suffix of files should be scan, eg
--detail- show process detail of scan
--functions- Use Functions as code scan standard
--max-distance=[input]- max hamming distance to keep, default is 20
--min-linecount=[input]- for function scan, the function would be ignore if the total line count of the function less than min-linecount
--output=[intput]- Customize the output file, default is "out.txt"
Requirement
Python 2.6+, Pip 9.0+, simhash
License
SameCodeFinder is available under the MIT license. See the LICENSE file for more info.
Related Skills
node-connect
344.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
96.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
