SkillAgentSearch skills...

Octomender

Get repo recommendation based on your GitHub star history. (EoS)

Install / Use

/learn @yilinjuang/Octomender
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Octomender

Github Repo Recommender System.

Octomender = Octocat + Recommender

Get repo recommendation based on your GitHub star history.

<a href="https://octomend.com">~~[HELP] Algorithm Testing~~</a> End of Service

~~The recommendation algorithm is deployed and being tested on octomend.com.~~

~~Visit octomend.com to help improve the recommendation.~~

End of Service since GitHub published "Discover Repositories" service.

Dependencies

  • redis: An in-memory database that persists on disk

Core

  • hireids: Minimalistic C client for Redis >= 1.2
  • OpenMP>=4.0: C/C++ API that supports multi-platform shared memory multiprocessing programming

Preprocessing

Website

  • Flask: A microframework for Python based on Werkzeug, Jinja 2 and good intentions
  • GitHub-Flask: Flask extension for authenticating users with GitHub and making requests to the API
  • gunicorn: A Python WSGI HTTP Server for UNIX
  • google-cloud-datestore: Low-level Java and Python client libraries for Google Cloud Datastore

Dataset

Github Archive

Build Core

cd core; make

Preprocessing

parse.py

Parse raw json data files into three pickle data files.

  • output-data-basename.user: map of user id (str) to user name (str)
  • output-data-basename.repo: map of repo id (int) to repo name (str)
  • output-data-basename.edge: list of tuples of user-repo edge (str, int)
Usage: parse.py {-m|--member|-w|--watch} {<input-json-directory>|<input-json-file>} <output-data-basename>
  -m, --member      parse MemberEvent.
  -w, --watch       parse WatchEvent.
Ex:    parse.py -m 2017-06-01-0.json data
Ex:    parse.py --watch json/2017-05/ data/2017-05

Refer raw json data format to GitHub API v3.

parse_mp.py

Ditto, but run with multiprocessing. Default number of processes is 16.

Usage: parse_mp.py {-m|--member|-w|--watch} {<input-json-directory>|<input-json-file>} <output-data-basename> [n-process]
  -m, --member      parse MemberEvent.
  -w, --watch       parse WatchEvent.
  n-process         number of processes when multiprocessing.
Ex:    parse.py -m 2017-06-01-0.json data
Ex:    parse.py --watch json/2017-05/ data/2017-05 32

mergedata.py

Merge multiple pickle data files into one.

Usage: mergedata.py <input-data-dir> <output-data-basename>
Ex:    mergedata.py data/2016-010203/ data/2016-Q1

graph2redis.py

Insert graph data into redis database.

Usage: graph2redis.py <input-edgelist> <redis-port>
Ex:    graph2redis.py data/2016-Q1.edge 6379

Thanks

importpython and reddit.

importpython

reddit

License

MIT

View on GitHub
GitHub Stars92
CategoryDevelopment
Updated2mo ago
Forks2

Languages

Python

Security Score

100/100

Audited on Jan 29, 2026

No findings