XGitGuard

AI based Secrets Detection Python Framework

Generate Convert Improve

Install / Use

/learn @Comcast/XGitGuard

About this skill

Quality Score

0/100

README

<h1 align="center"> xGitGuard </h1> <p align="center">AI-Based Secrets Detection<br> <i><b>Detect Secrets (API Tokens, Usernames, Passwords, etc.) Exposed on GitHub Repositories</b></i><br> Designed and Developed by Comcast Cybersecurity Research and Development Team</p>

Overview
xGitGuard Workflow
Features
- Credential-Detection-Workflow
- Keys/Token-Detection-Workflow
Install
Search Patterns
Usage
License

Overview

Detecting Publicly Exposed Secrets on GitHub at Scale
- xGitGuard is an AI-based system designed and developed by the Comcast Cybersecurity Research and Development team that detects secrets (e.g., API tokens, usernames, passwords, etc.) exposed on GitHub. xGitGuard uses advanced Natural Language Processing to detect secrets at scale and with appropriate velocity in GitHub repositories.
What are Secrets?
- Credentials
  - Usernames & passwords, server credentials, account credentials, etc.
- Keys/Tokens
  - Service API tokens (AWS, Azure, etc), encryption keys, etc.

xGitGuard Workflow

Features

Credential Detection Workflow

Enterprise Credential Secrets Detection - Run Secret detection on the given GitHub Enterprise account
Public Credential Secrets Detection - Run Secret detection on the GitHub Public account

Keys&Token Detection Workflow

Enterprise Keys and Tokens Secrets Detection - Run Secret detection on the given GitHub Enterprise account
Public Keys and Tokens Secrets Detection - Run Secret detection on the GitHub Public account

Install

Environment Setup

Install [Python >= v3.6]
Clone/Download the repository from GitHub
Traverse into the cloned xGitGuard folder
```
cd xGitGuard
```

Install Python Dependency Packages

python -m pip install -r requirements.txt

Check for Outdated Packages
```
pip list --outdated
```

Search Patterns

There are two ways to define configurations in xGitGuard
- Config Files
- Command Line Inputs
For Enterprise Github Detection (Secondary Keyword + Extension) under config directory
- Secondary Keyword: secondary_keys.csv file or User Feed - list of Keys & Tokens
- Secondary Keyword: secondary_creds.csv file or User Feed - list of Credentials
- Extension: extensions.csv file or User Feed - List of file Extensions
For Public Github Detection (Primary Keyword + Secondary Keyword + Extension) under config directory
- Primary Keyword: primary_keywords.csv file or User Feed - list of primary Keys
- Secondary Keyword: secondary_keys.csv file or User Feed - list of Keys & Toekns
- Secondary Keyword: secondary_creds.csv file or User Feed - list of Credentials
- Extension: extensions.csv file or User Feed - List of file Extensions

Usage

Enterprise Github Secrets Detection
- Enterprise Credential Secrets Detection
- Enterprise Keys and Tokens Secrets Detection
Public Github Secrets Detection
- Public Credential Secrets Detection
- Public Keys and Tokens Secrets Detection

Enterprise Github Secrets Detection

API Configuration Setup

Setup the system Environment variable below for accessing GitHub
- GITHUB_ENTERPRISE_TOKEN - Enterprise GitHub API Token with full scopes of repository and user.
  - Refer to the GitHub documentation [How To Get GitHub API Token] for help
Update the following configs with your Enterprise Name in config file xgg_configs.yaml in config Data folder xgitguard\config\*
- enterprise_api_url: https://github.<<Enterprise_Name>>.com/api/v3/search/code
- enterprise_pre_url: https://github.<<Enterprise_Name>>.com/api/v3/repos/
- url_validator: https://github.<<Enterprise_Name>>.com/api/v3/search/code
- enterprise_commits_url: https://github.<<Enterprise_Name>>.com/api/v3/repos/{user_name}/{repo_name}/commits?path={file_path}

Running Enterprise Secret Detection

Traverse into the github-enterprise script folder
```
cd github-enterprise
```

Enterprise Credential Secrets Detection

Detections Without Additional ML Filter

By default, the Credential Secrets Detection script runs for given Secondary Keywords and extensions without ML Filter.

# Run with Default configs
python enterprise_cred_detections.py

Detection With ML Filter

xGitGuard also has an additional ML filter where users can collect their organization/targeted data and train their model. Having this ML filter helps to reduce the false positives from the detection.

Pre-Requisite To Use the ML Filter

User Needs to follow the below process to collect data and train the model to use ML filter.

Follow ML Model Training

NOTE :

To use ML Filter, ML training is mandatory. This includes data collection, feature engineering & model persisting.

This process is going to be based on user requirements. It can be one time or if the user needs to improve the data, then needs to be done periodically.

Command to Run Enterprise Credential Scanner with ML

# Run for given Secondary Keyword and extension with ML model,
python enterprise_cred_detections.py -m Yes

Command to Run Enterprise Credentials Scanner for targeted organization

# Run for targeted org,
python enterprise_cred_detections.py -o org_name        #Ex: python enterprise_cred_detections.py -o test_org

Command to Run Enterprise Credentials Scanner for targeted repo

# Run for targeted repo,
python enterprise_cred_detections.py -r org_name/repo_name     #Ex: python enterprise_cred_detections.py -r test_org/public_docker

Command-Line Arguments for Credential Scanner

Run usage:
enterprise_cred_detections.py [-h] [-s Secondary Keywords] [-e Extensions] [-m Ml prediction] [-u Unmask Secret] [-o org_name] [-r repo_name] [-l Logger Level] [-c Console Logging]

optional arguments:
  -h, --help            show this help message and exit
  -s Secondary Keywords, --secondary_keywords Secondary Keywords
                          Pass the Secondary Keywords list as a comma-separated string
  -e Extensions, --extensions Extensions
                          Pass the Extensions list as a comma-separated string
  -m ML Prediction, --ml_prediction ML Prediction
                          Pass the ML Filter as Yes or No. Default is No
  -u Set Unmask, --unmask_secret To write secret unmasked, then set Yes
                          Pass the flag as Yes or No. Default is No
  -o pass org name, --org Pass the targeted org list as a comma-separated string
  -r pass repo name, --repo Pass the targeted repo list as a comma-separated string
  -l Logger Level, --log_level Logger Level
                          Pass the Logging level as for CRITICAL - 50, ERROR - 40 WARNING - 30 INFO - 20 DEBUG - 10. Default is 20
  -c Console Logging, --console_logging Console Logging
                          Pass the Console Logging as Yes or No. Default is Yes

Inputs used for search and scan

Note: Command-line argument keywords have precedence over config files (Default). If no keywords are passed in cli, data from config files will be used for the search.
- secondary_creds.csv file has a default list of credential relevant patterns for search, which can be updated by users based on their requirement.
- extensions.csv file has a default list of file extensions to be searched, which can be updated by the users based on their requirement.
GitHub search pattern for above examples: password +extension:py

Enterprise Keys and Tokens Secrets Detection

Detections Without Additional ML Filter

By default, the Keys and Tokens Secrets Detection script runs for given Secondary Keywords and the extensions without ML Filter.

# Run with Default configs
python enterprise_key_detections.py

Command to Run Enterprise Keys and Tokens Scanner for targeted organization

# Run for targeted org,
python enterprise_key_detections.py -o org_name        #Ex: python enterprise_key_detections.py -o test_org

Command to Run Enterprise Keys and Tokens Scanner for targeted repo

# Run for targeted repo,
python enterprise_key_detections.py -r org_name/repo_name     #Ex: python enterprise_key_detections.py -r test_org/public_docker

Detections With ML Filter

xGitGuard also has an additional ML filter where users can collect their organization/targeted data and train their model. Having this ML filter helps

Related Skills

apple-reminders

353.3k

Manage Apple Reminders via remindctl CLI (list, add, edit, complete, delete). Supports lists, date filters, and JSON/plain output.

gh-issues

353.3k

Fetch GitHub issues, spawn sub-agents to implement fixes and open PRs, then monitor and address PR review comments. Usage: /gh-issues [owner/repo] [--label bug] [--limit 5] [--milestone v1.0] [--assignee @me] [--fork user/repo] [--watch] [--interval 5] [--reviews-only] [--cron] [--dry-run] [--model glm-5] [--notify-channel -1002381931352]

node-connect

353.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

oracle

353.3k

Best practices for using the oracle CLI (prompt + file bundling, engines, sessions, and file attachment patterns).

Comcast

View profile

View on GitHub

GitHub Stars64

CategoryDevelopment

Updated1mo ago

Forks35

Comcast/xGitGuard

Languages

Python

Security Score

100/100

Audited on Mar 1, 2026

No findings

XGitGuard

Install / Use

README

Contents

Overview

xGitGuard Workflow

Features

Credential Detection Workflow

Keys&Token Detection Workflow

Install

Environment Setup

Search Patterns

Usage

Enterprise Github Secrets Detection

API Configuration Setup

Running Enterprise Secret Detection

Enterprise Credential Secrets Detection

Detections Without Additional ML Filter

Detection With ML Filter

Pre-Requisite To Use the ML Filter

Command to Run Enterprise Credential Scanner with ML

Command to Run Enterprise Credentials Scanner for targeted organization

Command to Run Enterprise Credentials Scanner for targeted repo

Command-Line Arguments for Credential Scanner

Enterprise Keys and Tokens Secrets Detection

Detections Without Additional ML Filter

Command to Run Enterprise Keys and Tokens Scanner for targeted organization

Command to Run Enterprise Keys and Tokens Scanner for targeted repo

Detections With ML Filter

Related Skills