ToolLibGen
No description available
Install / Use
/learn @SalesforceAIResearch/ToolLibGenREADME
ToolLibGen
A comprehensive framework for extracting, clustering, and aggregating reusable tools from Chain-of-Thought (CoT) reasoning data using Large Language Models.
Setup
pip install -r requirements.txt
touch .env
echo "OPENAI_API_KEY=<YOUR_OPENAI_KEY>" >> .env
Pipeline Overview
The framework consists of three main stages:
- Tool Creation: Extract reusable tools from CoT reasoning data
- Clustering: Group similar tools into hierarchical clusters
- Aggregation: Merge and optimize tools within clusters to create final tool libraries
1. Tool Creation & Clustering
Extract reusable tools from question-answer pairs with CoT reasoning.
Basic Usage
cd src
export PYTHONPATH=$PYTHONPATH:$(pwd)
python create_specific_tool.py --file_path $INPUT_DATA --save_folder $SAVE_FOLDER --generation_model_name $MODEL_NAME --verification_model_name_lst $LLM_SOLVER_FOR_VERIFICATION
Arguments
--file_path: Path to input JSON file containing question-answer pairs--save_folder: Path to save extracted tools--generation_model_name: the model for tool creation and clustering--verification_model_name_lst: the model for verificaiton--debug: Enable debug mode (process only a samll samples)
An example of input file is in src/data/example_for_aggregation.json
Output Files
*_extracted_tools.json: Flattened Extracted Tools*clustered_hierarchy*.json: Hierarchical Cluster Structure*clustered_assigned_tools*.json: Tool Assignment*merged_tools.json*.json: All created tools and their assignments
2. Tool Aggregation
Merge and optimize tools within clusters to create final consolidated tool libraries.
Basic Usage
python aggregate_tools.py --file $merged_tools_json --model_name $MODEL_NAME --verification_model_name $VERIFICATION_MODEL
Arguments
--file: Path to clustered tools JSON file--model-name: LLM model to use--verification_model_name: LLM solver for verification
Aggregation Process
- Blueprint Design: Create high-level design for each cluster
- SIB Processing: Process blocks in parallel
- Implementation: Generate optimized code
- Validation: Test tool functionality
- Optimization: Iterative refinement
- Library Generation: Create final tool libraries with OpenAI schemas
Output Structure
The final output includes:
- Tool Libraries: Optimized Python functions with OpenAI schemas
3. Evaluation
python eval.py --input_data_path $test_file_path --tool_path $library_path --model_name $model_to_test
An example of test file is in src/data/example_test.json
