SkillAgentSearch skills...

Functionary

Chat language model that can use tools and interpret the results

Install / Use

/learn @MeetKai/Functionary
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Functionary

<a href="https://meetkai.com/"> <img align="right" width="256" height="256" src="https://github.com/meetkai/functionary/assets/3749407/c7a1972d-6ad7-40dc-8000-dceabe6baabd"> </a>

Functionary is a language model that can interpret and execute functions/plugins.

The model determines when to execute functions, whether in parallel or serially, and can understand their outputs. It only triggers functions as needed. Function definitions are given as JSON Schema Objects, similar to OpenAI GPT function calls.

Documentation and more examples: functionary.meetkai.com

<details> <summary>Changelog: (click to expand)</summary> </details>

Getting Started

Functionary can be deployed using either our vLLM or SGLang servers. Choose either one depending on your preferences.

Installation

vLLM

pip install -e .[vllm]

SGLang

pip install -e .[sglang] --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer-python

Running the server

Small Model

vLLM

python3 server_vllm.py --model "meetkai/functionary-v4r-small-preview" --host 0.0.0.0 --port 8000 --max-model-len 8192

SGLang

python3 server_sglang.py --model-path "meetkai/functionary-v4r-small-preview" --host 0.0.0.0 --port 8000 --context-length 8192

Medium Model

Our medium models require: 4xA6000 or 2xA100 80GB to run, need to use: tensor-parallel-size or tp (SGLang)

vLLM

# vllm requires to run this first: https://github.com/vllm-project/vllm/issues/6152
export VLLM_WORKER_MULTIPROC_METHOD=spawn

python server_vllm.py --model "meetkai/functionary-medium-v3.1" --host 0.0.0.0 --port 8000 --max-model-len 8192 --tensor-parallel-size 2

SGLang

python server_sglang.py --model-path "meetkai/functionary-medium-v3.1" --host 0.0.0.0 --port 8000 --context-length 8192 --tp 2

LoRA Support (Currently Only in vLLM)

Similar to LoRA in vLLM, our server supports serving LoRA adapters both at startup and dynamically.

To serve a LoRA adapter at startup, run the server with the --lora-modules argument:

python server_vllm.py --model {BASE_MODEL} --enable-lora --lora-modules {name}={path} {name}={path} --host 0.0.0.0 --port 8000

To serve a LoRA adapter dynamically, use the /v1/load_lora_adapter endpoint:

python server_vllm.py --model {BASE_MODEL} --enable-lora --host 0.0.0.0 --port 8000
# Load a LoRA adapter dynamically
curl -X POST http://localhost:8000/v1/load_lora_adapter \
  -H "Content-Type: application/json" \
  -d '{
    "lora_name": "my_lora",
    "lora_path": "/path/to/my_lora_adapter"
  }'
# Example chat request to lora adapter
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my_lora",
    "messages": [...],
    "tools": [...],
    "tool_choice": "auto"
  }'
# Unload a LoRA adapter dynamically
curl -X POST http://localhost:8000/v1/unload_lora_adapter \
  -H "Content-Type: application/json" \
  -d '{
    "lora_name": "my_lora"
  }'

Text-Generation-Inference (TGI)

We also provide a service that performs inference on Functionary models using Text-Generation-Inference (TGI). Follow these steps to get started:

  1. Install Docker following their installation instructions.

  2. Install the Docker SDK for Python

pip install docker
  1. Start up the Functionary TGI server

At start-up, the Functionary TGI server tries to connect to an existing TGI endpoint. In this case, you can run the following:

python3 server_tgi.py --model <REMOTE_MODEL_ID_OR_LOCAL_MODEL_PATH> --endpoint <TGI_SERVICE_ENDPOINT>

If the TGI endpoint does not exist, the Functionary TGI server will start a new TGI endpoint container with the address provided in the endpoint CLI argument via the installed Docker Python SDK. Run the following commands for remote and local models respectively:

python3 server_tgi.py --model <REMOTE_MODEL_ID> --remote_model_save_folder <PATH_TO_SAVE_AND_CACHE_REMOTE_MODEL> --endpoint <TGI_SERVICE_ENDPOINT>
python3 server_tgi.py --model <LOCAL_MODEL_PATH> --endpoint <TGI_SERVICE_ENDPOINT>
  1. Make either OpenAI-compatible or raw HTTP requests to the Functionary TGI server.

Docker

If you're having trouble with dependencies, and you have nvidia-container-toolkit, you can start your environment like this:

cd <ROOT>

# vLLM
sudo docker build -t functionary-vllm -f dockerfiles/Dockerfile.vllm .
sudo docker run --runtime nvidia --gpus all -p 8000:8000 functionary-vllm

# SGLang
sudo docker build -t functionary-sglang -f dockerfiles/Dockerfile.sgl .
sudo docker run --runtime nvidia --gpus all -p 8000:8000 functionary-sglang

OpenAI Compatible Usage

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="functionary")

client.chat.completions.create(
    model="meetkai/functionary-v4r-small-preview",
    messages=[{"role": "user",
            "content": "What is the weather for Istanbul?"}
    ],
    tools=[{
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        }
                    },
                    "required": ["location"]
                }
            }
        }],
    tool_choice="auto"
)

Raw Usage:

<details> <summary>Details (click to expand)</summary>
import requests

data = {
    'model': 'meetkai/functionary-v4r-small-preview', # model name here is the value of argument "--model" in deploying: server_vllm.py or server.py
    'messages': [
        {
            "role": "user",
            "content": "What is the weather for Istanbul?"
        }
    ],
    'tools':[ # For functionary-7b-v2 we use "tools"; for functionary-7b-v1.4 we use "functions" = [{"name": "get_current_weather", "description":..., "parameters": ....}]
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": 

Related Skills

View on GitHub
GitHub Stars1.6k
CategoryDevelopment
Updated2d ago
Forks118

Languages

Python

Security Score

100/100

Audited on Mar 26, 2026

No findings