PrivacyAsst: Safeguarding User Privacy in Tool-Using Large Language Model Agents

This repository is the official implementation of PrivacyAsst: Safeguarding User Privacy in Tool-Using Large Language Model Agents, Xinyu Zhang, Huiyu Xu, Zhongjie Ba, Zhibo Wang, Yuan Hong, Jian Liu, Zhan Qin, Kui Ren. IEEE Transactions on Dependable and Secure Computing (TDSC), 2024.

Introduction

Swift advancements in large language model (LLM) technologies lead to widespread research and applications, particularly in integrating LLMs with auxiliary tools, known as tool-using LLM agents. However, amid user interactions, the transmission of private information to both LLMs and tools poses considerable privacy risks to users. In this paper, we delve into current privacy-preserving solutions for LLMs and outline three pivotal challenges for tool-using LLM agents: generalization to both open-source and closed-source LLMs and tools, compliance with privacy requirements, and applicability to unrestricted tasks. To tackle these challenges, we present PrivacyAsst, the first privacy-preserving framework tailored for tool-using LLM agents, encompassing two solutions for different application scenarios. First, we incorporate a homomorphic encryption scheme to ensure computational security guarantees for users as a safeguard against both open-source and closed-source LLMs and tools. Moreover, we propose a shuffling-based solution to broaden the framework's applicability to unrestricted tasks. This solution employs an attribute-based forgery generative model and an attribute shuffling mechanism to craft privacy-preserving requests, effectively concealing individual inputs. Additionally, we introduce an innovative privacy concept, t-closeness in image data, for privacy compliance within this solution. Finally, we implement PrivacyAsst, accompanied by two case studies, demonstrating its effectiveness in advancing privacy-preserving artificial intelligence.

Quick Start

Our project is based on HuggingGPT, so the system requirements and usage are similar.

For User:

First choose your secrect key XA.
Run user/tCloseness/public_key_generate.py to generate your Diffie Hellman public key, and fill in diff_hellmen.key_user.
Replace openai.api_key, huggingface.token, and diff_hellmen.key_user in server/configs/config.default.yaml with your personal OpenAI Key, your Hugging Face Token, and your Diffie Hellman Public Key or put them in the environment variables OPENAI_API_KEY and HUGGINGFACE_ACCESS_TOKEN respectively. Then run the following commands.

For Server:

# setup env
cd server
conda create -n jarvis python=3.8
conda activate jarvis
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements.txt

# download models. Make sure that `git-lfs` is installed.
cd models
bash download.sh # required when `inference_mode` is `local` or `hybrid`. 

# run server
cd ..
python models_server.py --config configs/config.default_enc.yaml # required when `inference_mode` is `local` or `hybrid`
python awesome_chat_enc.py --config configs/config.default_enc.yaml --mode server # for text-davinci-003

For CLI:

cd server
python awesome_chat_enc.py --config configs/config.default_enc.yaml --mode cli

New Components

1. server/configs/config.default_enc.yaml

diff_hellmen:   # optional: if the encryption of the content is required
    key_user: REPLACE_WITH_YOUR_DEFFIE_HELLMAN_KEY_HERE

parse_task: >- #1 Task Planning Stage needs to add "encryption-image-classification".

2. server/data/p0_models.jsonl

Add two models: Encryption-based Solution (TenSEAL/encrypt-cnn-mnist) and Shuffling-based Solution (runwayml/stable-diffusion-v1-5-enc)

{"downloads": 3523663, "id": "runwayml/stable-diffusion-v1-5-enc", "likes": 6367, "pipeline_tag": "text-to-encimage", "task": "text-to-encimage", "meta": {"license": "creativeml-openrail-m", "tags": ["stable-diffusion", "stable-diffusion-diffusers", "text-to-encimage"], "inference": true, "extra_gated_prompt": "This model is open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage.\nThe CreativeML OpenRAIL License specifies: \n\n1. You can't use the model to deliberately produce nor share illegal or harmful outputs or content \n2. CompVis claims no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license\n3. You may re-distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL-M to all your users (please read the license entirely and carefully)\nPlease read the full license carefully here: https://huggingface.co/spaces/CompVis/stable-diffusion-license\n    ", "extra_gated_heading": "Please read the LICENSE to access this model"}, "description": "\n\n# Stable Diffusion v1-5 Model Card\n\nStable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input.\nFor more information about how Stable Diffusion functions, please have a look at [\ud83e\udd17's Stable Diffusion blog](https://huggingface.co/blog/stable_diffusion).\n\nThe **Stable-Diffusion-v1-5** checkpoint was initialized with the weights of the [Stable-Diffusion-v1-2](https:/steps/huggingface.co/CompVis/stable-diffusion-v1-2) \ncheckpoint and subsequently fine-tuned on 595k steps at resolution 512x512 on \"laion-aesthetics v2 5+\" and 10% dropping of the text-conditioning to improve [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598).\n\nYou can use this both with the [\ud83e\udde8Diffusers library](https://github.com/huggingface/diffusers) and the [RunwayML GitHub repository](https://github.com/runwayml/stable-diffusion).\n\n### Diffusers\n```py\nfrom diffusers import StableDiffusionPipeline\nimport torch\n\nmodel_id = \"runwayml/stable-diffusion-v1-5\"\npipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)\npipe = pipe.to(\"cuda\")\n\nprompt = \"a photo of an astronaut riding a horse on mars\"\nimage = pipe(prompt).images[0]  \n    \nimage.save(\"astronaut_rides_horse.png\")\n```\nFor more detailed instructions, use-cases and examples in JAX follow the instructions [here](https://github.com/huggingface/diffusers#text-to-image-generation-with-stable-diffusion)\n\n### Original GitHub Repository\n\n1. Download the weights \n   - [v1-5-pruned-emaonly.ckpt](https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt) - 4.27GB, ema-only weight. uses less VRAM - suitable for inference\n   - [v1-5-pruned.ckpt](https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned.ckpt) - 7.7GB, ema+non-ema weights. uses more VRAM - suitable for fine-tuning\n\n2. Follow instructions [here](https://github.com/runwayml/stable-diffusion).\n\n## Model Details\n- **Developed by:** Robin Rombach, Patrick Esser\n- **Model type:** Diffusion-based text-to-image generation model\n- **Language(s):** English\n- **License:** [The CreativeML OpenRAIL M license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) is an [Open RAIL M license](https://www.licenses.ai/blog/2022/8/18/naming-convention-of-responsible-ai-licenses), adapted from the work that [BigScience](https://bigscience.huggingface.co/) and [the RAIL Initiative](https://www.licenses.ai/) are jointly carrying in the area of responsible AI licensing. See also [the article about the BLOOM Open RAIL license](https://bigscience.huggingface.co/blog/the-bigscience-rail-license) on which our license is based.\n- **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses a fixed, pretrained text encoder ([CLIP ViT-L/14](https://arxiv.org/abs/2103.00020)) as suggested in the [Imagen paper](https://arxiv.org/abs/2205.11487).\n- **Resources for more information:** [GitHub Repository](https://github.com/CompVis/stable-diffusion), [Paper](https://arxiv.org/abs/2112.10752).\n- **Cite as:**\n\n      @InProceedings{Rombach_2022_CVPR,\n          author    = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\\\"orn},\n          title     = {High-Resolution Image Synthesis With Latent Diffusion Models},\n          booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n          month     = {June},\n          year      = {2022},\n          pages     = {10684-10695}\n      }\n\n# Uses\n\n## Direct Use \nThe model is intended for research purposes only. Possible research areas and\ntasks include\n\n- Safe deployment of models which have the potential to generate harmful content.\n- Probing and understanding the limitations and biases of generative models.\n- Generation of artworks and use in design and other artistic processes.\n- Applications in educational or creative tools.\n- Research on generative models.\n\nExcluded uses are described below.\n\n ### Misuse, Malicious Use, and Out-of-Scope Use\n_Note: This section is taken from the [DALLE-MINI model card](https://huggingface.co/dalle-mini/dalle-mini), but applies in the same way to Stable Diffusion v1_.\n\n\nThe model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotype

PrivacyAsst

Install / Use

README