RADAR

Code for our NeurIPS2023 accepted paper: RADAR: Robust AI-Text Detection via Adversarial Learning. We tested RADAR on 8 LLMs including Vicuna and LLaMA. The results show that RADAR can attain good detection performance on LLM-generated AI-text while being robust against paraphrasing.

Generate Convert Improve

Install / Use

/learn @IBM/RADAR

About this skill

Quality Score

0/100

README

RADAR_AI_Detection

Code for our NeurIPS2023 accepted paper: RADAR: Robust AI-Text Detection via Adversarial Learning.

Live demo for RADAR: RADAR-Demo

We tested RADAR on 8 LLMs including Vicuna and LLaMA. The results show that RADAR can attain good detection performance on LLM-generated AI-text while being robust against paraphrasing.

Environment Build

    cd env
    # go to env directory
    conda env create -f radar_core.yaml 
    # to init a environment with packages installed using conda
    conda activate radar_env
    #activate conda environment
    pip install -r radar_requirements.txt 
    # to install packages install using pip

Use RADAR to get AI-generated probability

Our RADAR detector is trained from the RoBERTa-large model. You can use it as using RoBERTa-large model. Here is an example of using RADAR to get the probability that the text is generated by Vicuna.

detector = transformers.AutoModelForSequenceClassification.from_pretrained("TrustSafeAI/RADAR-Vicuna-7B")
tokenizer = transformers.AutoTokenizer.from_pretrained("TrustSafeAI/RADAR-Vicuna-7B")
detector.eval()
detector.to(device)
Text_Input=["I'm not a chatbot"]
with torch.no_grad():
  inputs = tokenizer(Text_input, padding=True, truncation=True, max_length=512, return_tensors="pt")
  inputs = {k:v.to(device) for k,v in inputs.items()}
  output_probs = F.log_softmax(detector(**inputs).logits,-1)[:,0].exp().tolist()
  print("Probability of AI-generated texts is",output_probs)

Paraphrase the ai-text to evade detection

We prompt the gpt-3.5-turbo/gpt-4 to paraphrase the ai-generated text to make it more like human-written.

import openai
openai.api_key = "your_api_key"
def _openai_response(text,openai_model):
    # get paraphrases of text from the openai model
    # openai_model can be gpt-3.5-turbo/gpt-4
    system_instruct = {"role": "system", "content": "Enhance the word choices in the sentence to sound more like that of a human."}
    user_input={"role": "user", "content": text}
    messages = [system_instruct,user_input]
    k_wargs = { "messages":messages, "model": openai_model}
    r = openai.ChatCompletion.create(**k_wargs)['choices'][0].message.content
    return r

Calculate the Detection AUROC

We may need to calculate the detection auroc of the detector.

from sklearn.metrics import auc,roc_curve
def get_roc_metrics(human_preds, ai_preds):
    # human_preds is the ai-generated probabiities of human-text
    # ai_preds is the ai-generated probabiities of ai-text
    fpr, tpr, _ = roc_curve([0] * len(human_preds) + [1] * len(ai_preds), human_preds + ai_preds,pos_label=1)
    roc_auc = auc(fpr, tpr)
    return fpr.tolist(), tpr.tolist(), float(roc_auc)

Examples

We provide some examples of using RADAR in radar_examples.ipynb. You can refer to it to get more familiar with RADAR working flow.

Citation

If you find RADAR useful, please cite the following paper:

@inproceedings{DBLP:conf/nips/HuCH23,
  author       = {Xiaomeng Hu and
                  Pin{-}Yu Chen and
                  Tsung{-}Yi Ho},
  title        = {{RADAR:} Robust AI-Text Detection via Adversarial Learning},
  booktitle    = {Advances in Neural Information Processing Systems 36: Annual Conference
                  on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans,
                  LA, USA, December 10 - 16, 2023},
  year         = {2023}
}

Contact

Feel free to contact Xiaomeng Hu if you have any questions.

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

last30days-skill

20.0k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary