FinGPT: Open-Source Financial Large Language Models

License

Let us not expect Wall Street to open-source LLMs or open APIs, due to FinTech institutes' internal regulations and policies.

Blueprint of FinGPT

https://huggingface.co/FinGPT

Visitors

Project Contributors

FinGPT is an open-source financial large language model project developed and maintained by the AI4Finance Foundation.

Key contributors include:

Hongyang (Bruce) Yang – research and development on financial large language models and related applications
[other contributors…]

What's New:

[Model Release] Nov, 2023: We release FinGPT-Forecaster! 🔥Demo, Medium Blog & Model are available on Huggingface🤗!
[Paper Acceptance] Oct, 2023: "FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets" is accepted🎉 by Instruction Workshop @ NeurIPS 2023
[Paper Acceptance] Oct, 2023: "FinGPT: Democratizing Internet-scale Data for Financial Large Language Models" is accepted🎉 by Instruction Workshop @ NeurIPS 2023
[Model Release] Oct, 2023: We release the financial multi-task LLMs 🔥 produced when evaluating base-LLMs on FinGPT-Benchmark
[Paper Acceptance] Sep, 2023: "Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language Models" is accepted🎉 by ACM International Conference on AI in Finance (ICAIF-23)
[Model Release] Aug, 2023: We release the financial sentiment analysis model 🔥
[Paper Acceptance] Jul, 2023: "Instruct-FinGPT: Financial Sentiment Analysis by Instruction Tuning of General-Purpose Large Language Models" is accepted🎉 by FinLLM 2023@IJCAI 2023
[Paper Acceptance] Jul, 2023: "FinGPT: Open-Source Financial Large Language Models" is accepted🎉 by FinLLM 2023@IJCAI 2023
[Medium Blog] Jun 2023: FinGPT: Powering the Future of Finance with 20 Cutting-Edge Applications

Why FinGPT?

1). Finance is highly dynamic. BloombergGPT trained an LLM using a mixture of finance data and general-purpose data, which took about 53 days, at a cost of around $3M). It is costly to retrain an LLM model like BloombergGPT every month or every week, thus lightweight adaptation is highly favorable. FinGPT can be fine-tuned swiftly to incorporate new data (the cost falls significantly, less than $300 per fine-tuning).

2). Democratizing Internet-scale financial data is critical, say allowing timely updates of the model (monthly or weekly updates) using an automatic data curation pipeline. BloombergGPT has privileged data access and APIs, while FinGPT presents a more accessible alternative. It prioritizes lightweight adaptation, leveraging the best available open-source LLMs.

3). The key technology is "RLHF (Reinforcement learning from human feedback)", which is missing in BloombergGPT. RLHF enables an LLM model to learn individual preferences (risk-aversion level, investing habits, personalized robo-advisor, etc.), which is the "secret" ingredient of ChatGPT and GPT4.

Milestone of AI Robo-Advisor: FinGPT-Forecaster

Try the latest released FinGPT-Forecaster demo at our HuggingFace Space

The dataset for FinGPT-Forecaster: https://huggingface.co/datasets/FinGPT/fingpt-forecaster-dow30-202305-202405

demo_interface

Enter the following inputs:

ticker symbol (e.g. AAPL, MSFT, NVDA)
the day from which you want the prediction to happen (yyyy-mm-dd)
the number of past weeks where market news are retrieved
whether to add the latest basic financials as additional information

Click Submit！ And you'll be responded with a well-rounded analysis of the company and a prediction for next week's stock price movement!

For detailed and more customized implementation, please refer to FinGPT-Forecaster

FinGPT Demos:

Current State-of-the-arts for Financial Sentiment Analysis

FinGPT V3 (Updated on 10/12/2023)
- What's new: Best trainable and inferable FinGPT for sentiment analysis on a single RTX 3090, which is even better than GPT-4 and ChatGPT Finetuning.
- FinGPT v3 series are LLMs finetuned with the LoRA method on the News and Tweets sentiment analysis dataset which achieve the best scores on most of the financial sentiment analysis datasets with low cost.
- FinGPT v3.3 use llama2-13b as base model; FinGPT v3.2 uses llama2-7b as base model; FinGPT v3.1 uses chatglm2-6B as base model.
- Benchmark Results:
- | Weighted F1 | FPB | FiQA-SA | TFNS | NWGI | Devices | Time | Cost | | ------------------------------------------------------------ | :-------: | :-------: | :-------: | :-------: | :----------------: | :---------: | :------------: | | FinGPT v3.3| 0.882 | 0.874 | 0.903 | 0.643 | 1 × RTX 3090 | 17.25 hours | $17.25 | | FinGPT v3.2| 0.850 | 0.860 | 0.894 | 0.636 | 1 × A100 | 5.5 hours | $ 22.55 | | FinGPT v3.1| 0.855 | 0.850 | 0.875 | 0.642 | 1 × A100 | 5.5 hours | $ 22.55 | | FinGPT (8bit) | 0.855 | 0.847 | 0.879 | 0.632 | 1 × RTX 3090 | 6.47 hours | $ 6.47 | | FinGPT (QLoRA) | 0.777 | 0.752 | 0.828 | 0.583 | 1 × RTX 3090 | 4.15 hours | $ 4.15 | | OpenAI Fine-tune | 0.878 | 0.887 | 0.883 | - | - | - | - | | GPT-4 | 0.833 | 0.630 | 0.808 | - | - | - | - | | FinBERT | 0.880 | 0.596 | 0.733 | 0.538 | 4 × NVIDIA K80 GPU | - | - | | Llama2-7B | 0.390 | 0.800 | 0.296 | 0.503 | 2048 × A100 | 21 days | $ 4.23 million | | BloombergGPT | 0.511 | 0.751 | - | - | 512 × A100 | 53 days | $ 2.67 million |
  
  Cost per GPU hour. For A100 GPUs, the AWS p4d.24xlarge instance, equipped with 8 A100 GPUs is used as a benchmark to estimate the costs. Note that BloombergGPT also used p4d.24xlarge As of July 11, 2023, the hourly rate for this instance stands at $32.773. Consequently, the estimated cost per GPU hour comes to $32.77 divided by 8, resulting in approximately $4.10. With this value as the reference unit price (1 GPU hour). BloombergGPT estimated cost= 512 x 53 x 24 = 651,264 GPU hours x $4.10 = $2,670,182.40. For RTX 3090, we assume its cost per hour is approximately $1.0, which is actually much higher than available GPUs from platforms like vast.ai.
- Reproduce the results by running [benchm

FinGPT

Install / Use

README