Spectre
GPU-accelerated Factors analysis library and Backtester
Install / Use
/learn @Heerozh/SpectreREADME
||spectre
spectre is a GPU-accelerated Parallel quantitative trading library, focused on performance.
- Fast GPU Factor Engine, see below Benchmarks
- Pure python code, based on PyTorch, so it can integrate DL model very smoothly.
- Compatible with
alphalensandpyfolio
Python 3.7+, PyTorch 1.3+, Pandas 1.0+ recommended
Installation
pip install --no-deps git+git://github.com/Heerozh/spectre.git
Dependencies:
conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch
conda install pyarrow pandas tqdm plotly requests bs4 lxml
Benchmarks
My Machine:
- i9-7900X @ 3.30GHz, 20 Cores
- DDR4 3800MHz
- 3090: GIGABYTE GeForce RTX 3090 GAMING OC 24G
- 2080Ti: RTX 2080Ti Founders
Running on Quandl 5 years, 3196 Assets, total 3,637,344 bars.
| | spectre (CUDA/3090) | spectre (CUDA/2080Ti) | spectre (CPU) | zipline.pipeline | |----------------|-------------------------------|-------------------------------|----------------------------|-----------------------| |SMA(100) | 87.9 ms ± 3.35 ms (33.9x) | 144 ms ± 974 µs (20.7x) | 2.68 s ± 36.1 ms (1.11x) | 2.98 s ± 14.4 ms (1x) | |EMA(50) win=229 | 166 ms ± 3.25 ms (50.5x) | 270 ms ± 1.89 ms (31.0x) | 4.37 s ± 46.4 ms (1.74x) | 8.38 s ± 56.8 ms (1x) | |(MACD+RSI+STOCHF).rank.zscore | 184 ms ± 7.83 ms (77.7x) | 282 ms ± 1.33 ms (50.7x) | 6.01 s ± 28.1 (2.38x) | 14.3 s ± 277 ms (1x) |
- The CUDA memory used in the spectre benchmark is 1.8G, returned by cuda.max_memory_allocated().
- Benchmarks excluded the initial run (no copy data to VRAM, about saving 300ms).
Quick Start
DataLoader
First of all is data, you can use CsvDirLoader read your csv files.
spectre also has built-in Yahoo downloader, symbols=None will download all SP500 components.
from spectre.data import YahooDownloader
YahooDownloader.ingest(start_date="2001", save_to="./prices/yahoo", symbols=None, skip_exists=True)
You can use spectre.data.ArrowLoader('./prices/yahoo/yahoo.feather') load those data now.
Factor and FactorEngine
from spectre import factors
from spectre.data import ArrowLoader
loader = ArrowLoader('./prices/yahoo/yahoo.feather')
engine = factors.FactorEngine(loader)
engine.to_cuda()
engine.add(factors.SMA(5), 'ma5')
engine.add(factors.OHLCV.close, 'close')
df = engine.run('2019-01-11', '2019-01-15')
df
| | | ma5| close| |-------------------------|---------|-----------|-----------| |date |asset| | | |2019-01-14 00:00:00+00:00| A| 68.842003| 70.379997| | | AAPL| 151.615997| 152.289993| | | ABC| 75.835999| 76.559998| | | ABT| 69.056000| 69.330002| | | ADBE| 234.537994| 237.550003| | ...| ...| ...| ...| |2019-01-15 00:00:00+00:00| XYL| 68.322006| 69.160004| | | YUM| 91.010002| 90.000000| | | ZBH| 102.932007| 102.690002| | | ZION| 43.760002| 44.320000| | | ZTS| 85.846001| 84.500000|
Factor Analysis
from spectre import factors
import math
risk_free_rate = 0.04 / 252
excess_logret = factors.LogReturns() - math.log(1 + risk_free_rate)
universe = factors.AverageDollarVolume(win=120).top(100)
# Barra MOMENTUM
ema126 = factors.EMA(half_life=126, inputs=[excess_logret])
rstr = ema126.shift(11).sum(252)
MOMENTUM = rstr
# Barra Volatility
ema42 = factors.EMA(half_life=42, inputs=[excess_logret])
dastd = factors.STDDEV(252, inputs=[ema42])
VOLATILITY = dastd
# run engine
from spectre.data import ArrowLoader
loader = ArrowLoader('./prices/yahoo/yahoo.feather')
engine = factors.FactorEngine(loader)
engine.set_filter( universe )
engine.add( MOMENTUM, 'MOMENTUM' )
engine.add( VOLATILITY, 'VOLATILITY' )
engine.to_cuda()
%time factor_data, mean_return = engine.full_run("2013-01-02", "2018-01-19", periods=(1,5,10,))
<img src="https://github.com/Heerozh/spectre/raw/media/full_run.png" width="800" height="600">
Diagram
You can also view your factor structure graphically:
factors.BBANDS(win=5).normalized().rank().zscore().show_graph()
<img src="https://github.com/Heerozh/spectre/raw/media/factor_diagram.png" width="800" height="360">
The thickness of the line represents the length of the Rolling Window, kind of like "bandwidth".
If engine.to_cuda(enable_stream=True), the calculation of the branches will be performed
simultaneously, but the VRAM usage will increase proportionally.
Compatible with alphalens
The return value of full_run is compatible with alphalens:
import alphalens as al
...
factor_data, _ = engine.full_run("2013-01-02", "2018-01-19")
clean_data = factor_data[['{factor_name}', 'Returns']].droplevel(0, axis=1)
al.tears.create_full_tear_sheet(clean_data)
Back-testing
Back-testing uses FactorEngine's results as data, market event as triggers.
You can find other examples in the ./examples directory.
from spectre import factors, trading
from spectre.data import ArrowLoader
import pandas as pd, math
class MyAlg(trading.CustomAlgorithm):
def initialize(self):
# your factors
risk_free_rate = 0.04 / 252
excess_logret = factors.LogReturns() - math.log(1 + risk_free_rate)
universe = factors.AverageDollarVolume(win=120).top(100)
# Barra MOMENTUM Risk Factor
ema126 = factors.EMA(half_life=126, inputs=[excess_logret])
rstr = ema126.shift(11).sum(252)
MOMENTUM = rstr.zscore(mask=universe)
# Barra Volatility Risk Factor
ema42 = factors.EMA(half_life=42, inputs=[excess_logret])
dastd = factors.STDDEV(252, inputs=[ema42])
VOLATILITY = dastd.zscore(mask=universe)
# setup engine
engine = self.get_factor_engine()
engine.to_cuda()
engine.set_filter( universe )
engine.add( (MOMENTUM + VOLATILITY).to_weight(), 'alpha_weight' )
# schedule rebalance before market close
self.schedule_rebalance(trading.event.MarketClose(self.rebalance, offset_ns=-10000))
# simulation parameters
self.blotter.capital_base = 1000000
self.blotter.set_commission(percentage=0, per_share=0.005, minimum=1)
# self.blotter.set_slippage(percentage=0, per_share=0.4)
def rebalance(self, data: 'pd.DataFrame', history: 'pd.DataFrame'):
data = data.fillna(0)
self.blotter.batch_order_target_percent(data.index, data.alpha_weight)
# closing asset position that are no longer in our universe.
removes = self.blotter.portfolio.positions.keys() - set(data.index)
self.blotter.batch_order_target_percent(removes, [0] * len(removes))
# record data for debugging / plotting
self.record(aapl_weight=data.loc['AAPL', 'alpha_weight'],
aapl_price=self.blotter.get_price('AAPL'))
def terminate(self, records: 'pd.DataFrame'):
# plotting results
self.plot(benchmark='SPY')
# plotting the relationship between AAPL price and weight
ax1 = records.aapl_price.plot()
ax2 = ax1.twinx()
records.aapl_weight.plot(ax=ax2, style='g-')
loader = ArrowLoader('./prices/yahoo/yahoo.feather')
%time results = trading.run_backtest(loader, MyAlg, '2014-01-01', '2019-01-01')
<img src="https://github.com/Heerozh/spectre/raw/media/backtest.png" width="800" height="630">
It awful but you get the idea.
The return value of run_backtest is compatible with pyfolio:
import pyfolio as pf
pf.create_full_tear_sheet(results.returns, positions=results.positions.value, transactions=results.transactions,
live_start_date='2017-01-03')
API
Note
Differences to zipline:
- In order to GPU optimize, the
CustomFactor.computefunction calculates the results of all bars at once, so you need to be careful to prevent Look-Ahead Bias, because the inputs are not just historical data. Also usingengine.test_lookahead_biasdo some tests. - spectre's normally using float32 data type for GPU performance.
- spectre FactorEngine arranges data by bars, so
Return(win=10)means 10 bars return, may actually be more than 10 days if some assets not open trading in period. You can change this behavior by aligning data: filling missing bars with NaNs in your DataLoader, please refer to thealign_by_timeparameter ofCsvDirLoader.
Differences to common chart:
- If there is adjustments data, the prices is re-adjusted every day, so the factor you got, like MA, will be different from the stock chart software which only adjusted according to last day. If you want adjusted by last day, use like 'AdjustedColumnDataFactor(OHLCV.close)' as input data. This will speeds up a lot because it only needs to be adjusted once, but brings Look-Ahead Bias.
- Factors that uses the close data will be delayed by 1 bar.
- spectre's
EMAuses the algorithm same asziplineandDataframe.ewm(span=...), whenspanis greater than 100, it will be slightly different from common EMA. - spectre's
RSIuses the algorithm same aszipline, for consistency in benchmarks.
Factors
Built-in Technical Indicator Factors list
Returns(inputs=[OHLCV.close])
LogReturns(inputs=[OHLCV.close])
SimpleMovingAverage = MA = SMA(win=5, inputs=[OHLCV.close])
VWAP(inputs=[OHLCV.close, OHLCV.volume])
ExponentialWeightedMovingAverage = EMA(span=5, inputs=[OHLCV.close])
AverageDollarVolume(win=5, inpu
Related Skills
node-connect
337.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
337.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.1kCommit, push, and open a PR
