Spectre

GPU-accelerated Factors analysis library and Backtester

Generate Convert Improve

Install / Use

/learn @Heerozh/Spectre

About this skill

Quality Score

0/100

README

||spectre

spectre is a GPU-accelerated Parallel quantitative trading library, focused on performance.

Fast GPU Factor Engine, see below Benchmarks
Pure python code, based on PyTorch, so it can integrate DL model very smoothly.
Compatible with alphalens and pyfolio

Python 3.7+, PyTorch 1.3+, Pandas 1.0+ recommended

Installation

pip install --no-deps git+git://github.com/Heerozh/spectre.git

Dependencies:

conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch
conda install pyarrow pandas tqdm plotly requests bs4 lxml

Benchmarks

My Machine：

i9-7900X @ 3.30GHz, 20 Cores
DDR4 3800MHz
3090: GIGABYTE GeForce RTX 3090 GAMING OC 24G
2080Ti: RTX 2080Ti Founders

Running on Quandl 5 years, 3196 Assets, total 3,637,344 bars.

| | spectre (CUDA/3090) | spectre (CUDA/2080Ti) | spectre (CPU) | zipline.pipeline | |----------------|-------------------------------|-------------------------------|----------------------------|-----------------------| |SMA(100) | 87.9 ms ± 3.35 ms (33.9x) | 144 ms ± 974 µs (20.7x) | 2.68 s ± 36.1 ms (1.11x) | 2.98 s ± 14.4 ms (1x) | |EMA(50) win=229 | 166 ms ± 3.25 ms (50.5x) | 270 ms ± 1.89 ms (31.0x) | 4.37 s ± 46.4 ms (1.74x) | 8.38 s ± 56.8 ms (1x) | |(MACD+RSI+STOCHF).rank.zscore | 184 ms ± 7.83 ms (77.7x) | 282 ms ± 1.33 ms (50.7x) | 6.01 s ± 28.1 (2.38x) | 14.3 s ± 277 ms (1x) |

The CUDA memory used in the spectre benchmark is 1.8G, returned by cuda.max_memory_allocated().
Benchmarks excluded the initial run (no copy data to VRAM, about saving 300ms).

Quick Start

DataLoader

First of all is data, you can use CsvDirLoader read your csv files.

spectre also has built-in Yahoo downloader, symbols=None will download all SP500 components.

from spectre.data import YahooDownloader
YahooDownloader.ingest(start_date="2001", save_to="./prices/yahoo", symbols=None, skip_exists=True)

You can use spectre.data.ArrowLoader('./prices/yahoo/yahoo.feather') load those data now.

Factor and FactorEngine

from spectre import factors
from spectre.data import ArrowLoader
loader = ArrowLoader('./prices/yahoo/yahoo.feather')
engine = factors.FactorEngine(loader)
engine.to_cuda()
engine.add(factors.SMA(5), 'ma5')
engine.add(factors.OHLCV.close, 'close')
df = engine.run('2019-01-11', '2019-01-15')
df

| | | ma5| close| |-------------------------|---------|-----------|-----------| |date |asset| | | |2019-01-14 00:00:00+00:00| A| 68.842003| 70.379997| | | AAPL| 151.615997| 152.289993| | | ABC| 75.835999| 76.559998| | | ABT| 69.056000| 69.330002| | | ADBE| 234.537994| 237.550003| | ...| ...| ...| ...| |2019-01-15 00:00:00+00:00| XYL| 68.322006| 69.160004| | | YUM| 91.010002| 90.000000| | | ZBH| 102.932007| 102.690002| | | ZION| 43.760002| 44.320000| | | ZTS| 85.846001| 84.500000|

Factor Analysis

from spectre import factors
import math

risk_free_rate = 0.04 / 252
excess_logret = factors.LogReturns() - math.log(1 + risk_free_rate)
universe = factors.AverageDollarVolume(win=120).top(100)

# Barra MOMENTUM
ema126 = factors.EMA(half_life=126, inputs=[excess_logret])
rstr = ema126.shift(11).sum(252)
MOMENTUM = rstr

# Barra Volatility
ema42 = factors.EMA(half_life=42, inputs=[excess_logret])
dastd = factors.STDDEV(252, inputs=[ema42])
VOLATILITY = dastd

# run engine
from spectre.data import ArrowLoader
loader = ArrowLoader('./prices/yahoo/yahoo.feather')
engine = factors.FactorEngine(loader)

engine.set_filter( universe )
engine.add( MOMENTUM, 'MOMENTUM' )
engine.add( VOLATILITY, 'VOLATILITY' )

engine.to_cuda()
%time factor_data, mean_return = engine.full_run("2013-01-02", "2018-01-19", periods=(1,5,10,))

Diagram

You can also view your factor structure graphically:

factors.BBANDS(win=5).normalized().rank().zscore().show_graph()

The thickness of the line represents the length of the Rolling Window, kind of like "bandwidth".

If engine.to_cuda(enable_stream=True), the calculation of the branches will be performed simultaneously, but the VRAM usage will increase proportionally.

Compatible with alphalens

The return value of full_run is compatible with alphalens:

import alphalens as al
...
factor_data, _ = engine.full_run("2013-01-02", "2018-01-19")
clean_data = factor_data[['{factor_name}', 'Returns']].droplevel(0, axis=1)
al.tears.create_full_tear_sheet(clean_data)

Back-testing

Back-testing uses FactorEngine's results as data, market event as triggers.

You can find other examples in the ./examples directory.

from spectre import factors, trading
from spectre.data import ArrowLoader
import pandas as pd, math


class MyAlg(trading.CustomAlgorithm):
    def initialize(self):
        # your factors
        risk_free_rate = 0.04 / 252
        excess_logret = factors.LogReturns() - math.log(1 + risk_free_rate)
        universe = factors.AverageDollarVolume(win=120).top(100)

        # Barra MOMENTUM Risk Factor
        ema126 = factors.EMA(half_life=126, inputs=[excess_logret])
        rstr = ema126.shift(11).sum(252)
        MOMENTUM = rstr.zscore(mask=universe)

        # Barra Volatility Risk Factor
        ema42 = factors.EMA(half_life=42, inputs=[excess_logret])
        dastd = factors.STDDEV(252, inputs=[ema42])
        VOLATILITY = dastd.zscore(mask=universe)

        # setup engine
        engine = self.get_factor_engine()
        engine.to_cuda()
        engine.set_filter( universe )
        engine.add( (MOMENTUM + VOLATILITY).to_weight(), 'alpha_weight' )

        # schedule rebalance before market close
        self.schedule_rebalance(trading.event.MarketClose(self.rebalance, offset_ns=-10000))

        # simulation parameters
        self.blotter.capital_base = 1000000
        self.blotter.set_commission(percentage=0, per_share=0.005, minimum=1)
        # self.blotter.set_slippage(percentage=0, per_share=0.4)

    def rebalance(self, data: 'pd.DataFrame', history: 'pd.DataFrame'):
        data = data.fillna(0)
        self.blotter.batch_order_target_percent(data.index, data.alpha_weight)

        # closing asset position that are no longer in our universe.
        removes = self.blotter.portfolio.positions.keys() - set(data.index)
        self.blotter.batch_order_target_percent(removes, [0] * len(removes))

        # record data for debugging / plotting
        self.record(aapl_weight=data.loc['AAPL', 'alpha_weight'],
                    aapl_price=self.blotter.get_price('AAPL'))

    def terminate(self, records: 'pd.DataFrame'):
        # plotting results
        self.plot(benchmark='SPY')

        # plotting the relationship between AAPL price and weight
        ax1 = records.aapl_price.plot()
        ax2 = ax1.twinx()
        records.aapl_weight.plot(ax=ax2, style='g-')

loader = ArrowLoader('./prices/yahoo/yahoo.feather')
%time results = trading.run_backtest(loader, MyAlg, '2014-01-01', '2019-01-01')

It awful but you get the idea.

The return value of run_backtest is compatible with pyfolio:

import pyfolio as pf
pf.create_full_tear_sheet(results.returns, positions=results.positions.value, transactions=results.transactions,
                          live_start_date='2017-01-03')

API

Note

Differences to zipline:

In order to GPU optimize, the CustomFactor.compute function calculates the results of all bars at once, so you need to be careful to prevent Look-Ahead Bias, because the inputs are not just historical data. Also using engine.test_lookahead_bias do some tests.
spectre's normally using float32 data type for GPU performance.
spectre FactorEngine arranges data by bars, so Return(win=10) means 10 bars return, may actually be more than 10 days if some assets not open trading in period. You can change this behavior by aligning data: filling missing bars with NaNs in your DataLoader, please refer to the align_by_time parameter of CsvDirLoader.

Differences to common chart:

If there is adjustments data, the prices is re-adjusted every day, so the factor you got, like MA, will be different from the stock chart software which only adjusted according to last day. If you want adjusted by last day, use like 'AdjustedColumnDataFactor(OHLCV.close)' as input data. This will speeds up a lot because it only needs to be adjusted once, but brings Look-Ahead Bias.
Factors that uses the close data will be delayed by 1 bar.
spectre's EMA uses the algorithm same as zipline and Dataframe.ewm(span=...), when span is greater than 100, it will be slightly different from common EMA.
spectre's RSI uses the algorithm same as zipline, for consistency in benchmarks.

Factors

Built-in Technical Indicator Factors list

Returns(inputs=[OHLCV.close])
LogReturns(inputs=[OHLCV.close])
SimpleMovingAverage = MA = SMA(win=5, inputs=[OHLCV.close])
VWAP(inputs=[OHLCV.close, OHLCV.volume])
ExponentialWeightedMovingAverage = EMA(span=5, inputs=[OHLCV.close])
AverageDollarVolume(win=5, inpu

Related Skills

node-connect

337.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

337.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

83.1k

Commit, push, and open a PR