MishformerLens
MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing a custom, numerically inaccurate Transformer architecture.
Install / Use
/learn @ArthurConmy/MishformerLensREADME
MishformerLens
MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing a custom, numerically inaccurate Transformer architecture.
MishformerLens is currently highly experimental and at version 0.0.x.
Status as of 5th October: https://www.diffchecker.com/TaW9IAhJ shows the difference between https://colab.research.google.com/github/neelnanda-io/TransformerLens/blob/main/demos/Exploratory_Analysis_Demo.ipynb and MishformerLens/mishformer_lens/mishformer_lens_expoloratory_analysis_demo.py -- it's basically just formatting.
Note that we only have support for GPT-2 and Pythia (and GPT Neo-X), and no fold LN etc. stuff, just from_pretrained_no_preprocessing basically. It should be pretty easy to add fold LN etc. (with a small risk of numerical problems), and will be a lot harder to add support for every single model family.
TODO(v0.1): write this up in full
Roadmap
v0.1: make this usable for most TransformerLens models, including everything upstreamed to TL.
Models we want support for: Pythia, Gemma. Llama?
v1: PyPI, full testing, library ready for development.
Installation notes
TODO(v0.1): clean up
# Essential installs:
#
# Install transformers==4.45.1
# pip install this fork of TransformerLens 2.7.1: https://github.com/ArthurConmy/TransformerLens/tree/mishformer-lens-changes # TODO(v0.1): upstream TransformerLens changes
# Install https://github.com/google-deepmind/mishax at commit hash 617972a2f83f14b3b76288477974d95563fe5e7d
# Install this repo
#
# For various notebook stuff you may need to install:
#
# Install IPython
# Install plotly==5.24.1
# Install nbformat>=4.20.0
N.B. I may upstream some changes so the above could be inaccurate.
Related Skills
node-connect
345.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
104.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
345.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
345.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
