MishformerLens

MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing a custom, numerically inaccurate Transformer architecture.

Generate Convert Improve

Install / Use

/learn @ArthurConmy/MishformerLens

About this skill

Quality Score

0/100

README

MishformerLens

MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing a custom, numerically inaccurate Transformer architecture.

MishformerLens is currently highly experimental and at version 0.0.x.

Status as of 5th October: https://www.diffchecker.com/TaW9IAhJ shows the difference between https://colab.research.google.com/github/neelnanda-io/TransformerLens/blob/main/demos/Exploratory_Analysis_Demo.ipynb and MishformerLens/mishformer_lens/mishformer_lens_expoloratory_analysis_demo.py -- it's basically just formatting.

Note that we only have support for GPT-2 and Pythia (and GPT Neo-X), and no fold LN etc. stuff, just from_pretrained_no_preprocessing basically. It should be pretty easy to add fold LN etc. (with a small risk of numerical problems), and will be a lot harder to add support for every single model family.

TODO(v0.1): write this up in full

Roadmap

v0.1: make this usable for most TransformerLens models, including everything upstreamed to TL.

Models we want support for: Pythia, Gemma. Llama?

v1: PyPI, full testing, library ready for development.

Installation notes

TODO(v0.1): clean up

# Essential installs:
#
# Install transformers==4.45.1
# pip install this fork of TransformerLens 2.7.1: https://github.com/ArthurConmy/TransformerLens/tree/mishformer-lens-changes  # TODO(v0.1): upstream TransformerLens changes
# Install https://github.com/google-deepmind/mishax at commit hash 617972a2f83f14b3b76288477974d95563fe5e7d
# Install this repo
#
# For various notebook stuff you may need to install:
#
# Install IPython
# Install plotly==5.24.1
# Install nbformat>=4.20.0

N.B. I may upstream some changes so the above could be inaccurate.

Related Skills

node-connect

345.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

104.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

345.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

345.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。