DeepLearningStudyGroup
Papers, code, etc. for the Deep Learning Study Group, which has been meeting weekly since 2016. Meeting time - Tuesdays, 6:30 pm California time on Zoom. See README.md for Meetup, Zoom and Discord links.
Install / Use
/learn @davidmacmillan/DeepLearningStudyGroupREADME
Deep Learning Study Group
Official github of papers, code, etc. for our Deep Learning Study Group.
Every week we do a 2 hour deep dive into a recent deep learning paper.
We have been meeting weekly for over 10 years and done over 500 papers (listed below).
The primary goal is to facilitate technical and mathematical discussion
about the paper, in a supportive environment, to help each other
get the maximum out of each paper. This requires that some participants
read the paper beforehand. However, anyone is welcome to attend
and listen without reading the paper. Paper selection is based on
suggestions from attendees and we vote to select the next week's paper.
Meeting time - Tuesdays, 6:30 pm California time on Zoom.
Zoom and Discord links are on the meetup page:
https://www.meetup.com/handsonprogrammingevents/
======== 2026 ========
Paper for March 24, 2026:
Attention Residuals
https://arxiv.org/pdf/2603.15031
For March 17, 2026:
We will walk through the code for Karpathy's autoresearch:
https://github.com/karpathy/autoresearch/
He has some discussion of his experiments here and many others have posted on uses for it:
https://x.com/karpathy
A blog by someone on using it:
https://medium.com/modelmind/getting-started-with-andrej-karpathys-autoresearch-full-guide-c2f3a80b9ce6
There are also lots of YouTubes on it.
Karpathy's code only runs on H100's, so I patched it to run on consumer GPUs (tested on a 5090 32GB):
https://github.com/davidmacmillan/autoresearch.git
Paper for March 10, 2026:
ROMA: Recursive Open Meta-Agent Framework for Long-Horizon Multi-Agent Systems.
http://arxiv.org/abs/2602.01848
Blog:
https://www.sentient.xyz/blog/recursive-open-meta-agent
YouTube:
https://www.youtube.com/watch?v=ghoYOq1bSE4
Paper for March 3, 2026:
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
https://arxiv.org/abs/2602.08222
Paper for February 24, 2026:
Recursive Language Models
https://arxiv.org/pdf/2512.24601
Blog
https://alexzhang13.github.io/blog/2025/rlm/
Github
https://github.com/alexzhang13/rlm
Documentation
https://alexzhang13.github.io/rlm/
Paper for February 17, 2026:
ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation
https://arxiv.org/pdf/2601.21420
Paper for February 10, 2026:
Reinforcement Learning via Self-Distillation
https://arxiv.org/pdf/2601.20802
Paper for February 3, 2026:
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
https://arxiv.org/pdf/2601.07372
Paper for January 27, 2026:
mHC: Manifold-Constrained Hyper-Connections
https://arxiv.org/pdf/2512.24880v1
There are multiple YouTubes including:
https://www.youtube.com/watch?v=jYn_1PpRzxI
Background material: Hyper-Connections
https://arxiv.org/abs/2409.19606
Paper for January 20, 2026:
Digital Red Queen: Adversarial Program Evolution in Core War with LLMs
https://arxiv.org/pdf/2601.03335
Website:
https://pub.sakana.ai/drq
Code:
https://github.com/SakanaAI/drq
There are many YouTubes on this work.
Paper for January 13, 2026:
Hessian structure of neural networks
https://arxiv.org/abs/2505.02809
Blog: Loss functions and optimizers – Adam and Muon and the Hessian of the loss function
https://securemachinery.com/2025/12/18/loss-functions-and-optimizers/
Paper (a blog) for January 6, 2026:
When Models Manipulate Manifolds: The Geometry of a Counting Task
https://transformer-circuits.pub/2025/linebreaks/index.html
======== 2025 ========
Paper for December 30, 2025:
NVIDIA-Nemotron-3-White-Paper.pdf
https://research.nvidia.com/labs/nemotron/files/NVIDIA-Nemotron-3-White-Paper.pdf
For addition background, if interested:
https://research.nvidia.com/labs/nemotron/files/NVIDIA-Nemotron-3-Nano-Technical-Report.pdf
Paper for December 23, 2025:
The Path Not Taken: RLVR Provably Learns Off the Principals
https://arxiv.org/pdf/2511.08567
YouTube:
https://www.youtube.com/watch?v=iYpQJK5KLlw
Additional material
https://github.com/davidmacmillan/DeepLearningStudyGroup/blob/master/2025-12-23%20Supervised%20fine-tuning%20vs.%20reinforcement%20learning%20with%20verified%20rewards%20_%20Claude.pdf
Paper for December 16, 2025:
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
https://arxiv.org/abs/2503.14858
Additional background - Project site:
https://wang-kevin3290.github.io/scaling-crl/
Code:
https://github.com/wang-kevin3290/scaling-crl
Helpful CRL background info by one of the authors:
"Contrastive Learning as Goal-Conditioned Reinforcement Learning"
https://arxiv.org/pdf/2206.07568
Paper for December 9, 2025
PaTH Attention: Position Encoding via Accumulating Householder Transformations
https://arxiv.org/pdf/2505.16381
December 2, 2025
No meeting December 2 due to NeurIPS
Paper for November 25, 2025:
Nested Learning: The Illusion of Deep Learning Architectures
https://abehrouz.github.io/files/NL.pdf
Blog on Nested Learning paper
https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/
Paper for November 18, 2025:
DeepSeek-OCR: Contexts Optical Compression
https://arxiv.org/pdf/2510.18234
Paper for November 11, 2025:
Kimi linear attention
https://arxiv.org/pdf/2510.26692
Slides: https://github.com/davidmacmillan/DeepLearningStudyGroup/blob/master/2025-11-11%20Kimi%20Linear%20%26%20Kimi%20Delta%20Attention.pdf
Paper for November 4, 2025:
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
https://arxiv.org/pdf/2510.05592
Paper for October 28, 2025:
Attention Sinks and Compression Valleys in LLMs are Two Sides of the Same Coin.
http://arxiv.org/abs/2510.06477
Paper for October 21, 2025:
Less is More: Recursive Reasoning with Tiny Networks
https://arxiv.org/pdf/2510.04871
Paper for October 14, 2025:
Bootstrapping Task Spaces for Self-Improvement
https://arxiv.org/pdf/2509.04575
Paper for October 7, 2025:
Small Language Models are the Future of Agentic AI
https://arxiv.org/abs/2506.02153
Many YouTubes on this paper incuding by an author:
https://www.youtube.com/watch?v=9xgRTznP21E.
Sept. 30, 2025
No paper this week. Instead we did an in-person social event (dinner) on Tuesday Sept. 30 at 6:30 PM in Mountain View, CA.
Paper for Sept. 23, 2025:
Real-Time Detection of Hallucinated Entities in Long-Form Generation
https://arxiv.org/pdf/2509.03531
Paper for September 16, 2025:
Why Language Models Hallucinate
https://www.arxiv.org/abs/2509.04664
Paper for Sept. 9, 2025:
DataRater: Meta-Learned Dataset Curation
https://arxiv.org/pdf/2505.17895
Paper for Sept. 2, 2025:
A Survey on Diffusion Language Models
https://arxiv.org/pdf/2508.10875
Paper for August 26, 2025:
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
https://arxiv.org/pdf/2507.19457
Paper for August 19, 2025
Hierarchical Reasoning Models
https://arxiv.org/abs/2506.21734
There are multiple human YouTubes, including one by Gabriel Mongaras:
https://www.youtube.com/watch?v=TUsbk8vPDoM
Github:
https://github.com/sapientinc/HRM
Paper for August 12, 2025:
Subliminal Learning: Language models transmit behavioral traits via hidden signals in data
https://arxiv.org/abs/2507.14805
Paper for August 5, 2025:
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought
https://arxiv.org/pdf/2505.12514
Paper for July 29, 2025:
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
https://arxiv.org/pdf/2507.10524
Paper for July 22, 2025:
Kimi k1.5: Scaling Reinforcement Learning with LLMs
https://arxiv.org/pdf/2501.12599
There are also multiple YouTubes.
Additional Kimi info, if interested:
Kimi-VL Technical Report
https://arxiv.org/pdf/2504.07491
Paper for July 15, 2025:
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal
https://arxiv.org/abs/2503.14269
Two blogs and a paper for July 8, 2025:
Blog #1 - Gemma 3n model overview
https://ai.google.dev/gemma/docs/gemma-3n
Blog #2 - Introducing Gemma 3n: The developer guide
https://developers.googleblog.com/en/introducing-gemma-3n-developer-guide/
MatFormer: Nested Transformer for Elastic Inference
https://arxiv.org/pdf/2310.07707
There are multiple YouTubes on Gemma 3n and MatFormer.
Paper for July 1, 2025:
MELODI: Exploring Memory Compression for Long Contexts (DeepMind, Oct. 2024)
https://arxiv.org/abs/2410.03156
Open Review:
https://openreview.net/forum?id=TvGPP8i18S
Paper for June 24, 2025:
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
https://arxiv.org/pdf/2410.20672
OpenReview:
https://openreview.net/forum?id=WwpYSOkkCt
Paper for June 17, 2025:
Concise Reasoning via Reinforcement Learning
https://arxiv.org/pdf/2504.05185
For June 10, 2025:
Good news - no homework this week!!!
At the meeting, one of our members, Ted, will present MultiDecode,
original work he has done on speeding inference, including for RAG.
Papers for June 3, 2025:
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
openclaw-plugin-loom
Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
