DeepLearningStudyGroup

Papers, code, etc. for the Deep Learning Study Group, which has been meeting weekly since 2016. Meeting time - Tuesdays, 6:30 pm California time on Zoom. See README.md for Meetup, Zoom and Discord links.

Generate Convert Improve

Install / Use

/learn @davidmacmillan/DeepLearningStudyGroup

About this skill

Quality Score

0/100

README

Deep Learning Study Group

Official github of papers, code, etc. for our Deep Learning Study Group.

Every week we do a 2 hour deep dive into a recent deep learning paper.
We have been meeting weekly for over 10 years and done over 500 papers (listed below).
The primary goal is to facilitate technical and mathematical discussion
about the paper, in a supportive environment, to help each other
get the maximum out of each paper. This requires that some participants
read the paper beforehand. However, anyone is welcome to attend
and listen without reading the paper. Paper selection is based on
suggestions from attendees and we vote to select the next week's paper.

Meeting time - Tuesdays, 6:30 pm California time on Zoom.
Zoom and Discord links are on the meetup page:
https://www.meetup.com/handsonprogrammingevents/

======== 2026 ========

Paper for March 24, 2026:

Attention Residuals
https://arxiv.org/pdf/2603.15031

For March 17, 2026:

We will walk through the code for Karpathy's autoresearch:
https://github.com/karpathy/autoresearch/
He has some discussion of his experiments here and many others have posted on uses for it:
https://x.com/karpathy
A blog by someone on using it:
https://medium.com/modelmind/getting-started-with-andrej-karpathys-autoresearch-full-guide-c2f3a80b9ce6
There are also lots of YouTubes on it.
Karpathy's code only runs on H100's, so I patched it to run on consumer GPUs (tested on a 5090 32GB):
https://github.com/davidmacmillan/autoresearch.git

Paper for March 10, 2026:

ROMA: Recursive Open Meta-Agent Framework for Long-Horizon Multi-Agent Systems.
http://arxiv.org/abs/2602.01848
Blog:
https://www.sentient.xyz/blog/recursive-open-meta-agent
YouTube:
https://www.youtube.com/watch?v=ghoYOq1bSE4

Paper for March 3, 2026:

Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
https://arxiv.org/abs/2602.08222

Paper for February 24, 2026:

Recursive Language Models
https://arxiv.org/pdf/2512.24601
Blog
https://alexzhang13.github.io/blog/2025/rlm/
Github
https://github.com/alexzhang13/rlm
Documentation
https://alexzhang13.github.io/rlm/

Paper for February 17, 2026:

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation
https://arxiv.org/pdf/2601.21420

Paper for February 10, 2026:

Reinforcement Learning via Self-Distillation
https://arxiv.org/pdf/2601.20802

Paper for February 3, 2026:

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
https://arxiv.org/pdf/2601.07372

Paper for January 27, 2026:

mHC: Manifold-Constrained Hyper-Connections
https://arxiv.org/pdf/2512.24880v1
There are multiple YouTubes including:
https://www.youtube.com/watch?v=jYn_1PpRzxI
Background material: Hyper-Connections
https://arxiv.org/abs/2409.19606

Paper for January 20, 2026:

Digital Red Queen: Adversarial Program Evolution in Core War with LLMs
https://arxiv.org/pdf/2601.03335
Website:
https://pub.sakana.ai/drq
Code:
https://github.com/SakanaAI/drq
There are many YouTubes on this work.

Paper for January 13, 2026:

Hessian structure of neural networks
https://arxiv.org/abs/2505.02809
Blog: Loss functions and optimizers – Adam and Muon and the Hessian of the loss function
https://securemachinery.com/2025/12/18/loss-functions-and-optimizers/

Paper (a blog) for January 6, 2026:

When Models Manipulate Manifolds: The Geometry of a Counting Task
https://transformer-circuits.pub/2025/linebreaks/index.html

======== 2025 ========

Paper for December 30, 2025:

NVIDIA-Nemotron-3-White-Paper.pdf
https://research.nvidia.com/labs/nemotron/files/NVIDIA-Nemotron-3-White-Paper.pdf
For addition background, if interested:
https://research.nvidia.com/labs/nemotron/files/NVIDIA-Nemotron-3-Nano-Technical-Report.pdf

Paper for December 23, 2025:

The Path Not Taken: RLVR Provably Learns Off the Principals
https://arxiv.org/pdf/2511.08567
YouTube: https://www.youtube.com/watch?v=iYpQJK5KLlw
Additional material
https://github.com/davidmacmillan/DeepLearningStudyGroup/blob/master/2025-12-23%20Supervised%20fine-tuning%20vs.%20reinforcement%20learning%20with%20verified%20rewards%20_%20Claude.pdf

Paper for December 16, 2025:

1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
https://arxiv.org/abs/2503.14858
Additional background - Project site:
https://wang-kevin3290.github.io/scaling-crl/
Code:
https://github.com/wang-kevin3290/scaling-crl
Helpful CRL background info by one of the authors:
"Contrastive Learning as Goal-Conditioned Reinforcement Learning"
https://arxiv.org/pdf/2206.07568

Paper for December 9, 2025

PaTH Attention: Position Encoding via Accumulating Householder Transformations
https://arxiv.org/pdf/2505.16381

December 2, 2025

No meeting December 2 due to NeurIPS

Paper for November 25, 2025:

Nested Learning: The Illusion of Deep Learning Architectures
https://abehrouz.github.io/files/NL.pdf
Blog on Nested Learning paper
https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/

Paper for November 18, 2025:

DeepSeek-OCR: Contexts Optical Compression
https://arxiv.org/pdf/2510.18234

Paper for November 11, 2025:

Kimi linear attention
https://arxiv.org/pdf/2510.26692
Slides: https://github.com/davidmacmillan/DeepLearningStudyGroup/blob/master/2025-11-11%20Kimi%20Linear%20%26%20Kimi%20Delta%20Attention.pdf

Paper for November 4, 2025:

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
https://arxiv.org/pdf/2510.05592

Paper for October 28, 2025:

Attention Sinks and Compression Valleys in LLMs are Two Sides of the Same Coin.
http://arxiv.org/abs/2510.06477

Paper for October 21, 2025:

Less is More: Recursive Reasoning with Tiny Networks
https://arxiv.org/pdf/2510.04871

Paper for October 14, 2025:

Bootstrapping Task Spaces for Self-Improvement
https://arxiv.org/pdf/2509.04575

Paper for October 7, 2025:

Small Language Models are the Future of Agentic AI
https://arxiv.org/abs/2506.02153
Many YouTubes on this paper incuding by an author: https://www.youtube.com/watch?v=9xgRTznP21E.

Sept. 30, 2025

No paper this week. Instead we did an in-person social event (dinner) on Tuesday Sept. 30 at 6:30 PM in Mountain View, CA.

Paper for Sept. 23, 2025:

Real-Time Detection of Hallucinated Entities in Long-Form Generation
https://arxiv.org/pdf/2509.03531

Paper for September 16, 2025:

Why Language Models Hallucinate
https://www.arxiv.org/abs/2509.04664

Paper for Sept. 9, 2025:

DataRater: Meta-Learned Dataset Curation
https://arxiv.org/pdf/2505.17895

Paper for Sept. 2, 2025:

A Survey on Diffusion Language Models
https://arxiv.org/pdf/2508.10875

Paper for August 26, 2025:

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
https://arxiv.org/pdf/2507.19457

Paper for August 19, 2025

Hierarchical Reasoning Models
https://arxiv.org/abs/2506.21734
There are multiple human YouTubes, including one by Gabriel Mongaras:
https://www.youtube.com/watch?v=TUsbk8vPDoM
Github:
https://github.com/sapientinc/HRM

Paper for August 12, 2025:

Subliminal Learning: Language models transmit behavioral traits via hidden signals in data
https://arxiv.org/abs/2507.14805

Paper for August 5, 2025:

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought
https://arxiv.org/pdf/2505.12514

Paper for July 29, 2025:

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
https://arxiv.org/pdf/2507.10524

Paper for July 22, 2025:

Kimi k1.5: Scaling Reinforcement Learning with LLMs
https://arxiv.org/pdf/2501.12599
There are also multiple YouTubes.
Additional Kimi info, if interested:
Kimi-VL Technical Report
https://arxiv.org/pdf/2504.07491

Paper for July 15, 2025:

DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal
https://arxiv.org/abs/2503.14269

Two blogs and a paper for July 8, 2025:

Blog #1 - Gemma 3n model overview
https://ai.google.dev/gemma/docs/gemma-3n
Blog #2 - Introducing Gemma 3n: The developer guide
https://developers.googleblog.com/en/introducing-gemma-3n-developer-guide/
MatFormer: Nested Transformer for Elastic Inference
https://arxiv.org/pdf/2310.07707
There are multiple YouTubes on Gemma 3n and MatFormer.

Paper for July 1, 2025:

MELODI: Exploring Memory Compression for Long Contexts (DeepMind, Oct. 2024)
https://arxiv.org/abs/2410.03156
Open Review:
https://openreview.net/forum?id=TvGPP8i18S

Paper for June 24, 2025:

Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
https://arxiv.org/pdf/2410.20672
OpenReview:
https://openreview.net/forum?id=WwpYSOkkCt

Paper for June 17, 2025:

Concise Reasoning via Reinforcement Learning
https://arxiv.org/pdf/2504.05185

For June 10, 2025:

Good news - no homework this week!!!
At the meeting, one of our members, Ted, will present MultiDecode,
original work he has done on speeding inference, including for RAG.

Papers for June 3, 2025:

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

openclaw-plugin-loom

Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

sec-edgar-agentkit

AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.

davidmacmillan

View profile

View on GitHub

GitHub Stars33

CategoryEducation

Updated1d ago

Forks3

davidmacmillan/DeepLearningStudyGroup

Languages

Python

Security Score

80/100

Audited on Mar 23, 2026

No findings