hdDeepLearningStudy

Papers,code etc for deep learning study group
See group discord - https://discord.gg/HuWVmMgmqS
zoom link - On the meetup page
meeting time - 6:30 pm California time

Tuesday, November 21, 2023

paper: MemGPT -Towards LLMs as an Operating System https://arxiv.org/pdf/2310.08560.pdf
Blog w MemBPT - https://memgpt.ai/
youtube: https://www.youtube.com/watch?v=nQmZmFERmrg

Tuesday, November 14, 2023

paper: https://openreview.net/pdf?id=S1KGaTSOTS - CLUSTERFORMER: Clustering As A Universal Visual Learner.

Tuesday, November 7, 2023

paper: https://arxiv.org/pdf/2310.12962.pdf - An Emulator for Fine-Tuning Large Language Models using Small Language Models

Tuesday, October 31, 2023

paper: https://www.nature.com/articles/s42256-023-00711-8 - From attribution maps to human-understandable explanations through Concept Relevance Propagation

Tuesday, October 24, 2023

paper: https://arxiv.org/pdf/2209.12951.pdf - Liquid Structural State-Space Models

Tuesday, October 17, 2023

paper: Liquid Time-Constant Networks https://arxiv.org/abs/2006.04439
youtube: https://www.youtube.com/watch?v=IlliqYiRhMU
shorter video: https://www.youtube.com/watch?v=RI35E5ewBuI

Tuesday, October 10, 2023

paper - 3D Gaussian Splatting for Real-Time Radiance Field Rendering https://arxiv.org/abs/2308.04079
youtube: Superb 2 minute video on paper https://www.youtube.com/watch?v=HVv_IQKlafQ
youtube: Siggraph 2023 talk on paper - this is 5 minutes https://www.youtube.com/watch?v=T_kXY43VZnk&t=3s
Author's blog: including links to code: https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/

Tuesday, October 3 , 2023

paper: https://arxiv.org/abs/2112.04035 - Relating transformers to models and neural representations of the hippocampal formation
another paper: https://amygdala.psychdept.arizona.edu/labspace/JclubLabMeetings/JeanMarc-Build-cognitive-maps.pdf - How to build a cognitive map
youtube: https://www.youtube.com/watch?v=9qOaII_PzGY&t=413s - How Your Brain Organizes Information
youtube: https://www.youtube.com/watch?v=cufOEzoVMVA - Can We Build an Artificial Hippocampus?
youtube: https://www.cell.com/cell/fulltext/S0092-8674(20)31388-X - The Tolman-Eichenbaum Machine: Unifying Space and Relational Memory through Generalization in the Hippocampal Formation

Tuesday, September 26, 2023

paper: https://research.nvidia.com/labs/par/Perfusion/ - 3D Gaussian Splatting for Real-Time Radiance Field Rendering

Tuesday, September 19, 2023

paper: https://arxiv.org/pdf/2210.09276.pdf - Imagic: Text-Based Real Image Editing with Diffusion Models
youtube: https://www.youtube.com/watch?v=PzHMjCtuPuo
blog: https://imagic-editing.github.io/

Tuesday, Sept 12, 2023

paper: https://arxiv.org/abs/2307.02486 - LongNet: Scaling Transformers to 1,000,000,000 Tokens
Blog: https://syncedreview.com/2023/07/10/microsofts-longnet-scales-transformer-to-one-billion-tokens

Tuesday, Sept 5, 2023

https://arxiv.org/pdf/2308.08708.pdf - Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Tuesday, August 29, 2023

paper: https://arxiv.org/pdf/2307.15936.pdf - A Theory for Emergence of Complex Skills in Language Models and video
youtube: https://www.youtube.com/watch?v=0D23NeBjCeQ

Tuesday, August 22, 2023

Paper: https://arxiv.org/pdf/2206.04843.pdf -- Neural Laplace: Learning diverse classes of differential equations in the Laplace domain
Slides and video from ICML 2022: https://icml.cc/virtual/2022/oral/16728

Wednesday, August 16, 2023

paper: https://arxiv.org/abs/2308.03296 - Studying Large Language Model Generalization with Influence Functions
blog: https://www.anthropic.com/index/influence-functions

Wednesday, August 9, 2023

paper: Music Generations https://arxiv.org/pdf/2306.05284.pdf
blog: https://about.fb.com/news/2023/08/audiocraft-generative-ai-for-music-and-audio/
blog: https://ai.meta.com/blog/audiocraft-musicgen-audiogen-encodec-generative-ai-audio/

Wednesday, August 2, 2023

paper: https://arxiv.org/abs/2205.10343 Towards Understanding Grokking: An Effective Theory of Representation Learning
blog: https://ericjmichaud.com/grokking-squared/
blog: https://www.beren.io/2022-01-11-Grokking-Grokking/
blog: https://www.beren.io/2022-04-17-Understanding_Overparametrized_Generalization/

Wednesday, July 26, 2023

paper: Mixture of experts (similar to chatGPT4): https://arxiv.org/abs/2305.14705

blog: Mixture-of-Experts with Expert Choice Routing -
https://ai.googleblog.com/2022/11/mixture-of-experts-with-expert-choice.html

blot: Introducing Pathways: A next-generation AI architecture
https://blog.google/technology/ai/introducing-pathways-next-generation-ai-architecture/

Wednesday, July 19, 2023

We're going to cover Chapter 16 Deep Networks for Classification from the following book:
https://book-wright-ma.github.io/Book-WM-20210422.pdf - High dimensional Data Analysis with Low Dimensional Models blog: https://terrytao.wordpress.com/2007/04/13/compressed-sensing-and-single-pixel-cameras/#more-25

Wednesday, July 12, 2023

We're going to cover the 4th chapter of this book.
https://book-wright-ma.github.io/Book-WM-20210422.pdf - High dimensional Data Analysis with Low Dimensional Models

Wednesday, July 5, 2023

We're going to cover the 1st chapter of this book.
https://book-wright-ma.github.io/Book-WM-20210422.pdf - High dimensional Data Analysis with Low Dimensional Models
Blog: https://terrytao.wordpress.com/2007/04/13/compressed-sensing-and-single-pixel-cameras/#more-25

Wednesday, June 28, 2023

paper: https://arxiv.org/pdf/2305.17126.pdf - Large Language Models as Tool Makers
youtube: https://www.youtube.com/watch?v=qWI1AJ2nSDY
youtube: https://www.youtube.com/watch?v=KXlPzMRTfMk
youtube: https://www.youtube.com/watch?v=srDVNbxPgZI

Wednesday, June 21, 2023

Consciousness as a Memory System https://pubmed.ncbi.nlm.nih.gov/36178498/

Wednesday, June 14, 2023

https://arxiv.org/abs/1804.08838
Blog: https://www.uber.com/blog/intrinsic-dimension/
more good stuff on intrinsic dimension:
Nature paper: https://www.nature.com/articles/s41598-017-11873-y
Wikipedia: https://en.wikipedia.org/wiki/Intrinsic_dimension
Application - Yann LeCun at 57:15 on does text fully represent world model?
https://www.youtube.com/watch?v=SGzMElJ11Cc
vs. differing view from Ilya Sutskever at 15:30
https://www.youtube.com/watch?v=SjhIlw3Iffs
Applying intrinsic dimension to scaling laws in training / loss:
https://jmlr.csail.mit.edu/papers/volume23/20-1111/20-1111.pdf
https://arxiv.org/abs/2102.06701

Wednesday, June 7, 2023

Paper: https://arxiv.org/pdf/2305.16291.pdf
Twit: Tweet with nice overview by author https://twitter.com/DrJimFan/status/1662117784023883777
Code: https://github.com/MineDojo/Voyager
website: https://voyager.minedojo.org/

Wednesday, May 31, 2023

paper: https://arxiv.org/pdf/2203.15556.pdf - Training Compute-Optimal Large Language Models
blog: https://www.lesswrong.com/posts/6Fpvch8RR29qLEWNH/chinchilla-s-wild-implications
blog: https://www.harmdevries.com/post/model-size-vs-compute-overhead/
google blog: https://www.cnbc.com/2023/05/16/googles-palm-2-uses-nearly-five-times-more-text-data-than-predecessor.html

Wednesday, May 24, 2023

paper: https://arxiv.org/abs/2212.09720 - The case for 4-bit precision: k-bit Inference Scaling Laws
paper: https://arxiv.org/pdf/2210.17323.pdf - GPTQ: ACCURATE POST-TRAINING QUANTIZATION FOR GENERATIVE PRE-TRAINED TRANSFORMERS

Wednesday, May 17, 2023

paper: https://arxiv.org/pdf/2106.09685.pdf - LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS

Wednesday, May 10, 2023

paper: https://arxiv.org/pdf/2210.03629.pdf - REACT: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS
paper: https://www.pinecone.io/learn/locality-sensitive-hashing/

Wednesday, May 3, 2023

paper: https://arxiv.org/pdf/2201.11903.pdf - Chain of thought prompting elicits reasoning in large language models.
paper: https://arxiv.org/pdf/2210.03629.pdf - REACT: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS
paper: https://www.pinecone.io/learn/locality-sensitive-hashing/

Wednesday, Apr 26, 2023

https://python.langchain.com/en/latest/modules/agents.html
https://arxiv.org/pdf/2210.03629.pdf - REACT: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS
https://www.pinecone.io/learn/locality-sensitive-hashing/

Wednesday, Apr 19, 2023

Blog: https://yoheinakajima.com/task-driven-autonomous-agent-utilizing-gpt-4-pinecone-and-langchain-for-diverse-applications/
Code: https://github.com/hwchase17/langchain

Wednesday, Apr 12, 2023

Paper: Eliciting Latent Predictions from Transformers with the Tuned Lens https://arxiv.org/abs/2303.08112

Wednesday, Apr 5, 2023

Paper: https://openreview.net/pdf?id=lMMaNf6oxKM - Recipe for a General, Powerful, Scalable Graph Transformer
youtube: https://www.youtube.com/watch?v=DiLSCReBaTg

Wednesday, Mar 29, 2023

Paper: https://proceedings.neurips.cc/paper/2021/hash/f1c1592588411002af340cbaedd6fc33-Abstract.html - Do Transformers Really Perform Badly for Graph Representation?
video: https://www.youtube.com/watch?v=FKuQpPIRjLk - review by authors
video: https://www.youtube.com/watch?v=xQ5ltOOxoFg

Wednesday, Mar 22, 2023

Paper: https://arxiv.org/abs/2212.07359 - Post-hoc Uncertainty Learning using a Dirichlet Meta-Model
youtube: https://www.youtube.com/watch?v=nE8XJ1f0zO0

Wednesday, Mar 15, 2023

Paper: https://arxiv.org/abs/2202.05262 - Locating and Editing Factual Associations in GPT
blog: https://rome.baulab.info/
Yannic video: https://www.youtube.com/watch?v=_NMQyOu2HTo

HdDeepLearningStudy

Install / Use

README