Skip to content

Latest commit

 

History

History
398 lines (199 loc) · 37.2 KB

Resources.md

File metadata and controls

398 lines (199 loc) · 37.2 KB

Resources on Deep Learning algorithms for Image Processing and Generative tasks

books

Computer Vision: Algorithms and Applications, 2nd ed, Richard Szeliski, 2022

Dive into Deep Learning, Interactive deep learning book with code, math, and discussions, Aston Zhang, Zachary Lipton, Mu Li, Alexander Smola, online version

Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016

Understanding Deep Learning, Simon J. Prince, 2023

(book site URL: https://udlbook.github.io/udlbook/)

Deep Learning for Computer Vision: Image Classification, Object Detection and Face Recognition in Python, Jason Brownlee, 2019

articles

Problems in Image Recongition and Machine Vision

Semantic Segmentation

R-CNN: Rich Feature Hierarchies for Accurate Object Detection, Ross Girshick et al, UC Berkeley, 2014

Highly Accurate Dichotomous Image Segmentation, Xuebin Qin et al, 2022

Anomaly Detection

Exploring EfficientAD: Accurate Visual Anomaly Detection at Millisecond-Level Latencies: A Brief Overview, Vincent Liu, Medium, 2024

Anomalib v1.0.1: Unveiling Anomaly Detection on Plastic Surfaces, Vincent Liu, Medium, 2024

related repo: https://github.com/openvinotoolkit/anomalib

Models and Neural Architectures

CNNs

ImageNet Classification with Deep Convolutional Neural Nets, Alex Krizhevsky, Ilya Sutskever, Geoffrey Hinton, NIPS, 2012

Deconstructing Convolutional Neural Networks

Feature Visualization, Chris Olah et al, OpenAI, 2017

Zoom In: An Introduction to Circuits, Chris Olah et al, OpenAI, 2020

as pdf here

An Overview of Early Vision in InceptionV1, Chris Olah et al, OpenAI, 2020

as pdf here

Curve Detectors, Nick Cammarata et al, OpenAI, 2020

as pdf here

Naturally Occurring Equivariance in Neural Networks, Chris Olah et al, OpenAI, 2020

High-Low Frequency Detectors, Ludwig Schubert et al, OpenAI, 2021

Curve Circuits, Nick Cammarata et al, OpenAI, 2021

Visualizing Weights, Chelsea Voss et al, OpenAI, 2021

Branch Specialization, Chelsea Voss et al, OpenAI, 2021

Weight Banding, Michael Petrov et al, OpenAI, 2021

Going deeper with convolutions, Christian Szegedy et al, Google, 2014

Visualizing and Understanding Convolutional Networks, Matthew D. Zeiler et al, Courant Institute, NYU, 2013

Network Dissection: Quantifying Interpretability of Deep Visual Representations, David Bau et al, CSAIL MIT, 2017

Visualizing Higher Level Features of a Deep Network, Dumitru Erhan et al, Universite de Montreal, 2009

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps, Karen Simonyan et al, Visual Geometry Group, U. of Oxford, 2014

Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, Anh Nguyen et al, U. of Wyoming, 2015

What Causes Polysemanticity? An Alternative Origin Story of Mixed Selectivity from Incidental Causes, Anonymous

Singular Neural Networks

Why Your Neural Network is Still Singular and What You Can Do About It, Jakub Dworakowski, Pablo Rodriguez Bertorello, Stanford U., 2019

Distilling Singular Learning Theory, Liam Carroll, June 2023

The RLCT Measures the Effective Dimension of Neural Networks, Liam Carroll, June 2023

Why Neural Networks Obey Occam's Razor, Liam Carroll, June 2023

Neural Networks Are Singular, Liam Carroll, June 2023

Phase Transitions in Neural Networks, Liam Carroll, June 2023

Autoencoders

Neural Networks: Unleashing the Power of Latent Space Compression by Julien Pascal, May 2023, Medium

Autoencoders, Dor Bank, 2021

Autoencoders, Unsupervised Learning, and Deep Architectures, Pierre Baldi, 2012

Neural Networks and Principal Component Analysis: Learning from Examples Without Local Minima, Pierre Baldi, Kurt Hornik, 1988

Neural Networks: Unleashing the Power of Latent Space Compression, Julien Pascal, Medium, 2023

The Sparse Autoencoder, Andrew Ng, Lecture Notes CS294A

Tutorial On Principal Component Analysis, Jonathon Shlens, Google Research, 2014

LSTM, RNNs and Seq2Seq Modeling

Long Short-Term Memory, Sepp Hochreiter et al., 1997

LSTM Can Solve Hard Long Time Lag Problems, Sepp Hochreiter, Juergen Schmidthhuber, NIPS, 1996

Recurent Models of Visual Attention, Volodimir Mnih et al, 2014

Sequence to Sequence Learning with Neural Networks, Sutskever et al, Google Research, 2014

Generating Sequences with Recurrent Neural Networks, Alex Graves, UofToronto, 2014

The Unreasonable Effectiveness of Recurrent Neural Networks, Andrej Karpathy's blog, May 2015

Understanding LSTM: a tutorial into Long Short-Term Memory, R. Staudemeyer et al., 2019

A Tutorial on Training RNNs covering BPPT, RTRL, EKF, and the "echo state network" approach, Herber Jaeger, 2002

Understanding LSTM: Colah's Blog, 2015

as pdf here

Transformers

A Mathematical Framework for Transformer Circuits, Nelson Elhage et al, Anthropic, 2021

An Image is Worth 16X16 Wwords: Transformers for Image Recognition at Scale, A. Dosovitskiy, 2021

Attention Is All You Need, Vaswani et al, Google Brain, 2017

Do Vision Transformers See Like Convolutional Neural Networks? M. Raghu, Google Brain, 2022

How Do Vision Transformers Work? N. Park et al, 2022

The Annotated Transformer - delving into Vaswani's paper "Attention Is All You Need", 2018

as pdf here

The Illustrated Transformer, Jay Alamar's blog, 2021

as pdf here

The Transformer - Attention Is All You Need - Michal Chromiak's blog, 2017

Transformers for Image Recognition at Scale, Nel Houlsby and Dirk Weissenborn, Dec 2020, blog

as pdf here

An Introduction to Transformers: an NLP Perspective, T. Xiao et al, 2023

Transforming Auto-encoders, G. Hinton, A. Krizhevsky, et al., 2011

Understanding Transformer Reasoning Capabilities via Graph Algorithms, Clayton Sanford et al, 2024

Lecture 2 on Transformers from CMU CS 10-423 (GenAI) given in Jan 2024

Alternative Architectures to Transformers

MLP-Mixer: An all-MLP Architecture for Vision, I. Tolstikhin et al, Google, 2021

Generative models

Introduction to Diffusion Models for Deep Learning, Ryan O'Connor, 2022 (online blog)

What are Diffusion Models? Lilian Weng, OpenAI, 2021 (online blog)

Diffusion Models for Video Generation, Lilian Weng, OpenAI, 2024 (online blog)

Step-By-Step Diffusion: An Elementary Tutorial, P. Nakkiran et al, 2024

Introduction to Flow Matching, Tor Fjelde, Emile Mathieu, Vincent Dutordoir, 2024 (online blog)

Building Diffusion Model's theory from ground up, Ayan Das, ICRL blogposts, 2024

as a pdf: here

Perspectives on Diffusion, Sander Dieleman, 2023

as a pdf: here

Interpreting and Improving Diffusion Models from an Optimization Perspective, Frank Permenter et al, Toyota Research Institute, 2024

Lightweight Diffusion Models: A Survey, W. Song et al, 2024

Flow Matching For Generative Modeling, Y. Lipman et al, Meta AI, 2023

Generative Modeling by Estimating Gradients of the Data Distribution, Yang Song, Stanford, 2021 (online blog)

Deep Unsupervised Learning Using Nonequilibrium Thermodynamics, Jascha Sohl-Dickstein et al, Stanford U., 2015

Tutorial on Diffusion Models for Imaging and Vision, Stanley Chan, 2024

On Error Propagation of Diffusion Models, Y. Li, Michaela van der Schaar, U of Cambridge, 2024

Differential Diffusion: Giving Each Pixel Its Strength, Eran Levin, Ohad Fried, Tel Aviv University, 2024

Consistency Models, Y. Song et al, 2023

Understanding Diffusion Models: Unified Perspective, Calvin Luo, Google Brain, 2022

Diffusion Models Beat GANs on Image Synthesis, Prafulla Dharival, Alex Nichol, OpenAI, 2021

Generative Models of Images and Neural Networks, William Smith Peebles, PhD Thesis, 2023

Text-to-image Diffusion Models in Generative AI: A Survey, Chenshuang Zhang, Chaoning Zhang, Mengchun Zhang, In So Kweon, 2023

Generative Modeling by Estimating Gradients of the Data Distribution, Y. Song et al, Stanford U., 2020

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models, Alex Nichol et al, 2022

Denoising Diffusion Probabilistic Models, J. Ho et al, UC Berkeley, 2020

Denoising Diffusion Implicit Models, J. Song et al, 2021

Sampling, Diffusion, and Stochastic Localization, Andrea Montanari, 2023

Demystifying Variational Diffusion Models, Fabio De Sousa Ribeiro et al, Imperial College, 2024

Score-Based Generative Modeling Through Stochastic Differential Equations, Y. Song et al, Stanford U., Google, 2021

Improved Techniques for Training Score-Based Generative Models, Y. Song, S. Ermon, 2020

Reverse Time Stochastic Differential Equations for Generative Modeling, Ludwig Winkler, 2021

Reverse Time Diffusion Equation Models, Brian DO Anderson, U. of Newcastle, 1980

On The Mathematics of Diffusion Models, David McAllester, 2023

Navier-Stokes, Fluid Dynamics, and Image and Video Inpainting, M. Bertalmio et al, 2001

An Image Inpainting Technique Based on the Fast Marching Method, A. Telea, 2004

Lecture 7 on Diffusion Models from CMU CS 10-423 (GenAI) given in Feb 2024

Denoising Diffusion Model from scratch using PyTorch, Mickael Boillaud, 2024, Medium

as pdf: here

related repo: https://github.com/Camaltra/this-is-not-real-aerial-imagery/tree/main/src/ai

related paper: Denoising Diffusion Probabilistic Models, J. Ho et al, UC Berkeley, 2020

related paper: ACC-UNet: A Completely Convolutional UNet model for the 2020s, Nabil Ibtehaz et al, Purdue U., 2023

related paper: A ConvNet for the 2020s, Z. Liu et al, Meta FAIR, 2022

related paper: U-Net: Convolutional Networks for Biomedical Image Segmentation, Olaf Ronneberger et al, U. of Freiburg, 2015

Will Diffusion Models Be The Next Frontier of Deep Learning, Devansh, Medium, 2024

Variational Inference

Variational Inference: Foundations and Innovations, David Blei, Columbia U., slides, 2018

Variational Inference: A Review for Statisticians, David Blei et al, Columbia U., 2018

Automatic Variational Inference in Stan, Alp Kucukelbir et al, Columbia U., 2018

Automatic Differentiation Variational Inference, Alp Kucukelbir et al, Columbia U., 2017

Lecture 8 Diffusion Modeling + Variational Inference from CMU CS 10-423 (GenAI) given in Feb 2024

Python library containing Variational Inference algorithms such as ADVI : PyMC

Variational Autoencoders

Intuitively Understanding Variational Autoencoders, Irhum Shafkat, Towards Data Science, 2018

as a pdf file here

Diffusion Models as a kind of VAE, Angus Turner, 2021, online article

Understanding Variational Autoencoders by Joseph Rocca, Towards Data Science, Sept, 2019

as a pdf file here

Variational AutoEncoders (VAE) with PyTorch, Alexander Van de Kleut (online blog)

VAEs in Reinforcement Learning, Nicholsonjm, Medium, 2024

related Gym environment: https://gymnasium.farama.org/environments/mujoco/swimmer/

related paper: Variational State Encoding as Intrinsic Motivation in Reinforcement Learning, Martin Klissarov et al, McGill, ICLR 2019

Tutorial on Variational Autoencoders, Carl Doersch, Carnegie Mellon, UC Berkeley, 2021

Convolutional Variational Autoencoder with tensorflow, online Tensorflow page

Introduction to Variational Autoencoders, Diedrik P. Kingma, Max Welling, 2019

Auto-Encoding Variational Bayes, Diedrik P. Kingma, Max Welling, 2022

The Sparse Autoencoder, Andrew Ng, Lecture Notes CS294A

Lecture 9 on Variational Autoencoders from CMU CS 10-423 (GenAI) given in Feb 2024

The Diffusion Transformer

Scalable Diffusion Models with Transformers, William Peebles, UC Berkeley, 2022

Masked Diffusion Transformer is a Strong Image Synthesizer, S. Gao et al Sea AI Lab, Nankai U., 2023

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation, S. Mo et al, Huawei, NeurIPS 2023

FiT: Flexible Vision Transformer for Diffusion Model, Z. Lu et al, Feb 2024

DiffiT: Diffusion Vision Transformers for Image Generation, A. Hatamizadeh et al, 2024

Diffusion Transformer Explained: Exploring the architecture that brought transformers into image generation, Mario Larcher, Feb 28, 2024

Diffusion Transformer (DiT) Models: A Beginner’s Guide, Akruti Acharya, March 18, 2024

Reinforcement Learning from Human Feedback (RLHF)

Deep Reinforcement Learnng from Human Preferences, Paul Christiano et al, OpenAI, 2017

Training Language Models to Follow Instructions With Human Feedback, L. Ouyang et al, OpenAI, 2022

Fine Tuning Language Models from Human Preferences, Daniel M. Ziegler et al, OpenAI, 2020

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback, Y. Bai et al, Anthropic, 2022

Learning to Summarize from Human Feedback, Nisan Stiennon et al, OpenAI, 2022

Illustrating Reinforcement Learning from Human Feedback (RLHF), Hugging Face article, 2022, Nathan Lambert, Louis Castricato, Leandro von Werra , Alex Havrilla

Learning from human preferences, Dario Amodei, OpenAI blog, 2017

Reinforcement Learning fro Human Feedback, Wikipedia

A General Theoretical Paradygm to Understand Learning from Human Preferences, M. Azar et al, Google DeepMind, 2023

Direct Preference Optimization: Your Language Model is Secretly a Reward Model, Rafel Rafailov et al, Stanford U., 2023

SLiC-HF: Sequence Likelihood Calibration with Human Feedback, Y. Zhao et al, Google Deepmind, 2023

KTO: Model Alignment as Prospect Theoretic Optimization, K. Ethayarajh et al, Stanford U., 2024

ORPO: Monolithic Preference Optimization without Reference Model, Hong, 2024

Lecture 12 on RLHF from CMU CS 10-423 (GenAI) given in Feb 2024

classes, class notes, tutorials and videos

Stanford CS231n

CS231n: Convolutional Neural Networks for Image Recognition: Stanford CS class

github repo: https://github.com/cs231n/cs231n.github.io

CMU CS 10-423

CMU CS 10-423 GenAI class taught in Jan-Feb 2024

slides for the class: slides

MIT 6.S191

MIT 6.S191: Introduction to Deep Learning, Alexander Amini, years 2020-2023 playlist

2023

MIT 6.S191 (2023): Introduction to Deep Learning

MIT 6.S191 (2023): Recurrent Neural Networks, Transformers, and Attention

MIT 6.S191 (2023): Convolutional Neural Networks

MIT 6.S191 (2023): Deep Generative Modeling

MIT 6.S191 (2023): Robust and Trustworthy Deep Learning

MIT 6.S191 (2023): Reinforcement Learning

MIT 6.S191 (2023): Deep Learning New Frontiers

MIT 6.S191 (2023): Text-to-Image Generation

MIT 6.S191 (2023): The Modern Era of Statistics

MIT 6.S191 (2023): The Future of Robot Learning