ML Sprint: Transformers Wiki!

bhutanisanyam1 · September 10, 2021, 12:17pm

Hi Everybody!
As part of our first community hangout, we’re excited to be hosting a few sprints. This is one of the same:

The plan with ML Sprints is to run week-long activities where our community will contribute to projects.

This is one of three wikis that we’re inviting you to contribute to! This Wiki is meant to serve as a collection of best resources to learn about Transformer models and their applications.

This is a wiki! This means all of you can edit it, please do so!

Papers:
Attention Is All You Need (2017)
End-to-End Object Detection with Transformers (2020)
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (2021)

Blog posts:
The Annotated Transformer
Transformer Deep Dive
The illustrated Transformer

Explanation videos:
Attention is All you need by Yannic Kilcher
GPT-2 by Yannic Kilcher
BERT by Yannic Kilcher
RoBERTa by Yannic Kilcher

Kaggle Notebooks:
Utilizing Transformer Representations Efficiently
On Stability of Few-Sample Transformer Fine-Tuning
Speeding up Transformer w/ Optimization Strategies

harveenchadha · September 12, 2021, 7:56am

If you just want to get the hang of transformers in one post it would definitely be this one from Jay Alammar The illustrated Transformer

harveenchadha · September 12, 2021, 7:59am

This playlist from Ms. Cofee Bean is informative as well.

The Transformer explained by Ms. Coffee Bean

I specifically like this diagram from Chirs McCormick on what you need to know to understand transformers:

aadil · September 13, 2021, 1:58am

A great lecture by Dr. Rachel Tomas, about the fundamental idea behind Transformers. YouTube link
“Attention is All You Need” paper read through by Yannic Kilcher. YouTube link

dchanda · September 14, 2021, 2:34pm

These are some of the paper walk-throughs everyone should go through at least once:

These kernels are good to learn from the application point of view:

Different ways to utilize transformer representations Utilizing Transformer Representations Efficiently
Stabilizing training of transformer models On Stability of Few-Sample Transformer Fine-Tuning
Speeding up transformer training Speeding up Transformer w/ Optimization Strategies

Topic		Replies	Views
NeurIPS edition Paper Reading Group	9	1307	December 19, 2021
Master List: Bi-weekly Paper Reading Group Paper Reading Group	4	3302	November 2, 2021
Vision transformer paper explanation video? Paper Reading Group	0	354	October 12, 2023
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows Paper Reading Group	0	965	March 4, 2022
Tensorflow + Transformers Hyperparameter Sweeping Example(s) W&B Help sweeps , beginner-friendly	4	632	May 26, 2023

ML Sprint: Transformers Wiki!

Related topics