Outperformer

Repository containing the implementations related to my blog post series on scaling Transformers:

Check those out for a detailed explanation of the code and additional information (like reference papers and related codebases).

For now, the codebase is split between 3 files:

implementation of fast attention in the fast_attention.py file
implementation of reversible layers in the reversible.py file
implementation of a headless Reformer + Performer model (a BERT-like MLM with the above modifications) in the performer.py file

If you have any questions (and couldn't find an answer in the post), feel free to open an issue !

Regarding contributions, bug reports (and fixes) are greatly appreciated - although I hope there won't be any :p I don't know yet in which direction this repository will go, whether it will stay as is or incorporate additional features, so if you have ideas please open an issue to talk about them ! Any new feature should be in the spirit of the existing code: aiming at scaling Transformer MLMs through architectural innovations.

If you end up contributing, please review the guidelines first.

All of this is released under the MIT License so feel free to use it as you wish :D

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
src		src
.flake8		.flake8
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Outperformer

About

Languages

License

r0mainK/outperformer

Folders and files

Latest commit

History

Repository files navigation

Outperformer

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages