A state of the art decoder language model

I pulled together all of the current best practices and modifications for LLMs and implemented them in a minimalist style. Useful as a benchmark model to test research ideas against.

See here for project link and full description