AI Models

List of Open LLMs

Datasets: LLM Data Hub; The Pile; RedPajama Data; SlimPajama (600+GB dedup'd); RefinedWeb (Falcon's); The Stack (StarCoder's)

(Note: Training sets may or may not be legal if they use copywritten works. Copyright aspects of LLM's are being debated, though. Use what you believe is legal.)

Top Families Today

I'm going to list those with highly-permissive licenses. They can usually be used for businesses.

LLaMA-2: LlaMA 2 is here; 70B 4-bit GPTQ (works on 2x24GB VRAM); LLaMA 2 70B GPTQ full context on 2 3090s; LLongMA: LLaMA 2 8k; LLaMA-2 7B Uncensored QLoRA; LLaMA-2-7B-32K; LLongMA-2 16k;

(Note: This is the most, recent model from Facebook/Meta whose sizes vary from consumer up to high-end. While quite capable, it seems to have many issues from whatever alignment/morality they built into it. Remove the default, system prompt if you use it.)

Falcon: project page; HuggingFace article (7B/40B); HuggingFace article (180B)

(Note: Falcon-40B beat others in performance for a long time. Recently, they released 180B.)

MPT: MPT-7B; MPT-30B

(Note: MPT was trained on a lot of data, trained with their platform you can use, open-sourced for business use, and they build a business on it. That's a great, business model that I'd like to see others imitate.)

Salesforce XGen-7B

Warning: The rest of the models on this page may have license restrictions. I'm including them both for examples and any use you could get out them.

Models in Specific Categories

Instruction-following Models: WizardLM; MPT-Instruct 30B; Databricks Dolly-12B; LlaMA-2-70B-Instruct2

Coding models: StarCoder 15B; WizardCoder 15B; CodeT5 16B

Chatbots: Guanaco description, model links, and LlaMA-2-70B version; MPT-Chat 30B; LlaMA-2-Chat; FastChat-T5-3B

(Note: Guanaco are among the highest-performing models according to many people who use them with and without GPU's.)

Uncensored Models: Rationale; WizardLM-Uncensored (with examples); list of uncensored models

(Note: The WizardLM-Uncensored link is one of my favorites because you can see definite, political bias in these models. Then, the uncensored model just says yes to everything.)

Other Model Families

OpenOrca 13B: Preview; Chat Preview

OpenLlaMA

GPT-J6B

Pythia family

BLOOM (open w/ 1000+ collaborators): BLOOM-176B-LORA-8-bit (353GB -> 180GB)

GLM family (Chinese): ChatGLM-6B article

T5 Family: t5-large; LaMini-Flan-T5-248M
(Note: Several small models, including sub-1G, were built this way.)

Smaller, coding models: GGML for Falcoder 7B, SantaCoder 1B, and TinyStarCoder 160M; Replit Code 3B

Tiny models: Baby LlaMA-2; TinyStarcoder (164M on 6 epochs of 100GB total); GPT2023 (124m GPT-2 model on 2.23B tokens)

(Navigation: Go to top-level page in this series.)

(Learn the Gospel of Jesus Christ with proof it's true and our stories. Learn how to live it.)