Quick introductions: Purpose and Applications of LLM's; Large Language Models: The Complete Guide in 2023
(Note: The Complete Guide has many examples, esp in the linked paper.)
Basic terms: prompting,
tokens,
context
window/length, temperature,
alignment, more
parameters,
security
(eg DAN
attack)
Prompt Engineering: Microsoft's guide with many techniques; LangChain; Wikipedia page
Prompts and Models: FlowGPT for prompts; HuggingFace for models
Best software to run them: GPT4All; oobabooga; kobold; MLC LLM; llama.cpp; exllama
How Transformers Work (Towards Data Science)
(Note: Towards Data Science is a great resource for everything from Python to data analysis.)
Different ways of training LLM's, and why prompting is none of them (Towards Data Science)
All You Need to Know to Build Your First LLM App (2023) (Towards Data Science)
Courses: HuggingFace
NLP course (free); Practical Deep
Learning for Coders
(free); DeepLearning.ai's
courses (some by Andrew
Ng)
If running them locally, the 3090 and 4090 GPU's are recommended the
most. Many say to use two 3090's over one 4090, too.
If fine-tuning a lot or training, almost everyone recommends just
renting a cloud instance with A100's or H100's.
Best
GPU's for Deep Learning: An In-Depth Analysis
H100 Supply and Demand (Hacker News)
Hardware
for Deep Learning Series
Faster Training and Inference: Habana Gaudi-2 vs NVIDIA A100 80B
Below are links I've found while digging around on this topic. They
haven't been evaluated or organized. I'm just saving you search time.
Train Your Own Private ChatGPT Model for the Cost of a Starbucks Coffee
How to train your own
Large Language Models
Open-source Text Generation and LLM Ecosystem (HuggingFace)
Training NanoGPT (126M) with Deep Lake (ActiveLoop.ai)
High-performance LLM Training at 1000 GPU Scale with Alpa and Ray (anyscale)
Numbers every LLM developer should know
How to create a custom LLM (NVIDIA) (prompting, too)
LLaMA 2 Scaling Laws and Training Costs (Reddit)
Build Your Own Large Language Model Like Dolly (Databricks) (on-demand content)
Teach your LLM to answer with facts, not fiction (myscale.com)
The Novice's LLM Training Guide
Unleashing the Power of Fine-tuning
Getting
open LLM's to return a single value or JSON
Guide
to Fine-Tuning LLM Models on Custom Data
I
trained a 65B Model on My Texts
Effortless
Fine-Tuning of Large Language Models with Open-Source H20 LLM Studio
IA3: New, LORA-like Fine-tuning
Illustrating Reinforcement Learning from Human Feedback (HuggingFace)
Presenting
"The Muse" - a logit sampler that makes LLM's more creative
How
to overcome issues with token limits on document summarization? (Reddit)
How
to implement stopping criteria in a transformers library (StackOverflow)
This section is about how people started using compression algorithms in
this space. There was a flurry of posts on them. I've collected some.
Faster Neural Networks Straight from JPEG (uber.com)
Low-Resource Text Classification: A Parameter-Free Classification Method with Compressors (the tweet)
ziplm: Gzip-Backed Language Model (Hacker News) (has good links)
Rebuttal to above: Bad numbers in GZIP beats BERT paper (Hacker News)
Decoding the ACL Paper: Gzip and KNN Rival Bert in Text Classification (Hacker News)
PrivateGPT: Asking
Questions on Your Own Files
Faster
Training and Inference: Habana Gaudi-2 vs NVIDIA A100 80GB
Mipsology: Zebra converts models of some type to run on FPGA's for fast inference. Another article.
Classifying customer messages with LLMs vs traditional ML (trygloo.com)
A Tiny, Large Language Model Coded and Hallucinating
Myth of Context Length (Twitter)
Many Languages, One Deep Learning Model (Towards Data Science)
(Note: This HN comment says Google's flaxformer already addresses this. Produces the same output, too.)
Working with AI in Context (July 2023) (Medium)
(Navigation: Go to top-level page in this series.)
(Learn the Gospel of Jesus Christ with proof it's true and our stories. Learn how to live it.)