This blogpost is a brief summary of the points mentioned in the meetup on Large Language Models hosted at Google Developer space

News You can use - Data, Inference and Training

Google’s FLAN-T5 2022
- Fully open souce
- Instruction fine-tuned
- Available in multiple sizes
- Strangely under-rated
- SAT reading scores - Pre-trained models
- Google Blog post
- Available on Hugging face
- Googlers - 11B one is unfairly good
- Surprisingly powerful models available
Data Selection
- Large scale pretraining
- Corpus fine-tuning
- Task fine-tuning
Two easy fixes:
- N-gram filter
  - Data Selection for language models via importance resampling - 2023
  - Basic idea
    - Train on data of similar data
  - DSIR helps in creating good sampling data from random text on the internet
  - 82.2 on Roberta whereas DSIR is 83
- Prefix detox
  - Pretraining language models with human preferences
  - Simple idea
    - Instead of training on plain text
    - Rate the pretraining text
  - Training - use two control tokens during training
  - Inference - use only control token that is good
Running Large Models
- Large models can be difficult to run
- Low precision works surprisingly well
  - use float16
- Most practical to least practical
  - 8-bit quantisation
  - LLM.INT8() - mean zero shot accuracy - 8-bit matrix multiplication for transformers at scale
    - Metrics are within standard error of original models
  - Flexgen(4-bit)
    - High-throughput GEnerative Inference of Large Language Models with a Single GPU
    - lowers requirements of LLM inference
    - Metric is tokens per second - Why is this metric important ?
  - 1 bit quantization
    - Binarized Neural Machine Translation
Training Large Models
- Gradient Checkpointing
  - save 50 percent GPU memory
- Key idea
  - Don’t train the large model
  - Train a parasite model
- LoRA - Low rank adaptation of large language models
  - Reduce trainable parameters by 10,000x
  - Reduce GPU memory
  - minLoRA
  - LORA and Stable Diffusion
  - PEFT library
- PEFT for Whisper
- ControlNet
Privacy
- Offsite Tuning - Transfer learning without full model
How to make use of big-iron models without having to invest in big-iron ?

Supercharge ML experiments with Pytorch Lightning - Vivek Kalyan

Handshakes - startup from SG
https://slides.com/vivekkalyan/pytorch-lightning

`langchain`

open source
Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge
LLMs are based on conditional generation based on a prompt
RLHF
LLMs can’t access the traditional software stack
LLM alone is not enough
Each call to LLM is individual
Conversation state needs to be passed back in each time
LLMs have a finite span we can pass in to
LangChain is an open source project - build a structure around language models
Allows fully features apps that interact with software stack
Manage LLMs and prompts
Integrate with APIs, databases and data sources
supports python and typescript
Seven components
- LLMs
- Prompt templates
- Tools
- Chains
- Memory
- Agents
- Index
Every thing starts with a promot
Everything in langchain is based on prompt
Need to use prompts
Prompt engineering
- Getting the model to generate text conditioned on some text
- The prompt that we input has a massive influence on the output
Instruct GPT - Training language models to follow instructions with human feedback
few shot learning
Tools are individual components that LangChain can use in Chains
Chain is made up of link of tools
Chains
- Generic
- Utility
- Async chains
fact extraction on a tech crunch article
generate knowledge graph triples via facts
PALChain
- Tell me the answer to a natural language based question
- Turn it in to code
- the code when run, returns the answer
Model generates URL

Large Language Models - Meetup

Contents

News You can use - Data, Inference and Training

Supercharge ML experiments with Pytorch Lightning - Vivek Kalyan

`langchain`