Kickstart Your AI Weekend: 8 bits for a Byte

Enhance your Friday with substantial bits of knowledge

April 27, 2024 • Estimated Reading Time: 7 minutes

'8 bits for a byte' your weekend AI read. Our come back Meetup on 4/25 was a big success! Thanks for your support in making this happen!

AI Innovations: Dive into the latest developments with the "Artificial Intelligence Index Report" by Stanford. Read the report.
Where AI Jobs Are Booming: Explore the key cities leading in AI employment with insights from Axios. Check out the hotspots.

AI Orchestration: Learn about orchestration techniques for Large Language Models and Retrieval-Augmented Generation applications through this detailed video. Watch the video.
Chapters
- 00:01:25 - Orchestration is the context of AI and LLM apps
- 00:09:42 - Deepset Cloud vs. Haystack open source project
- 00:12:19 - Haystack usage patterns: RAG and other apps
- 00:17:24 - Retrieval Augmented Generation (RAG)
- 00:23:48 - Tuning RAG requires experiments at scale R
- 00:28:28 - RAG Evaluation Metrics
- 00:34:08 - Hallucination
- 00:38:32 - Information Extraction
- 00:43:57 - Data Quality
- 00:46:00 - Streaming and (near) real-time
- 00:54:43- Haystack 2.0
Understanding AI Hallucinations: Gain a deeper understanding of the challenges in building reliable AI, focusing on the phenomenon of hallucinations. Read the blog post.
Google's Prompting Guide: Access the Guide specifically designed for use with Gemini for Google Workspace.
- A little guide to building Large Language Models in 2024 LLM Video Training: Update your knowledge on building Large Language Models in 2024 with this comprehensive video training. Start Learning.
1. Super Bonus - presentation deck.

00:00:00 Intro
00:00:59 Workflow for LLMs Part 1: Training: data
00:01:17 Data preparation - intro and good recent ressources on data preparation
00:05:28 A web scale pretraining corpus - goals and challenges
00:11:29 Web scale data sources – Focus on recent datasets
00:18:01 Language, and quality filtering
00:24:34 Diving in data deduplication
00:27:40 Final data preparation for training
00:31:31 How to evaluate data quality at scale
00:36:29 The datatrove and lighteval libraries Part 2: Training: modeling
00:38:18 Introduction in modeling technics for LLM training
00:39:09 When the model is too big: parallelism
00:40:00 Data parallelism
00:41:18 Tensor parallelism
00:44:38 Pipeline parallelism
00:47:00 Sequence parallelism and references on 4D parallelism
00:47:52 Synchronisation: GPU-CPU and GPU-GPU challenges
00:52:14 Flash attention v1 and v2
00:56:23 Stable training recipes
00:59:12 New architectures: Mixture-of-experts
01:03:13 New architectures: Mamba
01:04:49 The nanotron library Part 3: Fine-tuning: RLHF and alignement
01:06:15 RLHF in 2024
01:08:23 PPO, DPO and REINFORCE Part 4: Fast inference techniques
01:11:23 Quantization, speculative decoding and compilation: overview and ressources End
01:14:36 Sharing your model, datasets and demo – final words

Until next Friday, take it one bit at a time!

or to participate.