Kickstart Your AI Weekend: 8 bits for a Byte

Enhance your Friday with substantial bits of knowledge

 '8 bits for a byte' your weekend AI read. Our come back Meetup on 4/25 was a big success! Thanks for your support in making this happen!

Here's what I enjoyed reading, watching and learning this week:

  • AI Innovations: Dive into the latest developments with the "Artificial Intelligence Index Report" by Stanford. Read the report.

  • Where AI Jobs Are Booming: Explore the key cities leading in AI employment with insights from Axios. Check out the hotspots.

  •  AI Orchestration: Learn about orchestration techniques for Large Language Models and Retrieval-Augmented Generation applications through this detailed video. Watch the video.

    Chapters

    • 00:01:25 - Orchestration is the context of AI and LLM apps

    • 00:09:42 - Deepset Cloud vs. Haystack open source project

    • 00:12:19 - Haystack usage patterns: RAG and other apps

    • 00:17:24 - Retrieval Augmented Generation (RAG)

    • 00:23:48 - Tuning RAG requires experiments at scale R

    • 00:28:28 - RAG Evaluation Metrics

    • 00:34:08 - Hallucination

    • 00:38:32 - Information Extraction

    • 00:43:57 - Data Quality

    • 00:46:00 - Streaming and (near) real-time

    • 00:54:43- Haystack 2.0

  • Understanding AI Hallucinations: Gain a deeper understanding of the challenges in building reliable AI, focusing on the phenomenon of hallucinations. Read the blog post.

  • Google's Prompting Guide: Access the Guide specifically designed for use with Gemini for Google Workspace.

  • - A little guide to building Large Language Models in 2024 LLM Video Training: Update your knowledge on building Large Language Models in 2024 with this comprehensive video training. Start Learning.

    1. Super Bonus - presentation deck.

Chapters:

  • 00:00:00 Intro

  • 00:00:59 Workflow for LLMs Part 1: Training: data

  • 00:01:17 Data preparation - intro and good recent ressources on data preparation

  • 00:05:28 A web scale pretraining corpus - goals and challenges

  •  00:11:29 Web scale data sources – Focus on recent datasets

  • 00:18:01 Language, and quality filtering

  • 00:24:34 Diving in data deduplication

  • 00:27:40 Final data preparation for training

  •  00:31:31 How to evaluate data quality at scale

  • 00:36:29 The datatrove and lighteval libraries Part 2: Training: modeling

  • 00:38:18 Introduction in modeling technics for LLM training

  • 00:39:09 When the model is too big: parallelism

  • 00:40:00 Data parallelism

  • 00:41:18 Tensor parallelism

  • 00:44:38 Pipeline parallelism

  • 00:47:00 Sequence parallelism and references on 4D parallelism

  • 00:47:52 Synchronisation: GPU-CPU and GPU-GPU challenges

  • 00:52:14 Flash attention v1 and v2

  • 00:56:23 Stable training recipes

  • 00:59:12 New architectures: Mixture-of-experts

  • 01:03:13 New architectures: Mamba

  • 01:04:49 The nanotron library Part 3: Fine-tuning: RLHF and alignement

  • 01:06:15 RLHF in 2024

  • 01:08:23 PPO, DPO and REINFORCE Part 4: Fast inference techniques

  • 01:11:23 Quantization, speculative decoding and compilation: overview and ressources End

  • 01:14:36 Sharing your model, datasets and demo – final words

Until next Friday, take it one bit at a time!

Join the conversation

or to participate.