Из-за периодической блокировки нашего сайта РКН сервисами, просим воспользоваться резервным адресом:
Загрузить через dTub.ru Загрузить через ClipSaver.ruУ нас вы можете посмотреть бесплатно Lecture 8: The GPT Tokenizer: Byte Pair Encoding или скачать в максимальном доступном качестве, которое было загружено на ютуб. Для скачивания выберите вариант из формы ниже:
Роботам не доступно скачивание файлов. Если вы считаете что это ошибочное сообщение - попробуйте зайти на сайт через браузер google chrome или mozilla firefox. Если сообщение не исчезает - напишите о проблеме в обратную связь. Спасибо.
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса savevideohd.ru
In this lecture, we will learn about Byte Pair Encoding: the tokenizer which powers modern LLMs like GPT-2, GPT-3 and GPT-4. The key reference book which this video series very closely follows is Build a Large Language Model from Scratch by Manning Publications. All schematics and their descriptions are borrowed from this incredible book! This book serves as a comprehensive guide to understanding and building large language models, covering key concepts, techniques, and implementations. Affiliate links for purchasing the book will be added soon. Stay tuned for updates! 0:00 Why we need Byte Pair Encoder (BPE) 2:55 Word and character level tokenizers 11:37 Sub-word tokenization 16:05 Byte Pair Encoder (BPE) Algorithm 21:33 BPE for Large Language Models 22:42 BPE practical demonstration 40:51 Implementing BPE in Python 47:47 Key takeaways Entire Code file link: https://drive.google.com/file/d/1ukW7... OpenAI BPE Implementation (tiktoken): https://github.com/openai/tiktoken ================================================= ✉️ Join our FREE Newsletter: https://vizuara.ai/our-newsletter/ ================================================= Vizuara philosophy: As we learn AI/ML/DL the material, we will share thoughts on what is actually useful in industry and what has become irrelevant. We will also share a lot of information on which subject contains open areas of research. Interested students can also start their research journey there. Students who are confused or stuck in their ML journey, maybe courses and offline videos are not inspiring enough. What might inspire you is if you see someone else learning and implementing machine learning from scratch. No cost. No hidden charges. Pure old school teaching and learning. ================================================= 🌟 Meet Our Team: 🌟 🎓 Dr. Raj Dandekar (MIT PhD, IIT Madras department topper) 🔗 LinkedIn: / raj-abhijit-dandekar-67a33118a 🎓 Dr. Rajat Dandekar (Purdue PhD, IIT Madras department gold medalist) 🔗 LinkedIn: / rajat-dandekar-901324b1 🎓 Dr. Sreedath Panat (MIT PhD, IIT Madras department gold medalist) 🔗 LinkedIn: / sreedath-panat-8a03b69a 🎓 Sahil Pocker (Machine Learning Engineer at Vizuara) 🔗 LinkedIn: / sahil-p-a7a30a8b 🎓 Abhijeet Singh (Software Developer at Vizuara, GSOC 24, SOB 23) 🔗 LinkedIn: / abhijeet-singh-9a1881192 🎓 Sourav Jana (Software Developer at Vizuara) 🔗 LinkedIn: / souravjana131