LLM-FineTuning-Large-Language-Models

Форк
0

7 месяцев назад
9 месяцев назад
7 месяцев назад
7 месяцев назад
7 месяцев назад
9 месяцев назад
9 месяцев назад
7 месяцев назад
README.md

LLM (Large Language Models) FineTuning Projects and notes on common practical techniques

Find me here..


Fine-tuning LLM (and YouTube Video Explanations)

Notebook🟠 YouTube Video 🟠
CodeLLaMA-34B - Conversational Agent Youtube Link
Inference Yarn-Llama-2-13b-128k with KV Cache to answer quiz on very long textbookYoutube Link
Mistral 7B FineTuning with_PEFT and QLORAYoutube Link
Falcon finetuning on openassistant-guanacoYoutube Link
Fine Tuning Phi 1_5 with PEFT and QLoRAYoutube Link
Web scraping with Large Language Models (LLM)-AnthropicAI + LangChainAIYoutube Link

Fine-tuning LLM

NotebookColab
📌 Finetune codellama-34B with QLoRAOpen In Colab
📌 Mixtral Chatbot with Gradio
📌 togetherai api to run MixtralOpen In Colab
📌 Integrating TogetherAI with LangChain 🦙Open In Colab
📌 Mistral-7B-Instruct_GPTQ - Finetune on finance-alpaca dataset 🦙Open In Colab
📌 Mistral 7b FineTuning with DPO Direct_Preference_OptimizationOpen In Colab
📌 Finetune llama_2_GPTQ
📌 TinyLlama with Unsloth and_RoPE_Scaling dolly-15 datasetOpen In Colab
📌 Tinyllama fine-tuning with Taylor_Swift Song lyricsOpen In Colab

LLM Techniques and utils - Explained

LLM Concepts
📌 DPO (Direct Preference Optimization) training and its datasets
📌 4-bit LLM Quantization with GPTQ
📌 Quantize with HF Transformers
📌 Understanding rank r in LoRA and related Matrix_Math
📌 Rotary Embeddings (RopE) is one of the Fundamental Building Blocks of LlaMA-2 Implementation
📌 Chat Templates in HuggingFace
📌 How is Mixtral 8x7B is a dense 47Bn param model
📌 The concept of validation log perplexity in LLM training - a note on fundamentals.
📌 Why we need to identify target_layers for LoRA/QLoRA
📌 Evaluate Token per sec
📌 traversing through nested attributes (or sub-modules) of a PyTorch module
📌 Implementation of Sparse Mixtures-of-Experts layer in PyTorch from Mistral Official Repo
📌 Util method to extract a specific token's representation from the last hidden states of a transformer model.
📌 Convert PyTorch model's parameters and tensors to half-precision floating-point format
📌 Quantizing 🤗 Transformers models with the GPTQ method
📌 Quantize Mixtral-8x7B so it can run in 24GB GPU
📌 What is GGML or GGUF in the world of Large Language Models ?

Other Smaller Language Models

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.