-
Notifications
You must be signed in to change notification settings - Fork 8
Introduction to large language models (Lesson 01) #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Converted back to draft, this is way too wordy so I'm going to trim it down a bit before subjecting someone to reviewing this. 😄 |
roshansuresh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the intro lesson reads really well! The intro table of contents shouldn't include Abstraction layers there I think. I've left some small comments in the chapters.
| - **1. Natural language processing**: an overview of the general field of natural language processing | ||
| - **2. Introduction to LLMs**: what are large language models, and how do they work? | ||
| - **3. LLM architecture**: a deeper dive into what makes LLMs tick. | ||
| - **4. Demo: Visualizing embeddings** Visualizing how meanings are represented in LLMs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity, I would have it say "Demo - Visualizing embeddings:"
|
|
||
| Ultimately, NLP aims to make human-computer interaction as intuitive as human-to-human exchanges, so it can be used in fields as diverse as healthcare diagnostics, explaining complex legal documents, and personalized education. | ||
|
|
||
| ### NLP Methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can call this "NLP Methods - A History" or something?
|
|
||
| There are really *two* different learning modes for LLMs. First, by training on huge bodies of text in the next-word-prediction task, we end up with what are often called *foundational*, *pretrained*, or *base* models. These are general purpose models that embody information from extremely broad sources. | ||
|
|
||
| However, these foundational models don't work well in special-purpose jobs like personal assistants, chatbots, etc. A *second* second training step fine-tunes these models on smaller, labeled datasets for specific applications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a really important point. I think we should reiterate this in the chatbots lesson
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, a typo in second second
|
|
||
| If you have ever been using ChatGPT and it has asked you to rank two responses, this is OpenAI collecting data for future rounds of RLHF. | ||
|
|
||
| The result with fine-tuning is the production of specialized models built on top of the same foundation. While in this course we will not go through the process of building your own LLM, the excellent book [Build a Large Language Model from Scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch) by Sebastian Raschka, walks you through this in detailk using PyTorch if you are interested. The above picture is adapted from his book. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in "detailk*
|
|
||
| The result with fine-tuning is the production of specialized models built on top of the same foundation. While in this course we will not go through the process of building your own LLM, the excellent book [Build a Large Language Model from Scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch) by Sebastian Raschka, walks you through this in detailk using PyTorch if you are interested. The above picture is adapted from his book. | ||
|
|
||
| In the next section we will dig into the details about how LLMS actually work. As we said, it isn't just that they are *large*, but their *architecture*, that makes them so powerful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in "LLMS" (maybe fine but its probably better to have it be LLMs)
| For a quick overview of LLM function, [check out this video](https://www.youtube.com/watch?v=5sLYAQS9sWQ). | ||
|
|
||
| ## 4. Demo: Visualizing embeddings | ||
| In the following demonstration we will visualize text embeddings based on their semantic similarity. Instead of similareity of single words like `apple` and `phone`, we will look at similarities among entire *sentences*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in "instead of similareity"
| ``` | ||
| We first need to load our OpenAI API key so you can use the embedding model that is part of their suite of models. We will discuss this more in the lesson on chat completions | ||
|
|
||
| > TODO: handle this discussion better -- put in README for this week and discuss handling of api keys, where to put it, that it will recursively seach parent dsirectories up to root). Please be sure not to share your API key! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to remove this
| plt.show() | ||
| ``` | ||
|
|
||
| This mapping shows that similar movies tend to cluster together in the embedding space. E.g., Christmas movies in one region of the space, romcons in another (note Love Actually is a Christmas Romcom). Feel free to drop new summaries in to see where they fall in this movie map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we show the plot here or just ask them to run this code themselves?
|
|
||
| > To fill in later: brief motivational preview here. Briefly explain why this lesson matters, what students will be able to do by the end, and what topics will be covered. Keep it tight and motivating. | ||
| > For an introduction to the course, and a discussion of how to set up your environment, please see the [Welcome](../README.md) page. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove the prior statement
| 2. [OpenAI Chat Completions API](02_open_ai_api.md) | ||
| Intro and overview of openai api chat completions endpoint. Go over required params (messages/model), but also the important optional params (max_tokens, temperature, top_p etc). Mention responses endpoint (more friendly to tools/agents). Discuss and demonstrate use of moderations endpoint. | ||
|
|
||
| 3. [Abstraction layers](03_abstractions.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a separate Abstraction layers chapter as far as I know
Overview of natural language processing and LLMs. Closes #27
Will mainly contribute
python-200/lessons/05_AI_intro/01_intro_nlp_llms.md