What if 40% of your LLM's answers are just plain wrong? Learn how to measure factuality and build more reliable AI applications.
#1about 2 minutes
Understanding the dual nature of large language models
LLMs can generate both creative, coherent text and factually incorrect "hallucinations," posing a significant challenge for real-world applications.
#2about 4 minutes
The architecture and evolution of LLMs
The combination of the scalable Transformer architecture and massive text datasets enables models like GPT to develop "parametric knowledge" as they grow in size.
#3about 3 minutes
How training data quality influences model behavior
The quality of web-scraped datasets like Common Crawl, even after filtering, directly contributes to model hallucinations by embedding misinformation.
#4about 2 minutes
Differentiating between faithfulness and factuality hallucinations
Hallucinations are categorized as either faithfulness errors, which contradict a given source text, or factuality errors, which stem from incorrect learned knowledge.
#5about 3 minutes
Using the TruthfulQA dataset to measure misinformation
The TruthfulQA dataset provides a benchmark for measuring an LLM's tendency to repeat common misconceptions and conspiracy theories across various categories.
#6about 6 minutes
A practical guide to benchmarking LLM hallucinations
A step-by-step demonstration shows how to use Python, LangChain, and Hugging Face Datasets to run the TruthfulQA benchmark on a model like GPT-3.5 Turbo.
#7about 4 minutes
Exploring strategies to reduce LLM hallucinations
Key techniques to mitigate hallucinations include careful prompt crafting, domain-specific fine-tuning, output evaluation, and retrieval-augmented generation (RAG).
#8about 4 minutes
A deep dive into retrieval-augmented generation
RAG reduces hallucinations by augmenting prompts with relevant, up-to-date information retrieved from a vector database of document embeddings.
#9about 2 minutes
Overcoming challenges with advanced RAG techniques
Naive RAG can fail due to poor retrieval or generation, but advanced methods like Rowan selectively apply retrieval to significantly improve factuality.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
02:29 MIN
Understanding the problem of LLM hallucinations
Martin O'Hanlon - Make LLMs make sense with GraphRAG
02:55 MIN
Addressing the key challenges of large language models
Large Language Models ❤️ Knowledge Graphs
05:49 MIN
Explaining how large language models work and why they hallucinate
Innovating Developer Tools with AI: Insights from GitHub Next
01:27 MIN
Why web data is essential for training large language models
How to scrape modern websites to feed AI agents
05:18 MIN
Addressing the core challenges of large language models
Accelerating GenAI Development: Harnessing Astra DB Vector Store and Langflow for LLM-Powered Apps
06:47 MIN
Understanding the risks of large language models
Inside the Mind of an LLM
02:20 MIN
Understanding the limitations of large language models
Knowledge graph based chatbot
06:55 MIN
Demonstrating LLM hallucinations with tricky questions
What Are Large Language Models?Developers and writers can finally agree on one thing: Large Language Models, the subset of AIs that drive ChatGPT and its competitors, are stunning tech creations. Developers enjoying the likes of GitHub Copilot know the feeling: this new kind of te...
Daniel Cranney
How machine learning can help us tell fact from fictionA decade ago, machine learning was everywhere. While the rise of generative AI has meant artificial intelligence has stolen the spotlight to some degree, it’s machine learning (ML) that silently powers its most impressive achievements.From chatbots t...
Krissy Davis
The Best Large Language Models on The MarketLarge language models are sophisticated programs that enable machines to comprehend and generate human-like text. They have been the foundation of natural language processing for almost a decade. Although generative AI has only recently gained popula...
Chris Heilmann
Dev Digest 137 - AI'm not sure about thisHello fellow developer, this is the 1st "out of the can" edition of 3 as I am on vacation in Greece going "whee are you cute" at donkeys. So, fewer news, but lots of great resources. Enjoy! News and ArticlesOpenAI has been the big topic winning in th...
From learning to earning
Jobs that call for the skills explored in this talk.