Your RAG pipeline has security holes you haven't considered. Learn to defend against data poisoning and a new class of vector store attacks.
#1about 3 minutes
Using RAG to enrich LLMs with proprietary data
Retrieval-augmented generation (RAG) is the key to making large language models useful for enterprises by providing them with up-to-date, proprietary information.
#2about 4 minutes
The challenge of parsing complex document structures
Simple document parsers can misinterpret layouts like multi-column text, leading to corrupted data and incorrect outputs from the language model.
#3about 3 minutes
Using Docling to convert documents into structured formats
Docling is an open-source tool that acts like an advanced OCR service, converting various binary document formats into a structured, parsable tree.
#4about 7 minutes
Demo of a basic RAG ingestion pipeline
A live demonstration shows how a Quarkus application uses Docling to ingest a PDF, generate embeddings, and store the resulting chunks and vectors in Redis.
#5about 3 minutes
Securing RAG against data poisoning and leaks
To prevent data poisoning and sensitive data leaks, it is crucial to sanitize documents, verify their signatures, and use tools for PII masking.
#6about 4 minutes
Mitigating vector store attacks and encryption challenges
Vector stores are vulnerable to attacks like close vector modification and reversal, and standard encryption breaks vector distance, requiring specialized solutions.
#7about 5 minutes
Demo of a secure ingestion pipeline in action
A final demonstration showcases a secure pipeline that verifies document signatures, anonymizes sensitive data, and encrypts vectors before storing them.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
07:55 MIN
Demo: Implementing RAG with LangChain4J and a vector database
Langchain4J - An Introduction for Impatient Developers
02:00 MIN
Addressing unique security risks in RAG systems
Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails
Dev Digest 138 - Are you secure about this?Hello there! This is the 2nd "out of the can" edition of 3 as I am on vacation in Greece eating lovely things on the beach. So, fewer news, but lots of great resources. Many around the topic of security. Enjoy! News and ArticlesGoogle Pixel phones t...
Chris Heilmann
Dev Digest 134 - Where pixels sing?News and ArticlesWeAreDevelopers LIVE Data and Security Day is on Wednesday, 25/09/2024. Learn about OPC UA Updates, Best Practices for Using GitHub Secrets, Passwordless Web 1.5, Emerging AI Security Risks, Data Privacy in LLMs and get a chance to t...
Chris Heilmann
Dev Digest 116 - WWWAI?This time, learn how to un-AI Google's search results, what's new on the web, avoid a new security hole and go back to BASICS with us. News and ArticlesWhat a week. Google, Microsoft, OpenAI and many others had their big flagship events announcing th...
Daniel Cranney
Panel Discussion: Responsible AI in Practice - Real-World Examples and ChallengesIntroductionIn the ever-evolving landscape of artificial intelligence, the concept of "responsible AI" has emerged as a cornerstone for ethical and practical AI implementation. During the WWC24 Panel discussion, three eminent experts—Mina, Bjorn Brin...
From learning to earning
Jobs that call for the skills explored in this talk.