Your Elasticsearch query returns the top 10 results. But what if the real top result is missing entirely? Here's why.
#1about 7 minutes
Understanding the CAP theorem for distributed systems
The CAP theorem states that a distributed data store can only provide two of three guarantees: consistency, availability, and partition tolerance.
#2about 3 minutes
Introducing the FAB theory for datastore tradeoffs
The FAB theory proposes another set of tradeoffs for data stores, where you can only pick two of three attributes: fast, accurate, or big.
#3about 7 minutes
How terms aggregation trades accuracy for speed
Elasticsearch's terms aggregation may return inaccurate counts by default because each shard only considers its top local results to improve performance.
#4about 8 minutes
Inconsistent relevance scores in distributed full-text search
Full-text search relevance scores using TF-IDF can be inconsistent because inverse document frequency is calculated per-shard, not globally.
#5about 2 minutes
Using a single shard to ensure data accuracy
Forcing an index to use a single shard guarantees accurate aggregations and relevance scores by eliminating distributed calculations, but sacrifices horizontal scaling.
#6about 1 minute
Why you must consciously choose your data tradeoffs
It is crucial to understand and explicitly choose the tradeoffs in your data systems, like those in the CAP and FAB theorems, to avoid unexpected behavior.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
02:56 MIN
Navigating the challenges of distributed aggregations
Distributed search under the hood
03:31 MIN
Q&A on indexing, aggregations, and OpenSearch vs Elasticsearch
Search and aggregations made easy with OpenSearch and NodeJS
05:32 MIN
Optimizing compute, storage, and data transmission
A Hitchhiker's Guide to Resource Efficient Software
04:58 MIN
Optimizing performance with advanced data distribution methods
Fault Tolerance and Consistency at Scale: Harnessing the Power of Distributed SQL Databases
04:29 MIN
Introducing the core principles of Elasticsearch
Distributed search under the hood
01:17 MIN
Recapping Kafka's capabilities for real-time data feeds
Let's Get Started With Apache Kafka® for Python Developers
03:59 MIN
Modern data architectures and the reality of team size
Modern Data Architectures need Software Engineering
Data Science & more: The Lopez dilemmaCatwalk, Data Science, Hollywood, Google Images, Haute Couture, StackOverflow, Comfort Zone, Dota 2 and Versace – all these topics are connected and influenced by each other. Read here how and why!In 2000 Jennifer Lopez's green Versace dress went vi...
Benedikt Bischof
Making Data Warehouses Fast: A Developer’s StoryWelcome to this issue of the WeAreDevelopers Live Talk series. This article recaps an interesting talk by Adnan Rahic who teaches the audience how to make data warehouses.About the Speaker: Adnan is senior developers advocate at Cube. His passion lie...
Dev Digest 134 - Where pixels sing?News and ArticlesWeAreDevelopers LIVE Data and Security Day is on Wednesday, 25/09/2024. Learn about OPC UA Updates, Best Practices for Using GitHub Secrets, Passwordless Web 1.5, Emerging AI Security Risks, Data Privacy in LLMs and get a chance to t...
From learning to earning
Jobs that call for the skills explored in this talk.