Jemiah Sius

Aug 20, 2024 • World Congress 2024

Mastering AI-Driven Problem Solving in Engineering with Observability

Stop guessing during outages. Get a step-by-step blueprint for debugging complex systems, from microservice failures to AI model hallucinations.

#1about 2 minutes

Understanding observability and the need for a process

Observability provides insight into system health and performance, addressing the common lack of a methodical process for resolving issues in complex environments.

#2about 2 minutes

Navigating the complexity of highly distributed systems

A real-world example of a distributed trace highlights the challenges of debugging systems with thousands of microservices, databases, and daily deployments.

#3about 4 minutes

Understanding the four core telemetry data types

Effective problem-solving requires leveraging the distinct strengths of metrics, events, logs, and distributed traces to gain a complete picture of system behavior.

#4about 5 minutes

Key data sources and platform capabilities for observability

A comprehensive observability strategy involves monitoring all application layers and utilizing platform features like workloads, change tracking, and AI-driven intelligence.

#5about 1 minute

Prioritizing changes and errors for faster resolution

Insights from a Microsoft Azure study reveal that most production issues stem from software faults or bad data, making rollbacks a common and effective first solution.

#6about 6 minutes

A step-by-step framework for debugging complex systems

Follow a structured process for incident resolution by first checking for changes and errors, then examining local and remote dependencies before using traces to investigate further.

#7about 3 minutes

Strategies for mitigating AI model hallucinations

Combat AI hallucinations by constraining model inputs and outputs, providing additional context through retrieval-augmented generation (RAG), and eventually fine-tuning the model.

#8about 3 minutes

Deciding when to build versus buy LLM solutions

Evaluate the trade-offs between using consumption-based AI tools and building smaller, custom LLMs based on factors like request volume, cost, and data privacy.

Supercharging observability with AI analytics

02:13 MIN

Supercharging observability with AI analytics

Navigating the AI Wave in DevOps

Using AI for operations and incident management

03:04 MIN

Using AI for operations and incident management

Agentic DevOps: How AI-Powered Automation Transforms Software Delivery on GitHub and Azure

The growing need for observability in complex applications

01:06 MIN

The growing need for observability in complex applications

Observability with OpenTelemetry & Elastic

Q&A on AI adoption, tools, and challenges

16:49 MIN

Q&A on AI adoption, tools, and challenges

Navigating the AI Wave in DevOps

Addressing key challenges in the AI era for developers

02:45 MIN

Addressing key challenges in the AI era for developers

The Data Phoenix: The future of the Internet and the Open Web

Why observability is critical for Python and AI applications

01:48 MIN

Why observability is critical for Python and AI applications

Observability with OpenTelemetry & Elastic

Using AI for root cause analysis and fixes

01:47 MIN

Using AI for root cause analysis and fixes

Debugging in the Dark

Improving documentation and deep work with AI

02:45 MIN

Improving documentation and deep work with AI

Developer Experience in the Age of AI

Featured Partners

How AI Models Get Smarter

How AI Models Get Smarter

Ankit Patel

about 8 months ago • World Congress 2025

The AI-Ready Stack: Rethinking the Engineering Org of the Future

The AI-Ready Stack: Rethinking the Engineering Org of the Future

Jan Oberhauser, Mirko Novakovic, Alex Laubscher & Keno Dreßel

about 7 months ago • World Congress 2025

You are not an AI developer

You are not an AI developer

Zan Markan

about 2 years ago • World Congress 2024

From Monolith Tinkering to Modern Software Development

From Monolith Tinkering to Modern Software Development

Lars Gentsch

about 2 years ago • World Congress 2023

Handling incidents collaboratively is like solving a rubix cube

Handling incidents collaboratively is like solving a rubix cube

Nele Uhlemann

about 3 years ago • World Congress 2023

AI beyond the code: Master your organisational AI implementation.

AI beyond the code: Master your organisational AI implementation.

Marin Niehues

about a year ago • WeAreDevelopers LIVE

The State of GenAI & Machine Learning in 2025

The State of GenAI & Machine Learning in 2025

Alejandro Saucedo

about 7 months ago • World Congress 2025

The AI Security Survival Guide: Practical Advice for Stressed-Out Developers

The AI Security Survival Guide: Practical Advice for Stressed-Out Developers

Mackenzie Jackson

about 2 years ago • World Congress 2024

Related Articles

View all articles

Daniel Cranney

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

IntroductionIn the ever-evolving landscape of artificial intelligence, the concept of "responsible AI" has emerged as a cornerstone for ethical and practical AI implementation. During the WWC24 Panel discussion, three eminent experts—Mina, Bjorn Brin...

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

Chris Heilmann

Exploring AI: Opportunities and Risks for Developers

In today's rapidly evolving tech landscape, the integration of Artificial Intelligence (AI) in development presents both exciting opportunities and notable risks. This dynamic was the focus of a recent panel discussion featuring industry experts Kent...

Exploring AI: Opportunities and Risks for Developers

Chris Heilmann

WWC24 Talk - Scott Hanselman - AI: Superhero or Supervillain?

Join Scott Hanselman at WWC24 to explore AI's role as a superhero or supervillain. Scott shares his 32 years of experience in software engineering, discusses AI myths, ethical dilemmas, and tech advancements. Engage with his live demos and insights o...

WWC24 Talk - Scott Hanselman - AI: Superhero or Supervillain?

Chris Heilmann

Coffee with Developers - Maria Apazoglou - Making AI understandable for all in production

Hello and welcome to another edition of Coffee with Developers. Today, we're excited to share an intriguing conversation with Maria Apazoglou, a leading figure in the AI space at Thomson Reuters. Maria's career journey, insights on AI, and the exciti...

Coffee with Developers - Maria Apazoglou - Making AI understandable for all in production

From learning to earning

Jobs that call for the skills explored in this talk.

AI & Embedded ML Engineer (Real-Time Edge Optimization)

autonomous-teaming

Remote

GIT

Linux

PyTorch

ML Data Engineer - Object Detection & Active Learning

autonomous-teaming

Remote

NoSQL

NumPy

Pandas

Docker

ML Data Engineer - Object Detection & Active Learning

autonomous-teaming

Remote

NoSQL

NumPy

Pandas

Docker

Data Engineering

Mirelo AI

Machine Learning

Data Engineering

Mirelo AI

Machine Learning

AI Solution Architect

Merantix AG

Machine Learning

Let's begin! Software Engineer - AI Observability (5008)

Moody’s
Barcelona, Spain

DevOps

Conversational AI & Machine Learning Engineer

Deloitte

Machine Learning

Conversational AI & Machine Learning Engineer

Deloitte

DevOps

Docker

PyTorch

Tensorflow

Kubernetes

+2