Mastering AI-Driven Problem Solving in Engineering with Observability
Stop guessing during outages. Get a step-by-step blueprint for debugging complex systems, from microservice failures to AI model hallucinations.
#1about 2 minutes
Understanding observability and the need for a process
Observability provides insight into system health and performance, addressing the common lack of a methodical process for resolving issues in complex environments.
#2about 2 minutes
Navigating the complexity of highly distributed systems
A real-world example of a distributed trace highlights the challenges of debugging systems with thousands of microservices, databases, and daily deployments.
#3about 4 minutes
Understanding the four core telemetry data types
Effective problem-solving requires leveraging the distinct strengths of metrics, events, logs, and distributed traces to gain a complete picture of system behavior.
#4about 5 minutes
Key data sources and platform capabilities for observability
A comprehensive observability strategy involves monitoring all application layers and utilizing platform features like workloads, change tracking, and AI-driven intelligence.
#5about 1 minute
Prioritizing changes and errors for faster resolution
Insights from a Microsoft Azure study reveal that most production issues stem from software faults or bad data, making rollbacks a common and effective first solution.
#6about 6 minutes
A step-by-step framework for debugging complex systems
Follow a structured process for incident resolution by first checking for changes and errors, then examining local and remote dependencies before using traces to investigate further.
#7about 3 minutes
Strategies for mitigating AI model hallucinations
Combat AI hallucinations by constraining model inputs and outputs, providing additional context through retrieval-augmented generation (RAG), and eventually fine-tuning the model.
#8about 3 minutes
Deciding when to build versus buy LLM solutions
Evaluate the trade-offs between using consumption-based AI tools and building smaller, custom LLMs based on factors like request volume, cost, and data privacy.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
02:13 MIN
Supercharging observability with AI analytics
Navigating the AI Wave in DevOps
03:04 MIN
Using AI for operations and incident management
Agentic DevOps: How AI-Powered Automation Transforms Software Delivery on GitHub and Azure
01:06 MIN
The growing need for observability in complex applications
Observability with OpenTelemetry & Elastic
16:49 MIN
Q&A on AI adoption, tools, and challenges
Navigating the AI Wave in DevOps
02:45 MIN
Addressing key challenges in the AI era for developers
The Data Phoenix: The future of the Internet and the Open Web
01:48 MIN
Why observability is critical for Python and AI applications
Panel Discussion: Responsible AI in Practice - Real-World Examples and ChallengesIntroductionIn the ever-evolving landscape of artificial intelligence, the concept of "responsible AI" has emerged as a cornerstone for ethical and practical AI implementation. During the WWC24 Panel discussion, three eminent experts—Mina, Bjorn Brin...
Chris Heilmann
Exploring AI: Opportunities and Risks for DevelopersIn today's rapidly evolving tech landscape, the integration of Artificial Intelligence (AI) in development presents both exciting opportunities and notable risks. This dynamic was the focus of a recent panel discussion featuring industry experts Kent...
Chris Heilmann
WWC24 Talk - Scott Hanselman - AI: Superhero or Supervillain?Join Scott Hanselman at WWC24 to explore AI's role as a superhero or supervillain. Scott shares his 32 years of experience in software engineering, discusses AI myths, ethical dilemmas, and tech advancements. Engage with his live demos and insights o...