Tanmay Bakshi

Oct 12, 2020 • WeAreDevelopers LIVE

What do language models really learn

Are language models truly creative, or just powerful mathematical optimizers? This talk reveals what LLMs actually learn beyond the hype.

#1about 7 minutes

The fundamental challenge of modeling natural language

Language models aim to create intuitive human-computer interfaces, but this is difficult because language syntax doesn't fully capture semantic meaning.

#2about 3 minutes

How deep learning models learn by transforming data

Deep learning works by performing a series of transformations on input data to warp its vector space until it becomes linearly separable.

#3about 3 minutes

Why the training objective is key to model behavior

The training objective, or incentive, dictates exactly what a model learns and can lead to unintended outcomes if not designed carefully.

#4about 8 minutes

From Word2Vec and LSTMs to modern transformers

The evolution from slow, non-contextual models like LSTMs to the parallel and deeply contextual transformer architecture solved major NLP challenges.

#5about 7 minutes

A practical demo of a character-level BERT model

A scaled-down, character-level transformer model demonstrates the 'fill in the blank' pre-training task by predicting masked characters in artist names.

#6about 2 minutes

What language models implicitly learn about language structure

By analyzing a model's internal weights, we can see it learns phonetic relationships and syntactic structures without ever being explicitly trained on them.

#7about 7 minutes

Why current generative models don't truly 'write'

Generative models like GPT are excellent at predicting the next word based on statistical patterns but lack the underlying thought process required for true, creative writing.

#8about 4 minutes

Exploring the future with Blank Language Models

Blank Language Models (BLM) offer a new training approach by filling in text in any order, forcing the model to consider both past and future context.

#9about 3 minutes

The need for better tooling to accelerate ML research

The complexity of implementing novel architectures like BLMs highlights the need for better infrastructure and compiled languages like Swift for TensorFlow to speed up innovation.

Defining key GenAI concepts like GPT and LLMs

07:44 MIN

Defining key GenAI concepts like GPT and LLMs

Enter the Brave New World of GenAI with Vector Search

Understanding the fundamentals of large language models

01:24 MIN

Understanding the fundamentals of large language models

Building Blocks of RAG: From Understanding to Implementation

Understanding the core capabilities of large language models

02:26 MIN

Understanding the core capabilities of large language models

Data Privacy in LLMs: Challenges and Best Practices

Understanding the basics of large language models

04:05 MIN

Understanding the basics of large language models

Bringing the power of AI to your application.

Understanding the fundamentals of generative AI for developers

02:00 MIN

Understanding the fundamentals of generative AI for developers

Java Meets AI: Empowering Spring Developers to Build Intelligent Apps

The evolution of NLP from early models to modern LLMs

03:01 MIN

The evolution of NLP from early models to modern LLMs

Harry Potter and the Elastic Semantic Search

Using large language models for voice-driven development

03:30 MIN

Using large language models for voice-driven development

Speak, Code, Deploy: Transforming Developer Experience with Voice Commands

Using large language models as a learning tool

03:42 MIN

Using large language models as a learning tool

Google Gemini: Open Source and Deep Thinking Models - Sam Witteveen

Featured Partners

Creating Industry ready solutions with LLM Models

Creating Industry ready solutions with LLM Models

Vijay Krishan Gupta & Gauravdeep Singh Lotey

about 2 years ago • WeAreDevelopers LIVE

Lies, Damned Lies and Large Language Models

Lies, Damned Lies and Large Language Models

Jodie Burchell

about a year ago • WeAreDevelopers LIVE

How AI Models Get Smarter

How AI Models Get Smarter

Ankit Patel

about 8 months ago • World Congress 2025

Data Privacy in LLMs: Challenges and Best Practices

Data Privacy in LLMs: Challenges and Best Practices

Aditi Godbole

about 2 years ago • WeAreDevelopers LIVE

Multimodal Generative AI Demystified

Multimodal Generative AI Demystified

Ekaterina Sirazitdinova

about 2 years ago • WeAreDevelopers LIVE

AI: Superhero or Supervillain? How and Why with Scott Hanselman

AI: Superhero or Supervillain? How and Why with Scott Hanselman

Scott Hanselman

about 2 years ago • World Congress 2024

The pitfalls of Deep Learning - When Neural Networks are not the solution

The pitfalls of Deep Learning - When Neural Networks are not the solution

Adrian Spataru & Bohdan Andrusyak

about 6 years ago • WeAreDevelopers LIVE

Multilingual NLP pipeline up and running from scratch

Multilingual NLP pipeline up and running from scratch

Kateryna Hrytsaienko

about 2 years ago • WeAreDevelopers LIVE

Related Articles

View all articles

Luis Minvielle

What Are Large Language Models?

Developers and writers can finally agree on one thing: Large Language Models, the subset of AIs that drive ChatGPT and its competitors, are stunning tech creations. Developers enjoying the likes of GitHub Copilot know the feeling: this new kind of te...

What Are Large Language Models?

Krissy Davis

The Best Large Language Models on The Market

Large language models are sophisticated programs that enable machines to comprehend and generate human-like text. They have been the foundation of natural language processing for almost a decade. Although generative AI has only recently gained popula...

The Best Large Language Models on The Market

Benedikt Bischof

How we Build The Software of Tomorrow

Welcome to this issue of the WeAreDevelopers Live Talk series. This article recaps an interesting talk by Thomas Dohmke who introduced us to the future of AI – coding.This is how Thomas describes himself:I am the CEO of GitHub and drive the company’s...

How we Build The Software of Tomorrow

Daniel Cranney

How to Use Generative AI to Accelerate Learning to Code

It’s undeniable that generative-AI and LLMs have transformed how developers work. Hours of hunting Stack Overflow can be avoided by asking your AI-code assistant, multi-file context can be fed to the AI from inside your IDE, and applications can be b...

How to Use Generative AI to Accelerate Learning to Code

From learning to earning

Jobs that call for the skills explored in this talk.

Machine Learning & Data Engineer

vengine GmbH
Hamburg, Germany

Junior

Intermediate

Python

Machine Learning Engineer with Reinforcement Learning Expertise

Warmwind

€120K

PyTorch

Tensorflow

Machine Learning

Senior Machine Learning Engineer - LLM & Reinforcement Learning

Warmwind

€240K

Senior

PyTorch

Tensorflow

Machine Learning

Conversational AI & Machine Learning Engineer

Deloitte

DevOps

Docker

PyTorch

Tensorflow

Kubernetes

+2

Machine Learning Engineer

LEONHARD WEISS GmbH & Co. KG

Docker

SAP HANA

Tensorflow

Kubernetes

Machine Learning

+1

Machine Learning Engineer

Brunel GmbH

PyTorch

Tensorflow

Data analysis

Computer Vision

Machine Learning

+2

Softwareentwickler - AI / Data Science

Black Swans Exist

NumPy

SciPy

Keras

Pandas

PyTorch

+3

Founding Machine Learning Engineer

Bjak Sdn Bhd

PyTorch

Tensorflow

Machine Learning

Conversational AI & Machine Learning Engineer

Deloitte

Machine Learning