What is Statistics? (Michael I. Jordan)

What is Statistics? (Michael I. Jordan) | AI Podcast Clips

AQUAPiHahVY • 2020-02-25

Transcript preview

Open

Kind: captions
Language: en
an absurd question but what is
statistics so the here it's a little bit
it's somewhere between math and science
and technology it's somewhere in that
convex hull so it's some principles that
allow you to make inferences that have
got some reason to be believed and also
principle allow you to make decisions
where you can have some reason to
believe you're not gonna make errors so
all that requires some assumptions about
what do you mean by an error what do you
mean by you know the probabilities and
but and you know it struck me after you
start making some assumptions you're led
to conclusions that yes I can guarantee
that you know you you know if you do
this in this way your probability making
error will be small your probability of
continuing to not make errors over time
will be small and probability you found
something that's real will be small will
be high the decision-making is a big
parts it may be the big part yeah so the
original so statistics you know short
history was that you know it's Carter
goes back this sort as a formal
discipline you know 250 years or so it
was called inverse probability because
around that era probability was
developed sort of is especially to
explain gambling situations of course
and interesting so you would say well
given the state of nature is this
there's a certain roulette or that has a
certain mechanism in it what kind of
outcomes do I expect to see and especial
if I do things long long amounts of time
what outcomes what I see in the
physicists are to pay attention to this
and then people said well given and
let's turn the problem around what if I
saw certain outcomes could I infer what
the underlying mechanism was that's an
inverse problem and in fact for quite a
while statistics was called inverse
probability that was the name of the
field and I believe that I was Laplace
who was working in Napoleon's government
who was trying to needed to do a census
of France learn about the people there
so he went got in gather data and he
analyzed that data to determine policy
and said let's call this field that does
this kind of thing statistics cuz the
the word state is in there in French
that's a table but you know it's the
study of data for the state
it's anyway that caught on and it's been
all statistics ever since but but by the
time it got formalized it was sort of in
the 30s and around that time there was
game theory and decision theory
developed and nearby people that era
didn't think of themselves as either
computer science or statistics or
control or econ they were all they were
all the above and so you know von Norman
is developing game theory but also
thinking about its decision theory Walt
is an economy trician developing
decision theory and then you know turned
that into statistics and so it's all
about here's a here's not just data and
you analyze it here's a loss function
here's what you care about here's the
question you're trying to ask here is a
probability model and here's the risk
you will face if you make certain
decisions and to this day and most
advanced statistical curricula you teach
decision theory is the starting point
and then it branches out and if the two
branches are Bayesian or frequentist but
um that's it's all about decisions in
statistics what is the most beautiful
mysterious may be surprising idea that
you've come across yeah good question um
I mean there's a bunch of surprising
ones there's something it's way too
technical for this thing but something
called James Stein estimation which is
kind of surprising and really takes time
to wrap your head around can you try to
make me I think I don't want to even
want to try um let me just say a
colleague at Steve a Steven stickler
University Chicago wrote a really
beautiful paper on James Stein
estimation which helps to its views of
paradox
it kind of defeats the minds attempts to
understand it but you can and Steve has
a nice perspective on that there so one
of the troubles with statistics is that
it's like in physics that are in quantum
physics you have multiple
interpretations there's a wave and
particle duality in physics and you get
used to that over time but it still kind
of haunts you that you don't really you
know quite understand the relationship
the electrons away when electrons are
particle well hmm well the same thing
happens here there is Bayesian ways of
thinking and frequentist and they are
different they they all they sometimes
become sort of the same in practice but
they're Fazal way different and then in
some practice they are not the same at
all they give you a rather different
answers and so it is very much like wave
in particle duality and that is
something you have to kind of get used
to in the field
can you define Beijing and frequentist
yeah decision theory you can make I have
a like I have a video that people could
see it's called are you amazing or a
frequentist and kind of help try to make
it really clear it comes from decision
theory so you know decision theory you
talk about loss functions which are
function of data X and parameter theta
as well a function of two arguments okay
now either one of those arguments is
known you don't know the data uh priori
it's random and the parameter is unknown
all right so you have this function of
two things you don't know when you're
trying to say I want that function to be
small I want small loss all right well
what are you gonna do so you sort of say
well I'm gonna average over these
quantities or maximize over them or
something so that you know I turned that
uncertainty into something certain so
you could look at the first argument an
average over it or you could look at the
second argument averaged over it that's
Bayesian frequentist so the the
frequentist says I'm gonna look at the X
the data and I'm gonna take that as
random and I've got average over the
distribution so I take the expectation
loss under X theta is held fixed alright
that's called the risk and so it's
looking at other all the data sets you
could get all right and saying how well
will a certain procedure do under all
those data sets
that's called a frequent as guarantee
all right so I think it is very
appropriate when like you're building a
piece of software and you're shipping it
out there and people reviews on all
kinds of data sets you want to have a
stamp a guarantee on it that as people
run it on many many data sets that you
never even thought about that
ninety-five percent of time it will do
the right thing
perfectly reasonable the Bayesian
perspective says well no I'm gonna look
at the other argument at the loss
function the theta part ok that's
unknown and I'm uncertain about it so I
could have my own personal probability
for what it is
you know how many tall people are there
out there I'm trying to infer the
average height of the population well I
have an idea of roughly what the height
is so I'm gonna over the the the theta
so now that loss function as only now
again one arguments gone now it's a
function of X and that's what a Bayesian
does is they say well let's just focus
on a particular
we got the data set we got we condition
on that conditional on the X I say
something about my loss
that's a Bayesian approach to things and
the Bayesian will argue that it's not
relevant to look at all the other data
sets you could have gotten and averaged
over them the frequentist approach it's
really only the data set you got all
right and I do agree with that
especially in situations where you're
working with a scientist you can learn a
lot about the domain and you really only
focus on certain kinds of data and you
gathered your data and you make
inferences I don't agree with it though
that it you know in the sense that there
are needs for frequentist guarantees
you're writing software people are using
it out there you want to say something
so these two things have to got to fight
each other a little bit but they have to
blend so long story short there's a set
of ideas that are right in the middle
that are called empirical Bayes and
empirical Bayes sort of starts with the
Bayesian framework it's it's kind of
arguably philosophically more you know
reasonable and kosher write down a bunch
of the math that kind of flows from that
and then realize there's a bunch of
things you don't know because it's the
real world and you don't know everything
so you're uncertain about certain
quantities at that point ask is there a
reasonable way to plug in an estimate
for those things okay and in some cases
there's quite a reasonable thing to do
to plug in there's a natural thing you
can observe in the world that you can
plug in and then do a little bit more
mathematics and assure yourself it's
really good my math are based on human
expertise what's what it wouldn't go
they're both going in the Bayesian
framework allows you to put a lot of
human expertise in but the math kind of
guides you along that path and then kind
of reassures at the end you could put
that stamp of approval under certain
assumptions this thing will work so
Pratt you asked question was my favorite
you know or was the most surprising nice
idea so one that is more accessible as
something called false discovery rate
which is you know you're making not just
one hypothesis test or making one
decision you're making a whole bag of
them and in that bag of decisions you
look at the ones where you made a
discovery you announced it something
interesting it happened all right that's
gonna be some subset of your big back in
the ones you made a discovery which
subset of those are bad there are false
false discoveries you like the fraction
of your false discoveries among
discoveries to be small that's a
different criterion that accuracy or
precision or recall or sensitivity and
specificity it's a different quantity
those latter ones that are almost all of
them have more of a frequentist flavor
they say given the truth is that the
null hypothesis is true here's what
accuracy would get are given that the
alternative is true here's what I would
get it's kind of going forward from the
state of nature to the data the Bayesian
goes the other direction from the data
back to the state of nature and that's
actually what false discovery rate is it
says given you made a discovery
okay that's conditioned on your data
what's the probability of the hypothesis
it's going the other direction and so
the classical frequency look at that so
I can't know that there's some priors
needed in that and the empirical
Bayesian goes ahead and plows forward
and starts writing down these formulas
and realizes at some point some of those
things can actually be estimated in a
reasonable way no and so it's kind of
it's a beautiful set of ideas so I this
kind of line of arguments come out it's
not certainly mine but it sort of came
out from Robins around 1960 Brad Efron
has written beautifully about this in
various papers and books and and the FDR
is you know been Yamini and Israel
John's story did this Bayesian
interpretation and so on so I've just
absorbed these things over the years and
find it a very healthy way to think
about statistics
you

Resume

Berikut adalah rangkuman komprehensif dan terstruktur berdasarkan transkrip yang Anda berikan.

***

# Mengungkap Esensi Statistik: Dari Teori Keputusan hingga Debat Bayesian vs Frequentist

### Inti Sari (Executive Summary)
Video ini membahas definisi dan evolusi statistik sebagai disiplin ilmu yang berada di persimpangan antara matematika, sains, dan teknologi. Pembahasan diawali dengan sejarah singkat statistik, formalisasi teori keputusan pada tahun 1930-an, serta perbedaan mendasar antara pendekatan Frequentist dan Bayesian. Video juga menyoroti konsep Empirical Bayes sebagai jalan tengah dan memperkenalkan ide-ide modern seperti False Discovery Rate (FDR) yang penting dalam pengambilan keputusan berbasis data.

### Poin-Poin Kunci (Key Takeaways)
*   **Definisi Statistik:** Statistik adalah prinsip untuk membuat inferensi dan keputusan yang dapat dipercaya dengan meminimalkan kesalahan, yang membutuhkan asumsi tentang probabilitas.
*   **Asal Usul:** Istilah statistik berasal dari kata "state" (negara), awalnya digunakan untuk sensus, dan dikenal sebagai "inverse probability" sekitar 250 tahun yang lalu.
*   **Teori Keputusan:** Statistik modern diformalkan sekitar tahun 1930-an dan terkait erat dengan teori permainan (*game theory*), berfokus pada fungsi kerugian (*loss function*) dan risiko.
*   **Frequentist vs Bayesian:**
    *   *Frequentist* memandang data sebagai acak dan parameter tetap, berfokus pada jaminan jangka panjang (cocok untuk perangkat lunak).
    *   *Bayesian* memandang parameter sebagai ketidakpastian dan berfokus pada data spesifik yang diperoleh (cocok untuk inferensi ilmiah).
*   **Empirical Bayes:** Pendekatan tengah yang menggunakan filosofi Bayesian tetapi memasukkan estimasi dari dunia nyata untuk validasi matematis.
*   **False Discovery Rate (FDR):** Metrik penting untuk pengujian hipotesis ganda, yang mengukur proporsi kesalahan di antara penemuan yang diumumkan.

### Rincian Materi (Detailed Breakdown)

#### 1. Definisi dan Sejarah Statistik
Statistik berada di perpotongan antara matematika, sains, dan teknologi. Intinya adalah prinsip-prinsip yang memungkinkan kita membuat inferensi dan keputusan yang dapat dipercaya dengan meminimalkan kesalahan, yang tentu saja memerlukan asumsi mengenai error dan probabilitas.
*   **Sejarah Awal:** Sebagai disiplin formal, statistik berusia sekitar 250 tahun. Awalnya dikenal sebagai "inverse probability"—kebalikan dari probabilitas standar yang memprediksi hasil dari mekanisme yang diketahui. Inverse probability mencoba menyimpulkan mekanisme dari hasil yang diamati.
*   **Etymologi:** Pierre-Simon Laplace, yang melakukan sensus untuk Napoleon, adalah salah satu tokoh kuncinya. Nama "statistik" berasal dari kaitannya dengan "state" atau negara, karena awalnya digunakan untuk studi data negara.

#### 2. Teori Keputusan dan Formalisasi
Sekitar tahun 1930-an, statistik diformalkan melalui koneksi dengan teori permainan (*game theory*) dan teori keputusan. Tokoh-tokoh seperti von Neumann dan Wald berperan besar dalam hal ini.
*   **Komponen Utama:** Teori keputusan menjadi titik awal dalam kurikulum statistik lanjut, mencakup elemen-elemen seperti Data, Fungsi Kerugian (*Loss Function*), Pertanyaan, Model Probabilitas, dan Risiko.
*   **Dua Cabang Utama:** Dari teori ini, statistik bercabang menjadi dua aliran besar: Frequentist dan Bayesian.

#### 3. Perbandingan Frequentist dan Bayesian
Perbedaan keduanya dapat dijelaskan melalui teori keputusan yang melibatkan data (X) dan parameter (theta):
*   **Pendekatan Frequentist:**
    *   Memperlakukan data (X) sebagai sesuatu yang acak.
    *   Memperlakukan parameter (theta) sebagai tetap (fixed).
    *   Melakukan rata-rata atas distribusi data X.
    *   Fokus pada "risiko" dan jaminan performa di berbagai kemungkinan dataset.
    *   *Penggunaan:* Sangat baik untuk pengiriman perangkat lunak atau jaminan umum.
*   **Pendekatan Bayesian:**
    *   Memperlakukan parameter (theta) sebagai sesuatu yang tidak diketahui atau penuh ketidakpastian.
    *   Melakukan rata-rata atas parameter theta.
    *   Berkondisi pada data spesifik yang diperoleh.

## Kesimpulan & Pesan Penutup
Statistik telah berkembang dari sekadar alat administrasi negara menjadi fondasi ilmiah yang kritis untuk pengambilan keputusan di era modern. Melalui pemahaman terhadap teori keputusan, perbedaan mendasar antara pendekatan Frequentist dan Bayesian, serta metode inovatif seperti Empirical Bayes dan FDR, kita dapat meminimalkan kesalahan dalam inferensi data. Pemilihan pendekatan yang tepat sangat bergantung pada konteks masalah, baik itu untuk jaminan performa jangka panjang maupun analisis ilmiah yang spesifik.

Read

file updated 2026-02-13 13:24:40 UTC