George Hotz: Comma.ai, OpenPilot, and Autonomous Vehicles

George Hotz: Comma.ai, OpenPilot, and Autonomous Vehicles | Lex Fridman Podcast #31

iwcYp-XT7UI • 2019-08-05

Transcript preview

Open

Kind: captions
Language: en
the following is a conversation with
George Hotz he's the founder of comma AI
a machine learning based vehicle
automation company he is most certainly
an outspoken personality in the field of
AI and technology in general he first
gained recognition for being the first
person to carry or unlock an iPhone and
since then he's done quite a few
interesting things at the intersection
of hardware and software this is the
artificial intelligence podcast if you
enjoy it subscribe on YouTube give it
five stars on iTunes supported on
patreon or simply connect with me on
Twitter at lex friedman spelled fri d-m
a.m. and i'd like to give a special
thank you to Jennifer from Canada for
her support of the podcast on patreon
merci beaucoup Jennifer she's been a
friend and an engineering colleague for
many years since I was in grad school
your support means a lot and inspires me
to keep this series going and now here's
my conversation with George Hotz do you
think we're living in a simulation
yes but it may be unfalsifiable what do
you mean by unfalsifiable so if the
simulation is designed in such a way
that they did like a formal proof to
show that no information can get in and
out and if their hardware is designed to
for the anything in the simulation to
always keep the hardware in spec it may
be impossible to prove whether we're in
a simulation or not
so they've designed it such there's the
closed system you can't get outside of
the system
well maybe it's one of three worlds
we're either in a simulation which can
be exploited we're in a simulation which
not only can't be exploited but like the
same things too about VMs I'm a really
well-designed VM you can't even detect
if you're in a VM or not that's
brilliant
so where it's yeah so the simulation is
running in a virtual machine but now in
reality all VMs have wasted the fact
that's the point I mean is it
yeah you've done quite a bit of hacking
yourself and so you should know that
really any complicated system will have
ways in and out so this isn't
necessarily true going forward I spent
my time away from comma I learned
and said dependently typed like it's a
language for writing math proofs and if
you write code that compiles in a
language like that
it is correct by definition the types
check it's correct and so it's possible
that the simulation is written in a
language like this in which case yeah
yeah but that can't be sufficiently
expressive a language like that all
weekend it can be yeah okay well so all
right so the simulation doesn't have to
be tiring complete if it has a scheduled
end date looks like it does actually
with entropy and you know I don't think
that a simulation that results in
something as complicated in universe
would have a formal proof of correctness
right as as possible of course we have
no idea how good their tooling is and we
have no idea how complicated the
universe computer really is it may be
quite simple it's just very large right
it's very it's definitely very large but
the fundamental rules might be super
simple yeah Conway's gonna live kind of
stuff right so if you could hack it so
imagine the simulation that is hackable
if you could hack it what would you
change about the you know like how would
you approach hacking a simulation the
reason I gave that talk I by the way I'm
not familiar with the talk he gave I
just read that you talked about escaping
the simulation yeah like that so maybe
you can tell me a little bit about the
theme and the message there - it wasn't
a very practical talk about how to
actually escape a simulation it was more
about a way of restructuring an
us-versus-them narrative if we continue
on the path we're going with technology
I think we're in big trouble
like as a species and not just as a
species but even as me as an individual
member of the species so if we could
change rhetoric to be more like to think
upwards like to think about that we're
in a simulation and how we could get out
already we'd be on the right path what
you actually do once you do that while I
assume I would have acquired way more
intelligence in the process of doing
that so I'll just ask that so the the
thinking upwards what kind of ideas what
kind of breakthrough ideas do you think
thinking in that way could inspire and
what did you say upwards upwards into
space are you thinking sort of
exploration in all forms the space
narrative that held for the modernist
generation doesn't hold as well for the
postmodern generation
what's the space narrator we're talking
about the same space the dimensional
space like going a little ace is like
building like yuan mosque like we're
gonna build rockets we're gonna go to
Mars we're gonna colonize the universe
and the narrative your friend was born
in the Soviet Union you're referring to
the race to space the race to space
explore okay that was a great modernist
narrative
it doesn't seem to hold the same weight
in today's culture I'm hoping for good
postmodern narratives that replace it so
think let's think so you work a lot with
AI so the eyes one formulation of that
narrative there could be also I don't
know how much you do in VR and they are
yeah that's another eye I know less
about it but every time I play with it
and our research is fascinating that
virtual world are you are you interested
in the virtual world I would like to
move to a virtual reality in terms of
your work
no I would like to physically move there
the apartment I can rent in the cloud is
way better in the apartment I can rent
in the real world well it's all relative
isn't it because others will have very
nice departments too so you'll be
inferior in the virtual world that's not
how I view the world right I don't view
the world I mean it's very like like
almost zero-sum issue a to view the
world say like my great apartment isn't
great because my neighbor has one - no
my great apartment is great because like
look at this dishwasher man yeah you
just touch the dish and it's washed
right and that is great in and of itself
if I have the only apartment or if
everybody had the apartment I don't care
so you have fundamental gratitude the
the world first learned of Geo ha George
Hotz in August 2007 maybe before then
but certainly in August 2007 when you
were the first person to unlock carry
unlock an iPhone how did you get into
hacking what was the first system you
discovered vulnerabilities for and broke
into so that was really kind of the
first thing I had I had a book in in
2006 called grey hat hacking and I guess
I realized that if you acquired these
sort of powers you could control the
world but I didn't really know that much
about computers back then I started with
electronics the first iPhone hack was
physical card work um you had to open it
up and pull an address line high and it
was because I didn't really know about
software exploitation I learned that all
in the next few years and I got very
good at it but back then I knew about
like how men
chips are connected to processors and he
knew about software and programming
he didn't didn't know I'll really see
you the view of the world and computers
was physical was the most hard work
actually if you read the code that I
released with that
in August 2007 it's atrocious
the language was it a C say yes and in a
broken sort of state machine SC I didn't
know how to program man so how did you
learn to program
what was your journey cuz I mean we'll
talk about it you've live streams from
your programming man this is a chaotic
beautiful mess how did you arrive at
that years and years of practice I
interned at Google after the summer
after the iPhone unlock and I did a
contract for them where I built hardware
for for Street View and I wrote a
software library to interact with it and
it was terrible code and for the first
time I got feedback from people who I
respected saying you know like don't
write code like this now of course just
getting that feedback is not enough the
way that I really got good was I wanted
to write this thing like that could
emulate and then visualize like armed
binaries because I wanted to hack the
iPhone better and I didn't like that I
couldn't like see what that I couldn't
single step through the processor
because I had no debugger on there
especially for the low level things like
the boot ROM in the bootloader so I
tried to build this tool to do it
and I built the tool once and it was
terrible I built the tool second times
it was terrible
I built the tool third time this by the
time I was at Facebook it was kind of
okay
and then I built the tool fourth time
when I was a Google intern again in 2014
and that was the first time I was like
this is finally usable how do you
pronounce this kira-kira yeah
so it's essentially the most efficient
way to visualize the change of state of
the computer as the program is running
that's what I mean by debugger yeah it's
a timeless debugger so you can rewind
just as easily as going forward think
about if you're using gdb you have to
put a watch on a variable if you want to
see if that variable changes and Kure
you can just click on that variable and
then it shows every single time when
that variable was changed or accessed
think about it like get for your
computers uh the run lock so there's
like a deep log of of the state of the
computer as the program runs and you can
rewind why isn't that maybe it is maybe
you can educate me what isn't that kind
of debugging used more often ah because
the tooling is bad
well two things one if you're trying to
debug chrome chrome is a 200 megabyte
binary that runs slowly on desktops so
that's going to be really hard to use
for that but it's really good to use for
like CTFs and for boot roms and for
small parts of code so it's it's hard if
you're trying to debug like massive
systems what's the CTF and what's the
boot ROM the boot ROM is the first code
that executes it's the minute you give
power to your iPhone okay and CTF were
these competitions that I played capture
the flag to capture the flag I was going
to ask you about that what are those
LaVette I watched a couple videos on
YouTube those look fascinating what have
you learned about maybe at the high
level of vulnerability of systems from
these competitions the like I feel like
like in the heyday of CTFs you had all
of the best security people in the world
challenging each other and coming up
with new toy exploitable things over
here and then everybody okay who can
break it and when you break it you get
like there's like a file on the server
called flag and then there's a program
running listening on a socket that's
vulnerable so you write an exploit you
she'll and then you cat flag and then
you type the flag into like a web-based
scoreboard and you get points so the
goal is essentially to find an exploit
in the system that allows you to run
shell to run arbitrary code on that
system that's one of the categories
that's like the PO noble category
vulnerable
yeah horrible it's like you know you
pwned the program you are it's a program
yeah yeah you know for personally I
apologize I'm gonna I'm gonna say it's
because I'm Russian but maybe you can
help educate me some video game like
misspell to own way back in the Mia and
there's just I wonder if there's a
definition I'll have to go to urban
dictionary for it okay so what was the
heyday seat yeah by the way but was it
what decade are we talking about I think
like I mean maybe I'm biased because
it's the era that that that I played but
like 2011 to 2015 because the modern CTF
scene is similar to the modern
competitive programming scene you have
people who like do drills you have
people who practice and then once you've
done that you've turned it lesson to a
game of generic computer skill and more
into a game of okay you memorize you you
drill on these five categories and then
before that it wasn't it didn't have
like as much attention as it had I don't
know they were like I won $30,000 ones
in Korea for one of these competitions
oh crap they were they were that so that
means I mean money's money but that
means there was probably good people
there exactly yeah are the challenges
human constructive or are they grounded
in some real flaws and real systems
usually they're human constructed but
they're usually inspired by real flaws
what kind of systems are imagined is
really focused on mobile like what has
vulnerabilities these days is it does
primarily mobile systems like Android
everything does No yeah of course the
price has kind of gone up because less
and less people can find them and what's
happened in security is now if you want
to like jailbreak an iPhone you don't
need one exploit anymore you need nine
nine chained together what women yeah
Wow okay so it's really so what's the
but what's the benefit speaking higher
level philosophically about hacking I
mean it sounds from everything I've seen
about you you just love the challenge
and you don't want to do anything you
don't want to bring that exploit out
into the world and doing the actual let
it run wild you just want to solve it
and then you go on to the next thing oh
yeah I mean doing criminal stuffs not
really worth it and I'll actually use
the same argument for why I don't do
defense for why I don't do crime
if you want to defend a system say the
system has ten holes right if you find
nine of those holes as a defender you
still lose because the attacker gets in
through the last one if you're an
attacker you only have to find one out
of the ten but if you're a criminal if
you log on with a VPN nine out of the
ten times but one time you forget you're
done because you're caught okay because
you only have to mess up once to be
caught as a criminal yeah that's why I'm
not a criminal
but okay let me uh that's having a
discussion with somebody just at a high
level about nuclear weapons actually why
we're having blowing ourselves up yet
and my feeling is all the smart people
in the world look at the distribution of
smart people smart people are generally
good and then this other person I was
talking to Sean Carroll the physicist
and you were saying no good and bad
people are evenly distributed amongst
everybody my sense was good hackers are
in general good people and they don't
want to mess with the world what's your
sense I'm not even sure about that like
I have a nice life crime wouldn't get me
anything
but if you're good and you have these
skills you probably have a nice life too
right like you can use the father things
but is there an ethical is there some is
there a little voice in your head that
says well yeah if you could hack
something to where you could hurt people
and you could earn a lot of money doing
it though not hurt physically perhaps
but disrupt her life in some kind of way
it is there a little voice that says um
what two things one I don't really care
about money
so like the money wouldn't be an
incentive the thrill might be an
incentive but when I was 19 I read crime
and punishment right that was another
that was another great one that talked
me out of ever really doing crime Oh cuz
it's like that's gonna be me I'd get
away with it whatever just went in my
head even if I got away with it you know
and then you do crime for long enough
you'll never get away with it that's
right in the end that's a good reason to
be good I wouldn't say good I just say
I'm not bad you're a talented programmer
and a hacker in a good positive sense of
the word award you've played around
found vulnerabilities in various systems
what have you learned broadly about the
design of systems and so on from that
from that whole process you learn to not
take things for what people say they are
but you look at things for what they
actually are
yeah I understand that's what you tell
me it is but what does it do man and you
have nice visualization tools to really
know what it's really doing oh I wish
I'm a better programmer now than I was
in 2014 I said Kira that was the first
tool that I wrote that was usable I
wouldn't say the code was great I still
wouldn't say my code is great so how was
your evolution as a programmer except
practice he went he started with C at
which point did you pick up Python
because you're pretty big and Python
though now yeah in uh in college I went
to Carnegie Mellon when I was 22 um I
went back I'm like I'm gonna take all
your hardest CS courses we'll see how I
do right like did I miss anything by not
having a real undergraduate education
took operating systems compilers AI and
they're like a freshman reader math
course and operating says some of these
some of those classes you mentioned
actually they're great at least one the
2012 circuit 2012 operating systems and
compilers we're two of the best classes
I've ever taken my life because you
write an operating system and you write
a compiler I wrote my operating system
in C and I wrote my compiler in Haskell
but classical well somehow I picked up
Python that semester as well I started
using it for the CTS actually that's
when I really started to get into CTF
and CTF you're all to race against the
clock so I can't write things and say oh
there's a clock component so you really
want to use the programming language you
can be fastest than 48 hours pone as
many of these challenges you can pone
yeah you got like a hundred points a
challenge whatever team gets the most
you were both the Facebook and Google
for a brief stint yeah well the project
zero actually at Google for five months
where you develop kara what was project
zero about in general speak what what
just curious about the security efforts
in these companies
well product zero started the same time
I I went there what what years are there
2015 2015 so that was right at the
beginning of project it's small it's
Google's offensive security team I'll
try to give I'll try to give the best
public facing explanation that I can so
the idea is basically these
vulnerabilities exist in the world
nation states have them some high
powered bad actors have them
sometime people will find these
vulnerabilities and submit them in bug
bounties to the companies but a lot of
the companies don't really care it only
fix the bug there's no it doesn't hurt
for there to be a vulnerability so
project zero is like we're gonna do it
different we're going to announce a
vulnerability and we're going to give
them 90 days to fix it and then whether
they fix it or not we're gonna drop the
drop the zero day oh wow we're gonna
drop the weapon that's so cool that is
so cool I love that deadlines though
that's so cool give him real deadlines
yeah and I think it's done a lot for
moving the industry forward I watched
your coding sessions on the stream
downline you code things up basic
projects usually from scratch I would
say sort of as a programmer myself just
watching you that you type really fast
and your brain works in both brilliant
and chaotic ways I don't know if that's
always true but certainly for the live
streams so it's it's interesting to me
because I'm more I'm much slower and
systematic and careful and you just move
I mean probably an order of magnitude
faster some curious is there a method to
your madness
is this just who you are there's pros
and cons there's pros and cons to my
programming style and I'm aware of them
like if you ask me to like like get
something up and working quickly with
like an API that's kind of undocumented
I will do this super fast because I will
throw things at it until it works if you
ask me to take a vector and rotate it 90
degrees and then flip it over the XY
plane I'll spam program for two hours
and won't get it all because it's
something that you could do with a sheet
of paper think through design and then
just you really just throw stuff at the
wall and you get so good at it that it
usually works I should become better at
the other kind as well sometimes I'll do
things pathetically it's nowhere near as
entertaining on the twitch streams I do
exaggerate it a bit on the edge games as
well the twitch streams I mean what do
you want to see a game or you want to
see actions permit me right I'll show
you a PM for programming yes I recommend
people go to I think I watched
I was probably several hours you put
like I've actually left you programming
in the background while I was
programming because you made me you it
was it was like watching a really good
gamer
it's like energizes you because you're
like moving so fast it so it's it's
awesome it's inspiring and so it made me
jealous that like because my own program
is inadequate in terms of speed Oh as I
was like so I'm twice as frantic on the
live streams as I am when I code without
oh it's super entertaining so I I wasn't
even paying attention to where you were
coding which is great it's just watching
you switch windows and VAM I guess is
driven screen I've developed a workflow
Facebook and talk about how do you learn
new programming tools ideas techniques
these days what's your like methodology
for learning new things so I wrote for
comma the distributed file systems out
in the world are extremely complex like
if you want to install something like
like like Saif Saif is I think the like
open infrastructure to should be a file
system or there's like newer ones like
seaweed FS but these are all like 10,000
plus line projects I think some of them
are even 100,000 line and just
configuring them as a nightmare so I
wrote I wrote one um
it's 200 lines and it's it uses like
nginx to the live servers and has low
master server that I wrote and go and
the way I go this if I would say that
I'm proud per line of any code I wrote
maybe there's some exploits that I think
are beautiful and then this this is 200
lines and just the way that I thought
about it I think was very good and the
reason it's very good is because that
was the fourth version of it that I
wrote and I had three versions that I
threw away you mentioned you see go I
ready go yeah and go so is that a
functional language I forget what goes
they go is Google's language right I'm a
functional it's some it's like in a way
it's C++ but easier it's it's strongly
typed it has a nice ecosystem erotic
when I first looked at it I was like
this is like Python but it takes twice
as long to do anything yeah
now that I've open pilot is migrating to
sea but it still has large Python
components I now understand why Python
doesn't work for large code bases and
why you want something like Oh
interesting so why why doesn't Python
work for so even most speaking for
myself at least like we do a lot of
stuff basically demo level work with
autonomous vehicles and most of the work
is Python yeah why doesn't Python work
for large code bases because well lack
of type checking is a big errors
creeping yeah and like you don't know
the compiler can tell you like nothing
right so everything is either you know
like like syntax errors fine but if you
misspell a variable and Python the
compiler won't catch that there's like
linters that can catch it some other
time
there's no types this is really the
biggest downside and then will Python
slow but that's not related to it well
maybe the kind of related to its that's
lacking so what's what's in your toolbox
these days is a Python what else go I
need to move on something else but my
adventure interdependently type
languages I love these languages they
just have like syntax from the 80s what
do you think about JavaScript
yes thanks Nick tomorrow typescript
javascript is the whole ecosystem is
unbelievably confusing
NPM updates a package from zero to two
to zero to five and that breaks your
babble linter which translates your es5
into es6 which doesn't run on so why do
I have to compile my JavaScript again
huh it may be the future though if you
think about I mean I've embraced
JavaScript recently because just like
I've continually embraced PHP it seems
that these worst possible languages live
on for long is that cockroaches never
die yeah well it's in the browser and
it's fast it's fast yeah it's in the
browser and compute mites they become
you know the browser it's unclear what
the role the browser's in terms of
distributed computation in the future so
javascript is definitely here to stay
yeah interesting if
Tom's vehicles will run on JavaScript
one day I mean you have to consider
these possibilities well all our debug
tools are JavaScript
we actually just open-source them we
have a tool Explorer which you can
annotate your dis engagements and we
have tool cabana which lets you analyze
the canned traffic from the car so
basically any time you're visualizing
something about the log you using
javascript yeah well the web is the best
UI toolkit by far yeah um so and then
you know what you're voting in
JavaScript we have a react guy he's good
he acts nice let's get into it so let's
talk to Thomas vehicles you found it
comma a let's at a high level how did
you get into the world the vehicle
automation can you also just for people
who don't know tell the story of comma
yeah sure so I was working at this AI
startup and a friend approached me and
he's like dude I don't know where this
is going but the coolest applied AI
problem today is self-driving cars I'm
like well absolutely do you want to meet
with UI mosque and he's looking for
somebody to build a vision system for
auto pilot this is when they were still
on ap one they were still using mobile I
kneel on back then was looking for a
replacement and he brought me in and we
talked about a contract where I would
deliver something that meets mobile eye
level performance I would get paid
twelve million dollars if I could
deliver it tomorrow and I would lose 1
million dollars for every month I didn't
deliver yeah so I was like ok this is a
great deal this is a super exciting
challenge you know what even if it takes
me 10 months I get two million dollars
it's good maybe I can finish up in five
maybe I don't finish it at all and I get
paid nothing and I'll work for twelve
months for free so maybe I just take a
pause on that I'm also curious about
this because I've been working on
robotics for a long time and I'm curious
to see a person like you just step in
and sort of somewhat naive but brilliant
right so that's though that's the best
place to be because you basically
full-steam take on a problem how
confident how from that time because you
know a lot more now at that time how
hard do you think it is to solve all of
autonomous driving I remember I
suggested to Elon in the meeting I'm
putting
GPU behind each camera to keep the
compute local this is an incredibly
stupid idea I leave the meeting 10
minutes later and I'm like I could have
spent a little bit of time thinking
about this problem was I would just send
all your cameras to one big GPU you're
much better off doing that oh sorry you
said behind every camera you have a
small GPU I was like oh I'll put the
first few layers of my comm there Oh
like why did I say that that's possible
it's possible but it's a bad idea it's
not obviously a bad idea pretty obvious
but whether it's actually a bad idea or
not I left that meeting with Elon like
beating myself up I'm like why did I say
something stupid yeah you haven't given
I'm at least like thought through every
aspect yes he's very sharp too like
usually in life I get away with saying
stupid things and then kind of course
alright right away he called me out
about it and like usually in life I get
away with saying stupid things and then
like people will you know people a lot
of times people don't even notice and
I'll like correct it and bring the
conversation back but with Elon it was
like nope like okay well that's not at
all why the contract fell through I was
much more prepared the second time I met
him yeah but in general huh how hard did
you think it is like 12 months is uh-oh
is it tough timeline oh I just thought
I'd clone mob like you three I didn't
think I'd solve level five self-driving
or anything so the goal there was to do
lane-keeping good good link keeping I
saw my friend showed me the outputs from
a mobile I in the office from a mobile I
was just basically two lanes at a
position of a lead car mm-hm
like I can I can gather a dataset and
train this net in in weeks and I did
well first time I tried the
implementation of mobile I and the test
I was really surprised how good it is
it's quite incredibly good because I
thought it's just because I've done a
lot of computation I thought it'd be a
lot harder to create a system that
that's stable so I was personally
surprised you know have to admit it
because I was kind of skeptical before
trying it because I thought it would go
in and out a lot more it would get
disengaged a lot more and it's pretty
robust so what how how hard is the
problem we need to when you tackled it
I think a p1 was great like Elon talked
about dis engagements on the 405 down in
LA we'd like the lane marks were kind of
faded and the mobile eye system would
drop out uh like I had something up and
working that I would say was like the
same quality in three months same
quality but how do you know you you say
stuff like that yeah confidently but you
can't and I love it but well the
question is you can't you're kind of
going by feel because he not solely
absolutely like like I would take I
hadn't I borrowed my friends Tesla yeah
I would take ap one out for a drive yeah
and then I would take my system out for
a dry and seems reasonably like the same
so the four or five how hard is it to
create something that could actually be
a product that's deployed I mean I've
read an article or you on this
respondent said something by you saying
that to build autopilot is is more
complicated than a single George Hotz a
level job how hard is that job to create
something that would work across the
globe Lee what are the global ease the
challenge but Elon followed that up by
saying it's gonna take two years in a
company of ten people yeah and Here I am
four years later with a company of
twelve people and I think we still have
another two to go two years so yeah so
what do you think what do you think
about the hottest is progressing with
autopilot v2 v3
I think we've kept pace with them pretty
well
I think navigator autopilot is terrible
we had some demo features internally of
the same stuff and we would test it and
I'm like I'm not shipping this even as
like open-source software to people what
do you think is do
Consumer Reports does a great job of
describing it like when it makes a lane
change it does it worse than a human
you shouldn't ship things like autopilot
open pilot they Lane keep better than a
human if you turn it on for a stretch of
highway like an hour long it's never
gonna touch a lane line human will touch
probably a lane line twice you just
inspired me I don't know if you're
grounded and data on that I read labor
okay but no but that's interesting uh I
wonder actually how often we touch Lane
lines in general like a little bit cuz
it is okay I could answer that question
pretty easily with the common data side
yeah I'm curious I've never answered it
I don't know yeah I just - is like my
person it feels right that's interesting
because every time you touch the lane
that's the source of a little bit of
stress and kind of lane-keeping is
removing that stress that's all to me
the big the biggest value-add honestly
is just removing the stress of having to
stay in lane and I think honestly I
don't think people fully realize first
of all that that's a big value add but
also that that's all it is
and that not only I find it a huge value
add I drove down when we moved to San
Diego I drove down our Enterprise
rent-a-car and I missed it so I missed
having the system so much it's so much
more tiring to drive without it it's it
is that Lane centering that's the key
feature yeah
and in a way it's the only feature that
actually adds value to people's lives
and autonomous vehicles today way mode
does not add value to people's lives
it's a more expensive lower slower uber
maybe someday it'll be this big cliff
where it adds value but I don't usually
do this vessei I haven't talked to is
that this is good because I haven't I
have intuitively but I think we're
making it explicit now I I actually
believe that really good lane-keeping is
a reason to buy a car will be a reason
to buy a car is a huge value add I've
never until we just started talking
about it haven't really quite realized
that that I've felt with elan chase of
level four is not the correct chase it
was on because you should just say Tesla
has the best as if from a testing
perspective say Tesla has the best
lane-keeping coming I should say coming
I is the best link keeping and that is
it yeah yeah does do you think well you
have to do the longitudinal as well
you can't just Lane keep you have to do
a cc but a cc is much more forgiving
than lanky especially on the highway oh
by the way are you uh calming eyes
camera only correct oh no we use the
radar we from the car you were able to
get to open it um we can't do a camera
only now it's gotten to the point but we
leave the radar there is like a it's
it's fusion now okay so let's maybe talk
through some of the system specs on the
hardware or what it what's what's the
hardware side of what you're providing
what's the capabilities in the software
side would open pilot and so on so open
pilot as the the box that we sell that
it runs on it's a phone in a plastic
case it's nothing special we sell it
without the software so you're like you
know you buy the phone it's just easy
it'll be easy setup but it's sold with
no software
open pilot right now is about to be 0.6
when it gets to 1.0 I think we'll be
ready for a consumer product we're not
gonna add any new features we're just
gonna make the lane-keeping really
really good
so what do we have right now it's a
snapdragon 820
say so many IMX 298 forward-facing
camera driver monitoring camera and
she's a selfie cam on the phone and a
can transceiver biffle's little thing
calls pandas and they talk over USB to
the phone and then they have three
canvases that they talk to the car one
of those campuses is the radar CANbus
one of them is the main car CANbus and
the other one is the proxy camera CANbus
we leave the existing camera in place so
we don't turn a DB off right now we
still turn a TV off if you're using our
longitudinal but we're gonna fix that
before 1.0 you got it wow that's cool so
in its can both way so how are you able
to control vehicles so we proxy the
vehicles that we work with already have
Lane Keeping Assist system so Lane
Keeping Assist can mean a huge variety
of things it can mean it will apply a
small torque to the wheel after you've
already crossed a lane line by a foot
which is the system in the older Toyotas
versus like I think Tesla still calls it
Lane Keeping Assist where it'll keep you
perfectly in the center of the lane on
the highway you can control like you
would in joystick the cars these so
these cars already have the capability
of drive-by-wire so is it is it trivial
to convert a car that it operates with
it open pile is able to control the
steering Oh a new car or a car that we
so we have support now for 45 different
makes of cars what are one of the cars
general mostly Hondas and Toyotas we
support almost every Honda and Toyota
made this year and then a bunch of GM's
bunch of Subarus which it doesn't have
to be like a Prius it could be Coral as
well okay the 2020 Corolla is the best
car with open pilot it just came out
there the actuator has less lag than the
older Corolla
I think I started watching video with
your eye the way you make videos is
awesome literally the dealerships
streaming stream for an hour
yeah and basically like if stuff goes a
little wrong you're just like you just
go with it yeah I love it what's real
yeah that's real that's that's it's
that's so beautiful and it's so in
contrast to the way other companies
would put together a video like that how
do I like to do it like good I mean if
you become super rich one day is
successful I hope you keep it that way
because I think that's actually what
people love that kind of genuine oh it's
all that has value to me yeah my money
has no if I sell out to like make money
and I sold out it doesn't matter what do
I get yacht I don't I got and I think
Tesla's actually has a small inkling of
that as well with autonomy day they did
reveal more than I mean of course
there's marketing communications you can
tell but it's more than most companies
will reveal which is I hope they go
towards a direction more other companies
GM Ford oh Jessa Tesla's gonna win level
5 they really are so let's talk about it
you think you're focused on level 2
currently currently we're gonna be one
to two years behind Tesla getting to
level five okay we're interested right
we're into it you're in I'm just saying
once Tesla gets it we're one to two
years behind
I'm not making any timeline on when
Tesla's that's right you did that's
brilliant
I'm sorry Tesla investors if you think
you're gonna have an autonomous robot
taxi fleet by the end of the year yes
that's all bet against that so that what
do you think about this the most level
four companies are kind of just doing
their usual safety driver during full
autonomy kind of testing and then Tesla
does basically trying to go from
lane-keeping to full autonomy what do
you think about that approach how
successful would it be a ton better
approach because Tesla is gathering data
on a scale that none of them are they're
putting real users behind the behind the
wheel of the car
it's I think the only strategy that
works the incremental well so there's a
few components to test approach that's
that's more than just incrementally you
spoke with is the one is the software so
over-the-air software updates necessity
I mean way more ease have those - those
aren't but there was differentiating
from the automaker's right no link
keeping assist systems have no cars with
lane keeping system have that except
Tesla yeah and the other one is the data
the other direction which is the ability
to query the data I don't think they're
actually collecting as much days people
think but the ability to turn on
collection and turn it off so I'm both
in the robotics world in the the
psychology human factors world many
people believe that level to autonomy is
problematic because of the human factor
like the more the task is automated the
more there's a vigilance decrement you
start to fall asleep you start to become
complacent start texting more and so on
do you worry about that
because if we're talking about
transition from lane-keeping to full
autonomy if you're spending eighty
percent of the time not supervising
machine do you worry about what that
means to the safety of the drivers one
we don't consider open pilot to be 1.0
until we have 100% driver monitoring you
you can cheat right now our driver
monitoring system there's a few ways to
cheat it there pretty obvious we're
working on making that better before we
ship a consumer product that can drive
cars I want to make sure that I have
driver monitoring that you can't cheat
what's like a successful driver
monitoring system look like it's keep
its is it all buzz just keeping your
eyes on the road um well a few things so
that's what we went with it first for
driver monitoring I'm checking I'm
actually looking at where your head is
looking but cameras know about my
resolution eyes are a little bit hard to
get well head is this big I mean that is
good and actually a lot of it just as
psychology wise to have that monitor
constantly there it reminds you that you
have to be paying attention but we want
to go further we just hired someone
full-time to come onto the driver
monitoring I want to detect phone in
frame and I want to make sure you're not
sleeping
how much does the camera see of the body
this one not enough not enough the next
one everything
what's interesting fish Atkins we have
we're doing just data collection that
real-time but fish eye is a beautiful
mouth being able to capture the body and
the smartphone is really like the
biggest problem I'll show you I can show
you one of the pictures from from our
finder system
awesome so you're basically saying the
driver monitoring will be the answer to
that um I think the other point that the
original paper is is good as well you're
not asking a human to supervise a
machine without giving them meat they
can take over at a time right our safety
model you can take over we disengage on
both the gas or the brake we don't
disengage on steering I don't feel you
have to but we disengage on gas or brake
so it's very easy for you to take over
and it's very easy for you to re-engage
that switching should be super cheap
yeah the cars that require even
autopilot requires a double press that's
almost I said I like that yeah and then
then the cancel um to cancel in
autopilot you either have to press
cancel which no one knows where that is
so they press the brake but a lot of
things you don't you want to press the
brake you want present ass
so you should cancel on gas or wiggle
the steering wheel which is bad as well
wow that's brilliant I haven't heard
anyone articulate at that point I like
what this is all I think about
it's because I think I think actually
Tesla has done a better job than most
automakers at making that frictionless
but you just described that it could be
even better I love super cruise as an
experience once it's engaged yeah I
don't know if you've used it but getting
the thing to try to engage him yeah I've
used this of Germany's super cruise a
lot so what's their thoughts on the
super Cruiser system in June disengage
super cruise and it falls back to ACC so
my car's like still accelerating it
feels weird otherwise when you actually
have super cruise engaged on the highway
it is phenomenal we bought that Cadillac
we just sold it but we bought it just to
like experience this and I wanted
everyone in the office to be like this
is what we're striving to build GM
pioneering with the driver monitoring
you know you like their driver
monitoring system it has some bugs
if there's a sun shining back year it'll
be blind to you by overall mostly yeah
that's so cool you know the stuff that's
uh I don't often talk to people that
because it's such a rare car
unfortunately they bought one yes
possibly for us we lost like by 25k the
deprecation but a Philips worth it
I was very pleasantly surprised that GM
system was so innovative and really that
wasn't advertised much wasn't talked
about much yeah and I was nervous that
it would die that they would disappear
my eyes did they put it on the wrong car
they should've put it on the bolt and
not some weird Cadillac that nobody
bought I think that's gonna be into
they're saying at least is going to be
into their entire fleet so what do you
think about it if as long as we're on
the driver monitoring what do you think
about you know I must claim that driver
monitoring is not needed normally I love
his claims that one is stupid
that one is stupid and you know he's not
gonna have his level five fleet by the
end of the year hopefully he's like okay
I was wrong I'm gonna add driver
monitoring because when these systems
get to the point that they're only
messing up once every thousand miles
you absolutely need driver monitor so
let me play Delta because I agree with
you but let me play devil's advocate so
one possibility is that without driver
monitoring people are able to monitor
the self-regulate monitor themselves you
know that so your idea is seeing all the
people sleeping in decimals uh yeah well
I'm a little skeptical of all the people
sleeping in Tesla's because I have I've
stopped paying attention to that kind of
stuff because I want to see real data
there's too much glorified it doesn't
feel scientific to me so I want to know
you know what how many people are really
sleeping in Tesla's vs. sleeping I've I
was driving here sleep-deprived in a car
with no automation I was falling asleep
I agree that it's high P it's just like
you know what if you under I've am
wondering I think I rented a my last
autopilot experience was I rented a
model
three in march and drove it around the
wheel thing is annoying and the reason
the wheel thing is annoying we use the
wheel thing as well but we don't
disengage on wheel for Tesla you have to
touch the wheel just enough you should
trigger the torque sensor to tell it
that you're there but not enough as to
disengage it which don't use it for two
things
you disengage one wheel you don't have
to that whole experience Wow beautiful
put that all those elements even if you
don't have driver monitoring that whole
experience needs to be better driver
monitoring I think would make I mean I
think super cruise is a better
experience once it's engaged over
autopilot
I think super cruise is our transition
to engagement and disengagement are
significantly worse yeah so there's a
tricky thing because if I were to
criticize super cruise is uh it's a
little too crude and uh I think it's
like six seconds or something if you
look off-road you'll start warning you
it's some ridiculously long period of
time and just the way it I think it's
basically it's a binary chili adapter it
yeah it's it just needs to learn more
about you and used to communicate what
it sees about you more like I'm not you
know Tesla shows what it sees about the
external world it would be nice the
supercruise would tell us what it sees
about the internal world it's even worse
than that you press the button to engage
and it just says super cruise
unavailable yeah why why yeah that
transparency is good we've renamed the
driver monitoring packet to driver state
service state we have car state packet
which has the state of the car driver
state packet which I stay the driver so
what does itah make their BAC
must be do you think that's possible
with computer vision absolutely so to me
it's an open question I don't haven't
looked into too much they actually had
quite seriously looked at the literature
it's not obvious to me that from the
eyes and so on you can tell you might
need to stuff from the car as well yeah
you might need how they're controlling
the car right and that's fundamentally
at the end of the day what you care
about you but I think especially when
people are really drunk they're not
controlling the car nearly
smoothly as they would look at them
walking right there the car is like an
extension of the body so I think you
could totally detect and if you could
fix people who drunk distracted asleep
if you fix those three yeah this is
that's huge so what are the current
limitations of open pilot what are the
main problems that still need to be
solved um we're hopefully fixing a few
of them in 0-6 we're not as good as auto
pilot at stop cars so if you're coming
up to a red light at like 55 so it's the
radar stopped car problem which is
responsible to auto pilot accidents it's
hard to differentiate a stopped car from
a like signpost yes that ecology um so
you have to fuse you have to do this
visually there's no way from the radar
data to tell the difference maybe you
could make a map but I really believe in
mapping at all anymore um really what
you don't believe in mapping no so you
basically the open pilot solution is
saying react to the environment is just
like human doing beings and then
eventually when you want to do navigate
on open pilot I'll train the net to look
at ways all runways in the background
I'll train a car using GPS at all we use
it to crown trees we use it to very
carefully ground treat the paths we have
a stack which can recover a relative to
10 centimeters over one minute and then
we use that to ground truth exactly
where the car went in that local part of
the environment but it's all local how
are you testing in general just for
yourself like experiments stuff all
right were you were you located San
Diego San Diego yeah okay Oh what you
basically drive around there then
collect some data and watch on Florence
we have a simulator now and we have our
simulators really cool our simulator is
not it's not like a unity based
simulator our simulator lets us load in
real estate what I mean we can load in a
drive and simulate what the system would
have done on the historical data ooh
nice interesting so what yeah right now
we're only using it for testing but as
soon as we start using it for training
what's your feeling about the real world
versus simulation do you like simulation
for training if this moves to training
Chuck
we have to distinguish two types of
simulators right there's a simulator
that light is completely fake I could
get my car to drive around in GTA mm-hmm
um I feel that this kind of simulator is
useless you're never there's so many my
analogy here is like okay fine you're
not solving the computer vision problem
but you're solving the computer graphics
problem right and you don't think you
can get very far about creating ultra
realistic graphics no because you can
create ultra realistic graphics of the
road now create alter a realistic
behavioral models of the other cars oh
well I'll just use my self-driving no
you won't you need real you need actual
human behavior because that's what
you're trying to learn the dead driving
does not have a spec the definition of
driving is what humans do when they
drive whatever way mode does I don't
think it's driving right well I think if
you win more than others its if there's
any useful reinforcement learning I've
seen it used quite well I study
pedestrians a lot too is try to train
models from real data of how pedestrians
move and try to use reinforcement
learning models to make pedestrians move
in human-like ways by that point you've
already gone so many layers you detected
a pedestrian did you did you hand code
the feature vector of their state did
you guys learn anything from computer
vision before deep learning well okay
you know I feel like this is a
perception to you is the sticking point
does that mean what what's what's the
hardest part of the stack here

Resume

# Wawancara Eksklusif George Hotz: Dari Hacking iPhone, Masa Depan Mobil Otonom, hingga Singularitas AI

### Inti Sari (Executive Summary)
Video ini membahas perjalanan dan pemikiran George Hotz, peretas ternama yang pertama kali membuka kunci iPhone dan pendiri comma.ai, sebuah perusahaan otomatisasi kendaraan berbasis Machine Learning. Wawancara ini mengeksplorasi pendekatan teknis Hotz terhadap mobil otonom (openpilot) yang berfokus pada *end-to-end learning* tanpa LiDAR, kritiknya terhadap strategi kompetitor seperti Tesla dan Waymo, serta pandangan filosofisnya yang kontroversial mengenai teori simulasi, etika AI, dan masa depan di mana manusia bergabung dengan mesin.

### Poin-Poin Kunci (Key Takeaways)
*   **Filosofi Teknis:** Hotz percaya bahwa pendekatan terbaik untuk mobil otonom adalah *end-to-end learning* (mirip AlphaGo), di mana sistem belajar langsung dari input kamera ke output kontrol, mengkritik pemisahan antara persepsi dan perencanaan yang dilakukan oleh banyak perusahaan besar.
*   **Kritik Industri:** Hotz menganggap LiDAR sebagai "tongkat penyangga" (crutch) yang mahal dan tidak perlu, serta mengkritik fitur *Navigate on Autopilot* milik Tesla yang dianggap kaku dan mekanis dibandingkan gaya mengemudi manusia.
*   **Pendekatan Keamanan:** Berbeda dengan Elon Musk, Hotz menekankan pentingnya *Driver Monitoring System* (DMS) yang ketat. Ia menyatakan openpilot adalah sistem Level 2 yang mengharuskan pengemudi selalu waspada.
*   **Model Bisnis & Data:** Visi jangka panjang comma.ai adalah bertransformasi menjadi perusahaan asuransi mobil yang memanfaatkan data pengemudi untuk menilai risiko, serta menjual dataset berkualitas tinggi kepada pabrikan otomotif.
*   **Masa Depan AI:** Hotz memprediksi singularitas (ketika kekuatan komputasi silikon melampaui biologi) akan terjadi sekitar tahun 2038 dan membayangkan masa depan di mana manusia "bergabung" dengan AI melalui hubungan simbiosis yang mendalam.

---

### Rincian Materi (Detailed Breakdown)

#### 1. Latar Belakang, Hacking, dan Teori Simulasi
George Hotz dikenal sebagai pendiri comma.ai dan sosok yang vokal di dunia AI. Ia pertama kali dikenal dunia pada Agustus 2007 sebagai orang pertama yang berhasil membuka kunci (unlock) iPhone.
*   **Teori Simulasi:** Hotz percaya kita mungkin hidup dalam simulasi. Ia berpendapat bahwa sistem kompleks biasanya memiliki celah untuk diretas, namun simulasi bisa saja dirancang dengan bahasa yang "terbukti benar" (dependently typed languages) sehingga tidak bisa diretas.
*   **"Thinking Upwards":** Hotz mempromosikan narasi untuk keluar dari simulasi atau memecahkan masalah manusia. Jika modernis fokus pada luar angkasa (roket), generasi sekarang fokus pada AI dan VR.
*   **Preferensi VR:** Ia lebih memilih apartemen "cloud" di dunia maya daripada apartemen fisik, melihat dunia digital sebagai ruang tanpa batas yang tidak bersifat *zero-sum*.

#### 2. Perjalanan Teknis: Dari Project Zero hingga comma.ai
*   **Project Zero:** Hotz pernah bekerja di tim keamanan ofensif Google selama 5 bulan. Mereka memberi tenggat waktu 90 hari kepada perusahaan untuk memperbaiki kerentanan sebelum dipublikasikan, sebuah strategi yang mendorong industri keamanan maju.
*   **Gaya Coding:** Ia mengakui coding-nya cepat dan kacau ("chaotic") saat streaming, namun ini efektif untuk API yang tidak terdokumentasi. Ia belajar banyak bahasa (C, Python, Haskell, Go) dan menyukai Go karena tipe datanya yang kuat, namun mengkritik Python untuk basis kode besar karena kurangnya pemeriksaan tipe.
*   **Asal Usul comma.ai:** Semuanya dimulai ketika seorang teman menantangnya untuk membuat sistem mengemudi otonom. Ia bertemu Elon Musk dengan kontrak yang berani: hadiah $12 juta jika berhasil menggantikan Mobileye, atau denda $1 juta per bulan jika terlambat.

#### 3. Strategi Pengembangan Mobil Otonom vs Kompetitor
Hotz menjabarkan perbedaan mendasar antara pendekatannya dengan raksasa industri:
*   **Kontrak dengan Elon:** Target awalnya adalah mengkloning Mobileye (fitur *lane keeping*) dalam 12 bulan, bukan langsung mencapai Level 5 (mobil otonom penuh). Ia berhasil membuat prototipe sebanding dengan Tesla Autopilot versi 1 dalam 3 bulan.
*   **Tesla vs Kompetitor:** Hotz memprediksi Tesla akan memenangkan persaingan Level 5 karena skala pengumpulan data dari pengguna nyata. Namun, ia mengkritik *Navigate on Autopilot* (NoA) Tesla yang melakukan *lane change* dengan cara yang identik dan kaku setiap saat, tidak seperti manusia.
*   **LiDAR vs Kamera:** Hotz sepakat dengan Elon Musk bahwa LiDAR adalah "tongkat penyangga" yang mahal. Ia lebih percaya pada visi komputer (kamera) dan pendekatan reaktif terhadap lingkungan, mirip cara manusia mengemudi tanpa peta HD (High Definition).
*   **Waymo & Cruise:** Waymo diakui secara teknologi unggul 3-5 tahun, namun Hotz meragukan model bisnis mereka yang sangat *capital intensive* dan kurangnya efek jaringan (network effects) dibandingkan layanan seperti Uber.

#### 4. Teknikal: Persepsi, Perencanaan, dan Simulasi
*   **Debat Persepsi vs Perencanaan:** Banyak perusahaan memisahkan tumpukan perangkat lunak menjadi persepsi (melihat objek) dan perencanaan (mengambil keputusan). Hotz menentang ini, berargumen bahwa masalahnya harus diselesaikan secara bersamaan (*joint problem*) melalui vektor ruang laten (latent space vector), karena mengubah dunia nyata menjadi daftar objek sederhana akan kehilangan banyak konteks penting.

## Kesimpulan & Pesan Penutup
Wawancara ini menggambarkan visi George Hotz yang revolusioner mengenai mobil otonom dan masa depan AI, yang menekankan pentingnya *end-to-end learning* dan penolakan terhadap solusi mahal seperti LiDAR. Melalui kritiknya terhadap industri besar dan pandangan filosofisnya tentang singularitas, Hotz menantang kita untuk memikirkan kembali hubungan antara manusia dan teknologi.

Read

file updated 2026-02-13 13:23:55 UTC