Transcript

U_AREIyd0Fc • Boris Sofman: Waymo, Cozmo, Self-Driving Cars, and the Future of Robotics | Lex Fridman Podcast #241
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0577_U_AREIyd0Fc.txt
Back Raw
Kind: captions
Language: en
the following is a conversation with
boris sofman who is the senior director
of engineering and head of trucking at
waymo the autonomous vehicle company
formerly the google self-driving car
project
before that boris was the co-founder and
ceo of anki a robotics company that
created cosmo which in my opinion is one
of the most incredible social robots
ever built
it's a toy robot but one with an
emotional intelligence that creates a
fun and engaging human robot interaction
it was truly sad for me to see anki shut
down when he did
i had high hopes for those little robots
we talk about this story and the future
of autonomous trucks vehicles and
robotics in general
i spoke with steve vaseli recently on
episode 237 about the human side of
trucking this episode looks more at the
robotic side
this is the lex friedman podcast to
support it please check out our sponsors
in the description
and now here's my conversation with
boris
sofman
who is your favorite robot in science
fiction books or movies
wally and r2d2 where they were able to
convey such an incredible degree of
intent emotion and kind of character
attachment without
having any language whatsoever
and just purely through the emotion
richness of emotional interaction so
those were fantastic and then uh
the terminator series just like really
really
pretty wide wide range right uh but uh i
kind of love this uh dynamic where you
have this like incredible terminator
itself that arnold played but uh and
then he was kind of like the inferior
like previous generation version that
was like totally outmatched uh you know
in terms of kind of specs by the new one
but you know still kind of like held his
own and so it was kind of interesting
where you you realize how many how many
levels there are on the spectrum from
human to kind of potentials and ai and
robotics to uh futures and so yeah that
movie really uh as much as it was like
kind of a dark world in a way was
actually quite fascinating gets the
imagination going well from an
engineering perspective both the movies
you mentioned wally and
terminator the first one
is probably achievable you know humanoid
robot
maybe not with like the realism in terms
of skin and so on but
that humanoid form we have the humanoid
form it seems like a compelling form
maybe the challenge is just super
expensive to engine to build but you can
imagine maybe not a machine of war but
you could imagine terminator type robots
walking around
and then the same obviously with wall-e
you've basically so for people who don't
know you uh created the company anki
that created a small robot with a big
personality called coswell that just it
does exactly what wally does which is
somehow with
very few basic
visual tools is able to communicate a
depth of emotion and that's fascinating
but then again the humanoid form is
uh super compelling so like uh cosmo is
very distant from a humanoid form and
then the terminator has a humanoid form
you can imagine both of those actually
being in our society it's true and it's
interesting because um it was very
intentional to go really far away from
human form when you think about a
character like cosmo or like wall-e
where
you can completely rethink uh the
constraints you put on that character um
what tools you leverage and then how you
actually create a personality uh and a
level of intelligence interactivity that
actually
matches the constraints that you're
under whether it's mechanical or sensors
or ai of the day this is why i almost
was always really surprised by how much
energy people put towards trying to
replicate human form in a robot because
you actually take on some pretty
significant um kind of constraints and
downsides when you do that
um the first of which is obviously the
cost where it's just the the
articulation of a human body is just so
like magical um in both the precision as
well as the dimensionality that to
replicate that even in this quote
reasonably close form takes like a giant
amount of joints and actuators and uh in
motion and and you know sensors and
encoders and so forth but then um you're
almost like setting an expectation that
the closer you try to get to human form
the more you expect the strengths to
match and that's not the way ai works is
there's places where you're way stronger
and there's places where you're weaker
and by moving away from human form you
can actually
change the rules and embrace your
strengths and bypass your weaknesses and
at the same time the human form
like has way too many degrees of freedom
to play with it's it's kind of
counterintuitive just as you're saying
but when you have fewer constraints
it's almost harder to master the the
communication of emotion like you see
this with cartoons like stick figures
you can communicate quite a lot with
just
very minimal like two dots for eyes
and a line for for a smile i think like
you can almost communicate arbitrary
levels of emotion with just two dots and
a line yeah and like that's enough and
if you focus on just that
you can communicate the full range and
then you like if you do that then you
can focus on the actual magic
of of uh
human and
dot line interaction versus all the
engineering mess that's right like
dimensionality voice all these sort of
things actually become a crutch where
you get lost in a search space almost um
and so some of the best animators that
we've worked with um they almost like
study when they come up uh you know kind
of in building their expertise by
forcing these um projects where all you
have is like a ball that can like kind
of jump and manipulate itself or like
really really like aggressive
constraints for your force to kind of
extract the deepest level of motion and
so in a lot of ways um you know we
thought when we thought about cosmos
like you're right like our if we had to
like describe it in like one small
phrase it was bringing a pixar character
to life in the real world it's uh it's
what we were going for and um in a lot
of ways what was interesting is that
with like wall-e which we studied
incredibly deeply and in fact some of
our team were you know kind of had
worked previously at um at pixar and on
that project um they intentionally
constrained wall-e as well even though
in an animated film you could do
whatever you wanted to because it forced
you to like
really saturate the smaller amount of
dimensions but uh you sometimes end up
getting a far more beautiful output um
because you're pushing at the extremes
of this emotional space in a way that
you just wouldn't because you get lost
in a surface area if you have like
something that is just infinitely
articulable so if we backtrack a little
bit and uh you thought of cosmo in 2011
and 2013 actually uh designed and built
it what is anki what is cosmo
i guess who is cosmo and
uh what was the vision behind this
incredible little robot we started
uh anki back in
like while we were still in graduate
school so myself and my two co-founders
we were phd students uh in the robotics
institute at carnegie mellon um and so
we were uh studying robotics ai machine
learning kind of different you know
different uh uh areas one of my
co-founders working on walking robots uh
you know for a period of time and so we
all had a um a bit of a
really deep kind of a deeper passion for
applications of robotics and ai where um
there's like a spectrum where there's
people that get like really fascinated
by the theory of ai and machine learning
robotics where um whether it gets
applied in the near future or not is
less of a kind of factor on them but
they love the pursuit of like the
challenge and that's necessary and
there's a lot of incredible
breakthroughs that happen there we're
probably closer to the other end of the
spectrum where we love the technology
and the um and all the evolution of it
but we were really driven by
applications like how can you really
reinvent experiences and functionality
and build value that wouldn't have been
possible without these approaches and
and that's what drove us and we had a
kind of some experiences through
previous jobs and internships where we
like got to see the applied side of
robotics and at that time there was
actually relatively few applications of
robotics um that were outside of um
you know peer research or industrial
applications um military applications
and so forth there were very few outside
of it so maybe you know my robot was
like one exception and maybe there were
a few others but for the most part there
weren't that many and so we got excited
about consumer applications of robotics
where you could leverage
way higher levels of intelligence
through software to create value and
experiences that were just not possible
in
in those fields today
and we saw
kind of a pretty wide range of
applications
that varied in the complexity of what it
would take to actually solve those and
what we wanted to do was to
commercialize this into a company but
actually
do a bottoms-up approach where we could
have a huge impact in a space that was
ripe to have an impact at that time and
then build up off of that and move into
other areas and entertainment became the
place to start because um
you had relatively little innovation in
a toy space
an entertainment space you had these
really rich
experiences in video games and uh and
movies but there was like this chasm in
between and so we thought that
we could really reinvent that experience
and there was a
really fascinating transition
technically that was happening at the
time where the cost of components was
plummeting because of the mobile phone
industry and then the smartphone
industry and so the cost of a
microcontroller of a camera of a motor
of memory of
microphones cameras was dropping by
orders of magnitude and then on top of
that with the iphone coming out in 2000
uh i think it was 2007 i believe
um
it started to become apparent within a
couple of years that this could become a
really incredible
interface device and the brain with much
more computation behind a physical world
experience that wouldn't have been
possible previously
and so
um we really got excited about that and
how we push all the complexity from the
physical world into software by using
really inexpensive components but
putting huge amounts of complexity into
the ai side and so cosmo became our
second product and then the one that
we're probably most proud of the idea
there was to create a physical character
that had enough understanding and
awareness of the physical world around
it in the context that mattered to feel
like
like he was alive um and
to be able to have these like emotional
kind connections and experiences with
people that you would typically only
find uh inside of a movie and the
motivation very much was was pixar like
we had an incredible uh respect and
appreciation for what they were able to
um build in this like really beautiful
fashion and film um but it was always
like a you know when it was virtual and
two it was like a story on rails that
had no interactivity to it it was very
fixed and it obviously had a magic to it
but where you really start to hit a
different level of experiences when
you're actually able to physically
interact with that robot and then that
was your idea with anki like the first
product was the cars
so basically you take
you take a toy
you add intelligence into it in the same
way you would add intelligence into ai
systems within a video game but you're
not bringing into the physical space so
the idea is is really brilliant which is
you're basically bringing video games to
life exactly that's exactly right we
literally use that exact same phrase
because in the case of drive this was a
parallel of the racing genre and the
goal was to effectively have a physical
racing experience but have a virtual
state at all times that matches what's
happening in the physical world and then
you can have a video game off of that
and you can have uh different characters
different traits for your
the cars
weapons and interactions and special
abilities and all these sort of things
that you think of virtually but then you
can have it physically and um one of the
things that we were like really
surprised by that really stood out and
immediately led us to really like kind
of accelerate the path towards um cosmo
is that things that feel like they're
really constrained and simple in the
physical world they have an amplified
impact on people where the exact same
experience virtually would not have
anywhere near the impact but seeing it
physically really stood out and so
effectively we've with with drive we
were creating a video game engine for
the physical world um and then with
cosmo we expanded that video game engine
to create a character and and
kind of an animation and interaction
engine
on top of it that allowed us to start to
create these much more rich experiences
and a lot of those elements were uh
almost like a proving ground for what
would human robot interaction feel like
in a domain it's much more forgiving
where you can make mistakes in a game
it's okay if like uh if you know car
goes off the track or if if cosmo makes
a mistake um and what's funny is
actually we're so worried about that
in reality we realized very quickly that
those mistakes can be endearing and if
you make a mistake as long as you
realize you make a mistake and have the
right emotional reaction to it it builds
even more empathy with the character
that's brilliant exactly so when uh the
the thing you're optimizing for is fun
you have so much more freedom to fail
to explore
and and also in the toy space like all
this is really brilliant like i got to
ask you backtrack
it seems for a roboticist
to take us jump
in into the direction of fun
is a brilliant move because when you
have the freedom to explore to design
all those kinds of things
and you can also build cheap robots
like you don't have to like if you're
not chasing perfection
and like
toys it's understood that you can go
cheaper which means in robot it's still
expensive but it's actually affordable
by a large number of people so it's a
really brilliant space to explore yeah
that's right it's uh and in fact we
realized pretty quickly that like
perfection is actually not fun yeah
because like in a traditional robotic
roboticist sense the first kind of path
planner and uh this is the you know the
part that i worked worked on out of the
gate was like a lot of the kind of ai
systems where you have these you know
vehicles and
you know cars racing kind of making
optimal maneuvers to try to kind of get
ahead and you realize very quickly that
like that's actually not fun because you
want the like
chaos from mistakes and the
and so you start to kind of
intentionally almost add noise to the
system uh in order to kind of create
more of a realism in the exact same way
the human player might start really
ineffective and inefficient and then
start to kind of increase their quality
bar as they
as they progress and
there is a really really aggressive
constraint that's forced on you by
being a consumer product where the price
point matters a ton particularly in like
kind of an entertainment where um
you know you you can't make a thousand
dollar product unless you're going to
meet the qua like the expectations of a
thousand dollar product and so um in
order to make this work like your cost
of goods had to be like like you know
well under a hundred dollars uh uh in
the case of cosmo we got it under fifty
dollars end-to-end fully packaged and
delivered and it was under two hundred
dollars
it cost the retail yeah
so uh okay if we sit down like at this
early stages
if you go back to that
and you're sitting down and thinking
about what kosovo looks like from a
design perspective and from a cost
perspective i imagine that was part of
the conversation
first of all what came first did you
have a cost in mind is there a target
you're trying to chase
did you have a vision in mind like size
did you have because there's a lot of
unique qualities to cosmos so for people
who don't know they should definitely
check it out
there's a display there's eyes on the
little display and those eyes can it's
pretty uh low resolution eyes right but
they they still able to convey a lot of
emotion and there's this arm
like that out lift sort of lifts stuff
but there's something about arm movement
that adds even more kind of depth
it's like uh the face communicates
emotion and sadness and disappointment
and happiness and then the arms
kind of communicates i'm trying here
yeah i'm doing my best
exactly so it's um
uh it's interesting because like um
all of cosmo's only four degrees of
freedom and two of them are the two
treads which is for basic movement and
so you literally have only
a head that goes up and down a lift that
goes up and down and then your two
wheels uh and you have sound uh and a
screen yeah and a low resolution screen
and with that it's actually pretty
incredible what you can uh what you can
come up with where like you said it's a
uh it's a really interesting give and
take because there's a lot of ideas far
beyond that obviously as you can imagine
where like you said how big is it how
much degrees of freedom what does it
look like um uh what does he sound like
how does he communicate it's it's a
formula that actually scales way beyond
entertainment this is the formula for
human
kind of robot interface more generally
is you almost have this triangle between
um the physical aspects of it the
mechanics the industrial design what's
mass producible the cost constraints and
so forth
you have the ai side of
how do you understand the world around
you interact intelligently with it
execute what you want to execute so
perceive the environment
make intelligent decisions and
and move forward and then you have the
character side of it
um
most uh companies have done anything in
human robot interaction really uh missed
the mark or under invest in the
character side of it um they over invest
in the mechanical side of it uh
you know and then varied results on the
ai side of it and so the thinking is
that you put more mechanical flexibility
into it you're gonna do better um you
don't necessarily you actually create a
much higher bar uh for a high roi
because now your price point goes up
your expectations go up and if the ai
can't meet it or the overall experience
isn't there you missed the mark
um so who like how did you through those
conversations get the cost down so much
and make it made it so simple like that
there's a big theme here because you
come from the mecca of robotics which is
carnegie mellon university robotics like
for all the people i've interacted with
that come from there or just from
you know the world experts at robotics
they don't
they would never build something like
cosmo yeah and so where did that come
from so the simplicity it came from this
combination of a team that we had it was
it was quite cool because like we and by
the way you ask anybody that's like
experienced in the like kind of you know
toy entertainment space you'll never
sell a product over 99 um that was
fundamentally false and we believed it
to be false it was because experience
had to kind of you know meet the mark
and so we pushed past that amount but
there was a pressure where the higher
you go the more seasonal you become and
the tougher it becomes and so on the
cost side we very quickly partnered up
with some previous contacts that we
worked with where just as an example one
our head of mechanical engineering um
was one of the earliest heads of
engineering at logitech and has a
billion units of consumer products and
circulation that he's worked on yeah so
like crazy low cost high volume consumer
product experience with a really great
mechanical engineering team and just a
very practical mindset where we were not
going to compromise on feasibility in
the market in order to chase something
that would be enabler and we pushed a
huge amount of expectations onto the
software team where yes we're going to
use cheap
noisy motors and sensors but we're gonna
fix it in the um on the software side
then we found on the design and
character side there was a faction that
was more from like a game design
background that thought that it should
be very games driven cosmo where you
create a whole bunch of games
experiences and it's all about like game
mechanics and then there was um a
faction which my my co-founder and i the
most involved in this like really
believed in which was character driven
and the argument is that you will never
compete with what you can do virtually
from a game standpoint but you actually
on the character side put this into your
wheelhouse and put it more towards your
advantage
because a physical character has a
massively
higher impact uh physically than
virtually this is okay i can't just
pause on that because this is so
brilliant when i uh for people who don't
know cosmo
plays games with you
but there's also a depth of character
and i actually when i was
you know playing with it
i wondered
exactly what is the compelling aspect of
this because to me obviously i'm i'm
biased but to me the character i get
what i enjoyed most honestly
or what
got me to return to it is the character
that's right but that's that's a
fascinating discussion of uh
you're right ultimately
you cannot compete
on the
quality of the gaming experience too
restrictive the physical world is just
too restrictive and uh you don't have a
graphics engine it's like all this but
on the character side
we uh and clearly we moved in that
direction is like kind of the the the
winning path and um
we partnered up with this uh really we
immediately like went towards pixar and
carlos bana he was um one of like had
been in pixar for nine years he'd worked
on tons of the movies including wally
and others and just immediately kind of
spoke the language and just clicked on
how you think about that like kind of
magic and drive and then he we built out
a team uh
you know with him as like a really kind
of prominent kind of driver of this with
different types of backgrounds and
animators and character developers where
um we put these constraints on the team
but then got them to really try to
create magic despite that and we
converged on this system that was at the
overlap of character and the character
ai
that where if you imagine the
dimensionality of emotions happy
sad angry surprised confused uh um
scared like you think of these extreme
emotions
we almost like kind of put this
challenge to kind of populate this
library of responses on how do you show
the extreme
response that like goes to the extreme
spectrum on angry or frustrated or
whatever and and so that gave us a lot
of intuition and learnings and um and
then we started parameterizing them
where it wasn't just a fixed recording
but they were parameterized and had
randomness to them where you could have
infinite permutations of happy and
surprised and so forth
and then we had a behavioral engine that
took the context from the real world
and would interpret it and then create
kind of probability mappings on what
sort of responses you would have that
actually made sense and so if cosmo saw
you for the first time in a day um he'd
be really surprised and happy in the
same way that the first time you walk in
and like your toddler sees you they're
so happy but they're not gonna be that
happy for the entirety of your next two
hours but like you have this like spike
in response or if you leave him alone
for too long he gets bored and starts
causing trouble and like nudging things
off the table um or if you beat him in a
game um the most enjoyable emotions are
him getting frustrated and grumpy to a
point where our testers and our
customers would be like
i had to let him win because i don't
want him to be upset and
so you start to like create this
feedback loop where you see how powerful
those emotions are and just to give you
an example something as simple as eye
contact um you don't think about it in a
movie just like it kind of happens like
you know camera angles and so forth um
but that's not really a prominent source
of interaction
what happens when a physical character
like cosmo when he makes eye contact
with you um
it built universal kind of connection
kids all the way through adults um and
it was truly universal it was not like
people stopped caring after 10 12 years
old
and so
we started doing experiments and we
found something as simple as
increasing the amount of eye contact
like the amount of times in a minute
that he'll look over for your approval
to like kind of make eye contact just by
i think doubling it we increase the play
time engagement by 40
like you see these sort of like kind of
interactions where you build that
empathy and and so we studied pets we
studied um virtual characters there's
like a lot of times actually dogs are
one of the perfect most perfect uh um
influencers behind these sort of
interactions and what we realized is
that the games were not there to
entertain you the games were to create
context to bring out the character and
if you think about the types of games
that you know that you played they're
relatively simple but they were always
once to create scenarios of either
tension or winning or losing or surprise
or whatever the case might be and they
were purely there to just like create
context to where an emotion could feel
intelligent and not random
and in the end it was all about the
character
so yeah there's so many
elements to play with here so you said
dogs what lessons do we draw from cats
who don't seem to give a damn about you
is that just another character is this
another it's just another character and
so you you could almost like in early
aspirations we thought it would be
really incredible if you had a diversity
of characters where you almost help
encourage which direction it goes just
like in a role-playing game um and you
had uh like think of like the you know
seven dwarfs sort of and uh um and
initially we even thought that it would
be amazing if like the other like
you know like their characters actually
help them be have strengths and
weaknesses and some you know like
whatever they end up doing like some are
scared some are
you know
arrogant some are uh
you know super warm and like kind of
friendly and in the end we focused on
one because it made it very clear that
hey we got to build out enough depth
here because you're
kind of trying to expand
it's almost like how long can you
maintain a fiction that this character
is alive um to where the person's
explorations don't hit a boundary um
which happens almost immediately with
with typical toys um and you know even
with video games uh how long can we
create that immersive experience to
where you expand the boundary and one of
the things we realized is that you're um
just way more forgiving when something
has a personality and it's physical
that is the key
that unlocks
uh robotics interacting you know in the
physical world more generally is that
that uh the
when you have a when you don't have a
personality and you make a mistake as a
robot the stupid robot made a mistake
why is it not perfect when you have a
character and you make a mistake you
have empathy and it becomes endearing
and you're way more forgiving and that
was the key that was like i think goes
far far beyond entertainment it actually
builds the depth of the personality the
mistakes so let me ask the the movie her
question then
how and so
cosmos seem feels like the early days of
something that will obviously be
prevalent throughout society at a scale
that
we cannot even imagine
my sense is
it seems obvious
that these kinds of characters will
permeate society and they will be
friends with them we'll be interacting
with them in different ways the in the
way we i mean you don't think of it this
way but when you play video games
they're kind they're often cold and
impersonal
but but even then
uh you think about role-playing games
you become friends with certain
characters in that game they're they
don't remember much about you they
they're they're just telling a story
it's exactly what you're saying they
they exist in that virtual world but if
they acknowledge that you exist in this
physical world if the characters in the
game remember that you exist that you
like for me like lex they understand
that i'm a human being who has like
hopes and dreams and so on it seems like
there's going to be a like
billions
if not trillions of cosmos in the world
so if we look at that future
there are several questions to ask how
intelligent does that future cosmo need
to be
to create
fulfilling
relationships like friendships yeah it's
a great question and and part of it was
a recognition it's going to take time to
get there because it has to be a lot
more intelligent um because what's good
enough to
be a magical experience for
uh you know an eight-year-old um it's a
higher bar to do that be a complaint
like a pet in the home or to help with
functional interface in an office
environment or in a home or uh and so
forth and so and the idea was that you
build on that and you kind of get there
and as technology becomes more prevalent
and less expensive and so forth you can
start to kind of work up to it um but
you know you're absolutely right at the
end of the day um we almost equated it
to how uh the touchscreen created like
this really novel interface to you know
physical kind of devices like this
this is the extension of it where you
have
much richer physical interaction in the
real world this is this is the enabler
for it um and it shows itself in a few
kind of really obvious places so just
take something as simple as a voice
assistant um you will never most people
will never tolerate uh an alexa or a
google home just starting a conversation
um proactively uh when you weren't kind
of expecting it because it it feels
weird it's like you were listening and
like and then now you're kind of it
feels intrusive but if you had a
character um like a cat that touches you
and gets your attention or toddler like
you never think twice about it what we
found really kind of immediately is that
um these types of characters like cosmo
and they would like roam around and kind
of get your attention and we had a
future version it was always on kind of
called vector people were way more
forgiving
and so you could initiate interaction in
a way that is not acceptable for
for machines and in general um you know
there's a lot of ways to customize it
but it makes people who are skeptical of
technology much more comfortable with it
there was like there were a couple of
really really
prominent examples of this so when we
launched in europe and so we were in um
uh i think like a dozen countries if i
remember correctly but like we were we
went pretty aggressively in launching in
um germany and france and uh and uk and
we were very worried in europe because
there's obviously like a really a
socially higher bar for privacy and you
know security where you you've heard
about how many companies have had
troubles on uh uh that might things that
might have been okay in the u.s but like
are just not okay in germany and france
in particular um and so we were worried
about this because you have um you know
cosmo who's um uh you know in our future
product veteran like where you have
cameras you have microphones it's kind
of connected and like you're playing
with kids and like in these experiences
and you're like this is like ripe to be
like a nightmare if you're not careful
yes um and
uh and the journalists are like
notoriously like really really tough on
on these sort of things um
we were shocked and we prepared so much
for what we would have to encounter we
were shocked in that
not once from any journalists or
customer do we have any complaints
beyond like a really casual kind of
question
and it was because
of the character where um
when it conversation came up it was
almost like well of course he has to see
in here how else is he going to be alive
and interacting with you
and it completely disarmed um this like
fear of technology that enabled this
interaction to be much more fluid and
again like entertainment was a proving
ground but that is like a you know
there's like ingredients there that
carry over to a lot of other uh
elements down the road that's hilarious
that we're a lot less concerned about
privacy if the if the thing is value and
charisma
i mean that's true for all of women to
human interaction too it's an
understanding of intent where like well
he's looking at me he can see me if he's
not looking at me he can't see me right
so it's almost like uh
um you're communicating intent and with
that intent
people are like kind of kind of more
understanding and calmer and it's a it's
interesting we just it was just the
earliest kind of version of starting an
experiment with this but um it wasn't
enabler and um and then and then you
have like completely different
dimensions where like you know kids with
autism had like an incredible connection
with cosmo that just went beyond
anything we'd ever seen and we have like
these just letters that we would receive
from parents and we had some research
projects kind of going on with some
universities on studying this but um
there are like there's an interesting
dimension there that got unlocked that
just hadn't existed before um that has
these really interesting kind of links
into society and
and a potential building block of future
experiences so
if you look out into the future do you
think we will have
beyond a particular game
you know a companion
like uh like her
like the movie her
or like a cosmo that's
kind of asks you how your day went too
right
you know like a friend
how many years away from that do you
think we are what's your intuition good
question so
i think the idea of a different type of
character like more closer to like kind
of a pet style companionship it will
come way faster um
and there's a few reasons one is like to
to do something like in her that's like
effectively almost general ai and the
bar is so high that if you miss it by
bit you hit the uncanny valley where it
just becomes creepy and like and not um
not appealing um because the closer you
try to get to a human in form and
interface and
voice the harder it becomes whereas
you have way more flexibility on
still landing a really great experience
if you embrace the idea of a character
and that's why um one of the other
reasons why we didn't have a voice uh
and also why like a lot of video game
characters uh like
sims for example does not have a voice
when you uh when you think about it it
was it wasn't just a cost savings like
for them it was actually for all of
these purposes it was because when you
have a voice you immediately narrow down
the appeal to some particular
demographic or age range or um kind of
style or gender uh if you don't have a
voice people interpret what they want to
interpret
and an eight-year-old might get a very
different interpretation than a 40 year
old
but you create a dynamic range and so
you just you can lean into these
advantages much more um and something
that doesn't resemble a human and so
that'll come faster
i don't know when a human like that's
just uh still like ma just complete r d
at this point the the chat interfaces
are getting way more interesting and
richer but it's still a long way to go
to kind of pass the test of
you know well let me like let's consider
like let me play devil's advocate
so google is a very large company that's
servicing
it's creating a very compelling product
that wants to provide a service a lot of
people but let's go outside of that you
said characters yeah it feels like and
you also said that it requires general
intelligence to be a successful
participant in a relationship which
could explain why i'm single this is
very
but the i i honestly want to push back
on that a little bit
because i feel like is it possible
that if you're just good at playing a
character yeah you're in in a movie
there's a bunch of characters if you
just understand what creates compelling
characters
and then you you just are that character
and you exist in the world and other
people find you and they connect with
you just like you do when you talk to
somebody at a bar i like this character
this character is kind of shady i don't
like them you pick the ones that you
like and you know maybe it's somebody
that's uh reminds you of your father or
mother i don't know what it is but the
the freudian thing but there's some kind
of connection that happens and that's
that that's the cosmo you connect to
that's the future cosmo you connect and
that's
so i guess the statement i'm trying to
make is it possible to achieve a depth
of friendship without solving
general intelligence i think so it's
about intelligent kind of constraints
right and just uh you set expectations
and constraints such that in the space
that's left you can be successful and so
you can do that by having a very focused
domain that you can operate in for
example you're a customer support agent
for a particular product and you create
intelligence and a good interface around
that or uh you know kind of in the
personal companionship side you can't be
everything to
across the board
you you kind of solve those constraints
and i think uh i think it's possible my
my worry is like i
right now i don't see anybody that has
picked up on where kind of cosmo left
off yes and is pushing on it in the same
way and so i don't know if it's a sort
of thing where similar to like how you
know in dot com there were all these
concepts that we considered like you
know that didn't work out or like failed
or like were too early or whatnot and
then 20 years later you have these like
incredible successes on almost the same
concept like it might be that sort of
thing where like there's another pass at
it that happens in five years or in 10
years but um it does feel like that
appreciation of that like that this the
three like it's duel if you will between
like you know the hardware the ai and
the character um that balance it's hard
to
i'm not aware of of any pro anywhere
right now where like that same kind of
aggressive drive with the value on the
character is uh is happening and so to
me
just a prediction exactly as you said
something that looks awfully a lot like
cosmo not in the actual physical form
but in the three-legged stool
something like that in some number of
years would be a trillion dollar company
i don't understand like it's obvious to
me yeah that
like
character not just as robotic companions
but in all our computers they'll be
there it's like uh
clippy was like
two legs of that stool or something like
that yeah i mean that those are all
different attempts and
what's really confusing to me is they
they're born
these attempts and they they everybody
gets excited and for some reason they
die and then nobody else tries to pick
it up and then maybe a few years later
a crazy guy like you comes comes around
with just enough brilliance and vision
to create this thing and it's born a lot
of people love it a lot of people get
excited but maybe the timing is not
right yet
and then and then when the timing is
right it just blows up and it just keeps
blowing up more and more until it just
blows up and i guess everything in the
full span of human civilization
collapses eventually
and that wouldn't surprise me at all and
like what's gonna be different in
another five years or ten years what not
physical component costs will continue
to come down uh in price and you know
mobile devices and computations going to
become more and more prevalent as well
as cloud as a big tool uh to offload
cost um ai is going to be a massive
transformation compared to what we dealt
with uh where um everything from voice
understanding to um uh to just
you know kind of a broader contextual uh
understanding and mapping of
of semantics and uh understanding scenes
and so forth and then the character side
will continue to kind of you know
progress as well because that magic does
exist it just exists in different forms
and you have just the brilliance of uh
that's happening in animation and you
know these other areas where um that is
that was a big unlock in um you know in
film obviously uh and so i think yeah
the pieces can reconnect and the
building blocks are actually gonna be
way more impressive than they were five
years ago so
so in 2019
uh anki the company that created cosmo
the company that you started had to shut
down
how did you feel at that time
yeah
it was tough uh that was a really
emotional stretch and it was really
tough year
like about a year ahead of that was
actually a pretty brutal stretch because
we were um
kind of light life or death on many many
moments um just navigating
these insane kind of just ups and downs
and um barriers and the thing that made
it
like um
like just rewinding a tiny bit like what
you know what ended up being really
challenging about it as a business where
is um
from a commercial standpoint and
customer reception standpoint there's a
lot of things you could point to that
were like you know pretty big successes
sold millions of units uh like you got
to like pretty serious revenue like kind
of close to 100 million annual revenue
um uh number one kind of product in kind
of various categories
but it was pretty expensive it ended up
being very seasonal where something like
85 percent of our volume was in q4
because it was a you know a present and
and it was expensive to market it and
explain it and so forth um and even
though though the volume was like really
sizeable and like the reviews were
really fantastic um
forecasting and planning for it and
managing the cash operations was just
brutal like it was absolutely brutal you
don't think about this when you're
starting a company or when you have a
few million in
you know in revenue because it's just
your biggest costs are kind of just your
head count and operations and
everything's ahead of you but we got to
a point where um
you know you if you look at the entire
year you have to
operate your company pay all you know
the people and so forth you have to pay
for the manufacturing the marketing and
everything else
to do your sales in mostly november
december and then get paid in december
january by retailers and those swings
were pretty um were really rough um and
just made it like so difficult because
the more successfully became the more
wild those swings became
because you'd have to like spend
you know tens of millions of dollars on
inventory tens of millions of dollars on
marketing and tens of millions of
dollars on payroll and everything else
and then there's the bigger dip and then
you're waiting for the 204 yeah and it's
not a business that like is recurring
kind of month-to-month and predictable
and it's just and then you're walking in
your forecast in july um you know maybe
august
if you're lucky um
and uh and it's also like very hit
driven and seasonal where like you don't
have the sort of continued uh kind of
slow growth like you do in some other uh
consumer electronics industries and so
before then like hardware kind of like
went out of favor too and so you had
fitbit and gopro dropped from 10 billion
revenue to 1 billion revenue and
hardware companies are getting valued at
like 1x revenue oftentimes um which is
tough right and so
we effectively kind of got caught in the
middle where we were trying to
quickly evolve out of entertainment and
move into some other categories but you
can't let go of that business because
like that's what you're valued on that's
what you're raising money on um but
there's no path to prop kind of pure
profitability just there because it was
you know such you know uh specific type
of price points and so forth and so
um
we tried really hard to make that
transition and um yeah we had a
financing round that fell apart at the
last second and effectively there was
just no path to kind of get through that
and get to the next kind of like holiday
season and so we ended up um
uh selling some of the assets and kind
of winding down the company it was uh it
was brutal like we i was very
transparent with the company like in the
the team while we were going through it
where actually despite how challenging
that period was very few people left i
mean like people loved the vision the
team the culture of the like kind of
chemistry and kind of what we were doing
there was just a huge amount of pride
there and we wanted to see it through
and we felt like we had a shot to kind
of get through these checkpoints um
we ended up uh and i mean by brutal i
mean like literally like days of cash
like three four different times uh
runway like in the year you know kind of
before it um where you're like
playing games of chicken on negotiating
credit line timelines and
like repayment
terms and how to get like a bridge loan
from an investor it's just like
level of stress that like is as hard as
things might be anywhere else like
you'll never come you know come close to
that where you feel that like
responsibility for you know 200 plus
people right um
and so we were very transparent during
our fundraise on who we're talking to
the challenges um that we have
how it's going and when things are going
well when things were tough um and so it
wasn't a complete shock when it happened
but it was just very emotional where
like i you know like
you know when we announced it finally
that like um you know we you know
basically we're just like watching kind
of like you know the runway and trying
to kind of time it and when we realized
that like we didn't have any more outs
we wanted to like kind of wind it down
make sure that it was like clean and you
know we could like kind of take care of
people the best we could but yeah like
broke down crying at all you know hands
and somebody else had to step in for a
bit and like it was just very very
emotional but the beautiful part is like
afterwards like everybody stayed at the
office to like two three in the morning
just like drinking and hanging out and
telling stories and celebrating and it
was just like
one of the best uh for many people was
like the best kind of work experience
that they had and there was a lot of
pride in what we did and there wasn't
anything obvious we could point to that
like hey if only we had done that
different things would have been
completely different it was just like
the physics didn't line up uh and uh
um but the experience was pretty uh
incredible but it was hard like it was
uh it had this feeling that there was
this like incredible beauty in both the
technology and products and the team
that um
uh
you know there's there's a lot there
that like in the you know right context
could have been
uh pretty incredible but it was um
emotional just
yeah just thinking i mean just looking
at this company like you said the
product and technology but the vision
the implementation you got the cost down
very low
yeah and the compelling the nature of
the product was great so many robotics
companies failed at this at they
the robot was too expensive it didn't
have the personality it didn't really
provide any value like a sufficient
value to justify the price so like you
succeeded where basically every single
other robotics company or most of them
that are like going the category of
social robotics have kind of failed
and
i mean it's uh it's quite tragic i
remember uh
reading that i'm not sure if i talked to
you before that happened or not
but i remember you know i'm distant from
this
i remember being heartbroken reading
that
because like
if
if cosmo's not going to succeed
what is going to succeed because that to
me was incredible
like
it was an incredible idea
cost is down the minimum the
the
it's just like the most minimal design
in physical form that you could do it's
really compelling the balance of games
so it's a it's a fun toy it's a great
gift
for all kinds of age groups right it's
just it's compelling in every single way
and it seemed like uh it was a huge
success and it it failing was
i don't know there was heartbreak on
many levels for me just as an external
observer
is i was thinking how hard is it to run
a business
that's that's what i was thinking like
if this failed this must have failed
because uh it's obviously not like
yeah it's b it's business yeah maybe
it's some aspect of the manufacturing
and so on but i'm now realizing it's
also not just that it's
yeah
sales marketing also it's everything
right like how do you explain something
that's like a new category to people
that like how all these previous
positions and so like uh you know it it
had some of the hardest elements of
if you were to pick a business it had
some of the hardest uh um customer
dynamics because like to sell a 150
product you got to convince both the
child to want it and the parents to
agree that it's valuable so you're
having like this dual prong marketing
challenge you have manufacturing you
have like really high precision on the
components that you need you have the ai
challenges so there were a lot of tough
elements but is this feeling where like
just really great alignment of unique
strength across kind of like all these
different areas just an incredible like
you know kind of character and animation
team between this like carlos and
there's like a character director day
that came on board and like you know
really great people there the ai side
the um uh
the manufacturing the you know where um
like never missing a launch right and
actually you know he kind of hitting
that quality was um yeah it was it was
heartbreaking but uh
here's one neat thing is like we we had
so much like fan mail from kind of kids
parents like i actually like there was a
bunch they collected in the end yeah
that um i actually saved and like i
never it was too emotional to open it
and i still haven't opened it um and so
i actually have this giant envelope of
like a stack this much of like letters
from you know kids and families just
like every you know perpetration
permutation you can imagine and so
planning to kind of i don't know maybe
like a five year you know five year
eight some year reunion just inviting
everybody over and we'll just like kind
of dig into it and um kind of bring back
some memories but um you know good
impact and uh um well i i think there
will be companies uh maybe waymo and
google will be somehow involved that
will carry this flag forward and will uh
will make you proud whether you're
involved or not i think this is one of
the greatest robotics companies in the
history of robotics so you should be
proud it's still tragic to know that
you know because you read all the
stories of apple and and
let's see spacex and like companies that
were just on the verge of failure
several times through that story and
they just it's almost like a roll of the
diet they succeeded and here's the role
of the dice that just happened to go
and that's the appreciation that like
when you really like talk to a lot of
the
founders like everybody goes through
those moments and sometimes it really is
a matter of like you know timing a
little bit of luck like some things are
just out of your control and um
uh and you you get a much deeper
appreciation for um just the
dimensionality of of that challenge but
um the great thing is that like a lot of
the team actually like stayed together
and so um they were actually a couple of
companies that we we kind of kept big
chunks of the team together and we
actually kind of helped align this uh um
you know to help people out as well um
and one of them was waymo where uh
a majority of the ai and robotics team
actually had the exact background uh
that you would look for in like kind of
a b space it was a space that a lot of
us like you know were you know worked on
in grad school were always passionate
about and ended up uh you know maybe the
time you know
serendipitous timings from another
perspective where like uh um kind of
landed in a really unique um
circumstance it's actually been quite
exciting too
so it's interesting to ask you just your
thoughts uh cosmo still lives on
under dream labs i think
is that
are you tracking the progress there or
is it too much pain
is it are you is that something that
you're excited to see where that goes
so keeping an eye on it of course just
out of your curiosity and obviously just
kind of care for product line i think um
it's deceptive how complex it is to
manufacture and evolve that product line
um and the amount of
experiences that are required to
complete the picture and be able to move
that forward and i think that's going to
make it pretty hard to do something
really substantial with it it would be
cool if like even the product in the way
it was was able to be manufactured yes
again that would be
yeah which would be neat um but uh it's
i think it was it's deceptive how tricky
that is on like everything from the
quality control the details and um and
then like technology changes that forces
you to rick
reinvent and update certain things um so
uh i haven't been super close to it but
just kind of keeping an eye on it yeah
it's really interesting how
it's deceptively difficult just as
you're saying for example
those same folks uh and i've spoken with
them they're
they partnered up with rick and morty uh
creators to uh to do the butter robot
yes i love the idea i just recently
i've kind of half-assed watch rick and
morty previously but now i just watched
like the first season it's such a
brilliant show i i like
i did not understand how brilliant that
show is and obviously i think in season
one is where the butter robot comes
along for just a few minutes or whatever
but i just fell in love with the butter
robot the sort of the
that particular character just like you
said there's characters you can create
personalities you can create and that
particular
a robot
who's doing a particular task
realizes
you know
this like realizes that's the
existential question this the myth of
sisyphus question that uh camus writes
about it's like is this all there is
because he moves butter
but you know
that realization that's a that's a
beautiful little realization for a robot
that my purpose is very
limited with this particular task it's
abuse it's humor of course it's darkness
it's a beautiful mix but so they want to
release that butter robot
but something tells me
that to do the same depth of personality
as cosmo had the same richness it would
be on the manufacturing on the ai
on the storytelling on the design it's
going to be very very difficult it could
be a cool sort of uh toy
for rick and morty fans
but to create the same depth of
existential angst yeah that the butter
robot symbolizes is is really
that's the brave effort you succeeded at
with cosmo but it's not easy it's really
studies and
you can fail on almost any one of the
kind of dimensions and like uh and yeah
it takes you know
yeah unique convergence of a lot of
different skill sets to try to pull that
off yeah
on this topic let me ask you for some
advice
because uh as i've been watching rick
and morty i i told myself i have to
build the butter robot just as a hobby
project and so uh i got a nice platform
for it with treads and and there's a
camera that moves up and down and so on
um i'll probably paint it
but
the question i'd like to ask there's
obvious technical questions i'm fine
with communication the personality
storytelling all those kinds of things
i think i understand the process of that
but how do you know
when you got it right
so with with cosmo how did you know
this is great like or um something is
off like yeah is this brainstorming with
the team
do you know it when you see it is it
like
love at first sight it's like this is
right or like i guess if we think of it
as an optimization space
is there uncanny valley we're like
that's not right or this is right or are
a lot of characters right yeah
we stayed away from uncanny valley just
by having such a different what like
mapping where it didn't try to look like
a dog or a human or anything like that
and so uh
you avoided having like a weird pseudo
similarity but not quite hitting the
mark um but you could like just fall
flat where just like a personality or a
you know character emotion just didn't
feel right and so it actually mirrored
very closely to kind of the iterations
that a character director of pixar would
have where you're
running through it and you can virtually
kind of like see what it'll look like we
we created a plug-in to where we
actually used like like maya the sim you
know the animation tools and then we
created a plug-in that
perfectly
matched it uh to the physical one and so
you could like test it out virtually and
then push a button and see it physically
play out and there's like subtle
differences and so you want to like make
sure that that feedback loop is super
easy to be able to test it live
um and then sometimes like
you would just feel it that it's right
and intuitively no and then you'd also
do we did user testing but
it was very very often that like the
into like if we found it magical it
would scale and be magical uh more
broadly there were not too many cases
where like
like we were pretty decent about not
like getting to it you know geeking out
or getting too attached to something
that was super unique to us um but
trying to kind of like you know put a
customer hat on and does it truly kind
of feel magical and so in a lot of ways
we just give a lot of um
autonomy to the
character team to really think about the
you know character board and mood boards
and storyboards and like what's the
background of this character and how
would they react um and they went
through a process that's actually pretty
familiar but now had to operate under
these unique constraints um but
the moment where it felt right um
kind of took a fairly similar journey
than like a as a character in an
animated film actually it's quite cool
well the the thing that's really
important to me and i wonder if it's
possible well i hope it's possible
pretty sure it's possible is
for me
even though i know how it works
to make sure there's sufficient
randomness in the process yeah
probably because it would be machine
learning based
that i'm surprised
that i don't i'm surprised by certain
reactions i'm surprised by certain
communication maybe that's in a form of
a question
um were you surprised by certain things
cosmo did like certain interactions
yeah we made it intentionally like
uh so that there would be some surprise
then like a decent amount of variability
in how
he'd respond in certain circumstances
and so in the end like it's um
this is this isn't general ai this is a
giant like spectrum and library of like
parametrized kind of emotional responses
and an emotional engine that would like
kind of map
your current state of the game your
emotions the world the people are
playing with you all so forth to what's
happening um but we could make it feel
spontaneous by creating enough
diversity uh and randomness uh but still
within the bounds of what felt felt like
very realistic um to make that work and
then what was really neat is that we
could get statistics on how much of that
space we were saturating um and then add
more animations and more diversity in
the places that would get hit more often
so that you stay ahead of the um you
know the curve and maximize the uh the
chance that it it stays feeling alive um
and so but then when you like combine it
like
the permutations and kind of like the
combinations of emotions stitched
together sometimes surprised us because
you see them in isolation but when you
actually see them and you see them live
you know relative to some event that
happened in the game or whatnot like it
was kind of cool to see the combination
of the two and um uh and not too
different in other robotics applications
where like you get you get so used to
thinking about like the modules of a
system and how things progress through a
tech stack
that the real magic is when all the
pieces come together and you start
getting
the right emergent behavior um in a way
that's easy to lose when you just kind
of go too deep into any one piece of it
yeah when the system is sufficiently
complex there is something like emergent
behavior and that's where the magic is
you as a human being you can still
appreciate the beauty of that magic of
the fine at the system level first of
all thank you for humoring me on this uh
it's really really
uh fascinating i think a lot of people
would love this i i'd love to just one
last thing on the butter robot i promise
in terms of uh speech yeah
cosmo is able to communicate so much
with just movement and face
do you think speech
is too much of a degree of freedom
like a speech a feature or a bug of uh
deep
uh interaction
emotional interaction yeah
for a product
it's too deep right now it's just not
real uh it would immediately break the
fiction because the state of the art is
just not good enough um and
that's on top of just narrowing down the
demographic where like the way you speak
to an adult versus a way speak to a
child is very different
yet a dog is able to appeal to everybody
and so right now there is no speech
system that is like rich enough and and
subtly realistic enough to feel
appropriate um and so we very very
quickly kind of like moved away from it
now speech understanding is a different
matter where understanding intent that's
a really valuable input um
but
giving it back requires like a you know
way way higher bar given kind of where
today's
world is and so
that realization that you can do
surprisingly much with
uh either no speech or kind of tonal
like the way you know wally r2d2 and
kind of other characters are able to um
it's quite powerful and it generalizes
um across cultures and across ages
really really well i think we're going
to be in that
world for a little while where it's
still very much an unsolved problem on
how to like make something it touches on
kenny valley thing so if you have legs
and you're a big humanoid looking thing
you have very different expectations and
a much narrower degree of what's going
to be acceptable by society than if
you're a you know robot like uh like
cosmo or wall and you can or some other
form where you can kind of like reinvent
the character speech has that same
property where speech is so well
understood um in terms of expectations
by humans that you have far less
flexibility on how to deviate from that
and lean into your strengths and avoid
weaknesses but i wonder if there is
obviously there's certain kinds of
speech that
activates the uncanny valley and breaks
the illusion faster so
i guess my intuition is
we will solve
certain we would be able to
create some speech-based personalities
sooner than others so for example i
could i could think of
a robot that doesn't know english and is
learning english right yeah those kinds
of personalities where you're like uh
you're intentionally kind of like
getting a toddler level of uh speech so
that's exactly right so you can have
like
uh
tie it into the experience where uh it
is a more limited character or you
embrace the lack of emotions as part or
the lack of sorry dynamic range in the
speech kind of capabilities emotions as
like part of the character itself and
you've seen that in like kind of
fictional characters as well yeah um but
that's why this podcast works and
yeah like you kind of had that with like
um i don't know i guess like you know
data and some of the other yeah
like um but yeah so you have to and that
becomes a constraint that lets you
meet the bar um see i i honestly think
like also if you add uh drunk
and angry
that gives you more constraints that
allow you to be
dumber from an nlp perspective like
there's certain aspects so if you modify
human behavior like let's just so forget
the sort of artificial thing where you
don't know english toddler thing
we if you just look at the full range of
humans
i think we there's certain
situations where we put up
with uh like lower level of intelligence
in our communication like if somebody's
drunk we understand this issue that
they're probably under the influence
like we understand that they're not
going to be making any sense anger is
another one like that i'm sure there's a
lot of other kind of situation
yeah maybe uh yeah again language loss
in translation
that kind of stuff that i think if you
if you play with that
uh what is it the ukrainian boy that
passed the touring test you know play
with those ideas i think that's really
interesting and then you can create
compelling characters but you're right
that's a dangerous sort of road to walk
because uh you're adding degrees of
freedom that can get you in trouble yeah
and that's why like you have these um
big pushes that like for most of the
last decade plus like where you'd have
like
full like
human replicas of robots really being
down to like skin and like kind of in
some places um
my personal feeling is like man like
that's not the direction that's most
fruitful right now um beautiful art yeah
it's not in terms of a
uh rich deep fulfilling experience yeah
you're right yeah and the way creating a
minefield of potential places to feel
off uh
and then and then you're sidestepping
where like the biggest kind of
functional ai challenges are to actually
have
you know kind of like really rich
productivity that actually kind of
justifies a you know kind of the higher
price points and that's that's part of
the challenges like yeah like robots are
going to get to like thousands of
dollars tens of thousands of dollars and
so forth but you can imagine what sort
of expectation of value that comes with
it um and so that's where
you want to be able to invest the
the the time and uh and depth and so
going down the full human replica route
um
creates a gigantic uh
uh
distraction and really really high bar
that can end up sucking up so much of
your resources
so it's weird to say but you happen to
be one of the greatest at this point
roboticist ever because you
created this little guy you were part
obviously of a great team that created
the the little guy with a deep
personality
and they're now
switching to
an entirely well maybe not entirely but
a different
fascinating impactful robotics problem
which is autonomous driving and more
specifically the biggest version of
autonomous driving which is autonomous
trucking
so you are at waymo now can you give us
a big picture overview what is waymo
what is waymo driver what is waymo one
what is waymo via
can you give an overview of the company
and the vision behind the company for
sure waymo by the way it's just it's
been eye-opening on just how incredible
that that people and the talent is and
how in one company you almost have to
create i don't know 30 companies worth
of like technology and capability to
like kind of solve the full spectrum of
it so um
yeah so i've been at weymouth since um
2019 so about two and a half years so
waymo is uh focused on building what we
call a driver which is
creating the ability to have autonomous
driving across different environments
vehicle platforms domains and use cases
uh
you know as you know got started in uh
2009 it was a lot almost like an
immediate successor to the grand
challenge and urban challenges that were
like incredible uh kind of catalyst for
this whole space um and so google
started this project and then eventually
waymo spun out and so what waymo is
doing is creating uh the
systems both you know hardware software
infrastructure and everything that goes
into it to enable and to commercialize
autonomous driving this hits on consumer
transportation and ride sharing and kind
of vehicles and urban environments
and as you mentioned it hits on
autonomous trucking to
to transport goods so in a lot of ways
it's transporting people and
transporting goods um but at the end of
the day the underlying capabilities are
required to do that are surprisingly
better aligned than one might expect
where it's the fundamentals of um of
being able to understand the world
around you process it make intelligent
decisions and prove that we are at a
level of safety that enables uh
large-scale autonomy so from a branding
perspective sort of uh waymo driver is
the system that's irrespective of a
particular
uh
vehicle it's operating in there you have
a set of sensors that perceive the world
can act in that world and move this
whatever the vehicle is what's that
legal platform that's right and so in
the same way that you have a driver's
license and like your ability to drive
isn't tied to a particular make and
model of a car and of course there's
special licenses for other types of
vehicles but the fundamentals of a human
driver very very large you carry over
and then there's uniquenesses related to
a particular environment or domain or a
particular um vehicle type that kind of
add some extra additive challenges but
that's exactly right it's the underlying
systems that enable uh
a physical vehicle without a human
driver to uh very successfully
accomplish the tasks that previously um
what wasn't possible um without um you
know 100 human driving
and then there's
way more one which is the transporting
people that's right from a brand
perspective and just in case we refer to
it so people know and then there's waymo
via which is the trucking component why
via by the way what is that what is that
what's is it just like a cool sounding
name that just yeah uh like is there
does there an interesting story there
just it is a pretty cool sounding name
it's a cool sounding name i mean when
you think about it it's just like well
we're gonna transport it via this and
that like so it's just kind of like an
allusion to um the mechanics of
transporting something yes cool um and
uh and it is a pretty good grouping and
the interesting thing is that even the
groupings kind of bore where waymo one
is like human transportation and uh
there's a fully autonomous service in
the phoenix area that like every day is
transporting people and it's pretty
incredible to like just you know see
that operate at reasonably large scale
and just kind of happen and then on the
via side it doesn't even have to be
like long-haul trucking is a like a
major focus of uh of ours but down the
road you can stitch together the vehicle
transportation as well for local
delivery um also and a lot of this
requirements for local delivery overlap
very heavily with consumer
transportation um
obviously uh you know given that you're
operating on a lot of the same roads um
and uh
and navigating the same safety
challenges and so um
yeah and wave mode very much is a
multi-product company that
has
ambitions in both they have different
challenges and both are tremendous
opportunities but the cool thing is is
that there's a huge amount of leverage
and this kind of core technology stack
now gets pushed on by both sides and
that adds its own unique challenges but
the success case is that um the
challenges that you push on um they get
leveraged across all platforms and also
from an engineering perspective the
teams are integrated
it's a mix so there's a huge amount of
centralized kind of core teams that
support all applications and so you
think of something like the hardware
team that develops the lasers the
compute integrates into vehicle
platforms this is an experience that
carries over across um you know any
application that we'd have and they have
been flow with both then there's like
really unique um perception challenges
planning challenges like other you know
types of challenges where there's a huge
amount of leverage on a cortex stack but
then there's like dedicated teams that
think of how do you deal with a unique
challenge for example
an articulated trailer with varying
loads that completely changes the
physical dynamics of a vehicle that
doesn't exist on a car but becomes one
of the most important
kind of unique new challenges on a truck
so
what's the long-term dream
of waymo via
uh the autonomous trucking effort that
waymo is doing yeah so we're starting
with developing
uh
l4 autonomy for
class 8 trucks these are 53-foot
trailers that
capture like a big perc a pretty sizable
percentage of the good transportation in
the country
long term the opportunity is obviously
to expand to much more diverse types of
vehicles
types of good transportation and start
to really expand in both the volume and
the route feasibility that's possible
and so just like we did on the car side
you start with
a single route with a very specific
operating kind of domain and constraints
that allow you to solve the problem but
then over time you start to really try
to push
against those boundaries and open up
deeper feasibility across routes across
surface streets across environmental
conditions across the type of goods that
you carry the versatility of those goods
and how
little supervision is necessary to just
start to scale this network
and
long term there's actually it's a pretty
incredible enabler where um
you know today you have
already a giant shortage of truck
drivers it's uh over 80 000 truck driver
shortage that's expected to grow to
hundreds of thousands in the years ahead
you have
really really quickly increasing demand
from e-commerce and just just
distribution of uh where people are
located
um you have one of the deepest safety
challenges of um
of any profession in the u.s where um
there's a
huge huge kind of challenge around
fatigue and around kind of the long
routes that are driven
and even beyond kind of the cost and
necessity of it
there are fundamental constraints built
into our logistics network that are tied
to
the type of human constraints and
regulatory constraints that are
tied to trucking today for example our
limits on how long a driver can be
driving in a single day
before they're they're not allowed to
drive anymore which is a very important
safety constraint
what that does is it enforces
limitations on how far
jumps with a single driver could be and
makes you very subject to availability
of drivers which influences where
warehouses are built which influences
how goods are transported which
influences costs and so um
you start to have an opportunity on
everything from plugging into existing
fleets and brokerages and the existing
logistics network and just immediately
start to have a huge opportunity to add
value from
you know
cost and driving fuel insurance and
safety standpoint all the way to
completely reinventing the logistics
network um across the united states and
enabling something completely different
than what it looks like today yeah i had
uh
be published before this had a great
conversation with steve vicelli who we
talked about the manual driving and he
echoed many of the same things that you
were talking about but we talked about
much of the
the fascinating human stories of truck
drivers he was also was a truck driver
for
for a bit as a grad student to try to
understand the depth of the problem he's
a fascinating wives
we have some drivers that have 4 million
miles of lifetime driving experience
it's pretty incredible and um
yeah it's uh
yeah learning from them like some of
them are on the road for 300 days a year
it's a very unique type of lifestyle so
there's fascinating stuff there just
like you said there's a shortage of
actually
people uh truck drivers
taking the job counter to what this i
think is publicly believed
so there's an excess of jobs and a
shortage of people to take up those jobs
and just like you said it's such a
difficult problem
and these are experts at driving it's
solving this particular problem and it's
fascinating to learn from them to
understand
you know how hard is this problem and
that's the question i want to ask you
from a perception from a robotics
perspective
what's your sense of how difficult is
autonomous trucking maybe
you can comment on which scenarios are
super difficult which are more
manageable is there is there a way to
kind of convert into words
how difficult the problem is yeah it's a
good question so there's um
and as you can expect it's a mix some
things become a lot uh uh
a lot
easier or at least more flexible um some
things are harder and so you know on the
things that are like uh the tailwinds
the benefits um
a big focus of
automating trucking especially initially
is really focusing on the long-haul
freeway stretch of it where that's where
a majority of the value is captured on a
freeway you have a lot more structure
and a lot more consistency across
freeways across the u.s
compared to surface streets where you
have a way higher dimensionality of what
can happen lack of structural lack of
consistency and variability across
cities so you can leverage that
consistency to
tackle at least in that respect a more
constrained ai problem which has some
benefits to it um you can itemize much
more of the sort of things you might
encounter and so forth and so
those are benefits is there a canonical
freeway
and city we should be thinking about
like
is there is there a standard thing
that's brought up in conversation often
like here's a stretch of road
um what is it like when people talk
about traveling across country they'll
talk about
new york this is san francisco
is that the route like is there a
stretch of road that's like nice and
clean
and then there's like cities with
difficulties in them that you kind of
think of as the canonical problem to
solve here right uh so starting with the
car side um
well waymo very intentionally picked the
phoenix area and the san francisco area
as a follow once we hit driverless where
when you think of consumer
transportation and ride sharing you know
kind of economy a big percentage of that
market is captured in the densest cities
in the united states and so really
pushing out and solving san francisco
becomes a really huge opportunity and uh
importance and um
and you know places one dot on kind of
like the spectrum of like kind of
complexity uh the phoenix area starting
with chandler and then like kind of
expanding more broadly in the phoenix uh
metropolitan area it's i believe the
fastest growing city in the us it's a uh
kind of a higher medium-sized city but
growing quickly
and still captures a really wide range
of kind of like complexities and so
getting to driverless there actually
exposes you a lot of the building blocks
you need for the more complicated
environments and so in a lot of ways
there's a thesis that if you start to
kind of place a few of these kind of
dots where san francisco has these types
of unique challenges dense pedestrians
all this like complexity especially when
you get into the downtown areas and so
forth and phoenix has like
a really interesting kind of spectrum of
challenges maybe you know other ones
like la kind of add freeway focus and so
forth you start to kind of cover the
full set of features that you might
expect and it becomes faster and faster
if you have the right systems in the
right
organization to then open up the fifth
city and intensity in the 20th city on
trucking there's uh similar properties
where um obviously there's uniquenesses
and freeways when you get into really
dense environments and then
the real opportunity uh to then you know
get even more uh value is to think about
how you expand with like some of the
service street challenges but for
example right now we're looking um we
have a big facility that we're uh
finishing building in q1 in uh dallas
area um that'll allow us to do testing
from the dallas area on routes like
dallas to houston dallas to phoenix um
going out east and dallas to austin
austin so that triangle um waymo should
come to austin
well waymo the car side was in austin
for a while yes i know yeah come back
yeah but uh trucking is actually texas
is one of the best places to start uh
because of both volume regulatory
weather there's a lot of benefits um on
trucking a huge opportunity is port of
la going east so in a lot of ways a lot
of the work is to start to stitch
together a network and converge to
port of la where you have the biggest
port in the united states um and the
amount of goods going east from there is
pretty tremendous and then obviously
there's you know kind of channels
everywhere and you have extra
complexities as you get into like snow
and inclement weather and so forth but
um what's interesting about trucking is
every single route segment that you add
increases the value of the whole network
and so it has this kind of network
effect and cumulative effect that's very
unique and so there's all these
dimensions that we think about um and so
in a lot of ways dallas has a really
unique hub that opens up a lot of
options has become a really valuable
weber so the million questions i get
asked first of all
you mentioned level four
for people who totally don't know
there's these levels of automation
that uh level four refers to uh
kind of the first step that you could
recognize is fully autonomous driving
level five is really fully autonomous
driving level four is kind of fully
autonomous driving and then there are
specific definitions depending on who
you ask what that actually means but
for you what does the level four mean
and you mentioned freeway let's say like
there's three parts of long-haul
trucking maybe i'm wrong in this but
there's freeway driving
there's like
truck stop
and then there's
more urban-y type of area so which of
those do you want to tackle
which of them do you include under level
four like how do you think about this
problem what do you focus on where's the
biggest impact to be had in the short
term
so the goal is to we get we got to get
to market as fast as we can because the
moment you get the market you just learn
so much and it influences everything
that you do and it is um
uh i mean one of the experiences that
carried over from before is that you add
constraints you figure out the right
compromises you do whatever it takes
because getting the market like is so
critical right and here with autonomous
driving you can get to market in so many
different ways that's right and so one
of the simplified simplifications that
we intentionally have put on is using
what we call transfer hubs where you can
imagine
depots uh that are
uh
at the entry points to metropolitan
areas like let's say dallas like the hub
that we're building which does a few
things that are very valuable so from a
first product standpoint you can
automate transfer hub to transfer hub
and that path from the transfer hub to
the
you know the full freeway route can be a
very intentional single route that you
can select for the features that you
feel you want to handle at that point in
time then you build the hub
specifically designed
for time tracking and that's what's
going to happen actually like and you
get you need to come out in january and
check it out because it's going to be
really cool it's the not only is it our
main operating um headquarters for our
fleet there but it will be the first uh
fully ground-up design driverless hub
for autonomous drivers autonomous trucks
in terms of where do they enter where do
they depart how do you think about the
flow of people goods everything it's
like it's quite cool and it's really
beautiful on how it's thought through
and so early on it is totally reasonable
to do the last
five miles manually to get to the final
kind of depot to avoid having to solve
the general surface street problem which
is obviously very complex now when the
time comes and we are increasingly we're
already we're pushing on some of this
but we will increasingly be pushing on
surface street capabilities to build out
the value chain to go all the way deeper
to depot instead of transfer hub the
transfer hub and we have probably the
best advantages in the world because of
all the waymo experience on surface
streets but that's not the highest roi
right now where the highest roi is
hub the hub and get the routes going and
so when you ask what's l4
l4 can be applied to any domain
operating domain or scope but it's
effectively for the places where we say
we're ready for autonomous operation we
are 100
operating uh with uh through the as a
self-driving truck with no uh human
behind the wheel that is l4 autonomy and
it doesn't mean that you operate in
every condition it doesn't mean you
operate on every road but for a
particularly well-defined area uh
operating conditions routes kind of
domain you are fully autonomous and
that's the difference between l4 and l5
and most people would agree that at
least any time in the foreseeable future
l5 is just not even really worth
thinking about because there's always
going to be these extremes
and so it's a race and a almost like a
game where you think of what is the
sequence of expanded capabilities that
create the most value and teach us the
most and create this feedback loop where
we're building out and unlocking more
and more capability over time i gotta
ask you just curious so first of all i
have to when i'm allowed to visit the
dallas facility because it's super cool
it's like robot
on the giving and the receiving end it's
the truck is a robot and the the hub is
a robot yeah it's got to be very robot
friendly so yeah that's great
i will feel at home uh
the what's the sensor suite like on the
hub if you can just high level mention
it is
does the hub have like lidars and like
is is it is the truck doing most of the
intelligence or is the hub also
intelligent yeah so most of it will be
the truck and uh everything is like
connected like so we uh we have our
servers where we know exactly where
every truck is we know exactly what's
happening at a hub and so you can
imagine like a large back-end system
that over time starts to manage uh
timings goods delivery windows all these
sort of things and so you don't actually
uh
need to um there might be special cases
where that is valuable to equip some
sensors in the hub but a majority of the
intelligence is going to be on the truck
because um
whatever is relevant to the truck
relevance should be seen by the truck
and can be relayed
uh remotely for any sort of kind of
cognizance or decision making but
there's a distinct type of workflow
where um where do you check trucks where
do you want them to enter what if
there's many operating at once where's
the staging area to depart how do you
set up the flow of humans and human cars
and traffic so that you minimize the
interaction between humans and kind of
self-driving trucks uh and then how do
you even intelligently select the
locations of these transfer hubs that
are both really great service locations
for a metropolitan area and there could
be over time many of them for a
metropolitan area
while at the same time leaning into
the path of least resistance to lean
into your current capabilities and
strengths so that you minimize the
amount of work that's necessary to
unlock the next kind of big bar i have a
million questions so first is the goal
to have no human in the truck
the goal is to have no human in the
truck now of course right now we're
testing with expert operators and so
forth but um the goal is to um now there
might be circumstances where it makes
sense to have a human or uh and and
obviously these trucks can also be
manually driven so sometimes like our we
talk with our fleet partners about how
um you can buy a waymo equipped diamor
truck down the road and on the routes
that are autonomous it's autonomous on
the routes that are not it's um human
driven maybe there's l2 functionality
that add safety systems and so forth but
as soon as they become
as soon as we expand in software the
availability of driverless routes the
hardware is forward compatible to just
now start using them um in real time and
so you can imagine uh this mixed use but
at the end of the day the largest value
proposition is where you're um able to
have no constraints on how you can
operate this truck um and it's 100
autonomous with nobody inside oh that's
amazing so the
let me ask on the logistics front
because you mentioned that also
opportunity to revamp or for builds from
scratch some of the ideas around
logistics
i don't want to throw too much shade but
from talking to steve my understanding
is
logistics is not perhaps as great as it
could be in the current uh trucking uh
environment i'm not maybe you can break
down why but there's probably competing
companies
there's just a mess maybe some of it is
literally just it's old school like they
it's just like it's not computer it's
not computerized
like
truckers are almost like contractors
there there's an independence and
there's not a nice interface where they
can communicate where they're going
where they're at
you know all those kinds of things and
so there it just feels like there's so
much opportunity to digitize everything
to where you could optimize the use of
human time optimize the use of all kinds
of resources
how much you thinking about that problem
how fascinating is that problem
how difficult does it how much
opportunity is there to revolutionize
the space of logistics in autonomous
trucking in trucking period it's pretty
fascinating it's uh this is one of the
most motivating aspects of all this
where like yes there's like a mountain
of problems that are like you wanna you
have to solve to get to like the first
checkpoints and first drive list and so
forth and inevitably like in a space
like this you plug in initially into the
existing kind of system and start to
kind of you know learn and iterate but
um that opportunity is massive and so
you know a couple of the factors that um
play into it so first of all um there's
obviously just the physical constraints
of driving time driver availability
some fleets have a 95 attrition rate you
know right now because of just
this demands and like you know kind of
gaps in competition and so forth and
then it's also incredibly fragmented
where
you would be shocked at like when you
when you look at industries like when
you think of the top 10 players like the
biggest fleets like the walmarts and
fedexes and so forth the percentage of
the overall trucking market that's
captured by the top 10 or 50 fleets is
surprisingly small um the average kind
of uh truck operation is like a one to
five truck you know family business um
and so and so there's just like a huge
amount of like
fragmentation which makes for um
really interesting challenges in kind of
stitching together through like bulletin
boards and brokerages and some people
run their own fleets and and this
world's kind of like evolving um but
it is one of the less digitized and
optimized worlds that there is
and the part that is optimized is
optimized to the constraints of today
and even within the constraints of today
this is the 900 billion dollar industry
in the u.s
and it's continuing to grow it feels
like from a business perspective
if i were to predict
that while trying to solve the
autonomous trucking problem waymo might
solve first the logistics problem
like because that that would already be
a huge impact yeah so on the way to
solving autonomous trucking
the human driven like there's so much
opportunity to
significantly improve the human driven
trucking the timing the logistics so you
use humans optimally the handoffs the
like you know well even that you i mean
you get really ambitious you start to
expand this beyond like how does the uh
fulfillment center work and like how
does the transfer hub work how does a
warehouse work to
i mean there's a lot of opportunities to
start to automate these chains and um
a lot of the inefficiency today is
because like you have a delay like port
of la has a bunch of ships right now
waiting outside of it because they can't
dock because there's not enough
labor inside of the port of la that
means there's a big backlog of trucks
which means there's a big backlog of
deliveries which means the drivers
aren't where they need to be and so you
have this like huge chain reaction and
your feasibility of readjusting in this
network is low because everything's tied
to humans and manual kind of processes
uh or distributed processes across a
whole bunch of players
and so
one of the biggest enablers is um yes we
have to solve autonomous trucking first
and that by the way that's not like an
overnight thing that's decades of
continued kind of expansion and work
but um the first checkpoint in the first
route is like is not that far off but
once you start enabling and you start to
learn about how
the
constraints of autonomous trucking which
are very different in the constraints of
human trucking and again strengths and
weaknesses
how do you then start to leverage that
and
rethink a flow of goods uh more broadly
and this is where like the learnings of
like really partnering with some of the
largest fleets in the us
and the sort of
learnings that they have about the
industry and the sort of needs that they
have and
what would change if you just
like really broke this one constraint
that like holds up the whole network or
what if you enabled this other
constraint
that actually drives the roadmap in a
lot of ways because um this is not like
an all or nothing problem it's uh you
know you start to kind of unlock more
and more functionality over time which
functionality most enables this
optimization ends up being kind of part
of the discussion but you're totally
right like you fast forward to like you
know five years ten years uh 15 years
and
you think about like very generalized
capability of automation and logistics
as well as the ability to like poke into
how those handoffs work
the efficiency goes far beyond just
direct cost of today's like unit
economics of a truck they go towards
reinventing the entire system um in the
same way that uh you know you see you
know these other industries that uh like
when you get to enough scale you can
really rethink um how you build around
your new set of capabilities not
the old set of capabilities yeah use the
analogy metaphor or whatever that
autonomous trucking is like email versus
mail and then with email you're still
doing the communication but it opens up
all kinds of comm
varieties of communication that
you didn't anticipate that's right
constraints are just completely
different um and yeah there's definitely
a property of that here um and we're
also still learning about it because
there there is a lot of really um
fascinating and sometimes really elegant
things that the industry has done where
there's companies whose entire existence
is around despite the constraints
optimizing as much as they can out of it
and those lessons do carry over but it's
an interesting kind of merger of worlds
to think about like well what if this
was completely different how would we
approach it
and the interesting thing is that
for a really really really long time
it's actually going to be the merger
between how to use autonomy and how to
use humans that leans into each each of
their strengths
yeah and then we're back to cosmo
human robot interaction so and the
interesting thing about waymo is because
there's the passenger vehicle the the
human the transportation of humans and
transportation of goods
you could see over time they might kind
of
meld together more
because you you'll probably have like
zero occupancy vehicles moving around so
you have transportation goods for short
distances and then
for slightly longer distances and then
slightly longer and then there'll be
this then you just see the difference
between a passenger vehicle and a truck
is just size and you can have different
sizes and all that kind of stuff and at
the core you can have a way more driver
that doesn't as long as you have the
same that's sweet you can just think of
it as one problem and that's why over
time these do come kind of converge
where in a lot of ways a lot of the
challenges we're solving are freeway
driving which are going to carry over
very well to the vehicles to the car
side
um but there are like then unique
challenges like uh you have a very
different dynamics in your vehicle where
you have to see much further out in
order to have the proper like response
time because you have an 80 000 pound
fully loaded truck um that's a very very
different type of braking profile than a
than a car you have uh
really interesting kind of dynamic
limits because of the trailer where you
actually it's very very hard to like
physically like flip a car or do
something like physically like most risk
in a car is from just collisions um
it's very hard to like in any normal
operation to do something other than
like you know unless you hit something
it's actually kind of like roll over or
something on a truck you actually have
to drive much closer to the physical
bounds of the safety limits um
but you actually have like
real constraints because you could uh
you know you could have a really
interesting interactions between the
cabin and the trailer yeah there's
something called jackknifing if you turn
you know too quickly
you have roll risks and so forth and so
we spend a huge amount of time
understanding those boundaries and those
boundaries change based on the load that
you have which is also an interesting
difference you have to propagate through
the out that through the algorithm so
that you're leveraging your dynamic
range but always staying within the
safety balance but understanding what
those safety bonds are and so we have
this like really cool test facility
where we like take it to the max and
actually imagine a truck with these
giant training wheels on the back of the
trailer and you're pushing it past the
safety limits uh in order to like try to
actually see where it rolls and so you
you you define this high dimensional
boundary which then gets captured in
software to stay safe and actually do
the right thing but uh it's kind of
fascinating the sort of uh you know kind
of challenges you have there um but then
all of these things drive really
interesting challenges from perception
to um
unique behavior prediction challenges
and obviously in planner where you have
to think about merging and creating gaps
with a 53 foot trailer and so forth and
then obviously the platform itself is
very different where you have different
numbers of sensors sometimes types of
sensors and you also have unique blind
spots that you have because of the
trailer which you have to think about
and so it's a really interesting
spectrum and in the end
you try to capture these special cases
in a way that is cleanly augmentations
of the existing tech stack
because a majority of what we're solving
is actually generalizable to freeway
driving um and different platforms and
over time
they all start to kind of merge ideally
where the things that are unique are as
as minimal as possible and that's where
you get the most leverage and that's why
waymo can do
you know take on two trillion dollar
opportunities um and
have been nowhere near 2x the cost or
investment or size in fact it's much
much smaller than that
because of the high degree of leverage
so what kind of sensor suite
they can speak to that uh that a long
haul truck needs to have lidar vision
how many what are we talking about here
yeah so it's um more than the cars so
very loosely you can think of as like 2x
but it varies depending on the sensor
and so we have like dozens of cameras
radar and then multiple lidar as well
you'll see one difference where the cars
have a central main sensor pod on the
roof in the middle and then a some kind
of hood sensors for blind spots the
truck moves to two main sensor pods on
the outsides where you would typically
have the mirrors next to the driver
they effectively go as far out as
possible um kind of up to
the understanding of the front
kind of on the cabin not all the way in
the front but like kind of where the
mirrors for the driver would be and so
those are the main sensor pods and the
reason they're there is because if you
had one in the middle the trailer is
higher than the cabin and you would be
included with this like awkward wedge
too much occlusion too much occlusion
and so then you would add a lot of
complexity to the software yeah to make
up for that and and just unnecessary
components so many probably fascinating
design choices really cool because you
can probably bring up light or higher
and have it in the center or something
you could have all kinds of choices you
have to make the decisions here yeah
that ultimately probably will define the
industry right but by having two on the
side there's actually multiple benefits
so one is like um you're just beyond the
trailer so you can see fully flush with
the trailer and so you eliminate most of
your blind spot except for right behind
the trailer um which is which is great
because now the software carries over
really well and the same perception
system you use on the car side largely
that architecture can carry over and you
can retrain some models and so forth but
you leverage it a lot it also actually
helps with redundancy where
there's a really nice built-in
redundancy for all the lidar cameras and
radar where you can afford to have any
one of them fail and you're still okay
and at scale every one of them will fail
um
and you will be able to detect when one
of them fails because they don't uh
because the redundancy
they're giving you the data that's
inconsistent with the rest of that's
right and it's not just like they no
longer give data it could be like
they're fouled or they stop giving data
where the
some electrical thing gets cut or you
know part of your compute goes down so
what's neat is that like you have way
more sensors part of his field of view
and occlusions part of its redundancy
and part of it is new use cases so
there's um uh new types of sensors uh to
optimize for long range and uh kind of
the the the sensing horizon that we look
for on our vehicles um that is unique to
trucks because it actually is like kind
of much like further out than um than a
car but a majority are actually used
across both cars and trucks and so we
use the same compute the same uh
fundamental baseline sensors cameras uh
radar um imus and so you get a great
leverage from all of the infrastructure
and the hardware development as a result
so what about cameras what role does so
lidar is this rich set of information
has its strengths um has some weaknesses
camera is this rich source of
information that has some strengths has
its weaknesses
what role does lidar play what role does
vision
cameras play
in this in this beautiful
problem of autonomous trucking ah it is
beautiful there's like so much that
comes together
and how much yeah at which point do they
come together yeah
so let's start with lidar so lidar has
been like waymo's um uh one of waymo's
big strengths and advantages where uh we
developed our own lidar uh in-house
where many generations in both in cost
and functionality it is um uh the best
and you know in this in the space which
generation because i know there's this
there's uh this cool i mean i love
versions that are increasing uh which
version of the hardware stack is at
currently uh officially publicly uh so
uh so some parts iterate more than
others i'm trying to remember on the
sensor side so this the entire
self-driving system which includes
sensors and compute is fifth generation
yes um i can't wait until there's like
iphone style like announcements yeah for
like new versions of the weymouth
hardware yeah well we try to be careful
because man when you change the hardware
it takes a lot to like retrain the
models and uh and everything so we just
went through that and going from the
pacificas to the jaguars and so the
jaguars and then the trucks are you know
have the same generation now um but yeah
the lidar is uh it's incredible and so
waymo has um leaned into that as a
strength and so a lot of the near-range
perception system
that obviously kind of carries over a
lot from the car side uh uses lidar as a
very prominent kind of like primary
sensor but then obviously
everything has its strengths and
weaknesses and so in the near range
lidar is a gigantic advantage um and it
has its weaknesses on you know when it
comes to occlusions in certain areas
rain and weather like you know things
like that but it's an incredible sensor
and it gives you incredible density
perfect location precision and
consistency which is a very valuable
property um to be able to uh to kind of
apply a mel approach can you elaborate
consistency yeah when you have a camera
the position of the sun the time of the
day uh um various of the properties can
have a big impact uh whether there's
glare the field of view things like that
um
so consistent the signal with uh
in the face of a changing external
environment the signal yeah daytime
night time
it's about 3d um
physical existence in effect like you're
you're seeing
beams of light that bounce physically
bounce off of something and come back
and so whatever the conditional
conditions are like the shape of a
human
sensor reading from a human or from a
car or from an animal like you have um a
reliability there which ends up being
valuable for kind of like the long tail
of challenges yeah now
lidar is the first sensor to drop off in
terms of range and ours has a really
good range but at the end of the day um
it drops off and so particularly for um
for trucks on top of the general
redundancy that you want for near range
with and complements through cameras and
radar for occlusions and for
complementary information and so forth
when you get to long range you have to
be radar and camera primary because your
lidar data will fundamentally drop off
after a period of time and you have to
be able to see um kind of objects
further out now uh cameras have uh
the the incredible range um where you
get a high density high resolution
camera you can get data you know well
past a kilometer and it's like really um
potentially a huge value now the signal
drops off the noise is higher detecting
is harder classifying is harder and one
that you might think about localizing
it's harder because you can be off by
like
two meters and where something's located
a kilometer away and that's the
difference between being on the shoulder
and being in your lane and so you have
like interesting challenges there that
you have to solve which have a bunch of
approaches that come into it um radar is
interesting because um
uh
uh because it also has longer range than
um than lidar uh and it gives you speed
information so it becomes very very
useful for dynamic information of
traffic flow uh vehicle motions animals
pedestrians like uh just things that
might be um useful signals um and uh
it helps with weather conditions where
radar actually penetrates weather
conditions in a better way than um other
sensors and so it's just it's kind of
interesting where we've kind of started
to converge towards not thinking about a
problem as a lidar problem or a camera
problem or radar problem but it's a
fusion problem where
these are all like large scale ml
problems where you put data into the
system and in many cases you just look
for the signals that might be present in
the union of all of these and
leave it to the system as much as
possible to start to really identify how
to um how to extract that and then
there's places we have to intervene and
actually
include more but um
no single sensor is in a great position
to like really solve this problem and
then without a huge extra challenge
that's fascinating um
there's a question that's probably still
an open question is at which point do
you fuse them
do you
do
do you solve the perception problem for
each sensor suite individually the
lighter suite and the camera suite or do
you
do some kind of heterogeneous fusion or
do you fuse at the very beginning
is there a good answer or at least an
inkling of intuitions you can accomplish
yeah so people refer to this as like um
early fusion or late fusion so late
fusion might be that you have like the
the camera pipeline the lidar pipeline
and then you like fuse them and like
when it gets to like final
you know semantics and classification
and tracking you like kind of fuse them
together and and figure out which one's
best um there's more and more evidence
that um uh that early fusion is
important um and that is because uh
weight fusion does not allow you to pick
up on the complementary strengths and
weaknesses of the sensors um weather is
a great example where um if you do early
fusion you have an incredibly hard
problem for any single sensor in rain to
solve that problem um because you have
reflections from the lidar um you have
uh
you know weird kind of noise from the
camera blah blah blah right but the
combination of all of them can help you
filter and help you get to the real
signal that then gets you as close as
possible to the original stack
and be much more fluid about the
strengths and weaknesses where
um you know your camera is much more
susceptible to like kind of uh fouling
on the on the actual lens from
you know like rain or random stuff
whereas like you might be a little bit
more resilient than other sensors and so
there's an element of
logic that always happens late in the
game but that fusion early on actually
especially as you move towards ml and
large-scale data-driven approaches just
maximizes your ability to pull out the
best signal you can out of each modality
before you start making constraining
decisions that end up being hard to
unwind late in the stack so how much
of this is a machine learning problem
what role does ml machine learning
playing this whole
problem of autonomous driving autonomous
trucking
it's um
massive and it's increasing over time
you know if you go back to um
you know the grand challenge days in the
early days of kind of av development
there was ml but it was not in like kind
of the mass scale data style of ml it
was like
learning models but in a more structured
kind of way and it was a lot of
heuristic and search-based approaches
and planning and so forth you can make a
lot of progress
with these types of approaches kind of
across the board an almost deceptive
amount of progress we can get pretty far
but then you re you start to really
grind the further you get in some parts
of stack
if you don't have an ability to absorb a
massive amount of experience in a way
that scales very sublinearly in terms of
human labor and human attention and so
when you look at the stack
the perception side is probably the
first to get really revolutionized by ml
and it goes back many years because
ml for like computer vision and these
types of approaches has
kind of took off um was a lot of the
like early kind of push and um and deep
learning and so there's always a debate
on you know the spectrum between kind of
like end to end ml which
you know is a little bit kind of like
too far to how you architect it to where
you have modules but enough ability to
think about long tail problems and so
forth but at the end of the day um you
have
big parts of system that are very ml and
data driven and we're increasingly
moving that direction all the way across
the board including
behavior
where
even when it's not like
a gigantic ml problem that covers like a
giant swath end to end more and more
parts of the system have this property
where you want to be able to put more
data into it and it gets better
and that has been one of the
realizations as you drive tens of
millions of miles and try to like solve
new expansions of domains without
regressing in your old ones it becomes
intractable for a human to approach that
in the way that traditionally robotics
has kind of approached some elements of
the of the tech stack so are you trying
to um
create a data pipeline specifically for
the trucking problem this is it like how
much leveraging of the autonomous
driving is there in terms of data
collection yeah and
how unique
is the data required for the trucking
problem so we uh we we use all the same
infrastructure um so labeling workflows
ml workflows everything so that actually
carries over quite well um
we heavily reuse the data even where
almost every model that we have on a
truck we started with the latest car
model cool and um so it's almost like a
good background model yeah it's like you
can think of like you despite the
different domain and different numbers
of sensors and position of sensors
there's a lot of signals that carry over
across driving and so it's almost like
pre-training and getting a big boost out
of the gate where you can reduce the
amount of data you need by a lot
and it goes both ways actually and so
we're increasingly thinking about our
data
strategy on how we leverage both of
these
so you think about um you know how other
agents react to a truck yeah it's a
little bit different but the
fundamentals are actually like what will
other vehicles in the road do there's a
lot of carryover that's possible and in
fact
just to give you an example uh we're
constantly kind of like adding more data
from the trucking side but as of right
now
when we think of our like one of our
models behavior prediction for other
agents on the road like vehicles
85 percent of that data comes from cars
and a lot of that 85 comes from surface
streets
because we just had so much of it and it
was really valuable and so we're adding
in more and more particularly in the
areas where we need more data but you
get a huge boost out of the gate just
all different visual characteristics of
roads lane markings pedestrians all that
that's still relevant it's all still
relevant and then just the fundamentals
of how you know you detect the car
does it really change that much whether
you're detecting it from a car or a
truck um the fundamentals of how a
person will walk around your vehicle is
it it'll change a little bit but the
basics like there's a lot of signal in
there that as a starting point to a
network can actually be very valuable
now we do have some very unique
challenges where there's a sparsity of
events on a freeway um the frequency of
events happening on a freeway whether
it's you know interesting
you know objects in the road or
incidents or or even like from a human
benchmark like how often does a human
have an accident on a freeway is far
more sparse than on a surface street and
so that leads to really interesting data
problems where
you can't just drive infinitely to
encounter all the different permutations
of things you might encounter and so
there you get into interesting
tools like structured testing and data
collection data augmentation and so
forth and so there's really interesting
kind of technical challenges that
push some of the research um that
enables um these new suites of
approaches what role does simulation
play
really good question so waymo simulates
about a thousand miles for every mile it
drives um so you think of in both so
across the board across the board yeah
uh so you think of for example well if
we've driven you know over 20 million
miles that's over 20 billion miles in
simulation now how do you use simulation
um
it's a multi-purpose so
uh you use it for basic development so
you want to do make sure you have
regression prevention and protection of
everything you're doing right um that
that's an easy one
when you encounter something interesting
in the world let's say there was an
issue with how the vehicle behaved
versus an ideal human um you can play
that back in simulation and start
augmenting your system and seeing how
you would have reacted to that scenario
with this improvement or this new area
you can create scenarios that become
part of your regression set after that
point right um then you start getting
into like really really high kind of
hill climbing where um you say hey i
need to improve this system i have these
metrics that are really correlated with
final performance how do i know how well
i'm doing
uh operation the actual physical driving
is the least efficient form of testing
and it's expensive it's time consuming
so grabbing a large scale
batch of historical data and simulating
it to get a signal of over these last or
just random sample of 100 000 miles how
has this metric changed versus where we
are today you can do that far more
efficiently in simulation than just
driving with that new system on board
right
and then you go all the way to the
validation phase where to actually see
your human relative safety of like how
well you're performing on the car side
or the trucking side relative to a human
um
a lot of that safety case is actually
driven by
uh taking all of the physical
operational driving which probably
includes a lot of interventions where
like where the operate the driver took
over just in case um
and then you simulate those forward
and see if would anything have happened
and in most cases the answer is no but
you you can simulate it forward and you
can even start to do really interesting
things where you
add virtual agents to create harder
environments you can fuzz the locations
of physical agents you can muck with the
scene and stress test the scenario from
a whole bunch of different dimensions
and effectively you're trying to like
more efficiently sample this like
infinite dimensional space but try to
encounter the problems as fast as
possible because what most people don't
realize is the hardest problem in
autonomous driving is actually the
evaluation problem in many ways not the
actual autonomy problem and so if you
could in theory evaluate perfectly and
instantaneously
you can solve that problem in a really
fast feedback loop
quite well but the hardest part is being
really smart about this suite of
approaches on how can you get an
accurate signal on how well you're doing
as quickly as possible in a way that
correlates to physical driving that's in
the evaluation problem which metric are
you evaluating towards we're talking
about safety and some
what are the performance metrics that
we're talking about so in the end you
care about and safety like that's in the
end what keeps you
like um that's what's deceptive where uh
there's a lot of companies that have
like a great demo
the path from like a really great demo
to being able to go driverless
can be deceptively long even when that
demo looks like it's driverless quality
and the difference is is that
the thing that keeps you from going
driverless is not the stuff you
encounter on a demo it's the stuff that
you encounter once in a hundred thousand
miles or 500 000 miles and so
that is at the root of what it what is
most challenging about going driverless
because
any issue you encounter you can go and
fix it but how do you know you didn't
create five other issues that you
haven't that encountered yet so
those learnings like those were painful
earnings in waymo's history that waymo
went through and
led to us then finally being able to go
driverless in phoenix and now are at the
heart of how we develop
evaluation is simultaneously evaluating
final kind of end safety of how ready
are you to go driverless
which may be as
you know direct as what is your
collision
human relative kind of collision rate uh
for all these types of scenarios and and
uh uh and severities to make sure that
you're better than a human bar you know
by by a good amount um but that's not
actually the most useful for development
for development it's much more kind of
analog metrics that
are part of the art of finding
how
what what are the properties of driving
that give you a way quicker signal
that's more sensitive than a collision
that can correlate to qual the quality
you care about and push the feedback
loop to all of your development a lot of
these are for example comparisons to
human drivers like manual drivers how do
you how do you do relative to human
driver in various dimensions of various
um circumstances
can ask a tricky question so
if i brought you a truck how would you
test it
okay alan turing came along and you said
this one's can't tell if it's a human
driver or yeah exactly
yeah but
not the human because
because you know humans are flawed but
yeah how do you actually know you're
ready basically how do you know it's
good enough um
yeah and by the way this is the reason
why like um
weymouth released the safety framework
for the car side because like
one it sets the bar so nobody cuts below
it um and does something bad for the
field that and that causes an accident
two it's to start the conversation on
like framing what does this need to look
like same thing we'll end up doing for
the trucking side um there it ends up
being um
different demand different
portfolio of approaches there's easy
things like are you compliant with all
these like fundamental rules of the road
like you never drive above the speed
limit that's actually pretty easy like
you can fundamentally prove that it's
either impossible to violate that rule
or that in these like you can um
itemize the scenarios where that comes
up and you can do a test and show that
you you know you pass that test and
therefore you can handle that scenario
and so those are like traditional
structure testing kind of system
engineering approaches where you can
just quant like
fault rates is another example where
when something fails how do you deal
with it you're not going to drive and
randomly wait for it to fail you're
going to force a failure and make sure
that you can handle it and close courses
and simulation or on the road
and
and run through all the permutations of
failures which you can often times for
some parts of system itemize like
hardware
the hardest part is behavioral where
you have
just infinite
situations that could in theory happen
and you want to figure out the the
combinations of approaches that you know
that can work there you can probably
pass the turing test pretty quickly even
if you're not like completely ready for
driverless because the events that are
really
kind of like hard will not happen that
often just to give you a perspective
uh a human has a serious accident on a
freeway uh like a truck driver on a
freeway has uh there's a serious event
happens once every 1.3 million miles and
something that actually has like a
really serious injury is 28 million
miles and so those are really rare and
so you could have a driver that looks
like it's ready to go but you have no
signal on on what happens there and so
that's where you start to get creative
on combinations of
sampling and statistical arguments
focused structured arguments where you
can kind of
simulate those scenarios and show that
you can handle them and
metrics that are correlated with what
you care about but you can measure much
more quickly and get to a right answer
and that's what makes it pretty hard and
in the end um you end up borrowing a lot
of properties um from
uh aerospace and like space shuttles and
so forth where you don't get the chance
to launch it a million times just to say
you're ready because it's too expensive
to fail um and so you go through
a huge amount of kind of structured
approaches in order to validate it and
then by
by thoroughness you can make a strong
argument that you're ready to go
this is actually a harder problem in a
lot of ways though because you can think
of a space shuttle as um
getting to a fixed point and then you
kind of like or an airplane and you like
freeze the software and then you like
prove it and you're good to go here you
have to get to a driver's quality bar
but then continue to aggressively change
the software even while you're
driverless and so and also the full
range of environment that you there's
there's an external environment where
the shuttle is you're basically testing
the
like the systems the internal stuff yeah
uh and you have a lot of control on the
external stuff yeah and the hard part is
how do you know you didn't get worse in
something that you just changed yes
and so uh so in a lot of ways like
the turing test starts to fail pretty
quickly because you start to feel
driverless quality pretty early in that
curve
if you think about it right like in most
um
most uh kind of you know really good av
demos maybe you'll sit there for 30
minutes right yeah so you've driven you
know 15 miles or something like that um
to go driverless uh like what's the sort
of rate of issues that you need to have
you won't even encounter so let's try
something different then let's try
a different version of the touring test
which is like an iq test
so there's these
difficult questions of increasing
difficulty they're very they're they're
designed you don't know them ahead of
time
nobody knows the answer to them right
and so is it possible to in the future
orchestrate yeah basically really
obstacle course almost of like yeah that
maybe change every year
and that represent if you can pass these
it they don't necessarily represent the
full spectrum that's it yeah they won't
be conclusive but you can at least get a
really quick read and filter yeah like
you're able to yeah because you didn't
know them ahead of time like i don't
know
probably
like construction zones uh failures or
driving anywhere in russia yeah like
yeah
weather um cut-ins uh dense traffic kind
of merging lane closures
uh animal foreign objects on a road that
pop out on short notice mechanical
failures sensor braking tire popped
weird behaviors by other vehicles like a
hard brake something reckless that
they've done fouling of sensors like
bugs or birds
you know poop or something so but yeah
like you have these like kind of like
extreme uh
conditions where like you have a nasty
construction zone where everything shuts
down and you have to like you know get
pulled to the other side of the freeway
with a temporary lane like that right
those are sort of conditions where we do
that to ourselves right we itemize
everything that could possibly happen to
give you a starting point to how to
think about
what you need to develop and at the end
of the day there's no substitute for
real miles like if you think of
traditional ml like you know how there's
like a validation set where you hold out
some data and uh like real-world driving
is the ultimate validation set that's
the in the end like the cleanest signal
but you can do a really good job on
creating an obstacle course and you're
absolutely right like at the end um
if there was such a thing as automating
uh and kind of a readiness
um it would be these extreme conditions
like a red light runner right a um
really reckless pedestrian that's
jaywalking a cyclist that you know makes
like a really awkward maneuver that's
actually what keeps you from going
driverless like in the end that is the
long tail
yeah and it's interesting to think about
the that to me is the touring test
stereotest means a lot of things but to
me in driving the touring test
is exactly this validation set that is
handcrafted there's a i don't know if
you know
him there's a guy named francoise he um
he decides he thinks about like how
designed to test for general
intelligence he designs these iq tests
for machines
and the validation set for him is
handcrafted yeah and that it requires
like human genius or ingenuity to create
a really good test yeah and you hold you
truly hold it up it's an interesting
perspective on the validation set which
is like
make that as hard as possible right not
a generic representation of the data but
this is the hardest the hardest stuff
yeah you know it's like go like you'll
never fully itemize like all the world
states that you'll you'll expand and so
you have to come up with different
approaches and this is where you start
hitting the struggles of ml where ml is
fantastic at optimizing the average case
it's a really unique craft to think
about how you deal with the worst case
which is what we care about in in av
space um when using an ml system on
something that that occurs like super
infrequently
so like you don't care about the worst
case really on ads because if you miss a
few it's not a big deal but you do care
about it on the driving side and so um
and so typically like you'll never fully
enumerate the world and so you have to
take a step back and abstract away what
are the signals that you care about and
the properties of a driver
that correlate to defensive driving and
avoiding nasty situations that um
even though you'll always be surprised
by things you'll encounter you feel good
about your ability to generalize from
what you've learned
all right let me ask you a tricky
question
so to me
the two companies
that
are building at scale some of the most
incredible robots ever built
is waymo and tesla
so
there's very distinct approaches
technically philosophically in these two
systems
let me ask you to play sort of devil's
advocate
and then the devil's advocate to the
devil's advocate
it's it's a bit of a race of course
everyone can win
but
if waymo wins this race to level four
uh which
why would they win what aspect of the
approach do you think would be the
winning aspect and if tesla wins
why would they win and uh which aspect
of their approach would be the reason
just just building some intuition almost
not from a business perspective from any
of that just technically
yeah yeah and we could summarize
i think maybe you can correct me what
one of the more
distinct
aspects is
uh waymo has a richer suite of sensors
as lidar and vision
tesla now removed radar they do vision
only tesla has a larger fleet of
vehicles operated by humans so it's
already deployed on the field in its uh
larger what do you call it operational
domain
and then waymo is more focused on a
specific domain and growing it
with fewer vehicles so that's the both
are fascinating approaches both are i
think there's a lot of brilliant ideas
nobody knows the answer so i'd love to
get your comments on this lay of the
land yeah for sure so maybe i'll um i'll
start with waymo
and you're right like both incredible
companies and just a gigantic respect to
like everything tesla's accomplished and
uh how they push the field forward as
well so on the weymouth side there is a
fundamental advantage in the fact that
it is focused and geared towards l4 from
the very beginning we've customized the
sensor suite for it the hardware the
compute the infrastructure the tech
stack and all of the investment inside
the company um
that's deceptively important because
there's like a giant spectrum of
problems you have to solve in order to
like really do this from
infrastructure to hardware to autonomy
stack to the safety framework and that's
an advantage because there's a reason
why it's the fifth generation hardware
and
why all of those learnings went into the
dymor program um it becomes such an
advantage because you learn a lot as you
drive and you optimize for the best
information you have but fundamentally
like there's a big big jump um uh
like every order of magnitude that you
drive uh in numbers of miles and what
you earn and the gap from really kind of
like decent progress or l2 and so forth
to what it takes to actually go all for
and at the end of the day um
there's a feeling that waymo has
uh there's a long way to go uh nobody's
won um but
there's a lot of advantages um in all of
these buckets where it's the only
company that has shipped a fully
driverless service we can go and you can
use it and it's at a decently like uh
you know sizeable scale um and those
learnings can feed forward and to solve
how to solve the more general problems
you see this process you've deployed in
chandler
you don't know the timeline exactly but
you could see the steps
they they seem almost incremental
the steps it's become more engineering
than totally bind r d because it works
in one place and then you move yeah
another place and you grow it this way
and just to give you an example like we
fundamentally changed our hardware and
our software stack almost entirely from
what when driverless in phoenix to what
is the current generation of the system
on both sides
because the things that got us to
driverless even though it got to
driveway way like way beyond human
relative safety um it is
fundamentally not well set up to scale
in an exponential fashion without like
getting into like huge kind of scaling
pains and so those learnings you just
can't shortcut and so that's an
advantage and so uh there's a lot of
open challenges to kind of get through
technical organizational like how do you
solve problems that are increasingly
broad and complex like this work on
multiple products but
there's the feeling that okay like balls
in our court
there's a head start there now we got to
go and solve it and i think that focus
on l4 it's a fundamentally different
problem if you think about it like um
let's say we were designing an l2 truck
that was meant to be safer and help a
human you could do that with far less
sensors
far less complexity and provide value
very quickly arguably with what we
already have today just packaged up in a
good product but
you would take a huge risk in having a
gap from even like compute and sensors
not not to mention the software to then
jump from that system to an l4 system so
it's a huge risk basically so i can let
me allow me to be the person that plays
the devil's advocate and let's argue for
the tesla approach so
that the what you just laid out makes
perfect sense and is exactly right there
are some open questions here which is
it's possible
that
investing more in faster data collection
which is essentially what tesla's doing
will get us there faster
if
the sensor suite doesn't matter
yeah as much and machine learning can do
a lot of the work this is the open
question is
how much is is the thing you mentioned
before how much of driving can be end to
and learned
that's the open question obviously
the waymo and the vision only machine
learning approach
will
solve driving eventually both yeah the
question is of timeline what's faster
that's right and what you mentioned like
if i were to make the opposite argument
like what what puts tesla uh in the
strongest position it's data that is
their like superpower where they have an
access to real-world
data effectively with like
a safety driver uh and uh you know like
they've they found a way to like um get
paid by safety drivers versus paper
safety drivers
it's uh it's brilliant right yeah but
you know all joking aside like um one it
is incredible that they've built a
business that's incredibly successful
they can now be a foundation and
bootstrap kind of like really aggressive
investment in autonomy space uh if you
can do it that's always like an
incredible kind of advantage and then
the data aspect of it um it is a giant
amount of data if you can use it the
right way to then solve the problem but
the ability to collect um
and filter through the things that to
the things that matter at real-world
scale like a large distribution that is
a that is huge like it's a big advantage
um and so then the question becomes can
you use it in our right way and do you
have the right software systems and
hardware systems in order to solve the
problem and
you're right that in the long term
there's no reason to believe that pure
camera systems can't solve the problem
that humans obviously are solving with
you know with vision systems
but
the question is when it's a risk it's a
big so there's no argument that it's not
a risk right like and it's already such
a hard problem and so much of that
problem by the way is um
uh you know even beyond the perception
side some of the hardest elements of the
problem on behavioral side and decision
making and the long tail safety case
if you are adding risk and complexity
on the input side from perception you're
now making a really really hard problem
like
which is on its own is still like almost
insurmountably hard even harder and so
the question is just how much and this
is where like you can
easily get into a little bit of a
kind of a trap where similar to how you
how do you evaluate how good an av
company's product is like you go and you
do a
trial kind of a test run with them a
demo run which they've kind of optimized
like crazy and so forth and like and it
feels good do you do you put any weight
in that right you know that that gap is
kind of like you know pretty large still
um same thing on the like perception
case like the long tail of computer
vision is really really hard and there's
a lot of ways that that can come up and
even if it doesn't happen that often at
all when you think about the safety bar
and what it takes to actually go full
driverless not like incredible
assistance driverless but full
driverless
that bar gets crazy high and not only do
you have to solve it on the behavioral
side but now you have to
push computer vision beyond arguably
where it's ever been pushed and so you
now on top of the broader av challenge
you have a really hard perception
challenge as well so there's perception
there's planning there's human robot
interaction to me what's fascinating
about
what tesla is doing
is in this march towards level four
because it's in the hands of so many
humans
you get to see video you get to see
humans
i mean forget forget companies forget
businesses
it's fascinating for humans to be
interacting with robots that's
incredible and they're actually helping
kind of push it forward and yeah and
that is valuable by the way where even
for us a decent percentage of our data
is human driving yes um we intentionally
have humans drive higher percentage than
you might expect because that creates
some of the best signals to train the
autonomy and so
that is uh on its own value so together
we're kind of learning about this
problem in an applied sense just like
you had with cosmo like once when when
you're chasing an actual product that
people are going to use
robot based product that people are
going to use you have to contend with
the reality of
what it takes to build a robot that
successfully perceives the world and
operates in the world and what it takes
to have a robot that interacts with
other humans in the world and that
that's like to me one of the most
interesting problems humans have ever
undertaken because
you're
in trying to create an intelligent agent
that operates in a human world you're
also
understanding the nature of intelligence
itself
like
how hard is driving it's still not
answered to me yeah i still don't
understand
like all the subtle cues like even
little things like um
your interaction with a pedestrian where
you look at each other and just go okay
go right like that's hard to do without
a human driver right and you're missing
that dimension how do you communicate
that so there's like really really
interesting kind of like elements here
now here's what's beautiful can you
imagine that like when autonomous
driving is solved
how much of the technology foundation of
that like space can go and have like
tremendous just transformative impacts
on on other problem areas and other
other spaces that have
subsets of the these same problems like
it's just incredible
it's it's both a pro and a con is uh
with autonomous driving is so
um safety critical
it's so so once you solve it it's
beautiful because there's so many
applications that are a lot less safety
critical
but it's also the the con of that is
it's so safe it's so hard to solve and
the same journalists that you mentioned
and get excited for a demo are the ones
who
write long articles about the failure of
your company if there's
one accident
that's based on a robot and it's it's
it's just society's so tense and
waiting for failure robots you're in
such a high stake environment failure
has such a high cost and it slows down
development it slows down development
yeah like the team like definitely
noticed that like once you go driverless
like we're driving from phoenix and you
continue to iterate
your iteration pace slows down um
because your fear
of regression
forces
so much more rigor that you know
obviously you know you have to find a
compromise on like okay well how often
do we release driverless builds because
every time you release a driver's build
you have to go through this like
validation process which is very
expensive so far so um it is interesting
it's like it is just one of the hardest
things there's no other industry where
like uh you would not like you wouldn't
release products way way quicker when
you start to kind of provide
even portions of the value that you
provide healthcare maybe is the other
one
that's right but at the same time right
like we've gotten there where you think
of like surgery right like you have
surgery there's always a risk but like
it's really really bounded you know that
there's an accident rate when you go out
and drive your car today right like
and you know what the fatality rate in
the u.s is per year we're not banning
driving because there was a car accident
but the bar for us is way higher and we
hold ourselves very serious to it where
you have to not only be better than a
human but you probably have to like at
scale be far better than a human by a
big margin and you have to be able to
like really really thoughtfully explain
um all of the ways that we validate that
becomes very comfortable for humans to
understand because a bunch of jargon
that we use internally just doesn't
compute at the end of the day we have to
be able to explain to society how do we
quantify the risk
and acknowledge that there is some
non-zero risk but it's far above a human
you know relative safety here's the
thing
to push back a little bit
uh and bring cosmo back in the
conversation he said something quite
brilliant at the beginning of this
conversation that i think probably
applies for
autonomous driving which is
you know there's this desire to make
autonomous cars much safer than human
driven cars
but
if you create a product that's really
compelling and is able to explain both
the leadership and the engineers and the
product itself
can communicate intent
then i think people may be able to be
willing to put up with the thing that
might be even riskier than humans
because
they understand the value of taking
risks you mentioned the speed limit
humans understand the value of going
over the speed limit yeah humans
understand the value of like
going fast through a for through a
yellow light yeah to take in when you're
in manhattan streets pushing through uh
uh crossing pedestrians they understand
that i mean this is a much more tense
topic of discussion so this is just me
talking so in with cosmo's case there
was something about the way
this particular robot communicated the
energy it brought the intent it was able
to communicate to the humans that you
understood
that of course he needs to have a camera
yeah of course he needs to have this
information and in that same way to me
of course a car needs to take risks of
course there's going to be accidents
that's what like that's
you know if you want a car that never
has an accident
have a car that just doesn't go anywhere
yeah
and so that
but that's tricky because that's not a
robotics problem
like are not even under like due to you
right obviously
so there's a big difference though um
yeah you are
that's not a personal decision you're
also impacting obviously kind of the
rest of the road um and we're
facilitating it right and so there's a
higher kind of you know kind of ethical
and moral bar which obviously then you
know translates into as a society and
from a regulatory standpoint kind of
like what what comes out of it where
it's hard for us to ever see this even
being a
debate in the sense that like
you have to be beyond reproach from a
safety standpoint because if you're
wrong about this you could set the
entire field back a decade right see i i
this is me speaking i think if we look
into the future
there will be
i personally believe
this is me speaking yeah
that there will be less and less focus
on safety still very very high yeah
meaning like after autonomy is very
common and accepted it's not not not so
common as everywhere but there has to be
a transition because
i think for innovation just like you
were saying to explore ideas you have to
take risks and i think if autonomy
in the near term is to become prevalent
in society
i think people need to be more willing
to understand the nature of risk
the value of risk
it's very difficult you're right of
course with driving
but that that's the fascinating nature
of it this
it's a
it's a life-and-death situation that
brings value to millions of people so
you have to figure out what what do we
value about this world how much do we
value
how deeply do we want to avoid
hurting other humans that's right and
there is a point where like
you can imagine a scenario where waymo
has a system
that is uh even when it's like uh kind
of beyond a you know human relative
safety um and
provably statistically
will save lives
there is a
thoughtful navigation of
you know the that fact versus just
kind of
society readiness and perception
and
education of um
society and regulators and everything
else where like
it's it's multi-dimensional um and it's
not a purely logical uh argument but um
ironically the logic can actually help
with the emotions
and just like any technology there's
early adopters and then there's kind of
like a curve that um happens after it
but eventually celebrities you get the
rock in a way more vehicle and then
everybody just comes and everybody calms
down because the rock likes it yeah
if you post uh yeah and it's like it's
an open question on how this plays out i
mean maybe we're presently surprised and
it just like people just realize that
this is such a enabler of life and like
efficiency and cost and everything that
um there's a pull like at some point i
should fully believe that this will go
from a
thoughtful kind of you know
you know movement and tiptoeing and like
kind of like a push to society realizes
how
wonderful of an enabler this could
become and it becomes more of a pull and
um hard to know exactly how that will
play out but at the end of the day like
both
the goods transportation and the people
transportation side of it has that
property where it's not easy there's a
lot of open questions and challenges to
navigate and there's obviously the
technical problems to solve uh as a you
know kind of prerequisite but um
they they have such an opportunity that
is um
on a scale that very few industries in
the last 20 30 years have even had a
chance to tackle that
i
maybe were pleasantly surprised by how
much how much that tipping point like in
a really short amount of time actually
turns into a societal pull to kind of
embrace the benefits of this yeah i i
hope so it seems like in the recent few
decades there's been tipping points for
technologies where like overnight things
change it's uh like uh
from taxis to ride sharing services all
that that shift i mean there's just
shift after shift after shift that
requires digitization and technology i'm
i hope we're pleasantly surprised in
this so there's millions of long-haul
trucks now in the united states
do you see a future where
there's millions of waymo trucks and
maybe just broadly speaking way more
vehicles just
like like ants running around the united
states uh
freeways and local roads yeah in other
countries too like uh
you look back decades from now and
it might be one of those things that
just feels so natural and then it
becomes almost like a kind of
interesting kind of oddity that we had
none of it like uh you know kind of
decades earlier
and
it'll take a long time to grow and scale
very different challenges appear at
every stage but
over time like
this is one of the most enabling
technologies that um that we have in the
world uh today um it'll feel like
you know how was the world before the
internet how's the world before mobile
phones like it's gonna have that sort of
a feeling to it on both sides it's hard
to predict the future but
do you sometimes
uh think about weird ways it might
change the world like surprising ways so
obviously
there's more direct ways where like
there's increases efficiency it'll
enable a lot of kind of logistics
optimizations kind of things
it will change
our uh probably our roadways and all
that kind of stuff but it could also
change society in some kind of
interesting ways do you ever think about
how might change cities how might change
their lives all that kind of yeah
you can imagine city uh where people
live versus work becoming more
distributed because the pain of
commuting becomes different just easier
uh and i don't know there's a lot of
options that open up
the way out of cities themselves and how
you think about
car storage and parking obviously uh
just enables a completely different type
of uh uh
type of experience in urban environments
i i think there was like a statistic
that uh
something like
30 of the traffic uh in cities
during rush hour is caused by a pursuit
of parking uh or some like some really
high stats so
those obviously kind of open up a lot of
options um
flexibility on goods will enable
new industries and businesses that never
existed before because now the
efficiency becomes
more palatable good delivery timing
consistency and flexibility is going to
change
the way we distribute the logistics
network will change the way we then can
integrate with warehousing with
shipping ports you can start to think
about greater automation through the
whole kind of stack
and how that supply chain the ripples
become much more uh agile versus like
very
grindy the way they are
today where just the adaptation is like
very tough and there's a lot of
constraints that we have i think it'll
be great for the environment it'll be
great for safety where like probably
about 95 of accidents today um
statistically are due to just uh
attention or things that are preventable
with uh
with the strengths of automation
yeah and it'll be one of those things
where like industries will shift but the
net creation is going to be massively
positive and then we just have to be
thoughtful about the negative
implications that will happen in local
area places um and adjust for those but
i'm an optimist in general for the
technology where you could argue a
negative on any new technology but you
start to kind of
see that if there is a big demand for
something like this the in almost all
cases the like it's an enabling factor
that's gonna kind of propagate through
the um you know through society and
particularly as life expectancies get
longer and you know and so forth like
there's a just a lot more need for um a
greater percentage of the population to
kind of just be serviced with a high
level of efficiency
because otherwise we can have a really
hard time kind of scaling to what's
ahead in the next 50 years um
yeah and you're absolutely right every
technology has uh negative consequences
of positive consequences and we tend to
focus on the negative a little bit too
much
in fact autonomous trucks are
often brought up as an example of uh
artificial intelligence and robots in
general taking our jobs
and as we've talked about briefly here
we talk a lot with steve you know
that's
it is a concern that automation
will take away certain jobs it will
create other jobs so there's temporary
pain
uh hopefully temporary but pain is pain
and all people suffer and that human
suffering is really important to think
about
how uh but
trucking is
ver i mean there's a lot written on this
is i would say far from the the thing
that that would cause the most pain yeah
there's even more positive properties
about trucking where not only is there
just a you know huge shortage which is
going to increase the average age of
truck drivers is getting closer to 50
because the younger people aren't
wanting to come into it they're trying
to like incentivize lower the age limit
like all these sort of things um and the
demand is just going to increase and the
least favorable like it depends on the
person but in most cases the least
favorable types of routes are the
massive long-haul routes where you're on
the road away from your family 300 plus
stations steve talked about the pain of
those kind of routes from a family
perspective you're
you're basically away from family it's
not just hours you work insane hours but
it's also just time away from family
right and just obesity rate is through
the roof because you're just sitting all
day like
it's really really tough and um
and that's also where like the biggest
kind of safety risk is because of
fatigue and um
and so when you think of the gradual
evolution of how trucking comes in first
of all it's not overnight it's gonna
take decades to kind of phase in all the
like there's just a long long long road
ahead
but
the the routes and the portions of
trucking that are going to require
humans the longest and benefit the most
from humans are the short-haul and most
complicated kind of more urban routes
which are also the more more pleasant
ones which are um you know less
continual driving time more um uh
more flexibility on like you know
geography and location and you get to
kind of sleep with the at home with you
at your own home and very importantly if
you optimize the logistics you're going
to use human
you're going to use humans much better
that's right and and thereby pay them
much better because like one one of the
biggest problems is truck drivers
currently are paid by like how much they
drive so you they really feel the pain
of it inefficient logistics yeah because
like if they're just sitting around for
hours which they often do not driving
waiting yeah they're not getting paid
for that time that's right and that so
like logistics has a significant impact
on the quality of life of a truck driver
and high percentage of trucks are like
uh empty because of inefficiencies in
the system um yeah it's one of those
things where like um
and the other thing is when you increase
the efficiency of a system like this the
overall net like volume of the system
tends to increase right like the
the entire market cap of trucking is
going to go up um when the efficiency
improves uh and facilitates both growth
and industries and better utilization of
trucking um and so that on its own just
creates more and more demand which um
uh of all the places where ai comes in
and starts to really um
uh
kind of reshape an industry this is one
of those where like there's just a lot
of positives that for at least any time
in the foreseeable future seem really
lined up in a good way um to um
kind of come in and help with the
shortage and start to kind of optimize
for the routes that are most dangerous
and most painful
yeah so
this is true for trucking but if we zoom
out broader you know automation and ai
does
technology broadly i would say but you
know automation
is a thing that
has a potential in the next couple of
decades to shift the kind of jobs
available to humans yes and
so that results in
like i said human suffering because
people lose their jobs there's economic
pain there
and there's also a pain of meaning so
for a lot of people
work is a
source of uh meaning it's a source of
identity of
of pride
of you know
pride in getting good at the job pride
in craftsmanship and excellence which is
what truck drivers talk about yeah but
but that this is true for a lot of jobs
and is that something you think about as
a sort of a roboticist zooming out from
the trucking thing um
like where
do you think it would be harder
to find activity and work that's a
source of identity and source of meaning
in the future
like i do think about it because you
want to make sure that you you worry
about the entire system like not just
like the party economy plays in it but
what are the ripple effects of it down
the road and um
on enough of a time window there's a lot
of opportunity to put in the right
policies and the right opportunities to
kind of reshape and retrain and find
those openings and so just to give you a
few examples both trucking and cars
we have remote assistance facilities
that
are there to interface with customers
and monitor vehicles and
provide like very focused kind of
assistance on uh kind of areas where the
vehicle may want to request help uh in
understanding an environment so those
are jobs that kind of get created and
supported
i remember like taking a tour of one of
the amazon facilities where you've
probably seen the kiva systems robots uh
where you have these orange robots that
have automated um
the warehouse like kind of picking and
collecting of items in this like really
elegant and beautiful way um it's
actually one of my favorite applications
of robotics of all time um
uh you know like i think it kind of came
across a company like 2006 was just
amazing and
what was the warehouse or was the
transport little thing so basically
instead of a person going and walking
around and picking the seven items in
your order um
these robots go and pick up a shelf
and move it over in a row where like the
seven shelves that contain the seven
items are lined up and a
you know laser or whatever points to
what you need to get and you go and pick
it and you place it to fill the order
and so the people were fulfilling the
final orders what was interesting about
that is that when i was asking them
about like kind of the impact on labor
when they transitioned that warehouse
the throughput increased so much that
the jobs shifted towards the final
fulfillment
even though the robots took over
entirely the the search of the items
themselves and the labor
the job stayed like nobody like that was
actually the same amount of jobs uh
roughly they were necessary but the
throughput increased by i think over 2x
or some some amount right like so
um you have these situations that are
not zero-sum games in this really
interesting way and the optimist to me
thinks that there's these types of
solutions in almost any industry where
the growth that's enabled creates
opportunities that you can then leverage
but you got to be intentional about
finding those and really helping make
those links because
any even if you make the argument that
like there's a net positive
locally there's always tough hits that
you got to be very careful about that's
right you have to have an understanding
of that link because there's a short
period of time
whether training is required or just
mental transition or physical or
whatever is required
that's still going to be short-term pain
the uncertainty of it there's families
involved you know it
it's i mean it's exceptionally
it's difficult on a human level and you
have to really think about that even you
can't just look at economic metrics
always it's human beings that's right
and you can't even just uh take it as
like okay well we need to like subsidize
or whatever because like there is an
element of just personal pride where
right
majority of people like people don't
want to just be okay but like they want
to actually like have a craft like you
said and have a mission and
feel like they're having a really
positive impact and so um
my personal belief is that there's a lot
of transferability and skill set um
that is possible especially if you
create a bridge and an investment um to
enable it um and to some degree that's
our responsibility as well
this process
uh you mentioned kiva robots amazon
let me ask you about
the astro robot which is i don't know if
you've seen it it's amazon's announced
that
it's a home robot that they have a
screen
looks awfully a lot like
cosmo has
i think different vision probably
what are your thoughts about like home
robotics in this kind of space there's
been a
quite a bunch of home robots social
robots that very unfortunately have
closed their doors that um for various
reasons perhaps they were too expensive
there's manufacturing challenges all
that kind of stuff what are your
thoughts about amazon getting into this
space
yeah we had some signs that they were
getting into like long long long long
ago maybe they're a little
too interested in cosmo and uh yeah
during our conversations but they're
also very good partners actually for us
as we kind of disintegrated a lot of
shared technology but if i could also
get your thoughts on
you know you could think of
alexa as a robot as well yeah echo
do you see those as fundamentally
different just because you can move and
look around is that fundamentally
different than the thing that just sits
in place uh it opens up options um but
uh
you know my first reaction is i think
like
i have my doubts that this one's going
to hit the mark because i think for the
price point that it's at
and the like kind of functionality and
value propositions that they're i'm
trying to put out it's uh uh it's still
searching for like the killer
application that like justifies i think
it was like a 1500 price point or kind
of somewhere around there that's a
really high bar so
there's enthusiasts an early adopters
will obviously kind of pursue it but you
have to like really really hit a high
mark at that price point which we always
tried to we were always very cautious
about jumping too quickly to the more
advanced systems that we really wanted
to make but would have
raised the bar so much you have to be
able to hit it
in today's cost structures and
technologies
the mobility is an angle that hasn't
been utilized
but
it has to be utilized in the right way
um and so that's going to be the biggest
challenge is like can you
meet the bar of what a con what the mass
market consumer like you know think like
you know our
uh our neighbors our friend parents like
would they find a deep deep value like
in you know fi in this at a mass scale
that you know that justifies the price
point i think that's in the end one of
the biggest challenges for robotics
especially consumer robotics
where you have to kind of meet that bar
uh it becomes very very hard um and
there's also the higher bar just like
you were saying with cosmo of
you know a thing that can
look one way and then turn around and
look at you
there's that's either a super desirable
quality or super undesirable quality
depending on how much you trust the
thing that's right and so there's uh
there's a problem of trust to solve
there there's a problem of personalities
the thing is the quote-unquote problem
that cosmos solved so well yeah is that
there you trust the thing yeah and that
has to do with the company with the
leadership with the intent that's
communicated by the device and the
company and everything together yeah
exactly right uh and so um and i think
they also have to retrace some of the
like learnings on the character side
where like as usual i think that's the
place where it's uh a lot of companies
are great at the hardware side of it and
can you know think about those elements
and then there's like you know the
thinking about the ai challenges
particularly the advantage of alexa is a
pretty huge boost for them um the
character side of it for technology
companies is pretty new novel territory
and so that will take some iterations
but um yeah i mean i hope
i hope there's continued progress in the
space and that threat doesn't kind of go
dormant for too long
and it's not you know it's going to take
a while to kind of
evolve into like the ideal applications
but you know
this is one of um amazon's i guess like
you could call it it's definitely like
part of their dna but in many cases it's
also strength where they're very willing
to like iterate uh kind of aggressively
and um and move quickly not take risks
and take risks you have deep pockets so
you can yeah and they'll maybe have more
misfires than an apple would um but uh
you know it's different styles and
different approaches and um you know
at the end of the day it's like there's
a few familiar uh kind of elements there
for sure which was uh you know kind of
you know homage
is one way to put it yeah uh so why is
it so hard
at a high level
um to build a robotics company a
robotics company that
lives for a long time so if you look at
so i thought cosmo for sure would live
for a very long time that to me was
exceptionally successful vision and idea
and implementation
irobot is an example of a company that
has
pivoted in all the right ways to survive
and
arguably thrive
by focusing on the
having like a
have a driver that constantly provides
profit which is the vacuum cleaner
and of course there's like amazon what
they're what they're doing is they're
almost like taking risks so they can
afford it because they have other
sources of revenue right but outside of
those examples
most robotics companies fail yeah
why why do they fail why is it so hard
to run a robotics company our robot's
impressive because they found a really
really great
fit of where the technology could
satisfy a really clear used case in need
and
they did it well and they didn't try to
overshoot from a cost-to-benefit
standpoint
robotics is hard because it like tends
to be more expensive it combines way
more technologies than a lot of other
types of companies do if i were to like
say one thing that is maybe the biggest
risk and like a robotics company failing
is that
it can be either a
technology in search of a application
or
they try to bite off a
kind of an offering that has
a mismatch and kind of price to function
um and uh
just the mass market appeal isn't there
and um consumer products are just hard
it's just i mean after all the years and
it like definitely kind of feel a lot of
the battle scars because you have um you
know you not only do you have to like
hit the function but you have to educate
and explain get awareness up deal with
different conductive consumers like uh
you know there's um there's a reason why
a lot of technology sometimes start in
the enterprise space and then kind of
continue forward in the consumer space
even like you know you see ar like
starting to kind of make that shift with
hololens and so forth in some ways
consumers and price points that they're
willing to kind of uh be attracted in a
mass market way and i don't mean like
you know 10 000 enthusiasts bought it
but i mean like
you know 2 million 10 million 50 million
like mass market kind of interest uh you
know have bought it
that bar is very very high and typically
robotics is novel enough and
non-standardized enough to where pushes
on price points so much
you can easily get out of range where
the capabilities and today's technology
or just a function that was picked just
doesn't line up um and so that product
market fit is very important
so the space of killer apps or
a rather super compelling apps is much
smaller because it's easy to get outside
the price range yeah and most consumers
and it's not constant right like yeah
that's why like we picked off
entertainment because
the quality was just so low in physical
entertainment that we could we felt we
could leapfrog that and still create a
really compelling offering at a price
point that was defensible and and we
like that proved out to be true um
and
over time that same opportunity opens up
in healthcare in home applications and
you know commercial
applications and kind of broader more
generalized interface
but
there's missing pieces in order for that
to happen and all of those have to be
present um for it to line up and we see
these sort of trends in technology where
um you know kind of technologies that
start in one place
evolve and kind of grow to another
something starting gaming some things
start in
uh
in space uh or aerospace and then kind
of move into the consumer market
and sometimes it's just a timing thing
right where how many
stabs at what became the iphone were
there over the 20 years before that just
weren't quite ready in the function um
relative to the kind of price point and
complexity and sometimes it's a small
detail of the implementation that makes
all the difference which is uh design uh
design is so important well something
yeah like the the you the new generation
ux right yeah it's um and uh and that's
uh
um it's tough and oftentimes all of them
have to be there and it has to be like a
perfect storm and um
but yeah history repeats itself in a lot
of ways uh in a lot of these trends
which is pretty fascinating well let me
ask you about the humanoid form what do
you think about the tesla bot and
humanoid robotics in general
so obviously to me autonomous driving
waymo and the other companies working in
the space
that seems to be a great place to invest
in potential revolutionary application
robotics application focused application
what's the role of humanoid robotics do
you think
teslabot is ridiculous do you think it's
super promising do you think it's
interesting full of mystery nobody knows
what do you think about this thing yeah
i think today humanoid form robotics is
research there's very few situations
where you actually need a humanoid form
to solve a problem uh if you think about
it right like wheels are more efficient
than legs there's
joints and degrees of freedom beyond a
certain point just add a lot of
complexity and cost right so if you're
doing a humanoid robot oftentimes it's
in the pursuit of a humanoid robot not
in a pursuit of an application for the
time being
um
especially when you have like kind of
the gaps and interface and you know kind
of ai that we kind of talk about today
so anything you want does i'm interested
in following so there's there's an
element of that world no matter how
crazy how crazy it is i just like you
know i'll pay attention i'm curious to
see what comes out of it so it's like
you can't you can't ever you know ignore
it but you know it's uh definitely far
afield from their kind of core business
um uh obviously and um what was
interesting to me is i've
i've disagreed with you know elon a lot
about this
is
to me the in the compelling aspect of
the humanoid form
and a lot of kind of robots cosmo for
example
is a human robot interaction
part
from
elon musk's perspective the tesla bot
has nothing to do with the human
it's a form
that's effective for the factory because
the factory is designed for humans
but to me the reason you might want to
argue for the humanoid form is because
you know at a party
yeah it's a nice way to fit into the
party the humanoid form has a compelling
notion to it in the same way that cosmo
is compelling
i you i would argue if we were
arguing about this
that it's cheaper to build a cosmo like
that form but if you wanted to make an
argument which i have with jim keller
about you know you could actually make a
humanoid robot for pretty cheap
it's possible
and then the question is all right if if
you're using an application where it can
be flawed
it could it can have a personality and
be flawed in the same way that cosmo is
that maybe it's interesting for
integration to human society
that's that's to me is an interesting
application of a humanoid form because
humans are drawn like i mentioned to you
legged robots we're drawn to legs and
limbs and body language and all that
kind of stuff
and even a face even if you don't have
the facial features which you might not
want to have for the
uh
to reduce the creepiness factor all that
kind of stuff but yeah that to me the
humanoid form is compelling but in terms
of
that being the right form for the
factory environment i'm not so sure yeah
for the factory environment like right
off the bat um what are you optimizing
for is it strength is it mobility is it
versatility right like that changes
completely the look and feel of the
robot that you create you know and uh
almost certainly the human form is over
designed for some asp dimensions and
constrained for some dimensions and so
like like what are you grasping is it
big is it little right so you would
customize it and make it um
customizable um for the different needs
if that was the optimization right and
then you know for the other one uh
i could totally be wrong you know i
still feel that the closer you try to
get to a human the more you're subject
to the um biases of what a human should
be and you lose flexibility to shift
away from your weaknesses uh and towards
your strengths
and that changes over time but there's
ways to make
really
approachable
and natural interfaces for
robotic kind of characters and
you know and uh
you know and kind of deployments in
these applications
that
do not at all look like a human directly
but that actually creates way more
flexibility and capability and role and
forgiveness and interface and everything
else yeah it's interesting but i'm still
confused by the magic i see in legged
robots yeah so there is a magic so i i'm
uh
absolutely amazed at it from a
technical curiosity standpoint and like
the
the magic that like the boston dynamics
team can do from uh you know like from
walking and jumping and so forth now
like there's been a long journey to try
to find an application for that sort of
um technology but wow that's incredible
technology right yes so
then you kind of go towards okay are you
working back from a goal of what you're
trying to solve or are you working
forward from a technology and then
looking for a solution and i think
that's where um it's a kind of a
bi-directional search oftentimes but you
gotta you the two have to meet and that
that's where
humanoid robots is kind of close to that
and that like it is a decision about a
form factor and a
technology that it forces um
that
doesn't have a clear justification on
why that's the killer app or you know
from the other end but i think the core
fascinating idea with the tesla bot is
the one that's carried by waymo as well
is when you're solving the general
robotics problem
of perception control where this there's
the very clear applications of driving
it's
as you get better and better at it when
you have like way more driver yeah
the whole world starts to kind of start
to look like a robotics problem so it's
very interesting for now detection
classification
segmentation tracking
planning like it's carrie yeah so
there's no reason i mean i'm not i'm not
speaking for way more here but
you know
um
moving goods
there's no reason
transformer like this thing couldn't you
know uh take the goods up an elevator
you know yeah like that like uh slowly
expand yeah what it means to move goods
and
expand more and more of the world uh
into a robotics problem well that's
right and you start to like think of it
as an end robotics problem from like
loading from you know from everything
yes and even like the truck itself um
you know today's generation is
integrating into
today's understanding of what a vehicle
is right the pacifica jaguar uh the
freightliners from daimler there's
nothing that stops these us from like
down the road after like starting to get
to scale to like
expand these partnerships to really
rethink what would the next generation
of a truck look like um that is actually
optimized for autonomy not for today's
world
um
and maybe that means a very different
type of trailer maybe that like there's
a lot of things you could rethink on
that front which is on its own very very
exciting
let me ask you like i said you went to
the mecca of robotics which is cmu
carnegie mellon university you got a phd
there
so maybe by way of advice
and maybe by way of
story and memories what does it take to
get a phd
in robotics at cmu
and maybe
you can throw in there some advice for
people who are thinking about
doing work in artificial intelligence
and robotics and are thinking about
whether to get a phd it's like i
actually went i was a cmu for undergrad
as well and didn't know anything about
robotics coming in and was doing you
know electrical computer engineering
computer science and really got more and
more into kind of ai and then fell in
love with autonomous driving and at that
point like that was just by a big margin
like such a incredible like central spot
of
develop of investment in that area and
so what i would say is that like
robotics like for
all the progress that's happened is
still a really young field there's a
huge amount of opportunity now that
opportunity shifted where something like
autonomous driving has moved from being
very research and academics driven to
being commercial driven where you see
the investments happening
in commercial now there's other areas
that are much younger
and you see like kind of grasping and
manipulation making kind of the same
sort of journey that like autonomy made
and there's other areas as well what i
would say is the space moves very
quickly anything you do a phd in like it
is in most areas will evolve and change
as technology changes and constraints
change and hardware changes and the
world changes
um and so
the beautiful thing about robotics is
it's super broad it's not a narrow space
at all and it can be a million different
things in a million different industries
and so
uh it's a great opportunity to come in
and get a broad foundation on ai machine
learning computer vision systems
hardware sensors all these separate
things
you do need to like go deep and find
something that you're like really really
passionate
about obviously like just like any phd
this is like a
five six year kind of uh endeavor
and you have to
love it enough to go super deep to learn
all the things necessary to be super
deeply functioning in that area and then
contribute to it in a way that hasn't
been done before and in robotics that
probably means um more breadth because
robotics is rarely kind of like one
particular kind of narrow technology
and it means being able to collaborate
with teams where like one of the coolest
aspects of like
my the exp the experience that i kind of
cherish in our phd is that we actually
had a pretty large av project that for
that time was like a pretty serious
initiative where you got to like partner
with a larger team and you had the
experts in perception and the experts in
planning and the staff and the
mechanical challenge um so i was working
on the a project called upi back then uh
which was basically the off-road version
of the darpa challenge it was a darpa
funded project for
basically like a large off-road vehicle
that you would like
drop and then give it a waypoint 10
kilometers away and it would have to
navigate a complete structure in an
office environment yeah so like forest
ditches rocks vegetation and so it was
like a really really interesting kind of
a hard problem where like wheels would
be up to my shoulders it's like gigantic
right yeah by the way av for people
stands for autonomous vehicles house
vehicles yeah sorry um and so what i
think is like the beauty of robotics but
also kind of like the expectation is
that um
there's um spaces in computer science
where you can be very very narrow and
deep
robotics one of the the necessity but
also the beauty of it is that it forces
you to be excited about that breadth and
that partnership across different
disciplines that enable it but that also
opens up so many more doors where you
can go and you can do robotics and
almost any category where robotics isn't
a in isn't really an industry it's like
it's like ai right it's like the
application of physical automation to uh
you know to all these other worlds and
so you can do robotic surgery you can do
vehicles you can do factory automation
you can do healthcare or you can do like
uh
leverage the ai around the sensing to
think about static sensors and scene
understanding so um so i think that's
got to be the expectation and the
excitement and it
breeds people they're probably a little
bit more collaborative and more excited
about um
working in teams uh if i could briefly
comment
on the fact that the robotics people
i've met in my life
from cmu and mit
they're really happy people yeah because
i think it's the collaborative thing
yeah i think i think you don't
you're not like a sitting in like the
fourth basement uh exactly
which when you're doing machine learning
purely software it's very tempting to
just disappear into your own hole yeah
and never collaborate and and there that
breeds
a little bit more of the silo mentality
of like
i have a problem it's almost like
negative to talk to somebody else or
something like that but robotics folks
are just very collaborative very
friendly just and there's also an energy
of like you get to confront the physics
of reality often which is
humbling
and also exciting so it's humbling when
it it fails and exciting when it finally
it's like the purity of the passion you
got to remember that like right now like
robotics and ais like just all the rage
and autonomous vehicles and all this
like 15 years ago and 20 years ago
like
it wasn't that deeply lucrative people
went into robotics they did it because
they were like thought it was just the
coolest thing in the world to like make
physical things intelligent in the real
world and so there's like a raw passion
where they went into it for the right
reasons and so forth and so it's really
great space and that organizational
challenge by the way like um when you
think about the challenges in av we talk
a lot about the technical challenges the
organizational challenge is through the
roof where
um you think about the challenge the
what it takes to build an av system and
you have companies that are now
thousands of people
and um you know you look at other really
hard technical problems like an
operating system it's pretty well
established like you kind of know that
there's a file system there's virtual
memory there's this there's that there's
like
caching and like and there's like a
really reasonably well established
modularity and apis and so forth and so
you can kind of like scale it in an
efficient fashion that doesn't exist
anywhere near to that level of maturity
in autonomous driving right now
and tech stacks are being reinvented
organizational structures are being
reinvented you have problems like
pedestrians that are not isolated
problems they're part sensing part
behavior prediction part planning part
evaluation and
like one of the biggest challenges is
actually how do you solve these problems
where the mental capacity of a human is
starting to get strained on how do you
organize it and think about it where
you know you have this like
multi-dimensional matrix that needs to
all work together and so
that makes it kind of cool as well
because it's not like solved at all uh
from you know like what what is what
does it take to actually scale this
right and then you look at like other
gigantic challenges that have you know
that have been success successful and
are way more mature
there's a stability to it and like maybe
the autonomous vehicle space will get
there but right now just as many uh
technical challenges as they are they're
like organizational challenges and how
do you like solve these problems that
touch on so many different areas and
efficiently tackle them
while
like maintaining progress among all
these constraints um while scaling
by way of advice
what advice would you give to uh
somebody thinking about doing a robotics
startup you mentioned cosmo somebody
that wanted to carry the cosmo flag
forward the anki flag forward
looking back at your experience
looking forward to the future that will
obviously have such robots what advice
would you give to that person yeah it
was the greatest experience ever and
it's like there's something you there's
things you learn
navigating a startup that you'll never
like you you it was very hard to
encounter that in like a typical kind of
work environment and um
and it's just it's wonderful you got to
be ready for it it's not as good like
you know the the glamour of a startup
there's just like just brutal emotional
swings up and down and so um having
co-founders actually helps a ton like i
would not cannot imagine doing it solo
but having at least somebody where on
your darkest days you can kind of like
really openly just like have that
conversation and you know lean on to
somebody that's that's in the thick of
it with you helps a lot what i would say
what was the nature of darkest days and
the emotional swings is it worried about
the funding is it worried about whether
any of your ideas
are any good or ever were good is it
like the self-doubt uh is it like
facing new challenges that have nothing
to do with the technology like
organizational human resources that kind
of stuff what yeah you come from a world
in school where
you feel that uh you put in a lot of
effort and you'll get the right result
and input translates proportional to
output and
you know you need to solve the set or do
whatever and just kind of get it done
now phd tests out a little bit but at
the end of the day you put in the effort
you tend to like kind of come out with
your enough results to you kind of get a
phd
in the startup space like
you know like you could talk to 50
investors and they just don't see your
vision and it doesn't matter how hard
you kind of tried and pitched you could
uh work incredibly hard and you have a
manufacturing defect and if you don't
fix it you're gonna you're out of
business um you need to raise money by a
certain date and there's a you got to
have this milestone in order to like
have a good pitch and you do it you have
to have this talent and you just don't
have it inside the company or um
you know you have to get 200 people or
however many people kind of like along
with you and kind of buy in the journey
um you're like disagreeing with an
investor and they're your investors so
it's just like you know it's like you
there's no walking away from it right so
um and it tends to be like those things
where you just kind of get clobbered in
so many different ways that like things
end up being harder than you expect and
it's like such a gauntlet
but you learn so much in the process and
there's a lot of people that actually
end up rooting for you and helping you
like from the outside and you get good
great mentors and you like get find
fantastic people that step up in the
company and you have this like magical
period where everybody's like
it's life or death for the company but
like you're all fighting for the same
thing and it's the most satisfying kind
of journey ever um the things that make
it easier and that i would recommend is
like be really really thoughtful about
the
the application like there's a there's a
saying of like kind of you know team and
execution and market and like kind of
how important are each of those um and
oftentimes the market wins and you come
at it thinking that if you're smart
enough and you work hard enough and
you're like have the right talented team
and so forth like you'll always kind of
find a way through and um it's
surprising how much dynamics are driven
by the industry you're in and the timing
of you entering that industry um and so
just uh waymo is a great example of it
there is
i don't know if there'll ever be another
company or suite of companies that
has raised and continues to spend so
much money at such an early uh phase of
revenue generation and product and
productization um the you know from a p
l standpoint uh like it's it's a
anomaly like by any measure of any
industry that's ever existed um except
for maybe the u.s space program uh like
right uh like but
it's like uh
multiple trillion dollar opportunities
which is so unusual to find that size of
a market that
just the progress that shows the
de-risking of it you could apply
whatever discounts you want off of that
trillion-dollar market and it still
justifies the investment that is
happening because like being successful
in that space makes all the investments
feel trivial now by the same consequence
like
the size of the market the size of the
target audience the ability to capture
that market share how hard that's going
to be who the incumbent's like that's
probably one of the lessons i appreciate
like more than anything else where like
those things really really do matter
and um oftentimes can dominate the
quality of the team or execution because
if you
miss the timing or you do it in the
wrong space you run into like the
institutional kind of headwinds of a
particular environment like let's say
you have the greatest idea in the world
but you barrel into healthcare but it
takes 10 years to innovate in healthcare
because of a lot of challenges right
like there's fundamental
uh laws of physics that you have to
think about and so um the combination of
like anki waymo kind of drives that
point home for me where you can do a ton
if you have the right market the right
opportunity the right way to explain it
and you show the progress in the right
sequence
it actually can really significantly
change the course of your
journey and startup how much of it is
understanding the market and how much of
it's creating a new market so how do you
think about
like space robotics is really
interesting you said exactly right the
space of applications is small
yeah
you know relative to the cost involved
so how much is like truly revolutionary
thinking
about like what is the application
and then
yeah but so creating something that
didn't exist it didn't really exist like
this is pretty obvious to me the whole
space of home robotics just every
everything that cosmo did i guess you
could talk to it as a toy and people
will understand it picazo is much more
than a toy yeah
and
i don't think people fully understand
the value of that you have to create it
and the product will communicate it like
just like the iphone
nobody understood the value
of of no keyboard and a thing that's
that can do web browsing i don't think
they understand the value of that until
you create it
yeah having a foot and a door in an
entry point still helps because at the
end of the day like an iphone replaced
your phone and so it had a fundamental
purpose and all these things that it did
better right sure and so then you could
do abc on top of it and uh and then like
you even remember the early commercials
where it's always like one application
of what he could do and then you get a
phone call right and so that was
intentionally sending a message
something familiar but then like yes you
can send a text message you can listen
to music you can surf the web right and
so
you know autonomous driving obviously
anchors on that as well you don't have
to explain to somebody the functionality
of an autonomous truck right like
there's nuances around it but the
functionality makes sense um
in the home
you have a fundamental advantage like we
always thought about this because it was
so painful to explain to people what our
products did and how like how to
communicate that super cleanly
especially when something was so
experiential and so you compare like
anki to nest
nest um
had some beautiful products
where they
started scaling and like actually find
like really great success and they had
like really clean and beautiful
marketing messaging because they
anchored on reinventing existing
categories where it was a smart
thermostat right and uh like and so you
you kind of are able to
um take what's familiar anchor that
understanding and then explain what's
what's better about it that's funny
you're right cosmo is like totally new
thing like what what is this thing
because we struggle we spent like a lot
of money on marketing we had a heart
like we fought we actually had far
greater efficiency on cosmo than um
anything else because we found a way to
capture the emotion in some little
shorts to kind of lean into the
personality in our marketing and it
became viral where like we had these
kind of videos that would like go and
get like hundreds of thousands of views
and like kind of like get spread and
sometimes millions of views and so um
but it was like really really hard um
and so finding a way to kind of like
anchor on something that's familiar but
then grow into something that's not um
is an advantage but then again like you
don't have like there's successes
otherwise like alexa never had a comp
right uh
you could argue that that's very novel
and very new and um
and there's a lot of other examples that
kind of created a
kind of a category out of like kiva
systems i mean they like came in and
they like
uh enterprise is a little easier because
if you can uh it's less susceptible to
this because if you can argue a clear
value proposition it's a more logical
conversation that you can have um with
customers it's not it's a little bit
less emotional and um kind of subjective
but yeah in the home you have to
yeah so like a home robot it's like what
does that mean yeah and so then you
really have to be crisp about the value
proposition and what like
really makes it worth it like and and we
by the way went to that same order we
almost like
we almost hit a wall coming out of 2013
where
we were so big on explaining why our
stuff was so high-tech and all the kind
of like great technology in it and how
cool it is and so forth um to having to
make a super hard pivot on why is it fun
and why did like does the random kind of
family of
four need this right like so it's
learnings but
that's that's the challenge and i think
like robotics tends to sometimes fall
into the new category problem but then
you gotta be really crisp about why
it needs to exist well i think some of
robotics depending on the category
depending on the application
is a little bit of a marketing
this uh challenge and i don't i don't
mean
i mean
it's it's the kind of marketing that
weimo is doing that tesla is doing is
like
showing off incredible
engineering incredible technology
but convincing like you said a family of
four that this this will this is like
this is transformative for your life
this is this is this is fun this is you
don't care about tech isn't your thing
they don't they really don't like they
need to know why they want it so some of
that is just marketing yeah that's why
like roomba like um yes they didn't you
know like
go and you know have this like
you know huge huge con you know ramp
into like the entirety of like kind of a
robotics and so forth but like they
built a really great business and um uh
in a vacuum cleaner world and like
everybody understands what a vacuum
cleaner is um most people are annoyed by
doing it um and now you have one that
like kind of does it itself
uh yeah various degrees of quality but
that is so compelling that like it's
easier to understand and like uh and
they had a very kind of and i think they
have like 15 of the vacuum cleaner
market so it's like pretty successful
right
i think we need more of those um types
of thoughtful stepping stones in
robotics but the opportunities are
becoming bigger because
hardware's cheaper computes cheaper
clouds cheaper and ai's better so
there's a lot of opportunity if we zoom
out from specifically startups and
robotics what advice do you have
to uh high school students college
students about
career
and living a life that you can be proud
of you lived one heck of a life you're
very successful in several domains
um
if you can convert that into a
generalizable potion what advice would
you give yeah it's a very good question
so it's very hard to
go into a space that you're not
passionate about and push
like push hard enough to be you know to
like
maximize your potential uh in it and so
there's a um
there's always kind of like the saying
of like okay follow your passion great
try to find the overlap of where your
passion overlaps with like a growing
opportunity and need in the world where
it's not too different than the startup
kind of argument that we talked about
where um if you are where your passion
meets the market right you know i mean
like because it's like uh um it's a you
know that's a beautiful thing where like
you can do what you love but it's also
just opens up tons of opportunities
because the world's ready for it right
like and so um and so like if you're
interested in technology um that might
point to like go and study machine
learning because you don't have to
decide what career you're going to go
into but it's going to be such a
versatile space that's going to be at
the root of like everything that's going
to be in front of us that
you can have
eight different careers in different
industries
and be an absolute expert in this like
kind of tool set that you wield that can
go and be applied um and that by the way
that doesn't apply to just technology
right it's uh it could be the exact same
thing if you want to um you know the
same thought process apprised to design
to marketing to um you know to sales to
anything but um that versatility where
you like um
when you're in a space that's gonna
continue to grow um it's just like what
company do you join one that just is
going to grow and the growth creates
opportunities where the surface area is
just going to increase and the problems
will never get stale and you can have
you know many like and so you go into a
career where you have that sort of
growth in the in the world that you're
in
you end up having
so much more opportunity that
organically just appears and you can
then have more shots on goal to find
like that killer overlap of timing and
passion and skill set and point in life
where you can like
you know just really be motivated and
fall in love with something um
and then at the same time like uh find a
balance like there's been times in my
life where i worked like a little bit
too obsessively and you know and crazy
and uh and i you know think we kind of
like tried to correct that you know kind
of the right opportunities but you know
i think i probably appreciate a lot more
now
friendships that go way back um you know
family and things like that and um
and i i'm kind of have the personality
where i could use like i have like so
much desire to really try to optimize
like you know what i'm working on that i
can easily go to kind of an extreme and
now i'm trying to like kind of find that
balance and make sure that i have
the friendships the family like
relationship with the kids everything
that like i don't
uh i push really really hard but it kind
of find a balance and
and i think
people can be happy on
actually many kind of extremes on that
spectrum but it's easy to kind of
inadvertently make a choice
by how how you approach it that then
becomes really hard to unwind um
and so being very thoughtful about
kind of all of those dimensions makes a
lot of sense and so
um to come those are all interrelated um
but at the end of the day oh love
passion and love yeah love towards you
said uh yeah family friends family and
hopefully
one day
if your work pans out
boris
is love towards robots
not the creepy kind of good guy that's a
good kind
just just friendship and yeah and fun
just yeah it's like another dimension to
just how we interface with the world
yeah
of course you're one of my favorite
human beings roboticists you've created
some incredible robots and i think
inspired countless people
and like i said
i hope cosmo i hope you work with anki
lives on and um i can't wait
to see what you do with waymo i mean
that's if we're talking about artificial
intelligence technology that has the
potential to revolutionize
so much of our world
that's it right there so thank you so
much for the work you've done and thank
you for spending your valuable time
talking with me thanks alex
thanks for listening to this
conversation with boris sofman to
support this podcast please check out
our sponsors in the description and now
let me leave you some words from isaac
asimov
if you were to insist i was a robot
you might not consider me capable of
love
in some mystic human sense
thank you for listening and hope to see
you next time
you