Kind: captions
Language: en
as part of the MIT autonomous vehicle
technology study we're instrumenting
cars with various degrees of automation
so let's take a look at one of those
cars a Tesla Model S and look at our
instrumentation inside the car we have
three cameras one is looking at the
driver's face and that's capturing
things like where the driver is looking
the draws state of the driver the
emotional state and also cognitive load
we have a camera looking at the driver's
body a fish lens camera that's capturing
the entire body of the driver including
hands and that's giving you information
about whether the hands are off wheel
whether the body is aligned and further
supplementary information about the
state of the driver that the face camera
provides and finally there's a forward-
facing camera attached to the windshield
that's looking at the forward roadway
and it's capturing everything in the
external environment such as the
vehicles the lanes and other
characteristics of the road having these
three cameras in the car allows us to
study driver behavior and interaction
with automation so the driver facing
camera looking at the face a camera
looking at the body and a camera looking
at the outside environment allows us to
understand over hundreds of thousands of
miles of real world driving how people
interact with these Technologies how we
can have artificial intelligence systems
play an important role in keeping us
safe and providing an enjoyable
experience in driving with we have now
to date collected
275,000 M of real world driving and
interaction with autonomous systems in
Tesla Model S vehicles in Land Rover
Evoke vehicles and a Volvo S90 but most
importantly once that data is collected
it's just raw pixels 3.5 billion video
frames of raw pixels we're using
computer vision deep learning methods to
convert those pixels into knowledge into
understanding of what the drivers are
actually doing with these systems that
comes from the face camera that comes
from the body camera and the forward-
facing camera understanding comes from
actually being able to touch every
single one of those frames and convert
them into behavior of human beings as
they interact with these artificial
intelligence systems