Kind: captions Language: en as part of the MIT autonomous vehicle technology study we're instrumenting cars with various degrees of automation so let's take a look at one of those cars a Tesla Model S and look at our instrumentation inside the car we have three cameras one is looking at the driver's face and that's capturing things like where the driver is looking the draws state of the driver the emotional state and also cognitive load we have a camera looking at the driver's body a fish lens camera that's capturing the entire body of the driver including hands and that's giving you information about whether the hands are off wheel whether the body is aligned and further supplementary information about the state of the driver that the face camera provides and finally there's a forward- facing camera attached to the windshield that's looking at the forward roadway and it's capturing everything in the external environment such as the vehicles the lanes and other characteristics of the road having these three cameras in the car allows us to study driver behavior and interaction with automation so the driver facing camera looking at the face a camera looking at the body and a camera looking at the outside environment allows us to understand over hundreds of thousands of miles of real world driving how people interact with these Technologies how we can have artificial intelligence systems play an important role in keeping us safe and providing an enjoyable experience in driving with we have now to date collected 275,000 M of real world driving and interaction with autonomous systems in Tesla Model S vehicles in Land Rover Evoke vehicles and a Volvo S90 but most importantly once that data is collected it's just raw pixels 3.5 billion video frames of raw pixels we're using computer vision deep learning methods to convert those pixels into knowledge into understanding of what the drivers are actually doing with these systems that comes from the face camera that comes from the body camera and the forward- facing camera understanding comes from actually being able to touch every single one of those frames and convert them into behavior of human beings as they interact with these artificial intelligence systems