This is an old revision of the document!


To cope with the inertia problem without an explicit mapping of potential causes or on-policy interventions, we jointly train a sensorimotor controller with a network that predicts the ego vehicle’s speed. Both neural networks share the same representation via our ResNet perception backbone. Intuitively, what happens is that this joint optimization enforces the perception module to have speed related features into the learned representation. This reduces the dependency on input speed as the only way to get dynamics of the scene, leveraging instead visual cues that are predictive of the car’s velocity (e.g., free space, curves, traffic light states, etc).

Navigation