This is an old revision of the document!


Traffic light detector strongly requires showing reliable performance in real-time and working for both small (i.e., 3×9 pixels) and large objects with low false positive and low false negative rates, while maintaining a high detection accuracy. For example, a false red traffic light will lead the autonomous vehicle to abruptly stop while driving, while a missed red light will cause the vehicle to go through an intersection originally with red lights in its course of driving. In this coarse-grained traffic light detection step, we focus to reduce false negative (FN) rates or to collect as many true traffic lights as possible. We utilize the Single-Shot multi-box Detector (SSD) [5] that has been shown to be an effective tool for an object detection task. Note that we use the SSD architecture that has shown improved detection accuracy in other benchmarks than YOLO network architecture, which was utilized in the existing work by Behrendt et al. [1]. More modern architecture, such as Mask R-CNN [6], may provide better detection accuracy, but we leave this comparison for future work. The SSD model is based on a convolutional network and takes the whole image as an input and predicts a fixed-size collection of bounding boxes and corresponding confident scores for the presence of object instances in those boxes. The final detections are then produced followed by a non-maximum suppression step – all detection boxes are sorted on the basis of their predicted scores, and the detections with maximum score is then selected, while other detections with a significant overlap are suppressed. As we described in Figure 2, we use a standard VGG-16 network architecture [7] as a base convolutional network, which is pre-trained on ImageNet Large Scale Visual Recognition

Navigation