Modelling autonomous driving and obstacle avoidance using multimodal fusion transformer framework| International Journal of Innovative Science and Research Technology

Modelling Autonomous Driving and Obstacle Avoidance Using Multi-Modal Fusion Transformer Framework

Authors : Abhinav Singh; Nishant Prakash; Kanishk Bhaskar; Krishnan Rangarajan; Aditya Raj

Volume/Issue : Volume 8 - 2023, Issue 2 - February

Google Scholar : https://bit.ly/3IIfn9N

DOI : https://doi.org/10.5281/zenodo.7664562

Abstract : The papers that we are surveying have many methods that have been presented with different solutions for autonomous driving. One of the few novel representations helps in proving the reasoning for imitation learning in a certain scene where the cameras are used to highlight a certain location which coordinates to waypoints and semantics. In this method the camera follows the car and will show the waypoints at a certain distance ahead of the car at all times while the car is moving. The papers have used attention fields to compress two-dimensional images with features which are best suited for cognitive processing on a discrete aspect of information or in other words obstacles that may appear in front of the car. Therefore, the other model being a Multi-Modal Fusion Transformer is used to combine two separate datasets such as image data and topography data from cameras and distance sensors respectively using attention mechanism. This helps in integrating image data and the topography data that is being received through the camera and distance sensors. The distance sensor maps the surface of all the surroundings where the car is being driven.

Keywords : End-to-End Autonomous Driving, Transformer, 2D Imaging, Self-Attention Model, Imitation Learning.

The papers that we are surveying have many methods that have been presented with different solutions for autonomous driving. One of the few novel representations helps in proving the reasoning for imitation learning in a certain scene where the cameras are used to highlight a certain location which coordinates to waypoints and semantics. In this method the camera follows the car and will show the waypoints at a certain distance ahead of the car at all times while the car is moving. The papers have used attention fields to compress two-dimensional images with features which are best suited for cognitive processing on a discrete aspect of information or in other words obstacles that may appear in front of the car. Therefore, the other model being a Multi-Modal Fusion Transformer is used to combine two separate datasets such as image data and topography data from cameras and distance sensors respectively using attention mechanism. This helps in integrating image data and the topography data that is being received through the camera and distance sensors. The distance sensor maps the surface of all the surroundings where the car is being driven.

Keywords : End-to-End Autonomous Driving, Transformer, 2D Imaging, Self-Attention Model, Imitation Learning.