Pei, Dashuai and He, Jianhua and Liu, Kezhong and Wu, Yiwen and Lei, Yaxiong and Xiao, Xuedou (2026) Enhancement of Large Language Models Driving Knowledge for Practical Autonomous Driving Decision Making. IEEE Transactions on Vehicular Technology. pp. 1-17. DOI https://doi.org/10.1109/tvt.2026.3678726
Pei, Dashuai and He, Jianhua and Liu, Kezhong and Wu, Yiwen and Lei, Yaxiong and Xiao, Xuedou (2026) Enhancement of Large Language Models Driving Knowledge for Practical Autonomous Driving Decision Making. IEEE Transactions on Vehicular Technology. pp. 1-17. DOI https://doi.org/10.1109/tvt.2026.3678726
Pei, Dashuai and He, Jianhua and Liu, Kezhong and Wu, Yiwen and Lei, Yaxiong and Xiao, Xuedou (2026) Enhancement of Large Language Models Driving Knowledge for Practical Autonomous Driving Decision Making. IEEE Transactions on Vehicular Technology. pp. 1-17. DOI https://doi.org/10.1109/tvt.2026.3678726
Abstract
Theoretical driving knowledge and hazard perception capability are essential for large language models (LLMs) to be qualified for driving or assisting autonomous vehicles. Our prior study found that the mainstream LLMs with relatively small size suitable for onboard or edge deployment failed driving theory tests. That raised several important research questions on application of LLMs for safety critical autonomous driving tasks: how to enhance the theoretical driving knowledge of LLMs and how to assess its impact on driving decision-making and safety in realistic scenarios. In this paper we first propose a novel question guided multimodal graph based retrieval-augmented generation (QGM-GRAG) approach to address the first question. Structured textual rules and visual traffic sign semantics are integrated to create a unified graph-based representation. A joint representation that fuses paragraph level embeddings with question aware signals is generated by LLMs for accurate and interpretable retrieval. Then we propose a general scenario based evaluation framework to address the second question, by investigating the connection between the driving theory knowledge and practical driving decision-making performance. In the framework theory-test questions are enhanced and converted to realistic driving scenarios. Prompts are designed for LLMs to generate driving decisions. The decisions are evaluated automatically by a GPT-4o based agent across multiple dimensions (such as situation awareness and driving safety). Experiments demonstrate that the proposed RAG approach can achieve more than 10% improvement on driving theory tests. In addition, results from the evaluation framework show that the driving knowledge has significant impact on driving decisions and the QGM-GRAG approach can help improve driving performance by more than 13%. Furthermore, experiments on the real-world LingoQA dataset show an average improvement of 44.87%, confirming the practical applicability of our pipeline. These results validate the effectiveness and robustness of the proposed RAG approach and evaluation framework.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | LLMs; multimodal RAG; driving theory tests; theory-to-behavior transfer |
| Subjects: | Z Bibliography. Library Science. Information Resources > ZR Rights Retention |
| Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
| SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
| Depositing User: | Unnamed user with email elements@essex.ac.uk |
| Date Deposited: | 26 Jun 2026 14:49 |
| Last Modified: | 26 Jun 2026 15:08 |
| URI: | http://repository.essex.ac.uk/id/eprint/43477 |
Available files
Filename: Enhancement_of_Large_Language_Models_Driving_Knowledge_for_Practical_Autonomous_Driving_Decision_Making.pdf
Licence: Creative Commons: Attribution 4.0