Singh, Satyam, Aditya Sharma, and Smita Tiwari. 2026. “Empirical Benchmarking of Vision-Language Transformer Combinations for Visual Question Answering Tasks”. DMPedia Lecture Notes in Computer Science & Engineering, no. IMPACT26 (March): 199-209. https://digitalmanuscriptpedia.com/conferences/index.php/DMP-LNCSE/article/view/144.