Singh S, Sharma A, Tiwari S. Empirical Benchmarking of Vision-Language Transformer Combinations for Visual Question Answering Tasks. DMP-LNCSE [Internet]. 2026 Mar. 13 [cited 2026 Jun. 30];(IMPACT26):199-20. Available from: https://digitalmanuscriptpedia.com/conferences/index.php/DMP-LNCSE/article/view/144