Thinh, Nguyen Van, Tran Van Lang, and Van The Thanh. “RGTranCNet: Effective Image Captioning Model Using Cross-Attention and Semantic Knowledge”. Vietnam Journal of Science and Technology 64, no. 1 (July 15, 2025): 123–138. Accessed March 19, 2026. https://cip.vast.vn/jst/article/view/22381.