Hi! I’m Yang Tian (Chinese name: 田扬), a first-year Master’s student at the MINT Lab, Shanghai Jiao Tong University, where I’m fortunate to be supervised by Prof. Bo Zhao.

My research focuses on Multimodal Large Language Models and AI Agent, with an interest in building intelligent systems that can understand and reason across text and visual information.

I’m always open to discussions and collaborations—please feel free to reach out via email if our interests align!

🔥 News

2025.06: 🎉🎉 One paper was accepted to ICCV 2025.

📝 Publications

You can also use google scholar badge

[arXiv] TimeScope: Towards Task-Oriented Temporal Grounding In Long Videos. Xiangrui Liu, Minghao Qin, Yan Shu, Zhengyang Liang, Yang Tian, Chen Jason Zhang, Bo Zhao, Zheng Liu. Paper

[ICCV 2025] MMCR: Benchmarking Cross-Source Reasoning in Scientific Papers. Yang Tian, Zheng Lu, Mingqi Gao, Zheng Liu, Bo Zhao. Paper

[TGRS] FPNFormer: Rethink the Method of Processing the Rotation-Invariance and Rotation-Equivariance on Arbitrary-Oriented Object Detection. Yang Tian, Mengmeng Zhang, Jinyu Li, Yangfan Li, Hong Yang, Wei Li. Code

[IGARSS 2023] Aligned Feature for Vector-Based Rotated Object Detection. Yang Tian, Jinyu Li, Mengmeng Zhang. Paper

[arXiv] Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding. Xiangrui Liu, Yan Shu, Zheng Liu, Ao Li, Yang Tian, Bo Zhao. Paper

📖 Educations

2025.9 - now, School of Artificial Intelligence, Shanghai Jiao Tong University.
2021.09 - 2025.06, School of Mechanical Engineering, Beijing Institute of Technology.

💻 Internships

2026.03 - now, MiLM Plus, Xiaomi Inc.
2025.05 - 2025.10, Lark AI, ByteDance.

🎖 Honors and Awards

2021.10 None

💬 Invited Talks

2021.06, None