Hi,
I’m Jinming Wu, a Master student at Beijing University of Posts and Telecommunications (BUPT), supervised by Prof. Jingyu Wang.
My research focuses on Multimodal Understanding and Reasoning. Recently, I am working on build strong Reasoning Multimodal Agent through end-to-end Reinforcement Learning.
📧 Email ([email protected]) 🔗 Google Scholar / Github / Twitter
📄 MMSearch-R1: Incentivizing LMMs to Search (Work in Progress) ✍🏻 Jinming Wu*, Zihao Deng*, Wei Li, Yiding Liu, Bo You, Bo Li, Zejun Ma, Ziwei Liu 🔗 blog / code 🌟 We introduce the MMSearch-R1, an initial effort to equip LMMs with active image search capabilities through an end-to-end RL framework.
https://docs.google.com/spreadsheets/d/1PZo2f5eJIUNuc_aYMZcRIcWLrv5LadPR/edit?gid=2002795556#gid=2002795556
📄 Video Instruction Tuning With Synthetic Data ✍🏻 Yuanhan Zhang, Jinming Wu, Wei Li, Bo Li, Zejun Ma, Ziwei Liu, Chunyuan Li 🔗 ArXiv 2024 / paper / code / project 🌟 We release a high-quality synthetic dataset LLaVA-Video-178K containing 178K video captions and 1.15M video QAs. We also explore the best strategies for training VideoLLM. Our LLaVA-Video-7/72B model demonstrates strong performance across 10+ video understanding tasks.
📄 MDR: Model-Specific Demonstration Retrieval at Inference Time for In-Context Learning ✍🏻 Huazheng Wang*, Jinming Wu*, Haifeng Sun, Zixuan Xia, Daixuan Cheng, Jingyu Wang, Qi Qi, Jianxin Liao 🔗 NAACL Oral 2024 / paper / code 🌟 We propose a simple and effective metric to measure the preference of different LLMs for demonstrations and leverage this metric to improve existing demonstration retrieval frameworks at inference stage.
:tiktoklogo_tiktok: TikTok-AIIC, ByteDance, 03/2024~Present Research Intern, supervised by Wei Li, Zejun Ma
🌸 National Scholarship Ministry of Education of the People’s Republic of China, Oct. 2023