Jinming (Kimmy) Wu

11091750818569_.pic_hd.jpg

Hi, I am currently a researcher at Bytedance. Previously, I obtained my master's degree from Beijing University of Posts and Telecommunications (BUPT), supervised by Prof. Jingyu Wang. My research focuses on Multimodal Understanding and Reasoning. Recently, I am working on build strong Multimodal Agent through end-to-end Reinforcement Learning.

📧 Email ([email protected]) 🔗 Google Scholar / Github / Twitter

*Publications

📄 MMSearch-R1: Incentivizing LMMs to Search ✍🏻 Jinming Wu*, Zihao Deng*, Wei Li, Yiding Liu, Bo You, Bo Li, Zejun Ma, Ziwei Liu 🔗 ArXiv 2025 / paper / code / blog / model / data 🌟 We present MMSearch-R1, an e2e RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools. MMSearch-R1 model outperforms same-size traditional RAG baselines and cuts search calls by over 30%.

📄 LLaVA-Video: Video Instruction Tuning With Synthetic Data ✍🏻 Yuanhan Zhang, Jinming Wu, Wei Li, Bo Li, Zejun Ma, Ziwei Liu, Chunyuan Li 🔗 TMLR 2025 / paper / code / project 🌟 We release a high-quality synthetic dataset LLaVA-Video-178K containing 178K video captions and 1.15M video QAs. We also explore the best strategies for training VideoLLM. Our LLaVA-Video-7/72B model demonstrates strong performance across 10+ video understanding tasks.

📄 MDR: Model-Specific Demonstration Retrieval at Inference Time for In-Context Learning ✍🏻 Huazheng Wang*, Jinming Wu*, Haifeng Sun, Zixuan Xia, Daixuan Cheng, Jingyu Wang, Qi Qi, Jianxin Liao 🔗 NAACL Oral 2024 / paper / code 🌟 We propose a simple and effective metric to measure the preference of different LLMs for demonstrations and leverage this metric to improve existing demonstration retrieval frameworks at inference stage.

*Experience

:tiktoklogo_tiktok: TikTok-AIIC, ByteDance, 07/2025~Present Researcher, collaborated with Wei Li, Zejun Ma

*Awards

🌸 National Scholarship Ministry of Education of the People’s Republic of China, Oct. 2023