I’m a PhD student at the University of Tokyo, supervised by Prof. Yoichi Sato. I focus on computer vision and human activity understanding, specifically involving video and multi-view understanding, vision-language multimodal models, and human body perception.

🎓 Education

  • Ph.D. in Information Science @ The University of Tokyo (2026.3 expected)
  • M.Sc. in Information Science @ The University of Tokyo (2023.3)
  • B.Sc. in Computer Science @ Nanjing University (2020.7)

🔬 Research Experience

  • Intern at CyberAgent AI Lab, Activity Understanding Team, 2024
  • Intern at Shanghai AI Laboratory, OpenGVLab, 2023
  • Intern at Microsoft Research Asia, Media Computing Group, 2022
  • Intern at PCL Shenzhen, Virtual Reality Lab, 2021

🎖️ Services and Awards

  • JSPS Research Fellowship for Young Scientists DC2
  • Reviewer of CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, AAAI, ACMMM, BMVC
  • UTokyo-IIS Research Collaboration Initiative Award
  • “Stars of Tomorrow” award by Microsoft Research Asia
  • Contracted photographer of Visual China Group

📄 Publications

Arxiv 2025

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant

Yifei Huang, Jilan Xu, Baoqi Pei, Yuping He, Guo Chen, Mingfang Zhang, Lijin Yang, ..., Limin Wang
Arxiv preprint, 2025
Paper and Code


ICLR 2025

SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training

Nie Lin, Takehiko Ohkawa, Yifei Huang, Mingfang Zhang, Minjie Cai, Ming Li, Ryosuke Furuta, Yoichi Sato
International Conference on Learning Representations (ICLR), 2025
Paper and Code


ECCV 2024

Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition

Mingfang Zhang, Yifei Huang, Ruicong Liu, Yoichi Sato
European Conference on Computer Vision (ECCV), 2024
Paper and Code


CVPR 2024

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World

(* co-first author) Yifei Huang* , Guo Chen*, Jilan Xu*, Mingfang Zhang*, Lijin Yang, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, Limin Wang, Yu Qiao
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Paper and Code


CVPR 2024

Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation

Ruicong Liu, Takehiko Ohkawa, Mingfang Zhang, Yoichi Sato
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Paper and Code


CVPR 2023

Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction

Mingfang Zhang, Jinglu Wang, Xiao Li, Yifei Huang, Yoichi Sato, Yan Lu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Paper and Code


CVPR 2022

GazeOnce: Real-Time Multi-Person Gaze Estimation

Mingfang Zhang, Yunfei Liu, Feng Lu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Paper and Code


PAMI 2021

Optical Flow in the Dark

Mingfang Zhang, Yinqiang Zheng, Feng Lu
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Paper and Code


CVPR 2020

Optical Flow in the Dark

(*co-first author) Yinqiang Zheng*, Mingfang Zhang*, Feng Lu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
Paper and Code