I’m a PhD student at the University of Tokyo, supervised by Prof. Yoichi Sato. I focus on computer vision and human activity understanding, specifically involving video and multi-view understanding, vision-language multimodal models, and human body perception.
🎓 Education
- Ph.D. in Information Science @ The University of Tokyo (2026.3 expected)
- M.Sc. in Information Science @ The University of Tokyo (2023.3)
- B.Sc. in Computer Science @ Nanjing University (2020.7)
🔬 Research Experience
- Intern at CyberAgent AI Lab, Activity Understanding Team, 2024
- Intern at Shanghai AI Laboratory, OpenGVLab, 2023
- Intern at Microsoft Research Asia, Media Computing Group, 2022
- Intern at PCL Shenzhen, Virtual Reality Lab, 2021
🎖️ Services and Awards
- JSPS Research Fellowship for Young Scientists DC2
- Reviewer of CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, AAAI, ACMMM, BMVC
- UTokyo-IIS Research Collaboration Initiative Award
- “Stars of Tomorrow” award by Microsoft Research Asia
- Contracted photographer of Visual China Group
📄 Publications
An Egocentric Vision-Language Model based Portable Real-time Smart Assistant
Yifei Huang, Jilan Xu, Baoqi Pei, Yuping He, Guo Chen, Mingfang Zhang, Lijin Yang, ..., Limin Wang
Arxiv preprint, 2025
Paper and Code
SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training
Nie Lin, Takehiko Ohkawa, Yifei Huang, Mingfang Zhang, Minjie Cai, Ming Li, Ryosuke Furuta, Yoichi Sato
International Conference on Learning Representations (ICLR), 2025
Paper and Code
Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Mingfang Zhang, Yifei Huang, Ruicong Liu, Yoichi Sato
European Conference on Computer Vision (ECCV), 2024
Paper and Code
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
(* co-first author) Yifei Huang* , Guo Chen*, Jilan Xu*, Mingfang Zhang*, Lijin Yang, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, Limin Wang, Yu Qiao
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Paper and Code
Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation
Ruicong Liu, Takehiko Ohkawa, Mingfang Zhang, Yoichi Sato
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Paper and Code
Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
Mingfang Zhang, Jinglu Wang, Xiao Li, Yifei Huang, Yoichi Sato, Yan Lu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Paper and Code
GazeOnce: Real-Time Multi-Person Gaze Estimation
Mingfang Zhang, Yunfei Liu, Feng Lu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Paper and Code
Optical Flow in the Dark
Mingfang Zhang, Yinqiang Zheng, Feng Lu
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Paper and Code
Optical Flow in the Dark
(*co-first author) Yinqiang Zheng*, Mingfang Zhang*, Feng Lu
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
Paper and Code