I'm a PhD student at the University of Tokyo, supervised by Prof. Yoichi Sato. I focus on computer vision and human activity understanding, specifically involving video and multi-view understanding, vision-language multimodal models, and human body perception.
Education

Ph.D. in Information Science The University of Tokyo

M.Sc. in Information Science The University of Tokyo

B.Sc. in Computer Science Nanjing University
Research Experience

Intern at CyberAgent AI Lab Activity Understanding Team

Intern at Shanghai AI Laboratory OpenGVLab

Intern at Microsoft Research Asia Media Computing Group
Services and Awards
Reviewer of CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, AAAI, ACMMM, BMVC
JSPS Research Fellowship for Young Scientists DC2 2025
UTokyo-IIS Research Collaboration Initiative Award 2024
"Stars of Tomorrow" award by Microsoft Research Asia 2023
Contracted photographer of Visual China Group
Publications

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant
Arxiv 2025 [b3]

Egocentric Inertial Localization with Vision-Language Informed Action Cues
Arxiv 2025 [b2]


SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training
ICLR 2025 [c3]

Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
ECCV 2024 [a5]

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
CVPR 2024 [a4]

Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation
CVPR 2024 [c2]

Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
CVPR 2023 [a3]

GazeOnce: Real-Time Multi-Person Gaze Estimation
CVPR 2022 [c1]

Optical Flow in the Dark
TPAMI 2021 [a2]

Optical Flow in the Dark
CVPR 2020 [a1]