Yixuan Li (李奕萱)

I am currently a third-year Ph.D. student at MMLab in Department of Information Engineering, CUHK, advised by Prof. Dahua Lin. Before that, I received my Master's degree from Nanjing University in 2022, supervised by Prof. Limin Wang, and my Bachelor's degree also from Nanjing University in 2019. My research area is 3D vision, especially 3D Scene Reconstruction and Generation.

Email  /  Google Scholar  /  Twitter  /  Github

profile photo

News

• 09/2024 Two papers accepted by NeurIPS 2024.
• 02/2024 One paper accepted by CVPR 2024.
• 02/2024 One paper accepted by TIP.
• 06/2023 One paper accepted by ICCV 2023.
• 10/2021 I got the National Scholarship.
• 07/2021 One paper accepted by ICCV 2021.
• 06/2021 We got the first place in the HC-STVG track of the CVPR 2021 workshop Person in Context.
• 04/2021 I was a student co-organizer of ICCV 2021 Workshop DeeperAction.
• 10/2020 I got the National Scholarship.
• 06/2020 One paper accepted by ECCV 2020.

Research

HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation
Zhenzhi Wang, Yixuan Li, Yanhong Zeng, Youqing Fang, Yuwei Guo, Wenran Liu, Jing Tan, Kai Chen, Tianfan Xue, Bo Dai, Dahua Lin.
NeurIPS (Datasets and Benchmarks Track), 2024
arXiv / homepage / code

We propose camera-controllable human image animation task for generating video clips that are similar to real movie clips. To achieve this, we collect a dataset named HumanVid, and a baseline model combined by Animate Anyone and CameraCtrl. Without any tricks, we show that a simple baseline trained on our dataset could generate movie-level video clips.

LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
Shuai Yang*, Jing Tan*, Mengchen Zhang, Tong Wu, Yixuan Li, Gordon Wetzstein, Ziwei Liu, Dahua Lin
Arxiv, 2024
project page / video / arXiv

LayerPano3D generates full-view, explorable panoramic 3D scene from a single text prompt.

InterControl: Generate Human Motion Interactions by Controlling Every Joint
Zhenzhi Wang, Jingbo Wang, Yixuan Li, Dahua Lin, Bo Dai.
NeurIPS, 2024
arXiv / code

We could generate human motion interactions with spatially controllable MDM that is only trained on single-person data.

PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios
Jingbo Wang, Zhengyi Luo, Ye Yuan, Yixuan Li, Bo Dai.
CVPR, 2024
arXiv / code

We could generate human motion interactions with spatially controllable MDM that is only trained on single-person data.

MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering and Beyond
Yixuan Li*, Lihan Jiang*, Linning Xu,Yuanbo Xiangli, Zhenzhi Wang, Dahua Lin, Bo Dai.
ICCV, 2023
paper / project page / code

A large scale synthetic dataset from Unreal Engine 5 for city-scale NeRF rendering.

Sparse Action Tube Detection
Yixuan Li*, Zhenzhi Wang*, Zhifeng Li, Limin Wang.
TIP, 2024
paper

We present a simple end-to-end action tube detection method, which reduces the dense hand-crafted anchors, captures longer temporal information and explictly predicts the action boundary.

MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions
Yixuan Li, Lei Chen, Runyu He, Zhenzhi Wang, Gangshan Wu, Limin Wang.
ICCV, 2021
one track of ICCV2021, ECCV2022 Workshop DeeperAction.
paper / code

A fine-grained and large-scale spatial-temporal action detection dataset with 4 different sports, 66 action categories.

Actions as Moving Points
Yixuan Li*, Zixu Wang*, Limin Wang, Gangshan Wu.
ECCV, 2020
paper / code

A conceptually simple, computationally efficient, and more precise anchor-free action tubelet detector.

Professional Services

• Conference reviewer for CVPR, ICCV, ECCV, NeurIPS.
• Journal reviewer for IJCV, Pattern Recognition, TCSVT, Neurocomputing.
• Co-organizer of DeeperAction Workshop at ICCV 2021 and ECCV 2022.




Thanks Jon Barron for sharing the source code of this website template.