Recent News

Jan 2026

One paper on fixed-length dense fingerprint representation was accepted by T-IFS.

Nov 2025

One paper on under-screen fingerprint pose estimation was accepted by T-IFS, and one paper on finger pose HCI was accepted by T-MC.

Jul 2025

Achieved 2nd place at MER25 @ ACM MM, and had one paper on finger photo pose estimation accepted by IJCB 2025 as Oral.

Jun 2025

Achieved 2nd place in CVRR and EgoSchema Challenge at CVPR 2025, with related work accepted by EgoVis @ CVPR 2025.

Education

2021 - 2026

Department of Automation, Tsinghua University

Doctor of Philosophy (Ph.D.)

Focusing on enhancing performance and efficiency in domain-specific tasks via data & representation normalization / domain-informed modeling / workflow-level optimization; supervised by Prof. Jianjiang Feng and Prof. Jie Zhou.

Resulting in 11 publications in international journals and conferences (6 first-author).

Image Retrieval Image Registration Pose Estimation HCI Multi-Modal Fusion Multi-Task Collaboration Multi-Semantic Optimization Multi-Granularity Representation
2018 - 2021

Academy of Arts & Design, Tsinghua University

Minor in Digital Media Art

Coursework focusing on digital entertainment content and interactive media, with an emphasis on integrating creative design and interdisciplinary digital technologies.

Creative Design Aesthetic Sense Product Thinking User Experience Interactive Media
2017 - 2021

Department of Automation, Tsinghua University

Bachelor of Engineering (B.Eng.)

Undergraduate studies in automation, covering control theory, artificial intelligence and data analysis, with exposure to system design and engineering applications.

Resulting in 1 publications in international conferences (first-author).

System Modeling Computational Thinking Control Theory Machine Learning Signal & Image Processing Pattern Recognition

Work Experience

2024.12 - 2026.01

PC Intelligence & Ecosystems Lab, Lenovo Research

Research Intern

Focusing on multimodal perception & representation, intent understanding, and affective computing, improving deployment efficiency and performance through prompt engineering, post-training, agent collaboration, and pipeline optimization for multimodal large models.

Resulting in 4 technical reports (2 first/corresponding-author), and 4 corresponding awards in international competitions (ACM MM, CVPR).

Video Understanding & Question-Answering Affective Computing MLLM Post-training Context Learning Chain of Thought Agentic Workflow Harness Engineering
2023.06 - 2023.08

Algorithm Research Department, GRG Intek, CRG Banking

Algorithm Intern

Focusing on contactless palmprint recognition for large-scale populations, responsible for framework design, technical investigation, and early-stage engineering prototyping. Corresponding project was piloted for palm-based payment on the Beijing Daxing Airport Express line.

Resulting in a Silver Award in Tsinghua University Doctoral Social Practice.

Contactless Palmprint Recognition System Keypoint Detection Pose Estimation ROI Segmentation Image Enhancement Efficient Representation
2022.01 - 2023.06

Hisign Technology

Algorithm Intern (University-Industry Collaboration)

Focusing on image retrieval in edge computing scenarios (partial images / low quaity), improving storage and computational efficiency through image registration and mosaicking technologies.

Resulting in 1 publication in international journal (first-author).

Image Retrieval Fine-Grained Visual Alignment Image Enhancement Image Registration Image Mosaicking

Honors

Silver Award for Social Practice of Ph.D. Students Tsinghua University, 2023
Outstanding Graduates Department of Automation, Tsinghua University, 2021
Academic Excellence Award Tsinghua University, 2019, 2020

Competitions

ACM MM 2025 2nd Place
MER25 Track 2: Multimodal Emotion Recognition with Fine-Grained Categories
CVPR 2025 4th Place
CVRR: Complex Video Reasoning and Robustness Evaluation
CRCSPC 2018 1st Prize
35th China Regional College Students Physics Competition

Selected Publications

MLLMs & Video Understanding

Works on multi-modal large models, post-training, agentic workflow, and harness engineering.

ACM MM 2025 2nd Place

More is Better: A Moe-based Emotion Recognition Framework with Human Preference Alignment

Jun Xie*, Yingjian Zhu*, Feng Chen, Zhenghao Zhang, Xiaohui Fan, Hongzhu Yi, Xinming Wang, Chen Yu, Yue Bi, Zhaoran Zhao, Xiongjun Guan (Corresponding Author), Zhepeng Wang

Multi-modal emotion recognition with leveraged signals, samples and deliberation.

Affective Computing Multi-Modal Learning Semi-Supervised Learning Mixture of Experts Human Preference Alignment
ACM MM 2025 2nd Place

ZeroES: Zero-Shot Ensemble for Open-Vocabulary Video Emotion Recognition with Large Multimodal Models

Jun Xie*, Xiaohui Fan*, Zhenghao Zhang, Feng Chen, Hongzhu Yi, Yingjian Zhu, Xiongjun Guan, Xinming Wang, Yue Bi, Tao Zhang, Zhepeng Wang

Fine-grained multi-modal emotion recognition via multi MLLMs and model ensemble.

Affective Computing Open-Vocabulary Recognition Zero-Shot Learning Context Engineering Multi-Model Ensemble
CVPR 2025 2nd Place

Four Eyes Are Better Than Two: Harnessing the Collaborative Potential of Large Models via Differentiated Thinking and Complementary Ensembles

Jun Xie*, Xiongjun Guan*, Yingjian Zhu, Zhaoran Zhao, Xinming Wang, Hongzhu Yi, Feng Chen, Zhepeng Wang

Long-form video understanding and robustness evaluation via Chain of Thought and harness engineering.

Long Video Understanding Video Question-Answering Chain of Thought Harness Engineering Ensemble Learning Agentic Workflow
CVPR 2025 4th Place

Team of One: Cracking Complex Video QA with Model Synergy

Jun Xie*, Zhaoran Zhao*, Xiongjun Guan, Yingjian Zhu, Hongzhu Yi, Xinming Wang, Feng Chen, Zhepeng Wang

Open-ended video question answering with collaborative model reasoning.

Video Question-Answering Model Collaboration Harness Engineering Agentic Workflow

Finger-based HCI

Works on finger-based human-computer interaction, pose estimation, and sensing.

T-MC 2025 CCF-A

BiFingerPose: Bimodal Finger Pose Estimation for Touch Devices

Xiongjun Guan, Zhiyu Pan, Jianjiang Feng, Jie Zhou

Multi-modal 2D finger pose estimation with efficient 2D-to-3D mapping for mobile device interaction.

Mobile Interaction Pose Estimation Multi-Sensor Fusion Representation Learning Geometric Mapping
IJCB 2025 CCF-C Oral

Contactless Fingerprint Recognition Guided by 3D Finger Pose

Haoxiang Pei, Zhiyu Pan, Xiongjun Guan, Jianjiang Feng, Jie Zhou

Leveraging 3D finger pose to improve contactless fingerprint recognition through acquisition guidance and pose constraints.

Contactless Fingerprint Pose Estimation Geometry-aware Recognition Acquisition Guidance
CCBR 2021 Oral

Pose-Specific 3D Fingerprint Unfolding

Xiongjun Guan, Jianjiang Feng, Jie Zhou

Unfolding and visualization method for 3D fingerprints to improve compatibility with 2D images.

3D Fingerprint Geometric Unfolding Point Cloud Projection

Image Retrieval

Works on large-scale image retrieval with efficient representation and geometric normalization.

T-IFS 2026 CCF-A

Fixed-Length Dense Fingerprint Representation with Alignment and Robust Enhancement

Zhiyu Pan, Xiongjun Guan, Yongjie Duan, Jianjiang Feng, Jie Zhou

Fixed-length dense representation and matching framework that couples local discriminability with pose-aware alignment and robustness enhancement.

Efficient Image Embedding Dense Passage Retrieval Representation Learning Local Discriminability Recognition with Spatial Priors
Preprint 2025

Minutiae-Anchored Local Dense Representation for Fingerprint Matching

Zhiyu Pan, Xiongjun Guan, Yongjie Duan, Jianjiang Feng, Jie Zhou

Local representation that fuses geometric anchors and contextual textures.

Image Segment Retrieval Representation Learning Anchor-based Attention Geometry Constraint Matching
T-IFS 2025 CCF-A

Finger Pose Estimation for Under-screen Fingerprint Sensors

Xiongjun Guan, Zhiyu Pan, Jianjiang Feng, Jie Zhou

Rigid pose estimation via multi-modal fusion strategies and its application to retrieval.

Pose Estimation Multi-Modal Fusion Mixture of Experts Knowledge Transfer Decoupled Distribution Representation Geometry Constraint Matching
T-IFS 2024 CCF-A

Joint Identity Verification and Pose Alignment for Partial Fingerprints

Xiongjun Guan, ZhiyuPan, Jianjiang Feng, Jie Zhou

CNN-ViT hybrid network for joint pose estimation and identity recognition in partial image scenarios.

Location & Verification Multi-Task Collaboration Image Segment Retrieval Representation-Regularized Pre-Training
IJCB 2024 CCF-C

Latent Fingerprint Matching via Dense Minutia Descriptor

Zhiyu Pan, Yongjie Duan, Xiongjun Guan, Jianjiang Feng, Jie Zhou

Dense anchor descriptors that enables stronger local correspondence modeling and more reliable matching.

Patch Embedding Representation Learning Anchor-based Attention Geometry Constraint Matching
T-IFS 2024 CCF-A

Phase-Aggregated Dual-Branch Network for Efficient Fingerprint Dense Registration

Xiongjun Guan, Jianjiang Feng, Jie Zhou

Dual-branch dense registration paradigm that integrates precise geometric priors (phase) with robust deep representation.

Image Registration Phase Unwrapping Multi-Granularity Representation Geometry Constraint Matching
T-IFS 2023 CCF-A

Regression of Dense Distortion Field from a Single Fingerprint Image

Xiongjun Guan, Yongjie Duan, Jianjiang Feng, Jie Zhou

End-to-end dense distortion field regression replacing previous low-dimensional assumptions with grid-level geometric correction.

Distortion Rectification Multi-Semantic Optimization Geometry Constraint Matching Principal Component Analysis
IJCB 2022 CCF-C Oral

Direct Regression of Distortion Field from a Single Fingerprint Image

Xiongjun Guan, Yongjie Duan, Jianjiang Feng, Jie Zhou

End-to-end dense distortion field regression replacing previous low-dimensional assumptions with grid-level geometric correction.

Distortion Rectification Multi-Semantic Optimization Geometry Constraint Matching Principal Component Analysis

Teaching

Programming Fundamentals
Teaching Assistant Spring 2024 Department of Automation, Tsinghua University
Interdisciplinary Research and Practice: Image Processing
Teaching Assistant Fall 2023 Department of Automation, Tsinghua University
Basic of Information Theory
Teaching Assistant Spring 2023 Department of Automation, Tsinghua University
Digital Image Processing
Teaching Assistant Fall 2022 Department of Automation, Tsinghua University
...and possibly more.