个人简介

本人2023年6月于中国科学技术大学获得计算机科学与技术专业博士学位,师从陈恩红教授。现为中国科学技术大学博士后研究员。担任安徽省人工智能学会情感计算专委会秘书长,中国计算机学会会员,中国中文信息学会情感计算专委会委员、青年工作委员会委员,中国图象图形学会情感计算与理解专委会委员。主要从事情感计算、多模态理解、人机交互等方面的课题研究。近年来,在情感计算、多模态理解、深度学习等领域的重要期刊(IEEE TAFFC,Neural Networks,ACM TOMM等)和会议(ACM SIGKDD、ACM MM、ICME、CIKM、ICIP等)上发表论文30余篇,其中一作/通讯16篇。申请中国发明专利10余项,授权8项。获得国内外相关学术论文/竞赛奖项8项。主持国家自然科学基金青年基金1项、主持四川省自然科学基金青年基金1项,参与国家自然科学基金重大科研仪器项目、科技部重点研发等多项。担任学术期刊Frontiers in Big Data客座编辑(Guest-Editor)、国际会议PRAI2022/2023/2024 Special Session主席,担任IEEE TAFFC, Neural Network等多个学术期刊审稿人。


教育及工作经历

  • 2024.07—至今 中国科学技术大学,博士后研究员
  • 2023.07—2024.07 西南科技大学计算机学院,教师
  • 2019.09—2023.06 中国科学技术大学计算机学院,计算机应用技术,工学博士
  • 2017.07—2019.07 联发科技股份有限公司(成都),高级软件研发工程师
  • 2014.09—2017.06 西南科技大学计算机学院,计算机科学与技术,工学硕士
  • 2010.09—2014.06 西南科技大学信息工程学院,自动化,工学学士

近三年获奖情况

  • 2024   国际数据挖掘顶会ACM SIGKDD 2024 Best Student Paper
  • 2024   国际模式识别与人工智能会议PRAI2024 优秀论文奖
  • 2024   国际多媒体旗舰会议ACM MM2024@微表情挑战赛,检测赛道 亚军
  • 2023   国际计算机视觉旗舰会议CVPR2023@长视频理解挑战赛,Track3亚军
  • 2023   “天马杯”全国高校科技创新大赛@2D/3D数字人生成 一/二等奖
  • 2022   国际多媒体旗舰会议ACM MM2022@微表情挑战赛,生成赛道 亚军
  • 2022   国际多媒体旗舰会议ACM MM2022@微表情挑战赛,检测赛道 季军
  • 2021   国际多媒体旗舰会议ACM MM2021@微表情挑战赛,生成赛道 季军

科研论文

2024

[SCIS] Shukang Yin#, Chaoyou Fu#*, Sirui Zhao#*, Tong Xu, Hao Wang, Dianbo Sui, Enhong Chen*. "Woodpecker: Hallucination Correction for Multimodal Large Language Models", SCIENCE CHINA Information Sciences(SCIS), 2024, Accepted.
[National Science Review] Shukang Yin#, Chaoyou Fu#*, Sirui Zhao#*, Ke Li, Xing Sun, Tong Xu, Enhong Chen*. "A Survey on Multimodal Large Language Models", National Science Review, 2024, Accepted.
[arXiv] Chaoyou Fu, Yi-Fan Zhang, Shukang Yin, Bo Li, Xinyu Fang, Sirui Zhao, Haodong Duan, Xing Sun, Ziwei Liu, Liang Wang, Caifeng Shan, Ran He. "MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs", arXiv preprint arXiv:2411.15296, 2024.
[arXiv] Chaoyou Fu, Yuhan Dai, Yongdong Luo, Lei Li, Shuhuai Ren, Renrui Zhang, Zihan Wang, Chenyu Zhou, Yunhang Shen, Mengdan Zhang, Peixian Chen, Yanwei Li, Shaohui Lin, Sirui Zhao, Ke Li, Tong Xu, Xiawu Zheng, Enhong Chen, Rongrong Ji, Xing Sun "Video-mme: The first-ever comprehensive evaluation benchmark of multi-modal llms in video analysis", arXiv preprint arXiv:2405.21075, 2024.
[ACM MM'24] Zhengye Zhang#, Sirui Zhao#, Xinglong Mao, Shifeng Liu, Hao Wang, Tong Xu, Enhong Chen*. "A Multi-scale Feature Learning Network with Optical Flow Correction for Micro- and Macro-expression Spotting", In Proceedings of the 32nd ACM International Conference on Multimedia (ACM MM'24), Melbourne, Australia, 2024, Accepted.
[ICME'24] Shifeng Liu, Xinglong Mao, Sirui Zhao*, Chaoyou Fu, Ying Yu, Tong Xu, Enhong Chen*. "TGMAE: Self-supervised Micro-Expression Recognition with Temporal Gaussian Masked Autoencoder", In Proceedings of the 2024 IEEE International Conference on Multimedia and Expo (ICME'24), Niagra Falls, Canada, 2024, Accepted.
[ACM TOMM] Shukang Yin, Sirui Zhao*, Hao Wang, Tong Xu, Enhong Chen*. "Exploiting Instance-level Relationships in Weakly Supervised Text-to-Video Retrieval", ACM Transactions on Multimedia Computing Communications and Applications, 2024, Accepted.
[PRCV'24] Xinglong Mao, Shifeng Liu, Sirui Zhao*, Hao Wang, Tong Xu, Enhong Chen*. "H2LMER: A Cross Frame-Rate Representation Alignment Framework for Micro-Expression Recognition", Chinese Conference on Pattern Recognition and Computer Vision (PRCV), 2024.
[ICMR'24] Chenxiao Liu, Zheyong Xie, Sirui Zhao, Jin Zhou, Tong Xu*, Minglei Li, Enhong Chen, "Speak From Heart: An Emotion-Guided LLM-Based Multimodal Method for Emotional Dialogue Generation", In Proceedings of the 14th International Conference on Multimedia Retrieval (ICMR'24), Dusit Thani Laguna Phuket, Thailand, 2024, Accepted.
[ACM SIGKDD'24] Mingjia Yin, Hao Wang*, Wei Guo, Yong Liu, Suojuan Zhang, Sirui Zhao, Defu Lian, Enhong Chen, "Dataset Regeneration for Sequential Recommendation", The 30th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD'2024), Accepted.
[TOIS] Hao Wang, Mingjia Yin, Luankang Zhang, Sirui Zhao, Enhong Chen, "MF-GSLAE: A Multi-Factor User Representation Pre-training Framework for Dual-Target Cross-Domain Recommendation", ACM Transactions on Information Systems, Accepted.

2023

[TAFFC] Sirui Zhao, Huaying Tang, Xinglong Mao, Shifeng Liu, Hao Wang, Tong Xu, Enhong Chen*, "DFME: A New Benchmark for Dynamic Facial Micro-expression Recognition", IEEE Transactions on Affective Computing, doi: 10.1109/TAFFC.2023.3341918, 2023.
[ACM TOMM] Sirui Zhao, Hongyu Jiang, Hanqing Tao, Rui Zha, Kun Zhang, Tong Xu, Enhong Chen. "PEDM: A Multi-task Learning Model for Persona-aware Emoji-embedded Dialogue Generation", ACM Transactions on Multimedia Computing, Communications and Applications, 2023, 19(3s): 1-21.
[ICME'23] Shukang Yin, Shiwei Wu, Tong Xu, Sirui Zhao*, Enhong Chen*. "AU-aware graph convolutional network for Macro- and Micro-expression spotting", 2023 IEEE International Conference on Multimedia and Expo (ICME), IEEE, 2023: 228-233.
[ICME'23] Yiming Zhang, Hao Wang, Yifan Xu, Xinglong Mao, Tong Xu, Sirui Zhao*, Enhong Chen*. "Adaptive Graph Attention Network with Temporal Fusion for Micro-Expressions Recognition", 2023 IEEE International Conference on Multimedia and Expo (ICME), IEEE, 2023: 1391-1396.
[PRAI'23] Huaying Tang, Xiaorong Zhang, Xinglong Mao, Shifeng Liu, Sirui Zhao*, Enhong Chen*. "Global and Local Mixer for Micro-Expression Recognition", 2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Haikou, China, 2023, pp. 509-517.
[IWMCAS'23] Liu Minghao, Liu Haiyi, Zhao Sirui*, Ma Fei, Li Minglei, Dai Zonghong, Wang Hao, Xu Tong, Chen Enhong*. "STAN: Spatial-Temporal Awareness Network for Temporal Action Detection", Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports, 2023: 161-165.
[CIKM'23] Mingjia Yin, Hao Wang*, Xiang Xu, Likang Wu, Sirui Zhao, Wei Guo, Yong Liu, Ruiming Tang, Defu Lian, Enhong Chen, "APGL4SR: A Generic Framework with Adaptive and Personalized Global Collaborative Information in Sequential Recommendation", Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM'2023), Accepted.
[FCS] Mingdi HU, Long BAI, Jiulun FAN, Sirui ZHAO, Enhong CHEN, "Vehicle Color Recognition Based on Smooth Modulation Neural Network with Multi-Scale Feature Fusion", Frontiers of Computer Science, 2023, 17(3): 173321.

2022

[Neural Networks] Sirui Zhao, Huaying Tang, Shifeng Liu, Yangsong Zhang, Hao Wang, Tong Xu, Enhong Chen*. "ME-PLAN: A Deep Prototypical Learning with Local Attention Network For Dynamic Micro-Expression Recognition", Neural Networks, 2022, 153: 427-443.
[ACM MM'22] Sirui Zhao, Shukang Yin, Huaying Tang, Jin Rijin, Yifan Xu, Tong Xu, Enhong Chen*, "Fine-grained Micro-Expression Generation based on Thin-Plate Spline and Relative AU Constraint", Proceedings of the 30th ACM International Conference on Multimedia, 2022: 7150-7154.
[ACM MM'22] Wenhao Leng, Sirui Zhao#, Yiming Zhang, Shiifeng Liu, Xinglong Mao, Hao Wang, Tong Xu, Enhong Chen*. "ABPN: Apex and Boundary Perception Network for Micro- and Macro-Expression Spotting", Proceedings of the 30th ACM International Conference on Multimedia. 2022: 7160-7164.
[ICIP'22] Rijin Jin, Sirui Zhao, Zhongkai Hao, Yifan Xu, Tong Xu*, Enhong Chen, "AVT: Au-Assisted Visual Transformer for Facial Expression Recognition", 2022 IEEE International Conference on Image Processing (ICIP), IEEE, 2022: 2661-2665.
[PRAI'22] Hongyi Li, Sirui Zhao, Yadong Wu, Shiwei Wu, Tong Xu and Enhong Chen*, "Supervised Contrastive Attentive Learning for Facial Expression Recognition in the wild", 2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), IEEE, 2022: 293-301.

2021

[Neurocomputing] Sirui Zhao, Hanqing Tao, Yangsong Zhang, Tong Xu, Kun Zhang, Zhongkai Hao, Enhong Chen*. "A Two-stage 3D CNN based Learning Method for Spontaneous Micro-Expression Recognition", Neurocomputing, 2021, 448(2021), 276-289.
[Neural Networks] Yangsong Zhang, Huan Cai, Li Nie, Peng Xu, Sirui Zhao, Cuntai Guan. "An end-to-end 3D convolutional neural network for decoding attentive mental state", Neural Networks, 2021, 144: 129-137.
[ACM MM'21] Yifan Xu, Sirui Zhao, Huaying Tang, Xinlong Mao, Tong Xu*, Enhong Chen, "FAMGAN: Fine-grained AUs Modulation based Generative Adversarial Network for Micro-Expression Generation", In Proceedings of the 29th ACM International Conference on Multimedia (ACM MM'21), Chengdu, China, 2021, 4813-4817.
[Vis] Liang Fan, Cheng Chen, Sirui Zhao, Xiarorong Zhang, Yadong Wu, Fang Wang, et al., "Multi-threaded parallel projection tetrahedral algorithm for unstructured volume rendering", Journal of Visualization, 2021, 24(2): 261-274.

近三年专利申请情况

  • 一种基于音素感知的语音情感识别方法及装置,2024-12-08,中国,ZL202411505238.X(授权)
  • 基于小波变换混合增强对比学习的微动作识别方法及装置,2024-7-15,中国,ZL202410938994.5(授权)
  • 跨帧率微表情识别方法及装置,2024-7-16,中国,ZL202410592967.7(授权)
  • 一种微表情识别模型的训练方法、识别方法及相关设备,2024-7-26,中国,ZL202410649574.5(授权)
  • 视频检索方法、系统、设备及存储介质,2023-10-16,中国,ZL202311331941.9(授权)
  • 一种自发微表情识别方法,2022-9-30, 中国,ZL202011559343.3(授权)
  • 自然场景下人脸表情识别方法、系统、设备及存储介质,2022-9-06, 中国,ZL202210546946.2(授权)
  • 微表情峰值自动检测方法、系统、设备及存储介质,2022-4-14,中国,ZL202210387781.9(授权)
  • 人脸表情识别方法、系统、设备及存储介质,2022-4-28, 中国,CN202210459722.8
  • 微表情检测方法、系统、设备及存储介质,2023-04-03,中国,CN202310345351 .5
  • 文本情感原因的识别方法、系统、设备及存储介质,申请日期:2022-8-26, 中国, CN202211032385.0
  • 一种基于可视交互的三维人体动作关键帧提取方法,申请日期:2022-11-23, 中国, CN202211476480.X

主持及参研项目情况

  • 主持,国家自然科学基金青年基金项目,起止年月:2025/01—2028/01。
  • 主持,四川省自然科学基金青年基金项目,起止年月:2023/01—2024/01。
  • 技术骨干,国家自然科学基金重大科研仪器研制项目,起止年月:2018/01—2022/12。
  • 技术负责人,华为项目,起止年月:2022/12—2023/12。
  • 技术服务人,华东光电项目,起止年月:2023/12—2024/12。

只要思想不滑坡,办法总比困难多!

There's always a way as long as you maintain in good a state of mind!