[SCIS]
Shukang Yin#, Chaoyou Fu#*, Sirui Zhao#*, Tong Xu, Hao Wang, Dianbo Sui, Enhong Chen*.
"Woodpecker: Hallucination Correction for Multimodal Large Language Models",
SCIENCE CHINA Information Sciences(SCIS), 2024, Accepted.
[National Science Review]
Shukang Yin#, Chaoyou Fu#*, Sirui Zhao#*, Ke Li, Xing Sun, Tong Xu, Enhong Chen*.
"A Survey on Multimodal Large Language Models",
National Science Review, 2024, Accepted.
[arXiv]
Chaoyou Fu, Yi-Fan Zhang, Shukang Yin, Bo Li, Xinyu Fang, Sirui Zhao, Haodong Duan, Xing Sun, Ziwei Liu, Liang Wang, Caifeng Shan, Ran He.
"MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs",
arXiv preprint arXiv:2411.15296, 2024.
[arXiv]
Chaoyou Fu, Yuhan Dai, Yongdong Luo, Lei Li, Shuhuai Ren, Renrui Zhang, Zihan Wang, Chenyu Zhou, Yunhang Shen, Mengdan Zhang, Peixian Chen, Yanwei Li, Shaohui Lin, Sirui Zhao, Ke Li, Tong Xu, Xiawu Zheng, Enhong Chen, Rongrong Ji, Xing Sun
"Video-mme: The first-ever comprehensive evaluation benchmark of multi-modal llms in video analysis",
arXiv preprint arXiv:2405.21075, 2024.
[ACM MM'24]
Zhengye Zhang#, Sirui Zhao#, Xinglong Mao, Shifeng Liu, Hao Wang, Tong Xu, Enhong Chen*.
"A Multi-scale Feature Learning Network with Optical Flow Correction for Micro- and Macro-expression Spotting",
In Proceedings of the 32nd ACM International Conference on Multimedia (ACM MM'24), Melbourne, Australia, 2024, Accepted.
[ICME'24]
Shifeng Liu, Xinglong Mao, Sirui Zhao*, Chaoyou Fu, Ying Yu, Tong Xu, Enhong Chen*.
"TGMAE: Self-supervised Micro-Expression Recognition with Temporal Gaussian Masked Autoencoder",
In Proceedings of the 2024 IEEE International Conference on Multimedia and Expo (ICME'24), Niagra Falls, Canada, 2024, Accepted.
[ACM TOMM]
Shukang Yin, Sirui Zhao*, Hao Wang, Tong Xu, Enhong Chen*.
"Exploiting Instance-level Relationships in Weakly Supervised Text-to-Video Retrieval",
ACM Transactions on Multimedia Computing Communications and Applications, 2024, Accepted.
[PRCV'24]
Xinglong Mao, Shifeng Liu, Sirui Zhao*, Hao Wang, Tong Xu, Enhong Chen*.
"H2LMER: A Cross Frame-Rate Representation Alignment Framework for Micro-Expression Recognition",
Chinese Conference on Pattern Recognition and Computer Vision (PRCV), 2024.
[ICMR'24]
Chenxiao Liu, Zheyong Xie, Sirui Zhao, Jin Zhou, Tong Xu*, Minglei Li, Enhong Chen,
"Speak From Heart: An Emotion-Guided LLM-Based Multimodal Method for Emotional Dialogue Generation",
In Proceedings of the 14th International Conference on Multimedia Retrieval (ICMR'24),
Dusit Thani Laguna Phuket, Thailand, 2024, Accepted.
[ACM SIGKDD'24]
Mingjia Yin, Hao Wang*, Wei Guo, Yong Liu, Suojuan Zhang, Sirui Zhao, Defu Lian, Enhong Chen,
"Dataset Regeneration for Sequential Recommendation",
The 30th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD'2024),
Accepted.
[TOIS]
Hao Wang, Mingjia Yin, Luankang Zhang, Sirui Zhao, Enhong Chen,
"MF-GSLAE: A Multi-Factor User Representation Pre-training Framework for Dual-Target Cross-Domain Recommendation",
ACM Transactions on Information Systems,
Accepted.