News
-
05/2026: I am very happy to release SlimQwen. Please have a check if you want to know more details about Qwen3.5 pretraining. Paper Link
-
02/2026: Very excited to release Qwen3.5 & Qwen 3.6 series model!! Please have a try if you are interested! Blog Link
-
01/2026: one paper is accepted by CVPR 2026! Congratulations to Jiacheng!
-
01/2026: one paper is accepted by DMLR! Congratulations to the team!
-
10/2025: Bi-Mamba is accepted by TMLR! Thanks to all collaborators!
-
10/2025: one co-first author paper is accepted by NeurIPS 2025. Congratulations to Cong!
-
06/2025: one paper is accepted by ICCV 2025. Congratulations to Bowei!
-
06/2025: Start the research internship in Qwen pretraining team!
-
04/2025: one paper is accepted by 2nd Re-Align Workshop in ICLR 2025. Congratulations to Xuanjie and Cong!
-
02/2025: Happy to release the code and pretrained weights of Bi-Mamba, please check
here.
-
08/2024: Start my PhD life in MBZUAI.
-
05/2023: Invited to serve as Reviewer for
International Workshop on Resource-Efficient Learning for Knowledge Discovery
at KDD 2023.
-
05/2023: Invited to give a talk at
ć°éšćæ on June 8, 2023. Welcome!
-
02/2023: My first paper on
accelerating inference of vision language model
was accepted by CVPR 2023. Super excited :). Thank all co-authors' support.
-
09/2022: I joined Intelligent Automotive Group(IAG) at SenseTime as a system developer.
I will build system for various perception modules of self-driving.
Research
|
My research focuses on building efficient,
reliable, and
deployable
AI systems. I am interested in improving the full pipeline of modern foundation models, from architecture design and training to inference, data, and evaluation.
Specifically, my research spans four directions:
-
Inference Efficiency. I develop methods that reduce the computational and memory cost of large models during deployment, including structured pruning , quantization, adaptive computation, and token pruning.
-
Training Efficiency. I study resource-efficient training methods that improve model capability under limited computational budgets, including efficient optimization, data-efficient learning, and scalable training strategies.
-
Novel Model Architectures. I design compact and scalable model architectures for efficient intelligence, including work such as SlimQwen and other architecture-level innovations.
-
Data-Centric AI and Trustworthy Evaluation. I also study data quality, efficient data usage, benchmarks, and trustworthy evaluation.
I am always open to research collaborations. Please feel free to contact me if you are interested in efficient AI systems, foundation models, or related topics.
|
Publications
|
SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training
Shengkun Tang*, Zekun Wang*, Bo Zheng*, Liangyu Wang, Rui Men, Siqi Zhang, Xiulong Yuan, Zihan Qiu, Zhiqiang Shen, Dayiheng Liu
Paper
|
|
Qwen3.5: Towards Native Multimodal Agents
Core Contributor
Blog / code / model collection
|
|
BiGain: Unified Token Compression for Joint Generation and Classification
Jiacheng Liu*, Shengkun Tang*, Jiacheng Cui, Dongkuan Xu, Zhiqiang Shen
[CVPR 2026] The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2026
Paper / code
|
|
Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark
Sondos Mahmoud Bsharat, Mukul Ranjan, Aidar Myrzakhan, Jiacheng Liu, Bowei Guo, Shengkun Tang, Zhuang Liu, Yuanzhi Li, Zhiqiang Shen
[DMLR] Data-centric Machine Learning Research, 2026
Paper / code / website / leaderboard
|
|
Bi-Mamba: Towards Accurate 1-Bit State Space Models
Shengkun Tang, Liqun Ma, Haonan Li, Mingjie Sun, Zhiqiang Shen
[TMLR] Transactions on Machine Learning Research, 2025
Paper / code
|
|
Human Texts Are Outliers: Detecting LLM-generated Texts via Out-of-distribution Detection
Cong Zeng*, Shengkun Tang*, Yuanzhou Chen, Zhiqiang Shen, Wenchao Yu, Xujiang Zhao, Haifeng Chen, Wei Cheng, Zhiqiang Xu
[NeurIPS 2025] The Thirty-Ninth Annual Conference on Neural Information Processing Systems
Paper / code
|
|
MosaicDiff: Training-free Structural Pruning for Diffusion Model Acceleration Reflecting Pretraining Dynamics
Bowei Guo, Shengkun Tang, Cong Zeng, Zhiqiang Shen
[ICCV 2025] International Conference on Computer Vision, ICCV 2025
Paper
|
|
Do Large Language Models Perceive Orderly Number Concepts as Human?
Xuanjie Liu, Cong Zeng, Shengkun Tang, Ziyu Wang, Gus Xia
[Re-Align Workshop, ICLR 2025] 2nd Workshop on Representational Alignment, ICLR 2025
Paper
|
|
DALD: Improving Logits-based Detector without Logits from Black-box LLM
Cong Zeng*, Shengkun Tang*, Xianjun Yang, Yuanzhou Chen, Yiyou Sun, Yao Li, Haifeng Chen, Wei Cheng, Dongkuan Xu
[NeurIPS 2024] The Thirty-eighth Annual Conference on Neural Information Processing Systems
arXiv / code
|
|
Adadiff: Accelerating diffusion models through step-wise adaptive computation
Shengkun Tang, Yaqing Wang, Caiwen Ding, Yi Liang, Yao Li, Dongkuan Xu
[ECCV 2024] European Conference on Computer Vision
arXiv / code
|
|
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Shengkun Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Caiwen Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu
[CVPR 2023] The IEEE/CVF Conference on Computer Vision and Pattern Recognition
arXiv / code
|
|
DDR-Net: Learning Multi-Stage Multi-View Stereo With Dynamic Depth Range
Puyuan Yi*, Shengkun Tang*, Jian Yao
Preprint, 2021
arXiv / code
|
|
Scale-robust deep-supervision network for mapping building footprints from high-resolution remote sensing images
Haonan Guo, Xin Su, Shengkun Tang, Bo Du, Liangpei Zhang
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021
PDF
|
Industrial Experience
|
Qwen Team, Alibaba, 06/2025 - Now
Research Intern
Mentor: Bo Zheng and Dayiheng Liu
|
|
SenseTime, Engineering & Intelligent Automotive Group (IAG), 06/2021 - 10/2021 & 05/2022 - 07/2023
Vision Algorithm Intern; System Developer
Project: SenseRobot Chess Robotic, working with Ruodai Li
Project: Large-Scale Self-Driving System Development
|
Program Committee Member:
- NeurIPS 2024, 2025
- ICCV 2025
- ICML 2025
- ICLR 2025
- AISTATS 2025
- KDD 2023, 2024
- AAAI 2023
|
This template comes from source code, thanks for his fantastic website templates.
|
|