Shengkun (Bryson) Tang

Welcome to my website~ My name is Shengkun Tang. You can call me Bryson for short. Currently, I am a first-year PhD student of Machine Learning in MBZUAI, under the supervision of Prof. Zhiqiang Shen. During my gap year, I had a wonderful time as an research assistant in DASLab in ISTA , working with Prof. Dan Alistarh. Besides, I had close collaboration with Prof. Dongkuan Xu (NCSU) and Dr. Yaqing Wang (Google DeepMind), working on efficent multi-modal models. I finished B.E. in Remote Sensing at Wuhan University , under the supervision of Prof. Jian Yao and Prof. Xin Su.

Email  /  Google Scholar  /  Github

Last updated: Feb. 4th 2025

profile photo
News

02/2025: Happy to release the code and pretrained weights of Bi-Mamba, please check here.

08/2024: Start my PhD life in MBZUAI.

05/2023: Invited to serve as Reviewer for International Workshop on Resource-Efficient Learning for Knowledge Discovery at KDD 2023.

05/2023: Invited to give a talk at ć°†é—šćˆ›æŠ• on June 8, 2023. Welcome!

02/2023: My first paper on accelerating inference of vision language model was accept by CVPR 2023. This is my first work before PhD Program. Super excited :). Thank all co-authors' support.

09/2022: I joined Intelligent Automotive Group(IAG) at SenseTime as a system developer. I will build system for various perception modules of self-driving.



Research

My research interests lie on Landable Artificial Intelligence, focusing on the Resource Efficiency and Trustworthy of AI System. My research covers the whole pipeline of AI system, providing full-stack solutions from theoretical optimization methods and data-centric strategies to the development of efficient, interpretable and reliable deep learning techniques and the co-design of algorithms and hardware.

  • Resource-Efficient Training & Inference Algorithms

  • Data Optimization to Improve Data Quality & Efficiency

  • Scalable Methods for AI Systems with Theoretical Guarantees

  • Algorithm-Hardware Co-design for Acceleration

  • Application Scenario: Multi-Modal (Vision-Language), Uni-Modal (NLP, Computer Vision)

If you are interested in my research and seeking for collaboration, feel free to contact me. Any kinds of collaboration are welcome.

Publications



DALD: Improving Logits-based Detector without Logits from Black-box LLM
Cong Zeng*, Shengkun Tang*, Xianjun Yang, Yuanzhou Chen, Yiyou Sun, Yao Li, Haifeng Chen, Wei Cheng, Dongkuan Xu
[NeurIPS 2024] The Thirty-eighth Annual Conference on Neural Information Processing Systems
arXiv / code

We propose a simple but quite effective method to improve the performance of black-box LLM detection. DALD collects a small-size data from target model and train the surrogate model to align the distribution of surrogate model and target model.



Adadiff: Accelerating diffusion models through step-wise adaptive computation
Shengkun Tang, Yaqing Wang, Caiwen Ding, Yi Liang, Yao Li, Dongkuan Xu
[ECCV 2024] European Conference on Computer Vision
arXiv / code

We propose a uncertainty estimation module (UEM) to decide the exiting point during diffusion model inference at each timestep. Moreover, we propose an uncertainty-aware layer-wise loss to recover the performance for early-exited model.



You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Shengkun Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Caiwen Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu
[CVPR 2023] The IEEE/CVF Conference on Computer Vision and Pattern Recognition
arXiv / code

We propose a novel early exiting strategy based on cascading input similarity with valid assumptions on saturation states in visual-language models, a pioneering exploration of extending early exiting selection to encoders and decoders of sequence-to-sequence architectures.



DDR-Net: Learning Multi-Stage Multi-View Stereo With Dynamic Depth Range
Puyuan Yi*, Shengkun Tang*, Jian Yao
Preprint, 2021
arXiv / code

We propose a Dynamic Depth Range Network (DDR-Net) to determine the depth range hypotheses dynamically by applying a range estimation module (REM) to learn the uncertainties of range hypotheses in the former stages.



Scale-robust deep-supervision network for mapping building footprints from high-resolution remote sensing images
Haonan Guo, Xin Su, Shengkun Tang, Bo Du, Liangpei Zhang
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021
PDF

We propose a novel deep-supervision convolutional neural network (denoted as DS-Net) for extracting building footprints from high-resolution remote sensing images.

Intern Experience



SenseTime Engineering, 06/2021 - 10/2021

Vision Algorithm Intern Researcher
Project: SenseRobot Chess Robotic, working with Ruodai Li
Work Experience



SenseTime, Intelligent Automotive Group(IAG), 05/2022 - Now

System Developer
Project: Large-Scale Self-Driving System Development
Contest

Baidu Astar Developer Competition, 05/2020 - 10/2020

Ranking: 7/2305 (teams)

The task of Baidu Astar 2020 is traffic signs and surveillance cameras detection and matching. I was in charge of detection task. I solved the problems of data imbalance by using my own data argumentation strategy and detect surveil- lance cameras more accurately. We got into the final and rank 7 out of 2305 teams.
Professional Services
  • Program Committee Member:
    • ICML 2025
    • ICLR 2025
    • AISTATS 2025
    • NeurIPS 2024
    • KDD 2023, 2024
    • AAAI 2023

  • This template comes from source code, thanks for his fantastic website templates.