channelNews
LIPLAB’s New Progress in SATD Identification and Classification


In recent years, Self-Admitted Technical Debt (SATD) has emerged as a focal point in the field of software maintenance and evolution. SATD appears in diverse forms—from code comments and pull requests to commit messages and GitHub issues—and spans a wide range of categories, including architectural, build, code, defect, design, documentation, requirement, and test debt. However, existing approaches typically target a single source or a limited set of categories and suffer from imbalanced data distributions that hinder performance.


To address these challenges, Qingyuan Li (Ph.D. student), Zhixin Yin (Master) and Yaopeng Yang (Ph.D. student) at the Language Intelligent Processing Laboratory (LIPLAB) in Nanjing University, present IMPACT, a novel framework for SATD identification and classification. The key contributions of the study are as follows:


(1) ChatGPT-driven Data Augmentation

By leveraging multi-turn dialogue prompts with ChatGPT, the framework generates semantically equivalent variants of SATD instances while preserving key annotations. This approach significantly expands the dataset and alleviates data imbalance.


(2) Two-Stage Pipeline Separating Identification from Classification

The IMPACT framework adopts a two-stage design: In the first stage, MT-MoE-BERT (Multi-task Mixture-of-Experts BERT) performs binary SATD identification across multiple sources, offering strong performance with reduced computational overhead. In the second stage, GLM-4-9B-Chat utilizes few-shot in-context learning to classify identified SATD instances into eight categories, significantly improving classification accuracy.


(3) Comprehensive Evaluation across Languages, Sources, and Projects

The study conducts systematic evaluations using mainstream metrics and compares IMPACT against a variety of state-of-the-art baselines. It also validates cross-project generalization on unseen repositories such as Gradle, Maven, and Spring.


(4) Significant Performance Improvements

IMPACT achieves an average F1 score of 0.876 on the original test set and an impressive 0.697 on the most challenging pull request source. It outperforms existing state-of-the-art methods and their base models by a large margin, and maintains strong generalization on cross-project test sets (F1 = 0.813).


Fig. Overview of the IMPACT framework


This work has been accepted by ACM Transactions on Software Engineering and Methodology (TOSEM, CCF A). Moving forward, the research team plans to further explore SATD identification and classification in diverse scenarios, as well as strategies for technical debt repayment. Researchers interested in SATD are welcome to follow LIPLAB’s latest progress at http://liplab.site .