Jiangchao Yao


Reseach Focus



Label-noise and Adversarially Robust Learning

Perturbation can be ubiquitous in the real-world data and the proper mass can actually robustify the training of machine learning algorithms. This is common in the training practice like label smoothing, dropout and data augmentation with randomness. However, when it is excessive or deliberate, the special design should be considered in the training methods to reduce their negative impact. Motivated by this belief, we developed a range of methods to provide the references on this way.


Class-, Subpopulation- and Domain- Imbalanced Learning

Imbalanced Learning is an old topic in machine learning area, which is still lack of the solid foundation from theory to algorithm, although it seems to be studied in (at least) two decades. The reason that we re-focus this problem is that the generalization of algorithms in a broad sense has been recently drawn more attention, especially in the context of the pretraining paradigm. The underlying evaluation metric on the holistic measure of each class, each task, each domain coincidentally is similar to the fine-grained measure in imbalanced learning. This motivates us to use imbalanced learning to help these typical paradigms like self-supervised learning, weakly-supervised learning or generative modeling to enhance the generalization. The following taxonomy is mainly based on the aspect that each research work considers, but actually these imbalance types may mix in practice.


Universal Pretraining Methods for Medical Imaging Diagnosis

  • Chest X-Ray Pretraining series: UniChest-1 [TMI'24]