Hyeon-Su Jeong

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Seong-Jin Mun

Jae-Yun Jeon

Hanyang University

Sang-Hun An

Dae-Seok Han

Dongguk University

Jae-Uk Jeong

Dong-Hyeong Lee

Pusan National University Yangsan Hospital

Seong-Min Park

Seoul National University

Hye Won Chung

University of Ulsan

Se-Ui Yun

Han-Jak Ryu

Yonsei University

Cooperative Institutions

Seoul National University

Yonsei University

Hanyang University

Anyang University

Dongguk University

Pusan National University Yangsan Hospital

Rural Development Administration

National Institute of Animal Science

GS Caltex (South Korea)

Kunsan National University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Understanding Self-Distillation and Partial Label Learning in Multi-Class Classification with Label Noise

arXiv (Cornell University) (2024)

Hyeon-Su Jeong Hye Won Chung

Self-distillation (SD) is the process of training a student model using the outputs of a teacher model, with both models sharing the same architecture. Our study theoretically examines SD in multi-class classification with cross-entropy loss, exploring both multi-round SD and SD with refined teacher outputs, inspired by partial label learning (PLL). By deriving a closed-form solution for the student model's outputs, we discover that SD essentially functions as label averaging among instances with high feature correlations. Initially beneficial, this averaging helps the model focus on feature clusters correlated with a given instance for predicting the label. However, it leads to diminishing performance with increasing distillation rounds. Additionally, we demonstrate SD's effectiveness in label noise scenarios and identify the label corruption condition and minimum number of distillation rounds needed to achieve 100% classification accuracy. Our study also reveals that one-step distillation with refined teacher outputs surpasses the efficacy of multi-step SD using the teacher's direct output in high noise rate regimes.

Multi-label classification

10.48550/arxiv.2402.10482

Cite

Citations (0)

소하천 유역에서 유출특성 분석

Journal of Korea Water Resources Association (2000)

Jae-Uk Jeong Hyeon-Su Jeong Seong-Min Park Se-Ui Yun

Source

Cite

Citations (0)

A case of acute renal failure associated with acute fulminant hepatitis A

Kidney Research and Clinical Practice (2006)

Dong-Hyeong Lee Ja-Gyeong Kim Hyeon-Su Jeong Han-Jak Ryu Seong-Jin Mun

Fulminant hepatitis

Acute tubular necrosis

Hepatic Encephalopathy

Prothrombin time

Source

Cite

Citations (2)

Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing

arXiv (Cornell University) (2023)

Hyeon-Su Jeong Hye Won Chung

Crowdsourcing has emerged as an effective platform for labeling large amounts of data in a cost- and time-efficient manner. Most previous work has focused on designing an efficient algorithm to recover only the ground-truth labels of the data. In this paper, we consider multi-choice crowdsourcing tasks with the goal of recovering not only the ground truth, but also the most confusing answer and the confusion probability. The most confusing answer provides useful information about the task by revealing the most plausible answer other than the ground truth and how plausible it is. To theoretically analyze such scenarios, we propose a model in which there are the top two plausible answers for each task, distinguished from the rest of the choices. Task difficulty is quantified by the probability of confusion between the top two, and worker reliability is quantified by the probability of giving an answer among the top two. Under this model, we propose a two-stage inference algorithm to infer both the top two answers and the confusion probability. We show that our algorithm achieves the minimax optimal convergence rate. We conduct both synthetic and real data experiments and demonstrate that our algorithm outperforms other recent algorithms. We also show the applicability of our algorithms in inferring the difficulty of tasks and in training neural networks with top-two soft labels.

Crowdsourcing

Ground truth

Confusion

10.48550/arxiv.2301.00006

Cite

Citations (0)