J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf

MLILAB 88 views 12 slides May 28, 2024

Slide 1 of 12

About This Presentation

Language-Interfaced Tabular Oversampling Via Progressive Imputation And Self Autentication

Size: 1.38 MB

Language: en

Added: May 28, 2024

Slides: 12 pages

Slide Content

Language-Interfaced Tabular Oversampling Via
Progressive Imputation And Self Autentication
JuneYongYang*, Geondo Park*, JoowonKim, HyeongwonJang,
EunhoYang
ICLR 2024
Graduate School of AI, KAIST
Machine Learning & Intelligence Laboratory

Introduction
●Tabular data is ubiquitous across a myriad of industries,such as health care,
marketing, and finance.
●Tabular data in the wild are often ridden with class-imbalance.
●Given a training dataset !,the number of samples for each class is skewed.
!#≥!$≥⋯≥!%
whereN!isthenumberofsamplesbelongedtoclassc

Introduction
3
●Our research goal is to utilize Tabular Language Model (TLM) to synthesize tabular
samples belonging to the minority class to balance class distribution.
●Recent advances in deep generative models have bestowed the means to generate
high-quality synthetic tabular data.
●We propose Language-Interfaced Tabular Oversampling (LITO), oversampling
framework for tabular data that comprehensively utilizes the power of language-
interfaced tabular learning.

Language-Interfaced Tabular Generation
4
●Tabular data can be readily formatted into text.
Processed by generative language models without the usage
of external adapters or representation alignment.
●Given a tabular dataset, the :-throw of tabular can be represented as followed.
;",$=ℎ$,is,>",$,,?:@;"=;",%,;",&,⋯,;",'
where the (",$)-thvalue of the table and ℎ!is the name of m-thcolumn.

Language-Interfaced Tabular Oversampling
5
●Minor-Conditioned Sampling With Importance-Aware Imputation
●Simple class conditioned generation.
'"=)'#$%&#"*"+'#$%&#"=["label","23","c,","]
●Convert the sample to the targeted minority class by conditional imputation.
'"=)'#$%&#",'','(,⋯,')*+

Language-Interfaced Tabular Oversampling
6
●Minor-Conditioned Sampling With Importance-Aware Imputation
●Consideringthe heterogeneity of columns, puncture and impute columns guided by a
feature importance criterion.
●Self-attention scores of the TLM to attribute the importance of column features.
Last layer attention score

Language-Interfaced Tabular Oversampling
7
●Rejection Sampling via Self-Authenthication
●To filter out the ill-generated synthetic samples.
●The generative language model is capable of imputing the label of the given sample.

Language-Interfaced Tabular Oversampling
8
●Adaptive Oversampling With Progressive Imputation
●The number of column imputations required for successfully conversion may vary from
one sample to another.

Experiment: Binary Classification Tasks
●LITO consistently outperforms baselines on four binary classification tasks, excelling
in both extreme and mild imbalance scenarios.

Experiment: Multi-label Classification Tasks
●LITO brings better imbalance handling performance in most cases compared to
other baselines.
●In the extreme imbalance setting, LITO clearly outperform all baselines by large
margins.

Experiment: In-Context LITO
●A proof of-concept experiment to demonstrate the performance of in-context LITO
using OpenAIGPT-3.5-turbo API.
●Oversampling minority class samples through in-context learning is indeed effective.

J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......