Notice
Recent Posts
Recent Comments
ยซ   2024/11   ยป
์ผ ์›” ํ™” ์ˆ˜ ๋ชฉ ๊ธˆ ํ† 
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
Tags
more
Archives
Today
Total
๊ด€๋ฆฌ ๋ฉ”๋‰ด

Hello Potato World

[ํฌํ…Œ์ดํ†  ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] GAN Augmentation: Augmenting Training Data using Generative Adversarial Networks ๋ณธ๋ฌธ

Paper Review๐Ÿฅ”/Data Augmentation

[ํฌํ…Œ์ดํ†  ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] GAN Augmentation: Augmenting Training Data using Generative Adversarial Networks

Heosuab 2021. 3. 27. 05:47

โ‹† ๏ฝก หš โ˜๏ธŽ หš ๏ฝก โ‹† ๏ฝก หš โ˜ฝ หš ๏ฝก โ‹† 

[Object Detection paper review]

 

์ตœ๊ทผ์— ์ฝ์—ˆ๋˜ ์‹คํ—˜ ๋…ผ๋ฌธ ์ค‘์—์„œ ์ €์ž์˜ ์ฒด๊ณ„์„ฑ๊ณผ ์ง„ํ–‰ ๊ณผ์ •์ด ๊ฐ€์žฅ ๋ˆˆ์— ์ž˜ ๋“ค์–ด์™€์„œ ์ •๋ฆฌํ•ด์„œ ๋ฐœํ‘œํ•ด๋ณด์•˜๋˜ ๋…ผ๋ฌธ. ์˜๋ฃŒ๋ฐ์ดํ„ฐ์—์„œ GAN(Generative Adversarial Networks)์„ ์‚ฌ์šฉํ•˜์—ฌ data augmentation ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค.

 

 

 


GAN Augmentation


 ์ตœ๊ทผ ๋ช‡๋…„๊ฐ„ data augmentation์— GAN์„ ์ ์šฉ์‹œํ‚ค๋ ค๋Š” ์‹œ๋„์™€ ์—ฐ๊ตฌ๊ฐ€ ๊ฝค ์ด๋ฃจ์–ด์ง€๊ณ  ์žˆ๋‹ค. GAN์—๋„ ์ด๋ฏธ ์›Œ๋‚™ ๋งŽ์€ ๋ชจ๋ธ๋“ค์ด ๋‚˜์™€์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค๋ฅธ ๋…ผ๋ฌธ๋“ค์—์„œ๋Š” ๋˜ ์—ฌ๋Ÿฌ ๋ชจ๋ธ๋“ค์ด ์‚ฌ์šฉ๋˜์—ˆ์ง€๋งŒ, ์—ฌ๊ธฐ์„œ๋Š” Progressive Growing of GANs(PGGAN)์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šตํ•˜์˜€๋‹ค.

 

 ์šฐ์„  ๋…ผ๋ฌธ์—์„œ ์–ธ๊ธ‰ํ•œ Data Augmentation์˜ ํ•„์š”์„ฑ์— ๋Œ€ํ•ด์„œ๋ถ€ํ„ฐ ๋ณด์ž๋ฉด, ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋“ค์„ ๋‹ค๋ฃฐ ๋•Œ ํ•ด๋‹น ์ด๋ฏธ์ง€ ๋‚ด์—์„œ ํ•„์š”ํ•œ ํŠน์„ฑ๊ณผ ํ•„์š”ํ•˜์ง€ ์•Š์€ ํŠน์„ฑ์ด ์žˆ์„ ๊ฒƒ์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์˜๋ฃŒ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃฐ ๋•Œ๋Š”, ์ค‘์š”ํ•œ ๊ธฐ๊ด€์ด๋‚˜ ์กฐ์ง์˜ ์œ„์น˜, ๋ชจ์–‘, ํฌ๊ธฐ ๋“ฑ์€ ์ค‘์š”ํ•œ ์ •๋ณด๊ฐ€ ๋  ํ…Œ์ง€๋งŒ ์ „๋ฐ˜์ ์ธ ๋ฐ๊ธฐ ์ฐจ์ด๋‚˜ ์ดฌ์˜ ๊ฐ๋„ ๋“ฑ ์ถ”์ถœํ•ด๋‚ด๊ณ ์ž ํ•˜๋Š” ์ •๋ณด์™€ ๊ด€๋ จ์ด ์—†์–ด๋ณด์ธ๋‹ค. ์ด ๋•Œ ์“ธ๋ชจ์žˆ๋Š” ํŠน์„ฑ์„ pertinent variance, ์“ธ๋ชจ์—†๋Š” ํŠน์„ฑ์„ non-pertinent variance๋ผ๊ณ  ํ•˜๋Š”๋ฐ, ์ด ์“ธ๋ชจ์—†๋Š” ์ •๋ณด๋ฅผ ๋„ˆ๋ฌด ๋งŽ์ด ์œ ์ง€ํ•˜๋ฉด pertinent variance๋ฅผ ์ง„๋‹จํ•˜๋Š”๋ฐ์— ๋ฐฉํ•ด๊ฐ€ ๋˜๊ธฐ๋„ ํ•˜๊ณ , overfitting์˜ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์•„์ง€๋Š” ๋“ฑ์˜ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์‚ฌ์šฉํ•˜๊ณ ์ž ํ•˜๋Š” ๋ฐ์ดํ„ฐ์…‹ ๋‚ด์—์„œ non-pertinent variance๋ฅผ ์ œ๊ฑฐํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์ด ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ๋ฅผ ๋‹จ์ˆœํ™”์‹œํ‚ค๋Š” ๊ฒƒ์ด๊ณ (intensity normalization, cropping, registration to a standard space๋“ฑ์˜ ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค) , ๋‘๋ฒˆ์งธ ๋ฐฉ๋ฒ•์ด Data augmentation ๋ฐฉ์‹์ด๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ €์ž๋Š” GAN์ด hand craft(์ง์ ‘ ์กฐ์ •ํ•ด์ค˜์•ผํ•˜๋Š”) feature๋“ค์— ๋Œ€ํ•œ ํ•„์š”์„ฑ์„ ์ค„์—ฌ์ค€๋‹ค๋Š” ์žฅ์ ์„ ์‚ด๋ ค์„œ augmentation์— ์‚ฌ์šฉํ•˜๊ณ ์ž ํ–ˆ๋‹ค.

 

[Figure 1: ๊ฐ ์ด๋ฏธ์ง€ ์Œ์—์„œ ์œ„์ชฝ์ด GAN์œผ๋กœ ๋ณ€ํ˜•๋œ ์ด๋ฏธ์ง€, ์•„๋ž˜๊ฐ€ ์›๋ณธ ์ด๋ฏธ์ง€]

 

 

 ๋…ผ๋ฌธ์—์„œ PGGAN์„ ์‚ฌ์šฉํ•ด์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฆ๊ฐ•ํ•˜๊ณ  CNN(segmentation) ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚จ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด์„œ ๋จผ์ € ์‚ดํŽด๋ณด๋ฉด,

  1. ์‚ฌ์šฉํ•˜๋ ค๋Š” ๋ฐ์ดํ„ฐ์…‹์—์„œ ์ด 80k๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ๋ฝ‘์•„๋‚ด์„œ PGGAN์„ ํ•™์Šต์‹œํ‚จ๋‹ค.
  2. ํ•™์Šต๋œ PGGAN์„ ์‚ฌ์šฉํ•ด์„œ 80k๊ฐœ์˜ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ synthetic data๋ฅผ ๋งŒ๋“ ๋‹ค. (๊ฐ™์€ ์ด๋ฏธ์ง€๊ฐ€ ์ƒ์„ฑ๋˜๋ฉด ์•ˆ๋˜๊ธฐ ๋•Œ๋ฌธ์— PGGAN ๋ชจ๋ธ์— Gaussian Noise๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๋ณ€ํ˜•์„ ๋„ฃ์—ˆ๋‹ค๊ณ  ํ•œ๋‹ค.)
  3. ์ƒ์„ฑ๋œ synthetic data๋กœ๋ถ€ํ„ฐ ์ผ๋ถ€๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์ถ”์ถœํ•˜์—ฌ ๊ธฐ์กด ๋ฐ์ดํ„ฐ์…‹์— ํ•ฉ์นœ๋‹ค.
  4. ์ตœ์ข…์œผ๋กœ ์ƒ์„ฑ๋œ Training data๋กœ CNN(Segmentation Network)์„ ํ•™์Šต์‹œํ‚จ๋‹ค.

 

[Figure 2: Augmentation & Training ์ „์ฒด ๊ณผ์ •]

 

CNN์„ ํ‰๊ฐ€ํ•  ๋•Œ๋Š” ๊ฒฐ๊ณผ์˜ ์œ ์‚ฌ๋„๋ฅผ ์ธก์ •ํ•˜๋Š” ํ†ต๊ณ„์  ๋ฐฉ๋ฒ•์ธ Dice Similarity Coefficient(DSC)๋ผ๋Š” ํ‰๊ฐ€์ง€ํ‘œ๊ฐ€ ์‚ฌ์šฉ๋˜์—ˆ๊ณ , ์œ„์˜ ํ•™์Šต๊ณผ์ •์—์„œ ์กฐ์ •๋  ์ˆ˜ ์žˆ๋Š” ๋ณ€์ˆ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด 5๊ฐœ๊ฐ€ ์žˆ๋‹ค.

  • Amount of available real data : Training data๋Š” Real data์˜ ์ผ๋ถ€+synthetic data์˜ ์ผ๋ถ€๋กœ ์ด๋ฃจ์–ด ์ง€๋Š”๋ฐ, ์ด ๋•Œ ์‚ฌ์šฉ๋˜๋Š” real data์˜ ์–‘
  • Amount of additional synthetic data : Training data์— ์‚ฌ์šฉ๋˜๋Š” synthetic data์˜ ์–‘
  • Dataset : real data๋กœ ์‚ฌ์šฉ๋˜๋Š” dataset
  • Segmentation network : CNN์— ์‚ฌ์šฉ๋˜๋Š” ๋ชจ๋ธ
  • Augmentation : Augmentation์— ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ์‹(๊ธฐ์กด์— ์ž์ฃผ ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ์‹์œผ๋กœ๋Š” cropping, rotation, noising๋“ฑ์ด ์žˆ๊ณ  ์—ฌ๊ธฐ์„œ๋Š” GAN์„ ์‚ฌ์šฉํ•œ๋‹ค.)

๋…ผ๋ฌธ์—์„œ๋Š” ์ด 5๊ฐœ์˜ ๋ณ€์ˆ˜๋ฅผ ์กฐ์ •ํ•ด๊ฐ€๋ฉฐ ์‹คํ—˜๊ฒฐ๊ณผ๋ฅผ ๋น„๊ตํ•˜์—ฌ ๊ฐ๊ฐ์˜ ๋ณ€์ˆ˜๊ฐ€ ์„ฑ๋Šฅ์— ์–ด๋–ค ์˜ํ–ฅ์„ ์ฃผ๋Š”์ง€ ์•Œ์•„๋ณด๊ณ ์ž ํ–ˆ๋‹ค. ์ €์ž๊ฐ€ ๋˜์ง„ 5๊ฐ€์ง€์˜ ์งˆ๋ฌธ์„ ์ฐจ๋ก€๋กœ ์‚ดํŽด๋ณด๋ฉฐ ์–ด๋–ป๊ฒŒ ๋ณ€์ˆ˜๋ฅผ ์กฐ์ •ํ–ˆ๋Š”์ง€, ์–ด๋–ค ๊ฒฐ๊ณผ๋ฅผ ๋ณด์˜€๋Š”์ง€ ์ •๋ฆฌํ•ด๋ณด์ž.

 

 


Experiments & Discussion


  • Segmentation Network ์— ์‚ฌ์šฉ๋˜๋Š” ๋ชจ๋ธ์ด ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์˜ํ–ฅ์„ ์ฃผ์—ˆ๋Š”๊ฐ€?

 ์ €์ž๋Š” CT dataset์— ์ ํ•ฉํ•œ UNet๊ณผ Residual UNet(UResNet), ๊ทธ๋ฆฌ๊ณ  medical segmentation์—์„œ ์ผ๋ฐ˜์ ์œผ๋กœ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” DeepMedic๊นŒ์ง€ ์ด 3๊ฐœ์˜ Segmentation Network๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‹คํ—˜ํ•˜์˜€๋‹ค.

 ์•„๋ž˜ Figure3๋ฅผ ๋ณด๋ฉด DeepMedic์„ ์ œ์™ธํ•œ ๋‚˜๋จธ์ง€ ๋‘ ๊ฐœ์˜ ์‹คํ—˜๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™€์žˆ๋‹ค. UNet๊ณผ UResNet ์‚ฌ์ด์—์„œ augmentation์— ๋”ฐ๋ฅธ ๋ณ€ํ™” ์ •๋„๋‚˜ ๋ณ€ํ™” ๋ฐฉํ–ฅ์ด ๋น„์Šทํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค๋ฅธ segmentation network๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ํ•ด์„œ GAN augmentation์˜ ์˜ํ–ฅ์ด ํฌ๊ฒŒ ๋‹ฌ๋ผ์ง€์ง€๋Š” ์•Š๋Š”๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค. 

 

∴ GAN์€ ์–ด๋–ค segmentation network๋ฅผ ์‚ฌ์šฉํ•˜๋”๋ผ๋„ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋„์›€์ด ๋œ๋‹ค.

[Figure 3: Segmentation Network]

 

 

  • Augmentation ๋ฐฉ์‹์˜ ์ฐจ์ด๊ฐ€ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์˜ํ–ฅ์„ ์ฃผ๋Š”๊ฐ€?

 ์œ„์—์„œ๋„ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ๊ธฐ์กด์—๋Š” Data augmentation์„ ํ•  ๋•Œ ์ด๋ฏธ์ง€์— ๊ธฐํ•˜ํ•™์  ๋˜๋Š” ๋ฐ๊ธฐ, ์ƒ‰์ƒ ๋“ฑ์˜ ๋ณ€ํ™˜์„ ์ฃผ๋Š” ๋ฐฉ์‹์„ ๋งŽ์ด ์‚ฌ์šฉํ–ˆ์—ˆ๋Š”๋ฐ, GAN์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ด ์ „ํ†ต์ ์ธ ๋ฐฉ๋ฒ•๋“ค๊ณผ ๋น„๊ตํ•ด์„œ ๋” ์ข‹์€ ์˜ํ–ฅ์„ ์ฃผ๋Š”์ง€ ์•Œ์•„๋ณด๊ณ ์ž ํ–ˆ๋‹ค. ๋…ผ๋ฌธ์—์„œ๋Š” ์ „ํ†ต์ ์ธ Augmentation์˜ ๋Œ€ํ‘œ ์˜ˆ์‹œ๋กœ Rotation(ํšŒ์ „) augmentation์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ์ด 4๊ฐ€์ง€์˜ ์‹คํ—˜๊ฒฐ๊ณผ๋ฅผ ๋น„๊ต๋ถ„์„ํ•˜์˜€๋Š”๋ฐ, ์ฒซ ๋ฒˆ์งธ๋กœ๋Š” augmentation์„ ์•„๋ฌด๊ฒƒ๋„ ์ ์šฉํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ, ๋‘ ๋ฒˆ์งธ๋กœ GAN augmentation์„ ์ ์šฉํ–ˆ์„ ๊ฒฝ์šฐ, ์„ธ ๋ฒˆ์งธ๋กœ Ratation augmentation์„ ์ ์šฉํ–ˆ์„ ๊ฒฝ์šฐ, ๋งˆ์ง€๋ง‰ ๋„ค ๋ฒˆ์งธ๋กœ GAN๊ณผ Rotation augmentation์„ ํ•จ๊ป˜ ์ ์šฉํ–ˆ์„ ๊ฒฝ์šฐ๋กœ ๋‚˜๋ˆ„์–ด์„œ ์‹คํ—˜ํ•˜์˜€๋‹ค.

 ์•„๋ž˜ Figure 4์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ณด๋ฉด, ์–ด๋–ค ๋ฐฉ์‹์ด๋”๋ผ๋„ ์ผ๋‹จ augmentation์„ ์ ์šฉํ•˜๋ฉด ์•„๋ฌด๊ฒƒ๋„ ๊ฑด๋“œ๋ฆฌ์ง€ ์•Š์•˜์„ ๊ฒฝ์šฐ์— ๋น„ํ•ด ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋˜์—ˆ๊ณ , ๋‘ ๋ฐฉ์‹์„ ํ•จ๊ป˜ ์‚ฌ์šฉํ–ˆ์„ ๊ฒฝ์šฐ ๊ฐ๊ฐ ๋”ฐ๋กœ ์ ์šฉํ–ˆ์„ ๋•Œ๋ณด๋‹ค ์„ฑ๋Šฅํ–ฅ์ƒ์ด ๋” ์ด๋ฃจ์–ด์ง„ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ์ด๊ฒƒ์€ traditional๋ฐฉ์‹๊ณผ GAN์ด ๊ฐ๊ฐ ๋…๋ฆฝ์ ์ด๋ผ๋Š” ๊ฒƒ, ์ฆ‰ ๊ฐ method๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘ํ•˜๊ณ , ๊ฐ™์ด ์‚ฌ์šฉํ–ˆ์„ ๋•Œ ์‹œ๋„ˆ์ง€ ํšจ๊ณผ๋ฅผ ์ผ์œผํ‚ฌ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒฐ๋ก ์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

 

∴ ๋‘ ๋ฐฉ์‹์˜ ๋…๋ฆฝ์„ฑ์œผ๋กœ ์ธํ•ด ์‹œ๋„ˆ์ง€ ํšจ๊ณผ๋ฅผ ๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

[Figure 4: Augmentation]

 

 

 

  • Amount of available real data ๊ฐ€ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์˜ํ–ฅ์„ ์ฃผ๋Š”๊ฐ€?

 ์‚ฌ์‹ค Data Augmentation์ด๋ž€ ๋ฐฉ์‹์€ ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ์–‘์ด ๋ถ€์กฑํ•  ๋•Œ ํšจ๊ณผ๊ฐ€ ๊ทน๋Œ€ํ™”๋  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ์‚ฌ์šฉํ•˜๋Š” Real data์˜ ์–‘์„ ์กฐ์ ˆํ•ด์„œ ์ด ์ƒํ™ฉ์„ ๋งŒ๋“ค์–ด์„œ ๋น„๊ตํ•ด๋ณด๊ณ ์ž ํ–ˆ๋‹ค. ์ด 80k๊ฐœ์˜ Real data์ค‘์—์„œ 10%~90%๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์ถ”์ถœํ•˜์—ฌ ์‚ฌ์šฉํ–ˆ๋‹ค.(๋ผ๊ณ  ํ–ˆ๋Š”๋ฐ ํ‘œ์—์„œ๋Š” 100%๊นŒ์ง€ ๋‚˜์™€์žˆ์–ด์„œ ์—ฌ๋Ÿฌ๋ฒˆ ๋‹ค์‹œ ์ฝ์–ด๋ณด์•˜๋‹ค...์•„์ง๋„ ์˜๋ฌธ)

 Figure 5-(1)์˜ column์— ๋‚˜์˜จ ๋ถ€๋ถ„์ด Real data์˜ ์‚ฌ์šฉ๋Ÿ‰์„ ๋ณด์—ฌ์ฃผ๋Š”๋ฐ, ์˜ˆ์‹œ๋กœ UNet์˜ ๊ฒฝ์šฐ๋งŒ ๋ด๋„ 10%์ผ๋•Œ(์ฒซ๋ฒˆ์งธ row)๋Š” 76.9, 100%์ผ๋•Œ๋Š” 88.9%๋กœ ๋งค์šฐ ํฐ ์ฐจ์ด๋ฅผ ๋ณด์ด๊ณ  ์žˆ๋‹ค. ์ด๊ฑธ ํ†ตํ•ด์„œ ์„ฑ๋Šฅํ–ฅ์ƒ์— ๊ฐ€์žฅ ํฐ ์˜ํ–ฅ์„ ์ฃผ๋Š” ๊ฒƒ์€ Real data์˜ ์‚ฌ์šฉ๋Ÿ‰์ด์˜€๋‹ค๊ณ  ํ™•์ธ ํ•  ์ˆ˜ ์žˆ๋‹ค. 

 ๊ทธ๋Ÿฌ๋‚˜ ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ์…‹์—์„œ ์‹คํ—˜ํ•œ Figure 5-(2)๋ฅผ ๋ณด๋ฉด, 100% ๋ชจ๋‘ ์‚ฌ์šฉํ–ˆ์„ ๊ฒฝ์šฐ GAN augmentation์˜ ์ ์šฉ๋น„์œจ์„ ํ‚ค์šธ์ˆ˜๋ก ์„ฑ๋Šฅ์ด ์˜คํžˆ๋ ค ๋‚˜๋น ์ง€๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ฐ์ดํ„ฐ์˜ ์–‘์ด ๋ถ€์กฑํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ์—๋Š” ์ธ์œ„์ ์ธ ๋ณ€ํ™˜์ด ์ด๋ฃจ์–ด์ง„ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์˜คํžˆ๋ ค ํ•ด๊ฐ€ ๋  ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

∴ ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ์–‘์ด ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๊ฒฐ์ •ํ•˜๋Š” ๊ฐ€์žฅ ์ค‘์š”ํ•œ ์š”์†Œ์ด๋‹ค.

 

[Figure 5-(1), 5-(2): Amount of available real data]

 

 

 

  • Amount of Synthetic data ๊ฐ€ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์˜ํ–ฅ์„ ์ฃผ๋Š”๊ฐ€?

 Synthetic data์˜ ์–‘์ด Segmentation Network์— ์ฃผ๋Š” ์˜ํ–ฅ์„ ์กฐ์‚ฌํ•˜๊ธฐ ์œ„ํ•ด์„œ, Training data๋กœ ํ•ฉ์น  Synthetic data์˜ ์–‘์„ ์กฐ์ ˆํ•˜์—ฌ ์—ฌ๋Ÿฌ๋ฒˆ ์‹คํ—˜ํ•˜์˜€๋‹ค. 0%~100% ์‚ฌ์ด๋กœ ์กฐ์ ˆํ•˜์˜€๋Š”๋ฐ ์ด ๋•Œ ํผ์„ผํŠธ ๋น„์œจ์€ Synthetic data์— ๋Œ€ํ•œ ๋น„์œจ์ด ์•„๋‹ˆ๋ผ ์ตœ์ดˆ Real data์˜ ํผ์„ผํŠธ ๋น„์œจ๋กœ ํ‘œํ˜„๋˜์—ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด 50%์˜ ์ถ”๊ฐ€์ ์ธ data๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค๊ณ  ํ•˜๋ฉด, ์ตœ์ดˆ real data๋Š” ์ด 80k๊ฐœ์ด๊ธฐ ๋•Œ๋ฌธ์— 40k๋ฅผ ๋”ํ•˜์—ฌ ์ด 120k์˜ Training data๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜๋Š” ๊ฒƒ์ด๋‹ค.

 ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ์œ„์™€ ๊ฐ™์€ ์ž๋ฃŒ์ธ Figure5์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, Synthesis๊ณผ์ •์—์„œ ์ฃผ์–ด์ง„ ํŠน์„ฑ๋“ค๊ณผ ๋‹ค๋ฅธ ์ถ”๊ฐ€์ ์ธ ํŠน์„ฑ๋“ค๋„ ์ƒ์„ฑ๋˜๊ณ  ์—ฌ๋Ÿฌ ๋ณ€์ˆ˜๊ฐ€ ์ƒ๊ธฐ๊ธฐ ๋•Œ๋ฌธ์— 50%๋‚ด์™ธ์—์„œ ๊ฐ€์žฅ ํฐ ํšจ๊ณผ๋ฅผ ๋ณด์ด๊ณ  ์žˆ๋‹ค.

 

 

 

  • Dataset ์˜ ์ข…๋ฅ˜๊ฐ€ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ์˜ํ–ฅ์„ ์ฃผ๋Š”๊ฐ€?

 ์ €์ž๋Š” ์„œ๋กœ ๋‹ค๋ฅธ Dataset์— ๋Œ€ํ•ด ์‹คํ—˜ํ•ด๋ณด๊ธฐ ์œ„ํ•ด์„œ CT image dataset๊ณผ FLAIR image dataset ์ด ๋‘๊ฐ€์ง€๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ์—ฌ๊ธฐ์„œ ์ฃผ๋ชฉํ•ด๋ณผ ์ˆ˜ ์žˆ๋Š” CT image์ธ๋ฐ, ์ด ๋ฐ์ดํ„ฐ์…‹์€ label์ด cortial CSF, brain stem CSF, ventricular CSF์œผ๋กœ ์ด 3๊ฐœ์˜ class๋กœ ๋‚˜๋ˆ„์–ด์ ธ ์žˆ๋‹ค. ํด๋ž˜์Šค ์ด๋ฏธ์ง€๋Š” ์ˆœ์„œ๋Œ€๋กœ 4.35:1:1.35์˜ ๋น„์œจ๋กœ ์กด์žฌํ•˜๊ณ  ์žˆ๋Š”๋ฐ, Figure 6์„ ์ฐธ๊ณ ํ•ด์„œ ๋ณด๋ฉด ์•Œ ์ˆ˜ ์žˆ๋“ฏ์ด ๊ฐ€์žฅ ์ ์€ ๊ฐœ์ˆ˜์˜ ์ด๋ฏธ์ง€๊ฐ€ ์กด์žฌํ•˜๋Š” Brain stemp CSF๊ฐ€ ๊ฐ€์žฅ ํฐ ์„ฑ๋Šฅํ–ฅ์ƒ์„ ๋ณด์ด๊ณ  ์žˆ๋‹ค. (ํŒŒ๋ž€์ƒ‰ ๊ทธ๋ž˜ํ”„) ๋”ฐ๋ผ์„œ ๋ชจ๋“  ํด๋ž˜์Šค๊ฐ€ ๊ณ ๋ฅด๊ฒŒ ๋ถ„ํฌํ•ด์žˆ๋Š” balanced dataset๋ณด๋‹ค๋Š” ํด๋ž˜์Šค ์‚ฌ์ด์˜ ๋ถˆ๊ท ํ˜•์ด ์‹ฌํ•œ imbalanced dataset์— ๋Œ€ํ•ด์„œ ๋” ๋ˆˆ์— ๋„๋Š” ํšจ๊ณผ๋ฅผ ๋ณด์ผ๊ฒƒ์ด๋ผ๊ณ  ํ•œ๋‹ค.

 

∴ Imbalanced dataset์— ๋Œ€ํ•ด ๋” ํฐ ์„ฑ๋Šฅํ–ฅ์ƒ์„ ๋ณด์ธ๋‹ค.

 

[Figure 6: Dataset]

 

 


Conclusion


 ๋‹ค์–‘ํ•œ condition ํ•˜์—์„œ ์‹คํ—˜์„ ํ•ด๋ณด์•˜์„ ๋•Œ, ์‚ฌ์šฉ๋œ ํ‰๊ฐ€์ง€ํ‘œ (DSC)์— ๋Œ€ํ•ด์„œ 1~5%์˜ ์„ฑ๋Šฅํ–ฅ์ƒ์„ ๋ณด์˜€๊ณ , 10%์˜ ๋ฐ์ดํ„ฐ๋งŒ ์‚ฌ์šฉํ•œ (๋ฐ์ดํ„ฐ ๋ถ€์กฑ) ์ƒํ™ฉ์—์„œ ๊ฐ€์žฅ ํฐ ํ–ฅ์ƒ์„ ๋ณด์˜€๋‹ค.

 

 ์‹คํ—˜์ด ๊ต‰์žฅํžˆ ์ฒด๊ณ„์ ์œผ๋กœ ์งœ์—ฌ์ ธ์„œ ์ด๋ฃจ์–ด์กŒ๊ณ , ๊ฒฐ๊ณผ๋„ ์ „๋ถ€ ๋ณด๊ธฐ์ข‹๊ฒŒ ์ •๋ฆฌ๋˜์–ด ์žˆ์–ด์„œ ์ฝ๊ธฐ๊ฐ€ ํŽธํ–ˆ๋˜ ๊ฒƒ ๊ฐ™๋‹ค. ์ œ์ผ ํฅ๋ฏธ๋กœ์› ๋˜ ๋ถ€๋ถ„์€ ๊ธฐ์กด์˜ traditional augmentation๋ฐฉ์‹๋“ค๊ณผ GAN์ด ์•„์˜ˆ ๋‹ค๋ฅธ ๋ฐฉํ–ฅ์œผ๋กœ ๋™์ž‘๋œ๋‹ค๋Š” ์ ์ด์˜€๊ณ , ์กฐ๊ธˆ ๋” ๊นŠ๊ฒŒ ์ˆ˜์‹์ ์œผ๋กœ ํ™•์ธํ•ด๋ณด๋ฉด ์ƒˆ๋กœ์šด ์•„์ด๋””์–ด๋ฅผ ์–ป์–ด๋ณผ ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ ๋ผ๋Š” ์ƒ๊ฐ์ด ๋“ ๋‹ค.

 

 

 


References


[1] Christopher Bowles et al, GAN Augmentation: Augmenting Training Data using Generative Adversarial Networks, 2018

 

 

 

๊ฐ์ž๊ฐ™์€ ํ•™๋ถ€์ƒ ํ˜ผ์ž ์ฝ๊ณ  ๊ธฐ๋กํ•˜๋ ค๊ณ  ๋‚จ๊ธฐ๋Š” ๋ฆฌ๋ทฐ์ž…๋‹ˆ๋‹ค ์ˆ˜์ •ํ•  ๋ถ€๋ถ„์€ ์•Œ๋ ค์ฃผ์„ธ์š”๐Ÿฅ”

 

 

 

Comments