๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
Data Science/machine&deep learning

[Paper review] B-CNN: Branch Convolutional Neural Network for Hierarchical Classification ๋…ผ๋ฌธ๋ฆฌ๋ทฐ

by yejining99 2022. 2. 16.

๐Ÿ”Hierarchical classification?

Convolutional Neural Network(CNN)์€ image classifiers์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ธ๋‹ค.

ํ•˜์ง€๋งŒ ์ผ๋ฐ˜์ ์ธ CNN์€ ์˜ˆ์ธกํ•˜๋ ค๋Š” class๋“ค์ด equallyํ•˜๊ณ  exclusiveํ•˜๋‹ค๋Š” ๊ฐ€์ •์•„๋ž˜์—์„œ, ๋งŽ์€ class์ค‘ ํ•˜๋‚˜๋ผ๊ณ  ์˜ˆ์ธก์„ ๋‚ด๋ฆฌ๋Š” ๋ชจ๋ธ์ด๋‹ค.

ํ•˜์ง€๋งŒ ๋ณดํ†ต์˜ image classifier์€ ๊ณ„์ธต์ ์ธ(hierarchical)ํ•œ ๊ด€๋ จ์„ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค.

 

์˜ˆ๋ฅผ ๋“ค์–ด, ๊ณ ์–‘์ด์™€ ๊ฐ•์•„์ง€๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ•์•„์ง€์™€ ๋น„ํ–‰๊ธฐ๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ๊ฒฝ์šฐ๋ฅผ ์ƒ๊ฐํ•ด๋ณด์ž.

๊ณ ์–‘์ด์™€ ๊ฐ•์•„์ง€๋Š” ๋™๋ฌผ์ด๋ผ๋Š” ํ•˜๋‚˜์˜ ๋ฒ”์ฃผ์— ๋ฌถ์ด์ง€๋งŒ, ๊ฐ•์•„์ง€์™€ ๋น„ํ–‰๊ธฐ๋Š”...? ์•„์˜ˆ ๋‹ค๋ฅด๋‹ค.

๊ทธ๋Ÿผ image๋“ค์˜ ๊ณ„์ธต์ ์ธ ๊ด€๊ณ„๋ฅผ ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ์ ์šฉ์‹œํ‚ค๋ฉด ์–ด๋–จ๊นŒ?

์ด๋Ÿฌํ•œ ์•„์ด๋””์–ด์—์„œ ๋‚˜์˜จ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๋ฐ”๋กœ B-CNN์ด๋‹ค.

 

 

<์› ๋…ผ๋ฌธ ๋งํฌ!>

https://arxiv.org/abs/1709.09890

 

B-CNN: Branch Convolutional Neural Network for Hierarchical Classification

Convolutional Neural Network (CNN) image classifiers are traditionally designed to have sequential convolutional layers with a single output layer. This is based on the assumption that all target classes should be treated equally and exclusively. However,

arxiv.org

<๋…ผ๋ฌธ์— ์“ฐ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ฝ”๋“œ>

https://github.com/zhuxinqimac/B-CNN

 

GitHub - zhuxinqimac/B-CNN: Sample code of B-CNN paper (https://arxiv.org/abs/1709.09890) written in Python3+.

Sample code of B-CNN paper (https://arxiv.org/abs/1709.09890) written in Python3+. - GitHub - zhuxinqimac/B-CNN: Sample code of B-CNN paper (https://arxiv.org/abs/1709.09890) written in Python3+.

github.com

 

์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋ฉ”์ธ ์•„์ด๋””์–ด๋Š” 2๊ฐ€์ง€ ์ด๋‹ค.

์šฐ์„  B-CNN(Branch Convolutional Neural Network)์ด๋ผ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ, BT-strategy(Branch Traning strategy)๋ผ๋Š” ์ „๋žต์„ ํ•™์Šต์— ์ ์šฉ์‹œํ‚จ ๊ฒƒ!

ํ•˜๋‚˜์”ฉ ์•Œ์•„๋ณด์žฃ ... !

 

๐Ÿ’ซ Branch Convolutional Neural Network(B-NN)

B-CNN์˜ ๋ชจ์Šต

์šฐ์„  B-CNN์„ ํ•™์Šต์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ๋Š” hierarchical label์„ ์•Œ๊ณ  ์žˆ์–ด์•ผ ํ•œ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด ์šฐ๋ฆฌ๋Š” hierarchcal level์„ 3์ด๋ผ๊ณ  ๋‘”๋‹ค๊ณ  ๊ฐ€์ •ํ•ด๋ณด์ž.

๊ฐœ๋ผ๊ณ  ํ•˜๋ฉด label์€ [์ƒ๋ฌผ, ์• ์™„๋™๋ฌผ, ๊ฐœ] ์ด๋Ÿฐ์‹์˜ label์ด ๋  ๊ฒƒ์ด๋‹ค.

์˜์ž๋ผ๊ณ  ํ•˜๋ฏ„ label์€ [๋ฌด์ƒ๋ฌผ, ๊ฐ€๊ตฌ, ์˜์ž] ์ด๋Ÿฐ์‹์˜ label.

 

์—ฌ๊ธฐ์„œ ๋ญ๊ฐ€๋ฅผ ์•Œ ์ˆ˜ ์žˆ๋Š”๋ฐ ๋ฐ”๋กœ B-CNN์€ output์ด hierarchical level๋งŒํผ ๋‚˜์˜ฌ ๊ฒƒ์ด๋‹ค! ๋ผ๋Š” ๊ฒƒ์ด๋‹ค.

๊ทธ๋ ค๋Ÿฌ๋ฉด ์˜ˆ์ธก์„ ์œ„ํ•œ layers๋„ level ๊ฐฏ์ˆ˜๋งŒํผ ํ•„์š”ํ•˜๋‹ค. ์™œ๋ƒ๋ฉด ๊ฐ๊ฐ ๋”ฐ๋กœ ์˜ˆ์ธกํ•ด์•ผํ•˜๋‹Œ๊น ใ…‡ใ…‡

์œ„ ๊ทธ๋ฆผ์—์„œ๋„ layers๊ฐ€ 3๊ฐœ๊ฐ€ ์“ฐ์ธ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

๊ฐ layer์—๋Š” ConvNet์„ ์‚ฌ์šฉํ–ˆ๋‹ค๊ณ  ํ•œ๋‹ค.

 

๊ทธ๋ ‡๋‹ค๋ฉด model์„ fit์‹œํ‚ค๊ธฐ์œ„ํ•ด ํ•„์š”ํ•œ loss function๋„ level๊ฐฏ์ˆ˜๋งŒํผ ๋‚˜์˜ฌ ๊ฒƒ์ด๋‹ค.

์—ฌ๊ธฐ์„œ loss function์€ cross-entropy loss๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค๊ณ  ํ•œ๋‹ค.

loss function

์œ„์‹์—์„œ ์ฃผ๋ชฉํ•ด์•ผ ํ•  ๊ฒƒ์€ ๋ฐ”๋กœ~ Ak์ด๋‹ค.

์ด๊ฑด ๊ฐ level์— ํ•ด๋‹นํ•˜๋Š” loss function์˜ loss weight์ธ๋ฐ, ์ด๋ฅผ ํ†ตํ•ด hierarchicalํ•œ ๊ด€๊ณ„๋ฅผ ๋ชจ๋ธ ์—…๋ฐ์ดํŠธ์— ์ ์šฉ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค.

์ด ๋…ผ๋ฌธ์—์„œ ์ด ๊ฐœ๋…์„ BT-strategy๋ผ๊ณ  ์นญํ•˜์˜€๋‹ค.

 

 

๐Ÿ’ซ Branch Training Strategy(BT-strategy)

BT-strategy๋Š” loss weight๋ฅผ ๋‹ค์–‘ํ•˜๊ฒŒ ๋ณ€ํ™”์‹œํ‚ค๋ฉด์„œ B-CNN model์„ ํ•™์Šต์‹œํ‚จ๋‹ค.

loss weight๊ฐ€ ํฐ level์€ ๊ทธ ํ•™์Šต์—์„œ ํฌ๊ฒŒ ์ž‘์šฉ๋œ๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค.

์ด๊ฒƒ์„ ์ด ๋…ผ๋ฌธ์—์„œ๋Š” 'focus'๋ผ๊ณ  ๋งํ•˜์˜€๋‹ค.

์˜ˆ๋ฅผ๋“ค์–ด loss weight๋ฅผ [0.2, 0.3, 0.5]๋กœ ์ฃผ์—ˆ๋‹ค๋ฉด ๋งˆ์ง€๋ง‰ level์— focus๊ฐ€ ๊ฐ€์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค.

 

๊ทธ๋ฆฌ๊ณ  ์ด focus๋Š” ํ•™์Šต์„ ๊ฑฐ๋“ญํ•  ์ˆ˜๋ก ๋†’์€ level(ex: ์ƒ๋ฌผ)์—์„œ ๋‚ฎ์€ level(ex: ๊ฐœ)๋กœ ์˜ฎ๊ฒจ๊ฐ„๋‹ค.

์ด ๊ณผ์ •์—์„œ ๋ชจ๋ธ์€ ๋†’์€ level์˜ ํŠน์ง•์„ ์ž˜ ๋ฐฐ์šด ํ›„ ๋‚ฎ์€ level์˜ parameter tunining์— ์จ๋จน์„ ์ˆ˜ ์žˆ๋‹ค.

 

 

๐Ÿ”ฌ Experiments

์ฒ˜์Œ์œผ๋กœ ์•Œ์•„๋ณธ ๊ฒƒ์€ ๊ฐ level์˜ layer์˜ ์ •ํ™•๋„!

ํ™•์‹คํžˆ ๋†’์€ level(=coarse 1)์˜ ์ •ํ™•๋„๊ฐ€ ๋‚ฎ์€ level(=fine)๋ณด๋‹ค ๋†’๋‹ค.

์‚ฌ์‹ค ์ง๊ด€์ ์œผ๋กœ ์ƒ๊ฐํ•ด๋ด๋„ ๋‹น์—ฐํ•œ ๊ฒฐ๊ณผ๋‹ค.

๊ณ ์–‘์ด๋ž‘ ๊ฐ•์•„์ง€๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค, ๊ฐ•์•„์ง€๋ž‘ ์˜์ž๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š”๊ฒŒ ๋” ์‰ฌ์šธํ…Œ๋‹ˆ ใ…‹ใ…‹

๋‘๋ฒˆ์งธ๋Š” 3๊ฐ€์ง€ ๋ฐ์ดํ„ฐ์…‹์— ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์—ฌ ์ผ๋ฐ˜์ ์ธ CNN๊ณผ B-CNN์˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜์˜€๋‹ค.

B-CNN์ด ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ€์กŒ๋‹ค. (๊ทธ๋Ÿฌ๋‹Œ๊น ๋…ผ๋ฌธ์„ ์“ธ ์ˆ˜ ์žˆ์—ˆ๊ฒ ์ง€)

 

โœ… Conclusion

B-CNN์€ hierarchical label์„ ๊ฐ€์ด๋“œ์ฒ˜๋Ÿผ ์‚ฌ์šฉํ•ด์„œ ํผํฌ๋จผ์Šค๋ฅผ ๋†’์˜€๋‹ค.

์ผ๋ฐ˜ CNN๊ณผ ๋น„๊ตํ•ด๋„ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

 

๋‹ค๋งŒ ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์•„์‰ฌ์šด ์ ์€

1, hierarchical label์„ ๋ฏธ๋ฆฌ ์•Œ๊ณ  ์žˆ์–ด์•ผํ•œ๋‹ค๋Š” ์ 

2, ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง„ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์•„๋‹Œ, ๊ธฐ๋ณธ์ ์ธ CNN๊ณผ ๋น„๊ตํ–ˆ๋‹ค๋Š” ์ 

3, training epochs๋ฅผ 60์œผ๋กœ ์ œํ•œํ•ด์„œ ์ œ์ผ ์ •ํ™•ํ•œ ์ƒํƒœ๊ฐ€ ์•„๋‹ˆ๋ผ๋Š” ์ (๋ฌผ๋ก  ์ด์ ์ด ๊ฐ€์ง€๊ณ  ์˜ค๋Š” ์ด์ ๋„ ์žˆ์ง€๋งŒ)

 

๊ฐ„๋‹จํ•œ ์•„์ด๋””์–ด๋กœ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ€์ง„ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋งŒ๋“ค์–ด๋‚ธ ๊ฒƒ์ด ์‹ ๊ธฐํ–ˆ๋‹ค!