profile-img
The merit of an action lies in finishing it to the end.
slide-image

Big Data Intro

- 2003๋…„ ์ „๊นŒ์ง€ 5 ์—‘์‚ฌ๋ฐ”์ดํŠธ์˜ ์ •๋ณด๋ฅผ ์ƒ์‚ฐํ–ˆ์Œ

- ์ด์ œ ์ดํ‹€๋งˆ๋‹ค 5 ์—‘์‚ฌ๋ฐ”์ดํŠธ์˜ ์ •๋ณด๋ฅผ ์ƒ์‚ฐ -> ๋น…๋ฐ์ดํ„ฐ

 

Big Data์˜ 4V

- Volume

- Variety

- Velocity

- Veracity

 

Science Paradigms

- ๋ช‡์ฒœ ๋…„ ์ „: ๊ณผํ•™์€ empirical ํ–ˆ๋‹ค. -> ์ž์—ฐ ํ˜„์ƒ์„ ์„œ์ˆ 

- ๋ช‡๋ฐฑ ๋…„ ์ „: theoretical branch -> ์ด๋ก , ๋ชจ๋ธ๋ง

- ๋ช‡์‹ญ ๋…„ ์ „: computational -> ์‹œ๋ฎฌ๋ ˆ์ด์…˜

- ์˜ค๋Š˜๋‚ : data exploration

- ๋ฏธ๋ž˜: Data-driven Science -> Data-driven Hypothesis Generation

 

Big Data Challenges

- ๋น…๋ฐ์ดํ„ฐ

* ํฌ๊ณ  ๋ณต์žกํ•œ ๋ฐ์ดํ„ฐ

* ์˜ˆ: social, web, financial transaction data, academic articles, genomic data...

- 2๊ฐ€์ง€์˜ ๊ณ ๋‚œ

* ํšจ์œจ์  ์ €์žฅ ๋ฐ ์ ‘๊ทผ

* Data Analytics -> ๊ฐ€์น˜ ์žˆ๋Š” ์ •๋ณด ์บ๋‚ด๊ธฐ

 

Data Science and Big data

- ๋น…๋ฐ์ดํ„ฐ๋ผ๋Š” ์šฉ์–ด๋Š” ๊ฑฐ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ์…‹์˜ ๋ถ„์„์„ ๋œปํ•จ

- ์˜ˆ: ํŠธ์œ„ํ„ฐ/ํŽ˜์ด์Šค๋ถ ์ „์ฒด, ์ฃผ์š” ์‚ฌ์ดํŠธ์˜ ์›น ๋กœ๊ทธ, ๋ช‡์ฒœ ๋ช…์˜ genome sequence, Flickr์˜ ์ „์ฒด ์ด๋ฏธ์ง€...

- ํฌ๊ธฐ๊ฐ€ ์ปค์ง€๋ฉด ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃจ๊ธฐ ๋” ์–ด๋ ค์›Œ์ง„๋‹ค

 

Big data as Bad data

- Unrepresentative participation (bias): ์ธ์Šคํƒ€๊ทธ๋žจ-์ Š์€ ์ธต๋งŒ ์ด์šฉ, ๋‰ด์š•ํƒ€์ž„์ฆˆ-์ž์œ ์ฃผ์˜์ , Fox News-๋ณด์ˆ˜์ , ์›”์ŠคํŠธ๋ฆฌํŠธ์ €๋„-๋ถ€์ž๋งŒ ๋ด„

- Spam and machine-generated content: ๋ด‡, ๊ฐ€์งœ๋‰ด์Šค ๋“ฑ 

- Power-laws: redundancy๋ฅผ ์˜๋ฏธ -> ๊ฑด๋ฌผ ํƒ์ง€ ํƒœ์Šคํฌ๋ฅผ ์ˆ˜ํ–‰ํ•  ๋•Œ, ๋žœ๋“œ๋งˆํฌ ๋ฐ์ดํ„ฐ๋Š” ๋งŽ์œผ๋ฏ€๋กœ ๋žœ๋“œ๋งˆํฌ๋งŒ ์ž˜ ์ธ์‹ํ•˜๊ณ  ์ผ๋ฐ˜ ๊ฑด๋ฌผ์€ ์ž˜ ์ธ์‹ํ•˜์ง€ ๋ชปํ•  ์ˆ˜ ์žˆ์Œ

- Susceptibility to temporal bias (ex. Google Flu Trends): ๊ตฌ๊ธ€์˜ ์ž๋™์™„์„ฑ ๊ธฐ๋Šฅ์ด ๊ฒ€์ƒ‰ ์ฟผ๋ฆฌ์˜ ๋ถ„ํฌ๋ฅผ ๋ณ€ํ™”์‹œํ‚ด

 

Large-Scale Machine Learning

- ์šฐ๋ฆฌ๊ฐ€ ์ง€๊ธˆ๊นŒ์ง€ ๊ณต๋ถ€ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜๋“ค์€ ์•„์ฃผ ํฐ ๋ฐ์ดํ„ฐ์…‹์— ์ž˜ ๋งž์ง€ ์•Š๋Š”๋‹ค.

- ์•„์ฃผ ํฐ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด์„œ๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์ ์€ ๋ชจ๋ธ์ด ์ž˜ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š”๋‹ค.

- ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์‹œ๊ฐ„๋ณต์žก๋„๊ฐ€ linear -> O(n)์ด์–ด์•ผ ํ•œ๋‹ค.

- Big matrices๋Š” ๋น…๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ํฌ์†Œํ•  ์ˆ˜ ์žˆ๋‹ค.

* ์˜ˆ) ๋„ทํ”Œ๋ฆญ์Šค ๊ณ ๊ฐ 1์–ต๋ช…์„ ๋Œ€์ƒ์œผ๋กœ ์ „์ฒด ์˜ํ™”์— ๋Œ€ํ•œ ์œ ์ € ํ‰์  matrix๋ฅผ ๋ถ„์„ํ•œ๋‹ค๊ณ  ํ–ˆ์„ ๋•Œ ์ฐจ์›์ด ์•„์ฃผ ํฌ๊ณ  ํฌ์†Œํ•˜๋‹ค.

-> ๋ชจ๋“  ์‚ฌ๋žŒ์ด ๋ชจ๋“  ์˜ํ™”๋ฅผ ๋ณด์ง€๋Š” ๋ชปํ•˜๊ธฐ ๋•Œ๋ฌธ์—

 

Filtering Data -> domain specific criteria

- ๋น…๋ฐ์ดํ„ฐ์˜ ์žฅ์ ์€ ๋ถ„์„์„ ๋” ๊น”๋”ํ•˜๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋น…๋ฐ์ดํ„ฐ์˜ ์ƒ๋‹น ๋ถ€๋ถ„์„ ๋ฒ„๋ ค๋„ ๋œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.

- ์˜ˆ: ํŠธ์œ„ํ„ฐ์˜ ์ „์ฒด ํŠธ์œ— ์ค‘ 34%๋งŒ์ด ์˜์–ด๊ถŒ ๊ณ„์ • -> ๋‚˜๋จธ์ง€๋Š” ๋ฒ„๋ ค๋„ ๋จ

- irrelevant or hard-to-interpret data๋ฅผ ํ•„ํ„ฐ๋ง ํ•˜๋Š” ๊ฒƒ์€ application-specific knowledge๋ฅผ ํ•„์š”๋กœ ํ•œ๋‹ค.

 

Subsampling Data

- subsampling...?

* training/validation/testing data๋ฅผ ๊น”๋”ํ•˜๊ฒŒ ๋ถ„๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค.

* ๋‹จ์ˆœํ•˜๊ณ  ํŠผํŠผํ•œ ๋ชจ๋ธ์€ ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์ ์Œ -> big data์˜ overkill ์œ ๋„

cf) hyperparameter fitting, ๋ชจ๋ธ ์„ ์ •->์—ฌ๋Ÿฌ๋ฒˆ ํ•™์Šต

* ์Šคํ”„๋ ˆ๋“œ์‹œํŠธ ์‚ฌ์ด์ฆˆ์˜ ๋ฐ์ดํ„ฐ์…‹์€ ๋ถ„์„์ด ์‰ฝ๊ณ  ๋น ๋ฅด๊ฒŒ ๋œ๋‹ค.

 

์ ˆ๋‹จ(truncation)์„ ์ด์šฉํ•˜๊ธฐ

- ์ฒซ n๊ฐœ์˜ ๊ธฐ๋ก๋งŒ์„ ์ทจํ•˜๋Š” ๊ฒƒ์€ ์‰ฝ์ง€๋งŒ ์—ฌ๋Ÿฌ ์˜ค๋ฅ˜๋ฅผ ๋ฒ”ํ•  ์ˆ˜ ์žˆ๋‹ค.

* temporal biases (์˜ค๋ž˜๋œ ๋ฐ์ดํ„ฐ๋งŒ์„ ๋ถ„์„ํ•˜๊ฒŒ ๋  ์ˆ˜๋„)

* lexicographic biases (A์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๋งŒ ๋ถ„์„ํ•˜๊ฒŒ ๋  ์ˆ˜๋„ -> Arabic์€ ๋ถ„์„๋ฒ”์œ„์— ์žˆ์ง€๋งŒ Korean์€ ํฌํ•จ๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค)

* numerical biases (ID ๋ฒˆํ˜ธ ๊ฐ™์€ ๊ฑด ๋ฌด์ž‘์œ„ ๋ถ€์—ฌ๊ฐ€ ์•„๋‹ˆ๋ผ ์˜๋ฏธ๊ฐ€ ์žˆ๋Š” ์ˆซ์ž์ผ ์ˆ˜ ์žˆ์Œ)

 

Stream Sampling

- n์„ ๋ชจ๋ฅผ ๋•Œ size k ์งœ๋ฆฌ ๊ท ์ผ ์ƒ˜ํ”Œ์„ ์ถ”๊ตฌํ•  ๋•Œ๊ฐ€ ์žˆ๋‹ค.

- ํ•ด๊ฒฐ์ฑ…: k๊ฐœ์˜ active element์— ๋Œ€ํ•œ ๋ฐฐ์—ด์„ ์œ ์ง€ํ•˜๊ณ , n๋ฒˆ์จฐ ์š”์†Œ๋ฅผ k/n์˜ ํ™•๋ฅ ๋กœ ๋Œ€์ฒด -> ๋“ค์–ด์™”๋˜ ๊ฒƒ๋„ ์ซ“๊ฒจ๋‚  ์ˆ˜๋„ ์žˆ์Œ

- cut์„ ๋งŒ๋“ค ๋•Œ ์ƒˆ๋กœ์šด ์š”์†Œ์˜ ์œ„์น˜๋ฅผ ๋žœ๋ค์œผ๋กœ ์„ ์ •ํ•  ์ˆ˜ ์žˆ์Œ.

 

Distributed vs. Parallel Processing

- Parallel processing: 1๊ฐœ์˜ ๊ธฐ๊ณ„์—์„œ ์ง„ํ–‰ (์Šค๋ ˆ๋“œ, ์šด์˜์ฒด์ œ ํ”„๋กœ์„ธ์Šค ๋“ฑ์„ ํ†ตํ•ด)

- Distributed processing: ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๊ธฐ๊ณ„์—์„œ ์ง„ํ–‰ - ๋„คํŠธ์›Œํฌ ํ†ต์‹  ๋“ฑ์„ ์ด์šฉ

 

Data Paralllelism

- parallelism์„ ์ด์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋น…๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์ˆ˜์˜ ๊ธฐ๊ณ„๋กœ ๋‚˜๋ˆ  ๋…๋ฆฝ์ ์œผ๋กœ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์ด๋‹ค.

- k-means์ฒ˜๋Ÿผ ๊ฐ๊ฐ์˜ ๊ฒฐ๊ณผ๋ฅผ ํ•ฉ์น˜๊ธฐ๊ฐ€ ์–ด๋ ต๋‹ค.

* Parallelized K-means

์˜ˆ๋ฅผ ๋“ค์–ด 100๊ฐœ์˜ seed๊ฐ€ ์žˆ๋Š” ์ƒํ™ฉ์ด๋ผ๊ณ  ํ•˜๋ฉด

1. 1๊ฐœ์˜ master machine (job assign...)์ด k๊ฐœ์˜ random seed๋ฅผ ์„ ์ •ํ•ด์„œ ๋‹ค๋ฅธ machine์—๊ฒŒ broadcast

2. local์—์„œ, ๊ฐ machine์€ ์ž์‹ ์ด ๊ฐ€์ง„ ๋ฐ์ดํ„ฐ๊ฐ€ ์–ด๋Š ํด๋Ÿฌ์Šคํ„ฐ์— ์†ํ•˜๋Š”์ง€ ๊ณ„์‚ฐํ•œ๋‹ค.

3. ๊ทธ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์ƒˆ๋กœ์šด centroid, data #๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ์ „๋‹ฌํ•œ๋‹ค. (sum vector ๋“ฑ์˜ ํ˜•ํƒœ๋กœ ์ „๋‹ฌํ•˜๋ฉด ์ผ์ผ์ด ์ „๋‹ฌํ•  ํ•„์š”๊ฐ€ ์—†๋‹ค)

4. master machine์ด ๊ทธ ์ •๋ณด๋ฅผ ์ „๋‹ฌ๋ฐ›์•„ centroid ๊ฐ’์„ ์ตœ์ข…์ ์œผ๋กœ ์—…๋ฐ์ดํŠธํ•˜๊ณ , ์ด๋ฅผ ๋‹ค์‹œ broadcastํ•œ๋‹ค.

5. ๊ณ„์† ๋ฐ˜๋ณตํ•œ๋‹ค.

 

Grid Search

- ์ •์˜: right hyper-parameter๋ฅผ ์ฐพ๋Š” ๊ฒƒ

- ์˜ˆ: k-means clustering์—์„œ ์˜ฌ๋ฐ”๋ฅธ k๊ฐ’์„ ์ฐพ๋Š” ๊ฒƒ

- parallelํ•˜๊ฒŒ ์‹คํ–‰๋  ๊ฒฝ์šฐ ์ตœ์ ์˜ ๊ฐ’์„ ์ฐพ์•„๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

 

Distributed processing์˜ ๋ณต์žก๋„

- machine ์ˆ˜์— ๋”ฐ๋ผ ๊ธ‰๊ฒฉํ•˜๊ฒŒ ์ฆ๊ฐ€ํ•œ๋‹ค

* 1๊ฐœ: ๋‚˜์˜ box์˜ ์ค‘์‹ฌ๋ถ€๋ฅผ ๋ฐ”์˜๊ฒŒ ํ•จ

* 2๊ฐœ: ๋ช‡ ๊ฐœ์˜ box์— ๋Œ€ํ•ด ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰์‹œํ‚จ๋‹ค.

* ์—ฌ๋Ÿฌ๊ฐœ: MapReduce ๋“ฑ์˜ ์‹œ์Šคํ…œ์„ ์ด์šฉํ•˜์—ฌ ํšจ์œจ์ ์œผ๋กœ ๊ด€๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค.

 

MapReduce / Hadoop

- distributed computing์„ ์œ„ํ•œ ๊ตฌ๊ธ€์˜ MapReduce ํŒจ๋Ÿฌ๋‹ค์ž„์€ Hadoop, Spark ๋“ฑ์˜ ์˜คํ”ˆ์†Œ์Šค๋กœ ๊ตฌํ˜„๋จ

- ๊ฐ„๋‹จํ•œ parallel programming model

- ์ˆ˜๋ฐฑ/์ˆ˜์ฒœ ๊ฐœ์˜ ๊ธฐ๊ณ„์— ๋Œ€ํ•ด Straightforward scaling

- redundancy(์—ฌ๋ถ„)์„ ํ†ตํ•œ fault tolerance

 

Divide and Conquer

1. ๋ถ„ํ• (partition): ํ•˜๋‚˜์˜ ์ผ์„ ์—ฌ๋Ÿฌ ๊ฐœ๋กœ ์ชผ๊ฐฌ

2. ๊ฐ๊ฐ ์ˆ˜ํ–‰

3. ๋ณ‘ํ•ฉ(combine): ๊ฐ๊ฐ์˜ ๊ฒฐ๊ณผ๋ฅผ ํ•˜๋‚˜๋กœ ํ•ฉ์นจ

 

Typical Big Data Problem

- ์•„์ฃผ ํฐ ์ˆซ์ž์˜ ๊ธฐ๋ก์„ ๋ฐ˜๋ณต

- ๊ฐ๊ฐ์—์„œ ๊ด€์‹ฌ์žˆ๋Š” ๊ฒƒ์„ ๋ฝ‘์•„๋‚ด๊ธฐ

- ์ค‘๊ฐ„ ๊ฒฐ๊ณผ๋ฅผ shuffle, sort

- ์ค‘๊ฐ„ ๊ฒฐ๊ณผ๋ฅผ ์ง‘ํ•ฉ

- ์ตœ์ข… ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœ

- word counting, k-means clustering์„ ์ƒ๊ฐํ•ด๋ณด๊ธฐ

 

MapReduce์˜ ์•„์ด๋””์–ด

- Scale Out: ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๊ธฐ๊ณ„๋ฅผ ๋ถ™์ด๊ธฐ (up-> ๊ธฐ๊ณ„ ์„ฑ๋Šฅ ์žฅ์ฒด๋ฅผ ์˜ฌ๋ฆฌ๋Š” ๊ฒƒ->X)

* ์Šค์ผ€์ผ ์—…์€ ๋ฉ”๋ชจ๋ฆฌ bottleneck ํ˜„์ƒ ๋“ฑ์ด ๋ฐœ์ƒํ•  ์œ„ํ—˜์ด ์žˆ๋‹ค.

- Move processing to the data: ํด๋Ÿฌ์Šคํ„ฐ๋“ค์ด ์ œํ•œ๋œ bandwidth๋ฅผ ๊ฐ€์ง„๋‹ค

- ์ˆœ์ฐจ์ ์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ , ๋žœ๋ค ์ ‘๊ทผ์„ ํ”ผํ•˜๋ผ: seek์€ ๋น„์šฉ์ด ๋งŽ์ด ๋“ค์ง€๋งŒ, disk throughput์€ ํ•ฉ๋ฆฌ์ 

* HDD ๋“ฑ์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ์„ ๋•Œ head๋ฅผ ์ด๋™ํ•ด๊ฐ€๋ฉฐ ์ฝ๋Š”๋‹ค. ์ด๊ฒƒ์„ ์–ด๋–ป๊ฒŒ ์ตœ์ ํ™”ํ•˜๋Š๋ƒ?

 

Components of Hadoop

1. Hadoop/MapReduce

- ๋ถ„์‚ฐ๋œ ๋น…๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ์ธํ”„๋ผ๊ตฌ์กฐ

- abstract/paradigm

- fault-tolerant: ๋ฐ์ดํ„ฐ๋ฅผ 100๊ฐœ๋กœ ๋‚˜๋ˆ„์–ด์ฃผ๋Š”๋ฐ, ๊ทธ์ค‘ 1๊ฐœ์˜ ์ฒ˜๋ฆฌ๊ธฐ๊ฐ€ ๊ณ ์žฅ๋‚˜๋ฉด ์ผ๋‹จ ์žฌ์‹œ์ž‘๋ถ€ํ„ฐ ํ•จ -> ๊ทธ๋ž˜๋„ ๋™์ž‘ํ•˜์ง€ ์•Š์„ ๊ฒฝ์šฐ ๋‹ค๋ฅธ ์ฒ˜๋ฆฌ๊ธฐ์—๊ฒŒ ํ• ๋‹น

- schedule: job assignment ๋“ฑ์„ ์กฐ์œจ

- execution

 

2. HDFS (Hadoop Distributed File System)

- fault-tolerant: ํ•˜๋“œ๋””์Šคํฌ๊ฐ€ ๋ง๊ฐ€์ ธ๋„ ๊ดœ์ฐฎ์Œ

- high-bandwidth

- high availability distributed storage

- ์ตœ์†Œ 3๊ฐœ์˜ ๋ณต์‚ฌ๋ณธ์€ ์œ ์ง€

 

MapReduce Word Count

- Map and Reduce ํ•จ์ˆ˜

map(k,v) -> [(k', v')]
reduce(k', [v']) -> [(k',v')]

- Word Count

Map(String docid, String text):
    for each word w in text:
    	Emit(w, 1);
Reduce(String term, Iterator<int> values):
    int sum = 0;
    for each v in values:
    	sum += v;
    Emit(term, sum);

 

MapReduce ์‹คํ–‰ ์‹œ๊ฐ„

- scheduling ์กฐ์ ˆ: map, reduce task๋ฅผ ์ˆ˜ํ–‰ํ•˜๋„๋ก worker์— ๋ถ€์—ฌ

- ๋ฐ์ดํ„ฐ ๋ถ„์‚ฐ ์กฐ์ ˆ: ํ”„๋กœ์„ธ์Šค์—์„œ ๋ฐ์ดํ„ฐ๋กœ ์ด๋™

- synchronization ์กฐ์ ˆ: intermediate data์˜ ์ˆ˜์ง‘, ์ •๋ ฌ, ํ˜ผํ•ฉ

- error, fault ์กฐ์ ˆ: worker failure ๊ฐ์ง€ ๋ฐ ์žฌ์‹œ์ž‘

 

Hadoop Distributed File System (HDFS)

- ํด๋Ÿฌ์Šคํ„ฐ ๋‚ด๋ถ€์˜ ๋…ธ๋“œ์˜ local disk์— ๋ฐ์ดํ„ฐ ์ €์žฅ -> ๋ฉ”๋ชจ๋ฆฌ์˜ ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ–๊ณ  ์žˆ์„ RAM์ด ์ถฉ๋ถ„ํ•˜์ง€ ์•Š์Œ

- Disk access๋Š” ๋Š๋ฆฌ์ง€๋งŒ disk throughput์€ ํ•ฉ๋ฆฌ์  -> file์„ ํ†ตํ•œ linear scan์€ ๊ดœ์ฐฎ์Œ

- ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ 3๋ฒˆ ๋ณต์ œ -> reliability on commodity hardware

 

Cloud Computing Services

- Amazon ๋“ฑ์˜ ํ”Œ๋žซํผ์€ ๋‹จ๊ธฐ ์ž‘์—…์— ๋Œ€ํ•œ ์—ฌ๋Ÿฌ machine ์ œ๊ณต ๊ฐ€๋Šฅ

 

Feel Free to Experiment: micro instance์˜ ๊ฒฝ์šฐ ํ•ด๋ณผ ๋งŒํ•จ

'School/COSE471 ๋ฐ์ดํ„ฐ๊ณผํ•™' Related Articles +

๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ๋น„๊ต

- Power, expressibility: ์–ผ๋งˆ๋‚˜ ๋ณต์žกํ•œ ์ž‘์—…์„ ํ•  ์ˆ˜ ์žˆ๋Š๋ƒ

- Interpretability

- Ease of Use

- Training speed

- Prediction speed

  Linear Regression Nearest Neighbor Deep Learning
Power/Expressibility L L H
Interpretability H H L
Ease of Use H H L
Training speed H H L
Prediction speed H L H

cf) ๋”ฅ๋Ÿฌ๋‹์€ Foward Fast๋ฅผ ์ด์šฉํ•œ๋‹ค. Nearest Neighbor๋Š” ๊ฑฐ๋ฆฌ ๊ณ„์‚ฐ ๋•Œ๋ฌธ์— ์—ฐ์‚ฐ ์‹œ๊ฐ„์ด ๊ธธ๋‹ค.

 

XOR & Linear Classifier

- Linear Classifier๋Š” XOR ๊ฐ™์€ ๊ฐ„๋‹จํ•œ ๋น„์„ ํ˜•ํ•จ์ˆ˜๋ฅผ ์ ํ•ฉ์‹œํ‚ฌ ์ˆ˜ ์—†๋‹ค.

- ๋Œ€์•ˆ: Decision tree, Random forest, Support Vector Machines, Deep Learning

 

Decision Tree Classifier

- root->leaf path๋ฅผ ํ†ต๊ณผํ•˜๋ฉด์„œ ๋ถ„๋ฅ˜๊ฐ€ ๋˜๋Š” ๋ชจ๋ธ

- ํŠธ๋ฆฌ๋Š” ํ•™์Šต ์˜ˆ์‹œ๋“ค์„ ๋น„๊ต์  ๊ท ์ผํ•œ ๊ตฌ์„ฑ์œผ๋กœ ๋ถ„ํ•ด

ํƒ€์ดํƒ€๋‹‰ ์ƒ์กด ๋ชจ๋ธ๋ง ์˜ˆ์‹œ

- Top-down manner๋กœ ๊ตฌ์„ฑ

- m๊ฐœ์˜ ํด๋ž˜์Šค๋“ค์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์ •์ œํ•˜๊ธฐ ์œ„ํ•ด 1๊ฐœ์˜ ํ”ผ์ณ/์ฐจ์›์„ ๋”ฐ๋ผ ๋ถ„๋ฆฌ

* pure split: 1๊ฐœ์˜ ๋‹จ์ผ ํด๋ž˜์Šค ๋…ธ๋“œ ์ƒ์„ฑ 

* balanced split: group ํฌ๊ธฐ๊ฐ€ ๋Œ€๋žต์ ์œผ๋กœ ๋น„์Šทํ•˜๋„๋ก ํ•ญ๋ชฉ์„ ๋ถ„๋ฆฌ

Information-Theoretic Entropy

- entropy: class confusion์˜ ์–‘์„ ์ธก์ •

Split Criteria

- information gain

gain(D, A_i) = entropy(D)-entropy_{A_i}(D)

- ์ด์šฉ: ๋ฐ์ดํ„ฐ ์นผ๋Ÿผ๋ณ„๋กœ ๊ณ„์‚ฐํ•ด์„œ, ๊ฐ€์žฅ ๊ฐ’์ด ํฐ ๊ฒƒ์„ ์„ ํƒํ•ด์•ผ ์ž˜ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ์Œ.

 

Stopping Criteria

- information gain์ด 0์ด ๋  ๋•Œ๊ฐ€ ์•„๋‹ˆ๋ผ, ์ž…์‹ค๋ก ๋ณด๋‹ค ์ž‘๋‹ค๋ฉด ๋ฉˆ์ถฐ๋„ ๋œ๋‹ค. -> ์ด์ •๋„๋ฉด ์ถฉ๋ถ„ํ•˜๋‹ค๋Š” ๋œป

- alternate strategy: full tree๋ฅผ ๋งŒ๋“ค์–ด์„œ low value node๋ฅผ ๊ฐ€์ง€์น˜๊ธฐ ํ•˜๊ธฐ

-> subtree์ค‘ ์˜๋ฏธ๊ฐ€ ๊ฑฐ์˜ ์—†๋Š” ๋ถ€๋ถ„์„ leaf๋กœ ํ†ต์ผํ•œ ํ›„, ์›๋ž˜ ํŠธ๋ฆฌ์™€ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜์—ฌ ์ฑ„ํƒ

 

Decision Tree์˜ ์žฅ์ 

- ๋น„์„ ํ˜•์„ฑ

- categorical variable์„ ์ž˜ ์ ์šฉ

- ์„ค๋ช… ๊ฐ€๋Šฅ์„ฑ ๋†’์Œ

- robustness: ๋‹ค๋ฅธ ํŠธ๋ฆฌ๋“ค๊ณผ์˜ ์•™์ƒ๋ธ”์„ ์ง„ํ–‰ํ•ด์„œ ๋” ๋‚˜์€ ๊ฒƒ์„ voteํ•  ์ˆ˜ ์žˆ์Œ

 

Ensemble Methods

1. Bagging

training

- k๊ฐœ์˜ bootstrap sample ์ƒ์„ฑ

- ๊ฐ๊ฐ์˜ S[i] ์ƒ˜ํ”Œ์— ๋Œ€ํ•ด classifier๋ฅผ ์ƒ์„ฑํ•ด์„œ k๊ฐœ์˜ classifier๋ฅผ ๋งŒ๋“ฆ (๊ฐ™์€ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ด์šฉ)

testing

- k๊ฐœ์˜ classifier๋ฅผ ๋™์ผํ•œ ๊ฐ€์ค‘์น˜๋กœ ํˆฌํ‘œํ•ด์„œ ์ƒˆ๋กœ์šด ์‚ฌ๋ก€๋“ค์„ ๋ถ„๋ฅ˜ํ•ด๋ณด๊ธฐ

 

2. Boosting

training

- classifier์˜ sequence๋ฅผ ์ƒ์„ฑ (๊ฐ™์€ base learner ์ด์šฉ)

- ๊ฐ๊ฐ์˜ classifier๋Š” ์ด์ „ classifier์— ์˜์กด์ ์ด๋ฉฐ ๊ทธ๊ฒƒ์˜ ์—๋Ÿฌ๋ฅผ ์ฐพ๋Š” ๋ฐ ์ง‘์ค‘

- ์ด์ „ classifier์—์„œ ์ž˜๋ชป ์˜ˆ์ธก๋œ ์‚ฌ๋ก€๋“ค์€ ๋” ๋†’์€ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌ

testing

- classifier๋“ค์˜ ์—ฐ์†์œผ๋กœ ํŒ๋‹จ๋œ ๊ฒฐ๊ณผ๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ test case์˜ ์ตœ์ข… ํด๋ž˜์Šค๋ฅผ ๋ถ€์—ฌ

 

Random Forest

- Bagging with decision tree + split attribute selection on random subspace

-> learning process์—์„œ ๋‚˜๋‰œ ํ›„๋ณด์ž๋“ค ๊ฐ๊ฐ์„ ์„ ํƒํ•˜์—ฌ ํ•™์Šตํ•œ ๋ณ€ํ˜• ํŠธ๋ฆฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์‚ฌ์šฉ -> random subset of features

* 1๋‹จ๊ณ„: bootstrapped dataset ์ƒ์„ฑ

* 2๋‹จ๊ณ„: decision tree ์ƒ์„ฑ -> ๊ฐ ๋‹จ๊ณ„์˜ ํ”ผ์ณ์˜ random subset๋งŒ์„ ์ด์šฉํ•œ bootstrapped dataset์„ ์ด์šฉํ•ด์•ผ ํ•จ

ํŠธ๋ฆฌ์˜ ๊ตฌ์„ฑ์ด ๋‹ค์–‘ํ•ด์ง / subset size๋Š” ๋ณดํ†ต sqrt(feature ์ˆ˜)

- ์ƒˆ๋กœ์šด ๋…ธ๋“œ์—์„œ๋„ root์ฒ˜๋Ÿผ ๋žœ๋ค์œผ๋กœ ๋‘๊ฐœ์˜ ๋ณ€์ˆ˜๋ฅผ candidate๋กœ ์„ ํƒ (์ „์ฒด 3๊ฐœ์˜ column์ค‘ 1๊ฐœ๋Š” ๋ฌด์‹œ)

* 3๋‹จ๊ณ„: ๋ฐ˜๋ณต - ๋ฐ˜๋ณต์„ ํ†ตํ•ด ์ƒˆ๋กœ์šด ํŠธ๋ฆฌ๋ฅผ ๊ณ„์† ์ƒ์„ฑ

* 4๋‹จ๊ณ„: Inference

- ๊ฐ€์žฅ ํˆฌํ‘œ๋ฅผ ๋งŽ์ด ๋ฐ›์€ ์˜ต์…˜์ด ๋ฌด์—‡์ธ์ง€ ํ™•์ธ

* ์ •ํ™•๋„ ์ธก์ •

- ํ†ต์ƒ์ ์œผ๋กœ ์›๋ณธ ๋ฐ์ดํ„ฐ์˜ 1/3์€ bootstrapped dataset์— ๋‚˜ํƒ€๋‚˜์ง€ ์•Š์Œ

-> ์ด ๋ฐ์ดํ„ฐ(Out-Of-Bag sample)๋กœ validation์„ ์ง„ํ–‰

 

Support Vector Machines

- ๋น„์„ ํ˜•์„ฑ ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ๋งŒ๋“œ๋Š” ์ค‘์š”ํ•œ ๋ฐฉ๋ฒ•

- 2๊ฐœ์˜ ํด๋ž˜์Šค ์‚ฌ์ด์—์„œ maximum margin linear separator๋ฅผ ์ถ”๊ตฌ

 

SVM vs Logistic Regression

- ๊ณตํ†ต์ : seperating plane

- ์ฐจ์ด์ : LR๋Š” ๋ชจ๋“  ๊ฐ’์— ๋Œ€ํ•ด์„œ ํ‰๊ฐ€ํ•˜์ง€๋งŒ, SVM์€ ๊ฒฝ๊ณ„์— ์žˆ๋Š” ์ ๋งŒ ํ™•์ธํ•จ

- SVM: ๊ธฐ๋ณธ์ ์œผ๋กœ ์„ ํ˜•์ ์ด์ง€๋งŒ ๋” ๊ณ ์ฐจ์›์—๋„ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

 

๊ณ ์ฐจ์›์œผ๋กœ์˜ projection

- ์ฐจ์› ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋ฉด ๋ชจ๋“  ๊ฒƒ์„ linearly separableํ•˜๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

Kernels and non-linear functions

Feature Engineering

- domain-dependent data cleaning์€ ์ค‘์š”ํ•˜๋‹ค

* Z-scores, normalization

* bell-shaped distribution ์ƒ์„ฑ

* missing value๋ฅผ imputing

* ์ฐจ์›์ถ•์†Œ (SVD) -> ๋…ธ์ด์ฆˆ ๋ง๊ณ  ์ค‘์š”ํ•œ, y๊ฐ’์„ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š” ์•„์ฃผ ์ž‘์€ ์‹ ํ˜ธ ์ •๋ณด๋“ค์„ ๋ญ‰๊ฐœ๋ฒ„๋ฆด ์ˆ˜ ์žˆ์–ด performance์— ๋ฌธ์ œ๊ฐ€ ์ƒ๊ธธ ์ˆ˜ ์žˆ๋‹ค.

* non-linear combination์˜ explicit incorporation (ex. products, ratios...)

 

Nerual Networks

 

'School/COSE471 ๋ฐ์ดํ„ฐ๊ณผํ•™' Related Articles +

Nearest Neighbor Classification

- ์–ด๋–ค training example์ด target์— ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด์ง€ ํ™•์ธํ•ด์„œ class label์„ ๋ถ™์ด๋Š” ๊ฒƒ

- distance function์„ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”

- ์žฅ์ : simplicity, interpretability, non-linearity

- k-nearest neighbor

 

Distance Metrics

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š” ๊ฒฝ์šฐ

- d(x, y) >= 0 for all x, y (positivity)

- d(x, y) = 0 iff x = y (identity)

- d(x, y) = d(y, x) (symmetry)

- d(x, y) <= d(x, z) + d(z, y) (triangle inequality)

 

Not a Metric

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ž์—ฐ ์œ ์‚ฌ์„ฑ ์ธก์ • ๊ฐ’๋“ค์€ distance metric์ด ์•„๋‹ˆ๋‹ค.

- correlation coefficient (-1~1)

- cosine similarity, dot product

- mutual information measures

- cheapest airfare 

 

Euclidean Distance Metric

c_i : weighted sum

- ์ตœ์†Œํ•œ ์ •๊ทœํ™”๋ฅผ ํ†ตํ•ด ์ฐจ์›์„ ๋น„๊ต๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ด์•ผ ํ•จ

 

L_k Distance Norms

- ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ๋ฅผ ์ผ๋ฐ˜ํ™”ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์‹ ์‚ฌ์šฉ

- k=1 ์ผ ๋•Œ, Manhattan distance metric

- k=∞ ์ผ ๋•Œ, maximum component 

- k๋Š” largest and total dimensional difference ์‚ฌ์ด์˜ tradeoff๋ฅผ ์กฐ์ ˆ

p1(2, 0), p2(1.5, 1.5) ์ค‘ (0, 0)์—์„œ ๋” ๋จผ ์ ์€?

- k=1์ผ ๋•Œ, k=2์ผ ๋•Œ, k=∞์ผ ๋•Œ ๋จผ ์ ์ด ๋‹ค๋ฅด๋‹ค.

- distance metric์€ ์–ด๋–ค ์ ์ด ๋” ๊ฐ€๊นŒ์šด์ง€ ์„ค์ •

 

Circles for different k

- L_k circle์˜ ๋ชจ์–‘์€ ์›์ ์— ๋Œ€ํ•ด ์–ด๋– ํ•œ ์ ์ด ๋™๋“ฑํ•œ ๊ฒƒ๋“ค์ธ์ง€ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ

 

Projections from higher dimensions

- Projection method (ex.SVD): ํ‘œํ˜„ ๋ณต์žก๋„๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ์ฐจ์›์„ ์ถ•์†Œ

- nearest neighbor๋Š” ์›๋ž˜ ๊ณต๊ฐ„์— ๋น„ํ•ด ํ™•๊ณ ํ•ด์งˆ ๊ฒƒ

 

Regression / Interpolation by NN

- NN์ด๋ผ๋Š” ์ปจ์…‰์„ function interpolation์— ์ด์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

- k๊ฐœ์˜ ๊ทผ์ ‘ ์ ์˜ ๊ฐ’์˜ ํ‰๊ท ์„ ๋‚ด๋Š” ๋ฐฉ์‹

- weighted average scheme๋Š” (1) distance rank, (2) actual distances์— ์˜ํ•ด ๋”ฐ๋ผ ์ ๋“ค์„ ๋‹ค๋ฅด๊ฒŒ ๊ฐ’์„ ๋งค๊ธธ ์ˆ˜ ์žˆ๋‹ค.

- ๋ถ„๋ฅ˜์—๋„ ๋น„์Šทํ•œ ๋ฐฉ์‹์„ ์ด์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

 

Gender Classification by Height/Weight

- k๊ฐ€ ์ปค์ง€๋ฉด ๋” ๋งค๋„๋Ÿฌ์šด ๊ตฌ๋ถ„์„ ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Œ

 

Seeking good analogies

- ๋งŽ์€ ์ง€์‹์˜ ๋ถ„์•ผ๋Š” analogy(๋น„์œ )์— ๊ธฐ์ดˆํ•ด ์žˆ๋‹ค.

* Law: ์–ด๋–ค ๋ฒ•์  ํŒ๋ก€๊ฐ€ ์ด ์‚ฌ๋ก€์™€ ๋น„์Šทํ•œ๊ฐ€?

* Medicine: ๋น„์Šทํ•œ ์ฆ์ƒ์„ ๊ฐ€์กŒ๋˜ ํ™˜์ž๋ฅผ ๊ณผ๊ฑฐ์— ์–ด๋–ป๊ฒŒ ์น˜๋ฃŒํ–ˆ๊ณ , ๊ทธ ํ™˜์ž๊ฐ€ ์‚ด์•„๋‚จ์•˜๋Š”๊ฐ€?

* Real Estate: ์ด์›ƒ ์ง€์—ญ์— ๋น„๊ต ๊ฐ€๋Šฅํ•œ ์ž์‚ฐ์ด ์–ด๋Š์ •๋„ ๊ฐ€๊ฒฉ์— ํŒ”๋ ธ๋Š”๊ฐ€?

 

Finding Nearest Neighbors

- n๊ฐœ์˜ ์ ์ด ์ฃผ์–ด์ง„ d์ฐจ์›์—์„œ NN์„ ๊ตฌํ•˜๋Š” ๊ฒƒ์€ O(nd)์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋ฅผ ๊ฐ€์ง„๋‹ค.

- training set์ด ์ปค์ง€๊ฑฐ๋‚˜ ์ฐจ์›์ด ์ปค์ง€๋ฉด ์‹œ๊ฐ„์ด ๊ต‰์žฅํžˆ ๋งŽ์ด ๋“ ๋‹ค.

-> grid indices, kd-trees, Voronoi diagrams, locality sensitive hashing ๋“ฑ์„ ์ด์šฉ

 

Voronoi Diagrams / Kd-trees

- Voronoi Diagrams: ๊ฐ€์žฅ ๊ทผ์ ‘ํ•œ ์ด์›ƒ์„ ๊ณต์œ ํ•˜๋Š” ์ง€์—ญ์œผ๋กœ ๊ณต๊ฐ„์„ ๋ถ„ํ• 

- kd-tree ๋“ฑ์€ ์ €์ฐจ์›์—์„œ ๊ฐ€์žฅ ๊ทผ์ ‘ํ•œ ์ด์›ƒ์„ ์ฐพ๋Š” ํšจ์œจ์ ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค.

- ์ •ํ™•ํ•œ NN search๋Š” ์ถฉ๋ถ„ํžˆ ๊ณ ์ฐจ์›์˜ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ๋Š” linear research๋กœ ์ถ•์†Œ๋˜์–ด์•ผ ํ•œ๋‹ค.

 

Grid files

- ์ผ์ •ํ•œ ๊ฐ„๊ฒฉ์œผ๋กœ ๋‚˜๋‰˜์–ด์ง„ ๊ฒฉ์ž์— ์ฐํžŒ ์ ์„ ๋ฒ„์ผ“ํŒ… (Bucketting) ํ•˜๋Š” ๊ฒƒ์€ ์œ ์‚ฌ์„ฑ์œผ๋กœ ์ ๋“ค์„ ๊ทธ๋ฃนํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜์ž„

- ์ฐจ์›์ด ์ฆ๊ฐ€ํ•˜๋ฉด index๋Š” expensiveํ•ด์ง„๋‹ค. -> ์ฃผ๋ณ€ ๊ฒฉ์ž๋“ค๋„ ๋ชจ๋‘ ํƒ์ƒ‰ํ•ด์•ผ ํ•˜๋ฏ€๋กœ, exponentialํ•˜๊ฒŒ ์ฆ๊ฐ€ํ•  ์ˆ˜๋ฐ–์— ์—†์Œ

์™ผ: Voronoi Diagram / ์˜ค: Grid file

Locality Sensitive Hashing

- Hashing์€ ์ด์›ƒํ•˜๋Š” ์ ์ด ๊ฐ™์€ bucket์œผ๋กœ hash ๋˜์—ˆ์„ ๋•Œ NN search๋ฅผ ๋” ๋น ๋ฅด๊ฒŒ ์ง„ํ–‰๋  ์ˆ˜ ์žˆ๋„๋ก ํ•จ

* hashing: ๊ฒ€์ƒ‰์„ ๋น ๋ฅด๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด hash key ๊ฐ’์„ ๊ฐ€์ ธ์™€์„œ ๋น ๋ฅด๊ฒŒ ์ฐพ๋„๋ก ํ•จ

- normal hashing์€ distance bucket์œผ๋กœ ์œ ์‚ฌํ•œ ์ ๋“ค์„ ํผ๋œจ๋ฆผ

- Locality Sensitive Hashing (LSH): ์ ์ด๋‚˜ ๋ฒกํ„ฐ a, b๋ฅผ ๊ฐ€์ ธ์™€์„œ a๊ฐ€ b์™€ ์ธ์ ‘ํ•˜๋ฉด h(a)=h(b)์ผ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•จ.

 

LSH for points on a sphere

1) ์›์ ์„ ๊ฐ€๋กœ์ง€๋ฅด๋Š” ๋žœ๋ค ๋ฉด์„ ์„ ํƒ

2) ์„œ๋กœ ์ด์›ƒํ•ด ์žˆ๋‹ค๋ฉด ๋‘ ์ ์€ ๊ทธ ๋ฉด์˜ ๊ฐ™์€ ๋ฉด์— ์กด์žฌํ•  ๊ฒƒ์ž„. (์™ผ์ชฝ ๋˜๋Š” ์˜ค๋ฅธ์ชฝ)

3) d๊ฐœ์˜ ๋žœ๋ค ๋ฉด์— ๋Œ€ํ•œ L/R ํŒจํ„ด์€ d-bit LSH hash code๋ฅผ ์ƒ์„ฑ

์˜ˆ์‹œ๋ฅผ ์‚ดํŽด๋ณด์ž! ํŒŒ๋ž€ ์ƒ‰ ์ ์— ๋Œ€ํ•ด์„œ ์ฝ”๋“œ๋ฅผ ์‚ดํŽด๋ณผ ๊ฒƒ.

1๋ฒˆ ์„ ์„ ๊ทธ์—ˆ์„ ๋•Œ, ํ•ด๋‹น ์ ์€ ์˜ค๋ฅธ์ชฝ์— ์žˆ์–ด์„œ 0

2๋ฒˆ ์„ ์„ ๊ทธ์—ˆ์„ ๋•Œ ํ•ด๋‹น ์ ์€ ์™ผ์ชฝ์— ์žˆ์–ด์„œ 1

3๋ฒˆ ์„ ์€ ์˜ค๋ฅธ์ชฝ์ด๋ฏ€๋กœ 0

4๋ฒˆ ์„ ์€ ์™ผ์ชฝ์ด๋ฏ€๋กœ 1 --> ์ตœ์ข… LSH ์ฝ”๋“œ๋Š” 0101

 

Network data

  vertices edges
social network people friendships
WWW pages hyperlinks
Product/customer networks product, customer - bipartite graph sales
genetic networks genes interactions

 

Point Sets and Graphs

- Point set: graph๋ฅผ ์ •์˜ํ•จ -> x, y ์ ์ด ์„œ๋กœ ์ด์›ƒํ•˜๋ฉด (x, y) edge๋ฅผ ์ถ”๊ฐ€ / threshold ๊ธฐ์ค€์œผ๋กœ ๊ฑฐ๋ฆฌ๊ฐ€ ๊ฐ€๊นŒ์šฐ๋ฉด ๊ธ‹๊ธฐ

- graph: point set์„ ์ •์˜ํ•จ -> ์ธ์ ‘ ํ–‰๋ ฌ์˜ SVD๋ฅผ ์ˆ˜ํ–‰

ex) 2์ฐจ์› SVD๋ฅผ ์ง„ํ–‰ํ•˜์—ฌ, ์ฃผ์„ฑ๋ถ„ 1/2๊นŒ์ง€ ๋ฝ‘์•„๋‚ด๊ธฐ

 

Classical Graph Algorithms

- ๋‘ vertices ์‚ฌ์ด์˜ ์ตœ์†Œ ๊ฑฐ๋ฆฌ๋Š” ๊ณง ๊ทธ๋ž˜ํ”„์—์„œ์˜ ๊ฑฐ๋ฆฌ๊ฐ€ ๋œ๋‹ค.

- ์ตœ์†Œ ๊ฑฐ๋ฆฌ๋ฅผ ์ฐพ๋Š” ํด๋ž˜์‹ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜, connected components, spanning trees, cuts, flows, matchings, topological sorting ๋“ฑ์ด ์ ์šฉ๋  ์ˆ˜ ์žˆ์Œ

 

PageRank

- PageRank(v, G): G์—์„œ์˜ random walk๊ฐ€ vertex v์— ๋ฉˆ์ถ˜๋‹ค๋Š” ๋œป

- ๋ฐ˜๋ณต์ ์œผ๋กœ ์ •์˜๋จ

- ์‹ค์ œ ์ƒํ™ฉ์—์„œ๋Š” ๋น ๋ฅด๊ฒŒ ์ˆ˜๋ ดํ•จ.

- ์˜ˆ) ์ค‘์š”ํ•œ ์นœ๊ตฌ๊ฐ€ ํ‘œ๋ฅผ ํฌ์†Œํ•˜๊ฒŒ ๋‚˜๋ˆ ์ค„ ๋•Œ -> ๊ณ„์† ์—…๋ฐ์ดํŠธ๊ฐ€ ๋˜์–ด ๋ณ€ํ•˜์ง€ ์•Š๊ฒŒ ๋จ (์ˆ˜๋ ด)

- centrality, importance ๋“ฑ์„ ์ธก์ •ํ•˜๋Š” ๋ฐ ์ค‘์š”ํ•œ ์ง€ํ‘œ๋กœ ์‚ฌ์šฉ๋จ

- ์˜ˆ) ์œ„ํ‚คํ”ผ๋””์•„ ํŽ˜์ด์ง€

- PageRank์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์˜์‚ฌ ๊ฒฐ์ •

* ๊ด€๋ จ์„ฑ์ด ๋–จ์–ด์ ธ๋ณด์ด๋Š” vertices/edges ํŽธ์ง‘

* outdegree๊ฐ€ 0์ธ vertices ๋‹ค๋ฃจ๊ธฐ

* random jump์„ ํ—ˆ๊ฐ€ํ•˜๋Š” ๋ฌธ์ œ: damping factor

 

Clustering

- ์ •์˜: ์œ ์‚ฌ๋„๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์ ์„ ๊ทธ๋ฃนํ™”ํ•˜๋Š” ๋ฌธ์ œ

- ์š”์†Œ๋“ค์ด ์ ์€ ์ˆ˜์˜ source๋กœ๋ถ€ํ„ฐ ์œ ๋ž˜ํ•œ ์ƒํ™ฉ์— ์ด๋Ÿฌํ•œ origin์„ ๋‚˜ํƒ€๋‚ด๊ธฐ ์œ„ํ•ด ํด๋Ÿฌ์Šคํ„ฐ๋ง์„ ์ด์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

- ill-defined problem: ๋ณด๋Š” ์‚ฌ๋žŒ์— ๋”ฐ๋ผ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋‚˜๋ˆ„๋Š” ๊ธฐ์ค€์ด ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์Œ. ์ฃผ๊ด€์ ์ธ ๋ถ€๋ถ„

- ์˜ˆ์‹œ: ์œ ์ „์ž ๋ฐ์ดํ„ฐ ๊ทธ๋ฃนํ™”

- ํด๋Ÿฌ์Šคํ„ฐ๋ง์„ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ 

* hypothesis development: ๋‚ด ๋ฐ์ดํ„ฐ์— ๊ตฌ๋ณ„๋˜๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ์–ผ๋งˆ๋‚˜ ์žˆ๋Š”๊ฐ€?

* Modeling over smaller groups: ๊ฐ๊ฐ์˜ ํด๋Ÿฌ์Šคํ„ฐ์— ๋Œ€ํ•ด ๊ตฌ๋ถ„๋˜๋Š” ์˜ˆ์ธก ๋ชจ๋ธ์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Œ

* Data reduction: ๊ฐ ํด๋Ÿฌ์Šคํ„ฐ์˜ centroid๋ฅผ ์ด์šฉํ•˜์—ฌ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋Œ€์ฒด/๋Œ€ํ‘œ

* Outlier detection: ์–ด๋–ค ํ•ญ๋ชฉ์ด ํด๋Ÿฌ์Šคํ„ฐ์˜ ์ค‘์‹ฌ์œผ๋กœ๋ถ€ํ„ฐ ๋ฉ€๋ฆฌ ๋–จ์–ด์ ธ์žˆ๊ฑฐ๋‚˜ ์•„์ฃผ ์ž‘์€ ํด๋Ÿฌ์Šคํ„ฐ์— ๊ฐ‡ํ˜€์žˆ๋Š”๊ฐ€?

 

K-means Clustering

- ์ •์˜: k๊ฐœ์˜ ์ ์„ ์ค‘์‹ฌ์œผ๋กœ ์ •์˜ -> ๋ชจ๋“  ํ•ญ๋ชฉ๋“ค์„ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ค‘์‹ฌ์— ํ• ๋‹น -> ์ค‘์‹ฌ์„ ๋‹ค์‹œ ๊ณ„์‚ฐ -> ์ด ๊ณผ์ •์„ ๊ณ„์† ๋ฐ˜๋ณตํ•˜์—ฌ ์ถฉ๋ถ„ํžˆ ์•ˆ์ •๋˜๋„๋ก ํ•จ

- ๋ฌธ์ œ์ : local optima์— ๊ฐ‡ํž ์œ„ํ—˜์ด ์žˆ์Œ

 

Centroids or Center Points?

- Centroid: color, gender ๋“ฑ ์ˆซ์ž๋กœ ์ •์˜๋˜์ง€ ์•Š์€ ํŠน์„ฑ์— ๋Œ€ํ•ด ๊ทธ๋ฃนํ™”๋ฅผ ์ง„ํ–‰ํ•  ๋•Œ๋Š” ๋ช…ํ™•ํ•˜์ง€ ์•Š๋‹ค.

* ๋ณดํ†ต ์ด๋Ÿฌํ•œ ๋ฐ์ดํ„ฐ์—์„œ๋Š” one-hot encoding์œผ๋กœ ์ธ์ฝ”๋”ฉ ์ง„ํ–‰

- center๋กœ input example ์ค‘ ๊ฐ€์žฅ ๊ฐ€์šด๋ฐ์— ๊ฐ€๊นŒ์šด ์ ์„ ์ด์šฉํ•˜๋Š” ๊ฒƒ์€ ์šฐ๋ฆฌ๊ฐ€ ์˜๋ฏธ์žˆ๋Š” ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜๊ธฐ๋งŒ ํ•œ๋‹ค๋ฉด k-means๋ฅผ ์ด์šฉํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๋œป์ด๋‹ค.

 

How Many Clusters?

- ์˜ฌ๋ฐ”๋ฅธ ํด๋Ÿฌ์Šคํ„ฐ์˜ ๊ฐœ์ˆ˜๋Š” ํด๋Ÿฌ์Šคํ„ฐ๋ง์„ ์ง„ํ–‰ํ•˜๊ธฐ ์ „์—๋Š” ์•Œ ์ˆ˜ ์—†๋‹ค.

- ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋”ํ•ด๊ฐˆ ๋•Œ, ์ ๊ณผ ์ค‘์‹ฌ ์‚ฌ์ด์˜ MSE ๊ฐ’์€ ์ ์ง„์ ์œผ๋กœ ๊ฐ์†Œํ•ด์•ผ ํ•œ๋‹ค.

- ์˜ฌ๋ฐ”๋ฅธ ํด๋Ÿฌ์Šคํ„ฐ์˜ ๊ฐœ์ˆ˜๋ฅผ ์ดˆ๊ณผํ•˜๊ฒŒ ๋˜๋ฉด MSE ๊ฐ’์ด ๊ฐ์†Œํ•˜๋Š” ์†๋„๊ฐ€ ๋Š๋ ค์ง„๋‹ค.

elbow method๋กœ ์ตœ์  ํด๋Ÿฌ์Šคํ„ฐ ๊ฐœ์ˆ˜ ๊ตฌํ•˜๊ธฐ

K-means์˜ ํ•œ๊ณ„

- nested cluster, long thin cluster์— ๋Œ€ํ•ด์„œ ์ข‹์ง€ ์•Š์€ ์„ฑ๋Šฅ์„ ๋ณด์ž„

- ๋ฐ˜๋ณต์ ์œผ๋กœ ํด๋Ÿฌ์Šคํ„ฐ๋ง์„ ์ง„ํ–‰ํ•˜๋Š” ๊ฒƒ์€ local optima๋ฅผ ํ”ผํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•จ (random seed ๊ฐ’์„ ๋‹ฌ๋ฆฌ ํ•จ)

 

Expectation Maximization (EM)

- EM ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋Œ€ํ‘œ์ ์ธ ์˜ˆ์‹œ๊ฐ€ K-means

- E-step: ์ถ”์ •๋˜๋Š” ํด๋Ÿฌ์Šคํ„ฐ์— ์ ์„ ํ• ๋‹น

- M-step: ํ• ๋‹น์„ ์ด์šฉํ•˜์—ฌ parameter ์ถ”์ •์„ ํ–ฅ์ƒ์‹œํ‚ด

 

Agglomerative Clustering

- ์ด๋Ÿฐ bottom up ๋ฐฉ์‹์€ ๋ฐ˜๋ณต์ ์œผ๋กœ 2๊ฐœ์˜ ๊ทผ์ ‘ํ•œ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋ณ‘ํ•ฉ์‹œํ‚ด

 

๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ์˜๋ฏธํ•˜๋Š” ๊ฒƒ

- ๋ณดํ†ต Average, Centroid ๋ฐฉ์‹์„ ์ฑ„ํƒ

 

Linkage Criteria

Advantages of Cluster Hierachies

- ํด๋Ÿฌ์Šคํ„ฐ ๋ฐ ์„œ๋ธŒํด๋Ÿฌ์Šคํ„ฐ์˜ ์กฐ์งํ™”

- ํด๋Ÿฌ์Šคํ„ฐ๋ง ๊ณผ์ • ์‹œ๊ฐํ™”

- ์ƒˆ๋กœ์šด ํ•ญ๋ชฉ์˜ ํด๋Ÿฌ์Šคํ„ฐ๋ง์„ ํšจ์œจ์ ์œผ๋กœ ํ•  ์ˆ˜ ์žˆ์Œ

 

์–ด๋–ค ํด๋Ÿฌ์Šคํ„ฐ๋ง ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ• ๊นŒ?

- ์˜ฌ๋ฐ”๋ฅธ ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜ ์‚ฌ์šฉ

- ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ณ€์ˆ˜๋ฅผ ์ •๊ทœํ™”

- ์ตœ์ข… ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‹œ๊ฐํ™”ํ•ด์„œ ์ œ๋Œ€๋กœ ๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ

'School/COSE471 ๋ฐ์ดํ„ฐ๊ณผํ•™' Related Articles +

Linear Regression

- n๊ฐœ์˜ ์ ์ด ์ฃผ์–ด์กŒ์„ ๋•Œ, ๊ฐ€์žฅ ๊ทผ์‚ฌ๋ฅผ ์ž˜ ํ•˜๊ฑฐ๋‚˜ ์ž˜ ๋งž๋Š” ์ง์„ ์„ ์ฐพ๋Š” ๊ฒƒ

 

Error in Linear Regression

- residual error: ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ์‚ฌ์ด์˜ ์ฐจ์ด 

- Least squares regression์€ ๋ชจ๋“  ์ ์˜ ์ž”์ฐจ์˜ ํ•ฉ์„ ์ตœ์†Œํ™”ํ•จ.

-> nice closed form, ๋ถ€ํ˜ธ ๋ฌด์‹œํ•˜๋ฏ€๋กœ ์„ ํƒ๋จ

 

Contour plots - gradient descent

 

Linear function์„ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ 

- ์ดํ•ดํ•˜๊ธฐ ์‰ฌ์›€

- default model์— ์ ํ•ฉ

* ์ผํ•œ ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๊ธ‰์—ฌ๊ฐ€ ์ฆ๊ฐ€ / ์ง€์—ญ์ด ์ปค์ง์œผ๋กœ์จ ์„ ํ˜•์ ์œผ๋กœ ์ง‘๊ฐ’์ด ์ƒ์Šน / ๋จน์€ ์Œ์‹์— ๋”ฐ๋ผ ๋ชธ๋ฌด๊ฒŒ๊ฐ€ ์„ ํ˜•์ ์œผ๋กœ ์ฆ๊ฐ€

 

๋ณ€์ˆ˜๊ฐ€ ์—ฌ๋Ÿฌ ๊ฐœ์ผ ๋•Œ

- ๊ฐ๊ฐ์˜ x_n ๋ณ€์ˆ˜๋“ค๊ณผ y๊ฐ’์„ ํ–‰๋ ฌ๋กœ ๋‚˜ํƒ€๋‚ด์–ด ์„ธํƒ€ ๊ฐ’์— ๋Œ€ํ•œ ํ–‰๋ ฌ์„ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค.

 

๋” ๋‚˜์€ ํšŒ๊ท€ ๋ชจ๋ธ

1. ์ด์ƒ์น˜ ์ œ๊ฑฐ

- ์ž”์ฐจ์˜ quadratic weight ๋•Œ๋ฌธ์— ์ด์ƒ์น˜๋Š” ํšŒ๊ท€ ๋ชจ๋ธ์˜ fit์— ์˜ํ–ฅ์„ ์ค„ ์ˆ˜ ์žˆ๋‹ค.

- ์ด๋Ÿฌํ•œ ์ž”์ฐจ๋ฅผ ์ œ๊ฑฐํ•˜๋Š” ๊ฒƒ์€ ๋” ์ ํ•ฉํ•œ ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค. 

 

2. nonlinear function fitting

- ๊ธฐ๋ณธ์ ์œผ๋กœ Linear regression์€ ์ง์„ ์ด์ง€๋งŒ, x^2, sqrt(x) ๋“ฑ์„ ์ด์šฉํ•˜๋ฉด ๊ณก์„ ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

- ์ž„์˜๋กœ polynomial, exponential, logarithm ๋“ฑ์„ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

cf) ๋”ฅ๋Ÿฌ๋‹์€ raw feature์—์„œ ์Šค์Šค๋กœ ์›ํ•˜๋Š” ๊ฒƒ์„ ๋ฝ‘์•„๋‚ผ ์ˆ˜ ์žˆ์–ด feature engineering์— ๋Œ€ํ•œ ์ˆ˜์š”๊ฐ€ ์ ๋‹ค. ์ตœ๊ทผ์—๋Š” prompt engineering์„ ์ค‘์š”ํ•˜๊ฒŒ ์—ฌ๊ธด๋‹ค.

 

3. feature/target scaling

- ๋„“์€ ๋ฒ”์œ„์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃจ๊ฒŒ ๋˜๋ฉด coefficient๊ฐ€ ์ง€๋‚˜์น˜๊ฒŒ ์ปค์งˆ ์ˆ˜ ์žˆ๋‹ค.

- Z-score ๋“ฑ์œผ๋กœ ์Šค์ผ€์ผ์„ ์กฐ์ •ํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค.

- power law์ด ์ ์šฉ๋˜๋Š” ์ˆ˜์ž… ๋“ฑ์˜ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ํŠนํžˆ ์ค‘์š”ํ•˜๋‹ค.

- x๊ฐ’์„ log(x), sqrt(x) ๋“ฑ์œผ๋กœ ๋Œ€์ฒดํ•  ์ˆ˜ ์žˆ๋‹ค.

- feature๊ฐ€ ์ •๊ทœ๋ถ„ํฌ ํ˜•ํƒœ๋ผ๋ฉด, power law distribution์„ ๊ฐ–๋Š” ๋ฐ์ดํ„ฐ๋Š” linearํ•œ ์กฐํ•ฉ์œผ๋กœ ๋‚˜ํƒ€๋‚ด๊ธฐ ์–ด๋ ต๋‹ค.

- Z normalization์œผ๋กœ ๋ณ€ํ˜•๋œ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•œ ํ›„, ๊ฒฐ๊ณผ๋Š” ์›๋ž˜ ์ƒํƒœ๋กœ ๋Œ๋ ค๋‘” ํ›„ ๋‚˜ํƒ€๋‚ด๋ฉด ๋œ๋‹ค.

 

4. highly correlated variable ์ œ๊ฑฐ

- ๋‘ ๋ฐ์ดํ„ฐ๊ฐ€ ์„œ๋กœ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ๋†’๋‹ค๋ฉด ๋” ์ด์ƒ ์šฐ๋ฆฌ์—๊ฒŒ ์ค„ ์ˆ˜ ์žˆ๋Š” ์ •๋ณด๊ฐ€ ์—†๋‹ค. ์˜คํžˆ๋ ค ํ˜ผ๋ž€์„ ๊ฐ€์ค‘์‹œํ‚ด.

--> ๋”ฐ๋ผ์„œ ์ œ๊ฑฐํ•ด๋„ ๋œ๋‹ค.

- covariance matrix๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ์ œ๊ฑฐํ•ด์•ผ ํ•˜๋Š” feature๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค.

 

Closed form solution์˜ ๋ฌธ์ œ

- ์„ธํƒ€ ๊ฐ’์„ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ํฐ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ์—ฐ์‚ฐ ์†๋„๊ฐ€ ์—„์ฒญ ๋Š๋ ค์ง„๋‹ค. - O(n^3)

- linear algebra๋Š” ๋‹ค๋ฅธ ๊ณต์‹์— ์ ์šฉํ•˜๊ธฐ ์–ด๋ ต๋‹ค.

- gradient descent ๋ฐฉ์‹์„ ์„ ํƒํ•˜๊ฒŒ ๋งŒ๋“ ๋‹ค.

 

Lines in Parameter Space

- error function J๋Š” convexํ•˜๋‹ค.

 

Gradient Descent Search

- convex: 1๊ฐœ์˜ local/global minima ๋ฅผ ๊ฐ–๋Š” ๊ณต๊ฐ„

- convexํ•œ ๊ณต๊ฐ„์—์„œ๋Š” minima๋ฅผ ์ฐพ๊ธฐ ์‰ฝ๋‹ค. -> ๊ทธ๋ƒฅ ๊ฒฝ์‚ฌ๋ฅผ ๋”ฐ๋ผ์„œ ๋‚ด๋ ค๊ฐ€๊ธฐ๋งŒ ํ•˜๋ฉด ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค.

- ์–ด๋–ค ์ ์—์„œ ๋‚ด๋ ค๊ฐ€๋Š” ๋ฐฉํ–ฅ์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์€, ๋ฏธ๋ถ„์„ ํ•ด์„œ tangent line์„ ๋”ฐ๋ผ ๊ฐ€๋ฉด ๋จ

--> (x+dx, f(x+dx))์ ์„ ์ฐพ์€ ํ›„, (x, f(x)) ์ ์— fit

 

Batch Gradient Descent

- Batch: ๊ฐ๊ฐ์˜ ๊ฒฝ์‚ฌํ•˜๊ฐ•์—์„œ ๋ชจ๋“  training sample์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ

- ํ†ต์ƒ์ ์œผ๋กœ๋Š” batch size๋ฅผ ์ค„์—ฌ๊ฐ€๋ฉฐ ๊ฒฝ์‚ฌํ•˜๊ฐ•

 

Local Optima

- J๊ฐ€ convex๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด, ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•์„ ๋”ฐ๋ผ ๊ฐ”์„ ๋•Œ, Local optima์— ๋น ์ ธ๋ฒ„๋ฆด ์ˆ˜ ์žˆ๋‹ค.

 

Effect of Learning Rate / Step Size

- ๋„ˆ๋ฌด ์ž‘์€ ์Šคํ…์œผ๋กœ ์›€์ง์ด๋ฉด optima์— convergenceํ•˜๋Š” ์†๋„๊ฐ€ ๋Šฆ๋‹ค.

- ๋„ˆ๋ฌด ํฐ ์Šคํ…์œผ๋กœ ์›€์ง์ด๋ฉด ๋ชฉํ‘œ์— ๋„๋‹ฌํ•˜์ง€ ๋ชปํ•  ์ˆ˜ ์žˆ๋‹ค.

- ์ ์ ˆํ•œ step size๋ฅผ ๊ตฌํ•˜๋ ค๋ฉด?

-> step size๊ฐ€ ์ ์ ˆํ•œ์ง€ ํŒ๋‹จํ•˜๊ณ , ๋„ˆ๋ฌด ๋Šฆ๋‹ค๋ฉด multiplicative factor (3์˜ ์ง€์ˆ˜๋ฐฐ ๋“ฑ๋“ฑ) ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋Š˜๋ ค๋ณด๊ธฐ

-> ๋„ˆ๋ฌด ํฌ๋‹ค๋ฉด (1/3์˜ ์ง€์ˆ˜๋ฐฐ ๋“ฑ) ์ค„์—ฌ๋ณด๊ธฐ

 

Stochastic Gradient Descent

- batch size๋„ hyperparameter์ด๋‹ค.

- ๋ชจ๋“  example์ด ์•„๋‹Œ ์ผ๋ถ€๋งŒ ์ด์šฉํ•˜์—ฌ derivative๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ๋„ ๋ฐฉ๋ฒ•

 

Regulation

- J ํ•จ์ˆ˜์— coefficient๊ฐ€ ์ž‘๊ฒŒ ์œ ์ง€๋˜๋„๋ก ๋žŒ๋‹ค ๊ฐ’์„ ์ถ”๊ฐ€

- ๋žŒ๋‹ค ๊ฐ’์ด 0์— ๊ฐ€๊นŒ์›Œ์ง€๋ฉด error๋Š” ๊ฐ์†Œํ•˜๊ณ , ๋ฌดํ•œ๋Œ€์— ๊ฐ€๊นŒ์›Œ์ง€๋ฉด thetha_0๋งŒ ์‹์— ๋‚จ๊ฒŒ ๋œ๋‹ค.

- ๋ฐ์ดํ„ฐ์— ์ตœ๋Œ€ํ•œ ๊ฐ€๊น๊ฒŒ ์‹์„ ๋งŒ๋“ค๋ฉด error๋Š” ๊ฐ์†Œํ•˜์ง€๋งŒ, ์œ„ ๊ณต์‹์—์„œ ํŒŒ๋ž€ ๋ถ€๋ถ„์€ ์ปค์ง„๋‹ค.

 

Interpreting/Penalizing Coefficients

- Squared coefficient์˜ ํ•ฉ์„ Penalizing ํ•˜๋Š” ๊ฒƒ์€ ridge regression or Tikhonov regularization

- coefficient์˜ ์ ˆ๋Œ“๊ฐ’์„ penalizingํ•˜๋Š” ๊ฒƒ์€ LASSO regression์ด๋‹ค.

* L1 metric

* L2: ๊ฐ ์ฐจ์›์— ๋Œ€ํ•œ ์ œ๊ณฑ์˜ ํ•ฉ -> ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ 

 

LASSO (Least Absolute Shrinkage and Selection Operator)

์ฐจ์ด์ ์€ ์ ˆ๋Œ“๊ฐ’!

- sparse solution์„ ์„ ํƒํ•˜๋Š” ๊ฒฝํ–ฅ

- ๋ณ€์ˆ˜ ์„ ํƒ ๋ฐ regularization ๊ธฐ๋Šฅ

- interpretability๋ฅผ ํ–ฅ์ƒ

 

What is right Lambda?

- ๋žŒ๋‹ค๊ฐ€ ์ปค์ง€๋ฉด small parameter๋ฅผ ๊ฐ•์กฐ -> ex) set to all zeros

- ๋žŒ๋‹ค๊ฐ€ ์ž‘์•„์ง€๋ฉด training error ๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ๋ชจ๋“  ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ž์œ ๋กญ๊ฒŒ ์ด์šฉํ•  ์ˆ˜ ์žˆ์Œ

- ์˜ค๋ฒ„ํ”ผํŒ…/์–ธ๋”ํ”ผํŒ… ์‚ฌ์ด ๊ท ํ˜•์„ ์œ ์ง€ํ•ด์•ผ ํ•จ

 

Normal form with regulation

- Normal form equation์€ regularization์„ ๋‹ค๋ฃจ๊ธฐ ์œ„ํ•ด ์ผ๋ฐ˜ํ™”๋  ์ˆ˜ ์žˆ๋‹ค.

- ๋˜๋Š” ๊ฒฝ์‚ฌํ•˜๊ฐ•์„ ์ด์šฉํ•  ์ˆ˜๋„ ์žˆ๋‹ค.

 

Classification

- ๋ถ„๋ฅ˜๋Š” ๋‚จ์ž/์—ฌ์ž, ์ŠคํŒธ/์ผ๋ฐ˜๋ฉ”์ผ, ์•…์„ฑ/์–‘์„ฑ ์ข…์–‘ ๋“ฑ์˜ ๊ตฌ๋ถ„์— ์ด์šฉ

- input record์— ๋ผ๋ฒจ์„ ๋ถ€์—ฌ

 

Regression for Classification

- linear regression์„ ์ด์šฉํ•˜์—ฌ ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค.

- ์ด๋•Œ ๊ฐ๊ฐ์˜ ๋ถ„๋ฅ˜์— ๋Œ€ํ•ด 0/1์˜ ์ด์ง„ ๋ถ„๋ฅ˜๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

- positive = 1, negative = 0

- regression ์„ ์€ ์ด๋Ÿฌํ•œ ๋ถ„๋ฅ˜๋ฅผ ๋‚˜๋ˆŒ ๊ฒƒ์ด๋‹ค.

- ๊ทน๋‹จ์ ์ธ +, - ์‚ฌ๋ก€๋ฅผ ์ถ”๊ฐ€ํ•  ๊ฒฝ์šฐ ์„ ์ด ๋ฐ”๋€๋‹ค.

 

Decision Boundaries

- Feature space์—์„œ ์„ ์„ ํ†ตํ•ด ํด๋ž˜์Šค๋ฅผ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋‹ค.

- Logistic Regression: ๊ฐ€์žฅ ์ ํ•ฉํ•œ ๋ถ„๋ฅ˜ ์„ ์„ ์ฐพ๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•

sigmoid, logistic

 

Cost for Positive/Negative Cases

- ์„ธํƒ€ ๊ฐ’์„ ์ค„์ด๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ž„

- ์ƒˆ๋กœ์šด x์— ๋Œ€ํ•œ ์˜ˆ์ธก

 

Logistic Regression via Gradient Descent

- loss function์ด convexํ•˜๋ฏ€๋กœ, ๊ฒฝ์‚ฌ ํ•˜๊ฐ•์„ ํ†ตํ•ด ๊ฐ€์žฅ ์ ํ•ฉํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค.

-> ๋”ฐ๋ผ์„œ ๋‘ ํด๋ž˜์Šค์— ๋Œ€ํ•œ linear seperator๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค.

 

Logisitc Gender Classification

Red region: 229 w / 63 m, Blue region: 223 m / 65 w

Classification์˜ ๋ฌธ์ œ

1. Balanced Training Classes

- ๊ธ์ • ๋ผ๋ฒจ์„ ๊ฐ€์ง„ ๋ฐ์ดํ„ฐ๊ฐ€ 1๊ฐœ๊ณ  ๋ถ€์ • ๋ผ๋ฒจ์„ ๊ฐ€์ง„ ๋ฐ์ดํ„ฐ๊ฐ€ 10๋งŒ๊ฐœ ์žˆ๋‹ค๋ฉด ์˜ฌ๋ฐ”๋ฅธ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ฌ ์ˆ˜ ์—†๋‹ค.

- ๊ฐ๊ฐ์˜ ๋ผ๋ฒจ ๋ฐ์ดํ„ฐ ์ˆ˜๋ฅผ ๋งž์ถ”์ž.

* minority class์— ํ•ด๋‹นํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด ๋” ๋…ธ๋ ฅํ•˜๊ธฐ

* ๋” ํฐ class์˜ ์š”์†Œ๋ฅผ ๋ฒ„๋ฆฌ๊ธฐ

* minority class์— ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌ -> overfitting ์กฐ์‹ฌํ•˜๊ธฐ

* small class์— ๋Œ€ํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๋ณต์ œํ•˜๊ธฐ -> random perturbation (๋ณต์›์ถ”์ถœ๋กœ ์—ฌ๋Ÿฌ๊ฐœ ๋ฝ‘์•„์„œ ์•™์ƒ๋ธ” ์ง„ํ–‰)

 

2. Multi-Class Classifications

- ๋ชจ๋“  ๋ถ„๋ฅ˜๊ฐ€ ์ด์ง„์ ์ด์ง€๋Š” ์•Š์Œ.

- ordering ๊ด€๊ณ„๊ฐ€ ์—†๋Š” ๋ถ„๋ฅ˜์— ๋Œ€ํ•ด์„œ๋Š” ๋‹จ์ˆœํžˆ ์ˆซ์ž๋กœ ํ‘œํ˜„ํ•˜์—ฌ ๋ถ„๋ฅ˜๋ฅผ ์ง„ํ–‰ํ•˜๋ฉด ์•ˆ ๋œ๋‹ค.

- ordinal data์— ๋Œ€ํ•ด์„œ๋งŒ ์ˆซ์ž๋กœ ๋ผ๋ฒจ๋ง ๊ฐ€๋Šฅ. ์•„๋‹Œ ๊ฒฝ์šฐ ์› ํ•ซ ์ธ์ฝ”๋”ฉ ์ด์šฉ.

 

cf) One Versus All Classifiers

- ๋‹ค์ค‘ ๋…๋ฆฝ ์ด์ง„๋ถ„๋ฅ˜๊ธฐ๋ฅผ ์ด์šฉํ•˜์—ฌ multiclass classifier๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

- ๊ฐ ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์˜ˆ์ธกํ•œ ๊ฐ€๋Šฅ์„ฑ ์ค‘ ๊ฐ€์žฅ ํฐ ๊ฒƒ์„ ์ฑ„ํƒ.

 

3. Hierarchical Classification

- ์œ ์‚ฌ์„ฑ์„ ์ด์šฉํ•ด ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ„๊ณ  taxonomy๋ฅผ ๋งŒ๋“œ๋Š” ๊ฒƒ์€ ํšจ์œจ์ ์ธ class ๊ฐœ์ˆ˜๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ๊ฒŒ ํ•œ๋‹ค.

- top-down tree๋ฅผ ์ด์šฉํ•ด ๋ถ„๋ฅ˜

'School/COSE471 ๋ฐ์ดํ„ฐ๊ณผํ•™' Related Articles +

Linear Algebra / ์„ ํ˜•๋Œ€์ˆ˜ํ•™

- matrix์˜ ์ˆ˜ํ•™

- ๋ฐ์ดํ„ฐ ๊ณผํ•™์—์„œ ์ค‘์š”ํ•œ ์—ญํ• 

 

n * m matrix๊ฐ€ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋Š” ๊ฒƒ

  row column
Data object features
Geometric Point Sets point dimensions
Systems of equations equations ๊ฐ ๋ณ€์ˆ˜์˜ coefficient

- Graphs/Networks: M[i, j] = vertex i -> vertex j edge ๊ฐœ์ˆ˜

- Vectors: any row, column or d*1 matrix

 

Vector ์‚ฌ์ด์˜ ๊ฐ 

- ๋ฒกํ„ฐ A์™€ B ์‚ฌ์ด์˜ ๊ฐ๋„

- cos(0) = 1 ---> perfect similarity = 0

- cos(pi/2) = 0 ---> ๊ด€๋ จ์ด ์—†๋‹ค

- cos(pi) = -1 ---> perfect anticorrelation

==> cos = correlation of mean zero variables

- unit vector์— ๋Œ€ํ•ด์„œ ๊ทธ ๋ฒกํ„ฐ์˜ ํฌ๊ธฐ๋Š” 1์ด๋ฏ€๋กœ, dot product๋กœ ์ •์˜๋œ๋‹ค.

 

Transpose

- ์ •์˜: a*b matrix -> b*a matrix๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒƒ

- addition and transposition

-> B = A^T๋ผ๊ณ  ๊ฐ€์ •ํ–ˆ์„ ๋•Œ ํ•ฉํ•˜๋Š” ๋ฐฉ๋ฒ•

- a ๊ฐ’์„ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Œ.

 

Matrix multiplication & Dot Products

- A*B๋Š” ๊ฐ™์€ ๋‚ด๋ถ€ ์ฐจ์›์„ ๊ณต์œ ํ•ด์•ผ๋งŒ ๊ณ„์‚ฐ ๊ฐ€๋Šฅํ•˜๋‹ค.

- ๊ฒฐ๊ณผํ–‰๋ ฌ์˜ ๊ฐ ์š”์†Œ๋Š” row/column ๋ฒกํ„ฐ์˜ dot product 

- dot product๋Š” ๋‘ ๊ฐœ์˜ ๋ฒกํ„ฐ๊ฐ€ ์–ผ๋งˆ๋‚˜ ์œ ์‚ฌํ•œ์ง€ ์ธก์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.

- ํ–‰๋ ฌ์˜ ๊ณฑ์…ˆ์€ ๊ฒฐํ•ฉ๋ฒ•์น™์€ ์„ฑ๋ฆฝํ•˜๋‚˜ ๊ตํ™˜๋ฒ•์น™์€ ์„ฑ๋ฆฝํ•˜์ง€ ์•Š๋Š”๋‹ค.

 

Multiplying Feature Matrices

- ํ–‰๋ ฌ A๊ฐ€ n*d data matrix๋ผ๊ณ  ๊ฐ€์ •ํ•ด๋ณด์ž.

- n: ๋ฌธ์„œ, d: ์šฉ์–ด

- C ํ–‰๋ ฌ์€ n*n matrix of dot products - ์ ๋“ค ๊ฐ„์˜ ์œ ์‚ฌ๋„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ํ–‰๋ ฌ์ด ๋œ๋‹ค.

- D ํ–‰๋ ฌ์€ d*d matrix of dot products - ํŠน์„ฑ ๊ฐ„์˜ ์œ ์‚ฌ๋„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ํ–‰๋ ฌ์ด ๋œ๋‹ค.

- covariance matrix ๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ๋‹ค.

์˜ˆ์‹œ) car - automobile : D matrix ๊ณ„์‚ฐ ํ›„ ๋‘ ๊ฐ’์„ ๋‚˜ํƒ€๋‚ด๋Š” cell์„ ํ™•์ธํ•ด๋ณด๋ฉด ๋‹ค๋ฅธ cell๋ณด๋‹ค ๊ฐ’์ด ๋†’์€ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. - ์œ ์‚ฌ๋„๊ฐ€ ๋†’๋‹ค.

 

Interpreting Matrix Multiplication

- 0/1 adjacency matrices ๋ฅผ ๊ณฑํ•˜๋Š” ๊ฒƒ์€ ๋‘ ์  ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ํ–‰๋ ฌ์ด ๋œ๋‹ค.

- Multiplication by permutation matrices๋Š” ํ–‰/์—ด์„ ์žฌ์ •๋ ฌํ•œ๋‹ค.

 

Matrix Rank

- ์ •์˜: ์„ ํ˜• ๋…๋ฆฝ์ ์ธ row์˜ ์ˆ˜๋ฅผ ์ธก์ •ํ•˜๋Š” ๊ฒƒ

- n*n matrix๋Š” ์ˆœ์œ„๊ฐ€ n์ด ๋˜์–ด์•ผ ํ•œ๋‹ค.

Rank 1 ์ธ Matrix - ๋‘ ์‹์ด ์‚ฌ์‹ค์ƒ ๊ฐ™์€ ์˜๋ฏธ์ด๋ฏ€๋กœ ํ’€ ์ˆ˜ ์—†๋‹ค.
Rank 2์ธ matrix

* ๋žญํฌ๋ž€ ํ–‰๋ ฌ์˜ ์—ด๋“ค๋กœ ์ƒ์„ฑ๋  ์ˆ˜ ์žˆ๋Š” ๋ฒกํ„ฐ ๊ณต๊ฐ„์˜ ์ฐจ์›.

์ฐธ๊ณ ๋ฌธํ—Œ: https://blog.naver.com/sw4r/221416614473

 

[๊ธฐ์ดˆ ์„ ํ˜•๋Œ€์ˆ˜] ํ–‰๋ ฌ์—์„œ Rank (๋žญํฌ) ๋ž€?

์„ ํ˜•๋Œ€์ˆ˜์—์„œ ๋“ฑ์žฅํ•˜๋Š” Rank ๋ผ๋Š” ๊ฐœ๋…์— ๋Œ€ํ•ด์„œ ๊ฐ„๋žตํ•˜๊ฒŒ ์•Œ์•„๋ณด์ž.  ์œ„ํ‚ค์˜ ์ •์˜๋ฅผ ์šฐ์„  ํ™•์ธํ•ด๋ณด...

blog.naver.com

 

2=1 ์ฆ๋ช…์œผ๋กœ๋ถ€ํ„ฐ ์•Œ ์ˆ˜ ์žˆ๋Š” ๊ฒƒ

- ์ฆ๋ช…์— ์˜ค๋ฅ˜๊ฐ€ ์ƒ๊ธฐ๋Š” ์›์ธ: 0์œผ๋กœ ๋‚˜๋ˆ„๋Š” ๊ฒƒ์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ๊ฒƒ์„ ๊ฐ„๊ณผํ•จ

- ์„ ํ˜• ๋Œ€์ˆ˜์—์„œ singular matrix๋„ ํฌํ•จ๋œ๋‹ค.

 

Matrix๋ฅผ ๋‚˜๋ˆ„๋Š” ๊ฒƒ

- inverse operation: x๋ฅผ identity element๋กœ ๋‚ด๋ฆฌ๋Š” ๊ฒƒ

- ๊ณฑ์…ˆ์˜ inverse๋Š” ๋‚˜๋ˆ—์…ˆ.

- ๋ง์…ˆ์˜ inverse๋Š” ๋บ„์…ˆ.

 

Matrix Inversion

- A^-1 : A * A^-1 = I ์ธ matrix (I = identity matrix)

- A ํ–‰๋ ฌ์ด inverse๋ฅผ ๊ฐ€์ง„๋‹ค๋ฉด, Gaussian elimination์„ ์ด์šฉํ•˜์—ฌ ๊ณ„์‚ฐ๋  ์ˆ˜ ์žˆ๋‹ค.

* ๊ฐ€์šฐ์Šค ์†Œ๊ฑฐ๋ฒ•

https://m.blog.naver.com/siri0831/222033492473

 

์„ ํ˜•๋Œ€์ˆ˜ํ•™(1) - ๊ฐ€์šฐ์Šค ์†Œ๊ฑฐ๋ฒ•(Gauss Elimination)

์•ˆ๋…•ํ•˜์„ธ์š”! ์˜ค๋Š˜๋ถ€ํ„ฐ ์„ ํ˜•๋Œ€์ˆ˜ํ•™์„ ์กฐ๊ธˆ์”ฉ ์˜ฌ๋ ค ์ •๋ฆฌํ•ด๋ณผ๊นŒ ํ•ด์š”~ ๋Œ€ํ•™๊ต ์ž…ํ•™ํ•˜์—ฌ ๋ชจ๋‘๋“ค ์ฒ˜์Œ ์ ‘ํ•˜๊ฒŒ ๋ ...

blog.naver.com

 

Matrix Inversion and Linear Systems

- Ax=b ์‹์— A์˜ ์—ญํ–‰๋ ฌ A^-1์„ ๊ณฑํ•˜๋ฉด ๋‹ค์Œ์˜ ์‹์ด ๋‚˜ํƒ€๋‚œ๋‹ค.

- ์„ ํ˜•์‹์˜ ํ•ด๋ฅผ ๊ตฌํ•˜๋Š” ๊ฒƒ์€ matrix์˜ ์—ญํ–‰๋ ฌ์„ ๊ณฑํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™๋‹ค.

Principle Component Analysis (PCA)

- ๋ฐ์ดํ„ฐ๋ฅผ 2๊ฐœ์˜ ์ถ•์œผ๋กœ projectionํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

- ์–ด๋–ค ๊ธฐ์ค€์œผ๋กœ ์ถ•์„ ๊ณจ๋ผ์•ผ ํ•  ๊ฒƒ์ธ๊ฐ€?

1) 1์ฐจ์› projection์„ ํ–ˆ์„ ๋•Œ ๋ฐ์ดํ„ฐ ๋ถ„์‚ฐ์ด ๊ฐ€์žฅ ํฐ ๊ฒƒ์„ ์„ ํƒํ•œ๋‹ค.

* ์ด์œ : ์ •๋ณด ์œ ์‹ค์„ ์ตœ์†Œํ™”ํ•ด์•ผํ•˜๋ฏ€๋กœ

2) ์ฒซ๋ฒˆ์งธ ์ถ•๊ณผ ์ง๊ตํ•˜๋ฉด์„œ ๋ถ„์‚ฐ์ด ๊ฐ€์žฅ ํฐ ์ถ•์„ ์„ ํƒํ•œ๋‹ค. --> ์ƒˆ๋กœ์šด ๊ณต๊ฐ„์„ ๋งŒ๋“ค์–ด๋‚ธ๋‹ค.

 

Singular Value Decomposition (SVD) : ํŠน์ด๊ฐ’ ๋ถ„ํ•ด

- ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ ๋“ฑ์— ์‚ฌ์šฉ๋จ

- V*, U*๋Š” ๊ฐ ํ–‰๋ ฌ์˜ ์—ญํ–‰๋ ฌ์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜๋ฉด ๋จ

- https://darkpgmr.tistory.com/106

 

[์„ ํ˜•๋Œ€์ˆ˜ํ•™ #4] ํŠน์ด๊ฐ’ ๋ถ„ํ•ด(Singular Value Decomposition, SVD)์˜ ํ™œ์šฉ

ํ™œ์šฉ๋„ ์ธก๋ฉด์—์„œ ์„ ํ˜•๋Œ€์ˆ˜ํ•™์˜ ๊ฝƒ์ด๋ผ ํ•  ์ˆ˜ ์žˆ๋Š” ํŠน์ด๊ฐ’ ๋ถ„ํ•ด(Singular Value Decomposition, SVD)์— ๋Œ€ํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค. ๋ณดํ†ต์€ ๋ณต์†Œ์ˆ˜ ๊ณต๊ฐ„์„ ํฌํ•จํ•˜์—ฌ ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์ด์ง€๋งŒ ์ด ๊ธ€์—์„œ๋Š” ์‹ค์ˆ˜(real

darkpgmr.tistory.com

 

Reconstructing Lincoln

'School/COSE471 ๋ฐ์ดํ„ฐ๊ณผํ•™' Related Articles +
1 2 3