profile-img
The merit of an action lies in finishing it to the end.
slide-image

Probability
- experiment: ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋Š” ๊ฒฐ๊ณผ ํ•œ ์„ธํŠธ๋ฅผ ๋งŒ๋“ค์–ด๋‚ด๋Š” ๊ณผ์ •
- sample space S: experiment์˜ ๊ฐ€๋Šฅ์„ฑ์žˆ๋Š” ๊ฒฐ๊ณผ์˜ ์„ธํŠธ
- event E: experiment์˜ ํŠน์ • ํ•œ ๊ฐ€์ง€ ๊ฒฐ๊ณผ
- p(s): probability of an outcome s
0 <= p(s) <= 1 // p(s)์˜ ํ•ฉ = 1์„ ๋งŒ์กฑํ•˜๋Š” ์ˆซ์ž -> ํ™•๋ฅ 
- probability of an event E: experiment์˜ ๊ฒฐ๊ณผ๋“ค์˜ ํ™•๋ฅ ์˜ ํ•ฉ
P(E) = p(s)์˜ ํ•ฉ = 1 - p(E^C)
- random variable V: ํ™•๋ฅ ๊ณต๊ฐ„์˜ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ numerical function
- expected value of a random variable V
E[V] = p(s) * V(s)์˜ ํ•ฉ

Probability vs. Statistics
- ํ™•๋ฅ ์€ ๋ฏธ๋ž˜ ์‚ฌ๊ฑด์— ๋Œ€ํ•œ ๊ฐ€๋Šฅ์„ฑ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์„ ๋‹ค๋ฃธ
- ํ†ต๊ณ„๋Š” ๊ณผ๊ฑฐ ์‚ฌ๊ฑด์˜ ๋นˆ๋„์— ๋Œ€ํ•ด ๋ถ„์„, ์‹ค์ œ ์„ธ๊ณ„์— ๋Œ€ํ•œ ๊ด€์ธก์„ ํ•ฉ๋ฆฌํ™”ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋จ

Compound Events and Independence
- independent (๋…๋ฆฝ): ๋‹ค์Œ ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š” event 2๊ฐœ


Conditional Probability P(A|B)

- ์ •์˜: B์‚ฌ๊ฑด์ด ์ผ์–ด๋‚ฌ์„ ๋•Œ A์‚ฌ๊ฑด๋„ ์ผ์–ด๋‚  ํ™•๋ฅ 
- A์™€ B๊ฐ€ ๋…๋ฆฝ์ผ ๋•Œ, P(A|B) = P(A)

Bayes Theorem
- ์˜์กด๊ด€๊ณ„์˜ ๋ฐฉํ–ฅ์„ ๋ฐ”๊ฟ€ ๋•Œ ์ด์šฉ
- ์‚ฌ์ „ํ™•๋ฅ ๋กœ๋ถ€ํ„ฐ ์‚ฌํ›„ํ™•๋ฅ ์„ ๊ตฌํ•  ์ˆ˜ ์žˆ์Œ

Ex) B: ์ŠคํŒธ ์ด๋ฉ”์ผ์ธ ์‚ฌ๊ฑด / A: ์ด๋ฉ”์ผ์ธ ์‚ฌ๊ฑด
๋ฐ›์€ ์ด๋ฉ”์ผ์ด ์ŠคํŒธ ์ด๋ฉ”์ผ์ผ ํ™•๋ฅ ์„ ๊ตฌํ•  ๋•Œ ์ด์šฉ
- P(A), P(B)๋Š” ๊ฐ๊ฐ์˜ ์‚ฌ๊ฑด์˜ ์‚ฌ์ „ํ™•๋ฅ ์ด๋‹ค.
- ์‚ฌํ›„ ํ™•๋ฅ ์„ ๊ตฌํ•˜๊ธฐ. ์–ด๋ ค์šฐ๋ฏ€๋กœ approximation ~ naive Bayesian ์ด์šฉ

Distributions of Random Variables
- Random variables: ๊ฐ’๊ณผ ํ™•๋ฅ ์ด ๊ฐ™์ด ๋“ฑ์žฅํ•˜๋Š” ์ˆ˜์น˜์  ํ•จ์ˆ˜
- Probability density functions (pdfs): ํžˆ์Šคํ† ๊ทธ๋žจ ๋“ฑ์œผ๋กœ RV๋ฅผ ๋‚˜ํƒ€๋ƒ„
- Cumulative density functions (cdfs): running sum of the pdf
- pdf์™€ cdf๋Š” ๋™์ผํ•œ ์ •๋ณด๋ฅผ ๊ฐ–๊ณ  ์žˆ๋‹ค.

- cdf๋Š” ์„ฑ์žฅ๋ฅ ์— ๋Œ€ํ•œ ์ž˜๋ชป๋œ ์‹œ๊ฐ์„ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ๋‹ค. -> ์—„์ฒญ ๋น ๋ฅด๊ฒŒ ์„ฑ์žฅํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ด๋‚˜, pdf๋กœ ํ™•์ธํ•  ๊ฒฝ์šฐ ๊ฐœ๋ณ„ ์—ฐ๋„๋ณ„ ์„ฑ์žฅ๋ฅ ์ด ๊ทธ๋ ‡๊ฒŒ ๋†’์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Œ.

 

Descriptive Statistics

- Central tendency measures: ์ค‘์‹ฌ์  ์ฃผ๋ณ€์— ๋ถ„ํฌํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์„ค๋ช…

- Variation or variability measures: ๋ฐ์ดํ„ฐ๊ฐ€ ํผ์ ธ์žˆ๋Š” ์ •๋„๋ฅผ ์„ค๋ช…

 

Centrality Measure

(Arithmatic) Mean

- ์žฅ์ : outlier๊ฐ€ ์—†๋Š” symmetric distribution์—์„œ ์˜๋ฏธ ์žˆ๊ฒŒ ์‚ฌ์šฉ ๊ฐ€๋Šฅ (ex. ํ‚ค, ๋ชธ๋ฌด๊ฒŒ ๋“ฑ์˜ ์ •๊ทœ๋ถ„ํฌ)

 

median: middle value

- skewed distribution

- outlier๊ฐ€ ์žˆ๋Š” ๋ฐ์ดํ„ฐ

- ๋ถ€, ์ˆ˜์ž… ๋“ฑ

 

mode: ๊ฐ€์žฅ ์ž์ฃผ ๋‚˜ํƒ€๋‚˜๋Š” ์š”์†Œ

- ์ค‘์•™์— ๊ฐ€๊น์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค.

 

geometric mean: nth root of the product of n values

- geometric mean์€ ํ•ญ์ƒ arithmetic mean๋ณด๋‹ค ์ž‘๊ฑฐ๋‚˜ ๊ฐ™๋‹ค.

- 0์— ๊ฐ€๊นŒ์šด ๊ฐ’๋“ค์— ๋” ๋ฏผ๊ฐํ•˜๋‹ค.

- ratio์˜ mean์„ ๊ตฌํ•  ๋•Œ ์‚ฌ์šฉ

 

Aggregation as Data Reduction

- feature์˜ ๊ฐœ์ˆ˜๊ฐ€ ์•„๋‹Œ ๊ทธ๋ƒฅ ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋ฅผ ์ค„์ด๋Š” ๊ฒƒ

- train, test set ๋‚˜๋ˆŒ ๋•Œ ์œ ์˜ํ•˜์ง€ ์•Š์œผ๋ฉด ํŽธํ–ฅ๋  ์ˆ˜ ์žˆ์Œ.

 

Variance Metric: Standard Deviation

- Variance: standard deviation sigma์˜ ์ œ๊ณฑ

- population SD: n์œผ๋กœ ๋‚˜๋ˆ”

- sample SD: n-1๋กœ ๋‚˜๋ˆ”

- n์ด ์•„์ฃผ ์ปค์ง€๋ฉด n ~ (n-1) ์ด๋ฏ€๋กœ ํฐ ๋ฌธ์ œ๊ฐ€ ๋˜์ง€ ์•Š์Œ

- hat: sample์„ ์˜๋ฏธํ•จ

- ํ‰๊ท ๊ณผ ํ‘œ์ค€ํŽธ์ฐจ๋ฅผ ๊ฐ€์ง€๊ณ  ๋ถ„ํฌ๋ฅผ ํŠน์ •์ง€์„ ์ˆ˜ ์žˆ๋‹ค.

 

Parameterizing Distributions

- ๋ฐ์ดํ„ฐ๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ถ„ํฌํ•ด์žˆ๋Š”์ง€์™€๋Š” ์ƒ๊ด€์—†์ด, ์ตœ์†Œํ•œ 1-1/k^2๋ฒˆ์งธ ์ ์€ ํ‰๊ท ์˜ k sigma ์•ˆ์ชฝ์— ์žˆ์–ด์•ผ ํ•œ๋‹ค.

- ์ตœ์†Œํ•œ 75%๋Š” ํ‰๊ท ์˜ 2 sigma ์•ˆ์ชฝ์— ์žˆ๋‹ค.

- Power law์˜ ๊ฒฝ์šฐ์—๋Š” ํฐ ์˜๋ฏธ๊ฐ€ ์—†๋‹ค. (skewed data)

- signal to noise ratio๋ฅผ ์ธก์ •ํ•˜๋Š” ๊ฒƒ์€ ์–ด๋ ต๋‹ค. -> sampling error, measurement error์— ์˜ํ•œ ์ •ํ™•ํ•˜์ง€ ์•Š์€ ๋ถ„์‚ฐ

 

Batting Average - Interpreting Variance

- 3ํ•  ํƒ€์ž์—ฌ๋„, 2ํ•  7ํ‘ผ 5๋ฆฌ ์ดํ•˜์˜ ์„ฑ์ ์„ ๋ณด์ผ ๊ฐ€๋Šฅ์„ฑ์ด 10%๋‚˜ ๋˜๊ณ , 3ํ•  2ํ‘ผ 5๋ฆฌ ์ด์ƒ์˜ ์„ฑ์ ์„ ๋ณด์ผ ๊ฐ€๋Šฅ์„ฑ๋„ 10%๋‚˜ ๋œ๋‹ค.

 

Correlation Analysis

- correlation coefficient r(X, Y): Y๊ฐ€ X์˜ ํ•จ์ˆ˜์ธ ์ •๋„๋ฅผ ์ธก์ •ํ•œ๋‹ค.

- -1~1์‚ฌ์ด์˜ ๊ฐ’์„ ๊ฐ–๋Š”๋‹ค.

-1: anti-correlated

1: fully-correlated

0: uncorrelated

 

Pearson Correlation Coefficient

- ๋ถ„์ž = covariance

 

r^2

- X์— ์˜ํ•ด ์„ค๋ช…๋˜๋Š” Y์˜ ๋น„์œจ์„ ๋‚˜ํƒ€๋‚ธ ์ง€ํ‘œ

 

Variance Reduction & r^2

- good linear fit f(x)๊ฐ€ ์žˆ์„ ๋•Œ, ์ž”์ฐจ d = y - f(x)๋Š” y์— ๋น„ํ•ด์„œ ๋” ๋‚ฎ์€ ๋ถ„์‚ฐ์„ ๋ณด์ผ ๊ฒƒ์ด๋‹ค.

- 1-r^2 = V(d)/V(y)

- ex. r = 0.94์ผ ๋•Œ, 88.4%์˜ V(y)๋ฅผ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

Significance

- ์–ผ๋งˆ๋‚˜ ์œ ์˜ํ•˜๋ƒ~

- sample size, r ๋ชจ๋‘ ์ค‘์š”ํ•จ

- p value < 0.05 (์šฐ์—ฐํžˆ ๋ณด์•˜์„ ํ™•๋ฅ ์ด 5% ๋ฏธ๋งŒ์ด๋‹ค)

- ์ž‘์€ ์ƒ๊ด€๊ด€๊ณ„๋„ sample size๊ฐ€ ์ถฉ๋ถ„ํžˆ ํฌ๋ฉด ์œ ์˜ํ•ด์งˆ ์ˆ˜ ์žˆ์Œ

- permutation test: X๋ฅผ ๋‘๊ณ  Y๋ฅผ ์„ž์–ด์„œ ๊ณ„์‚ฐ -> ๋งŒ๋ฒˆ ๊ณ„์‚ฐ ํ›„ ์šฐ๋ฆฌ๊ฐ€ ๊ถ๊ธˆํ•œ r(X, Y)๊ฐ€ ์ƒ์œ„ ๋ช‡ % ์•ˆ์— ์žˆ๋‚˜ ํ™•์ธํ•˜๋Š” ๋ฐฉ๋ฒ•

 

Spearman Rank Correlation

- disordered pair์˜ ์ˆ˜๋ฅผ ์„ธ๋Š” ๋ฐฉ๋ฒ•

- ๋ฐ์ดํ„ฐ๊ฐ€ ์ง์„ ์— ์–ผ๋งˆ๋‚˜ ์ž˜ ๋งž๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹˜

=> non-linear relationship, outlier์— ๊ฐ•์ 

- ๊ณ„์‚ฐ ๋ฐฉ๋ฒ•

- Pearson correlation rank์ด๋ฏ€๋กœ ๋ฒ”์œ„๊ฐ€ -1~1 ์‚ฌ์ด

 

Correlation vs. Causation

- Correlation์ด causation์ธ ๊ฒƒ์€ ์•„๋‹ˆ๋‹ค.

- causation: ์›์ธ->๊ฒฐ๊ณผ์ธ ๋ฐฉํ–ฅ์ด ์žˆ๋Š” ์ •๋ณด

 

Autocorrelation and Periodicity

- time-series data ์ค‘ ์ข…์ข… cycle์„ ๋ณด์ด๋Š” ๋ฐ์ดํ„ฐ

- lag-k autocorrelation ๊ณ„์‚ฐ์€ O(n)์ด์ง€๋งŒ, Fast Fourier Transform(FFT)๋ฅผ ์ด์šฉํ•˜๋ฉด O(nlogn)์— ๊ณ„์‚ฐ ๊ฐ€๋Šฅ

- shifting์„ ์ด์šฉํ•˜์—ฌ correlation์„ ํŒŒ์•… = ์ฆ‰, ์ฃผ๊ธฐ์„ฑ ํŒŒ์•…

 

Logarithms

- ์ •์˜: inverse exponential function

- ์ปดํ“จํ„ฐ์˜ ์—ฐ์‚ฐ ๋ฌธ์ œ ๋–„๋ฌธ์— logarithm์„ ์ด์šฉํ•˜๋Š” ๊ฒƒ์ด ๋” ํšจ์œจ์ ์ด๋‹ค.

- ๋น„์œจ์„ ๊ทธ๋ƒฅ ๋น„๊ตํ•˜๋Š” ๊ฒƒ์€ ์—„์ฒญ๋‚œ ์ฐจ์ด๋ฅผ ๋ณด์ผ ์ˆ˜ ์žˆ์œผ๋‚˜ ๋น„์œจ์— ๋กœ๊ทธ๋ฅผ ์ทจํ•ด์„œ ๋น„๊ตํ•  ๊ฒฝ์šฐ equal displacement๋ฅผ ๋ณด์ธ๋‹ค.

- power law์—์„œ ๋กœ๊ทธ๋ฅผ ์”Œ์›Œ์„œ ๋น„๊ตํ•˜๋Š” ์ด์œ .

 

Normalizing Skewed Distributions

- logarithm์„ ์ด์šฉ: power law, ratio ๋“ฑ์— ์ด์šฉํ•˜์—ฌ ์ •๊ทœํ™” ๊ฐ€๋Šฅ

 

'School/COSE471 ๋ฐ์ดํ„ฐ๊ณผํ•™' Related Articles +