2014年11月23日 星期日

Three Learning Principles

1. Occam’s Razor

2. Sampling Bias

3. Data Snooping

 

Occam’s Razor: trimming down unnecessary explanation

The simplest model that fits the data is also the most plausible.

image

image

image

image

image

image

 

Sampling Bias: If the data is sampled in a biased way, learning will produce a similarly biased outcome.

image

image

image

data and testing should be both iid from P

image

random for validation vs. last records for test, that’s why still lost the contest

image

Consider using same distribution (sampling) in all of training, validation and testing phases

1. Emphasize weight in training if need

2. Match validation with test scenario as much as possible

image

image

 

Data Snooping

image

image

Red: using entire 8 years data for training although the performance is good (snooping)

Blue: using 6 years for training and 2 years for testing, the result is even negative

image

對犯人逼共久了,任誰都會招供的!!!

image

1. 避免偷看資料後決定模型

2. 時刻存著懷疑

image

沒有留言:

張貼留言