Lixiaoxu
Membre des de 6 jul 2009
→Online calculator for critical values, cumulative probabilities, and critical noncentral parameters
| (Hi ha 4 revisions intermèdies del mateix usuari que no es mostren) | |||
| Línia 165: | Línia 165: | ||
cat('\nraw data') | cat('\nraw data') | ||
if (rawdata) (x); | if (rawdata) (x); | ||
</pre> | |||
==方差分析教学课件== | |||
理解ANOVA是在作回归 | |||
===总体与抽样次数设定=== | |||
ANOVA在R里头和回归是同一个东西,几句话讲完。麻烦的是模拟数据: | |||
<Rform name="data"> | |||
所有男生(所谓总体)用中文卷的智商均值是:<br/> | |||
<input name="xmc" value="105" size="8"/><br/> | |||
所有女生(所谓总体)用中文卷的智商均值是:<br/> | |||
<input name="xfc" value="101" size="8"/><br/> | |||
所有男生(所谓总体)用英文卷的智商均值是:<br/> | |||
<input name="xme" value="80" size="8"/><br/> | |||
所有女生(所谓总体)用英文卷的智商均值是:<br/> | |||
<input name="xfe" value="90" size="8"/><br/> | |||
抽样分成男中、女中、男英、女英四组,每组内抽的人数是:<br/> | |||
<input name="n" value="5" size="8"/><br/><br/> | |||
<input type="submit" name="submit" value="点击按键重复抽样,更新如下数据和结果"/><br/> | |||
智商的抽样误差(不一定是误出来的,但一定是抽出来的)满足正态分布,还假定已知各组内的总体标准差相同。这些抽样误差彼此独立,每次实验都重新变化。 | |||
组内总体标准差为:<br/> | |||
<input name="sigma" value="10" size="8"/><br/><br/> | |||
</Rform> | |||
===可观测的数据及其影响的统计结论=== | |||
每个观测到的智商成绩都包含随机的成分,所以实际观测到的数据不是各组总体的均值。 | |||
<R name="data" iframe="height:1800px;"> | |||
s_group = as.factor(c('Male','Male','Female','Female')); | |||
q_group = as.factor(c('Chinese ',' English','Chinese ',' English')); | |||
xmc = ifelse(exists("xmc"),as.numeric(xmc),105); | |||
xme = ifelse(exists("xme"),as.numeric(xme),80); | |||
xfc = ifelse(exists("xfc"),as.numeric(xfc),101); | |||
xfe = ifelse(exists("xfe"),as.numeric(xfe),90); | |||
sigma = ifelse(exists("sigma"),as.numeric(sigma),10); | |||
x_true_group = c(xmc,xme,xfc,xfe); | |||
##每组找n(=10)个被试; | |||
n = ifelse(exists("n"),as.integer(n),10); | |||
s = rep(s_group,each=n); q =rep(q_group,each=n);x_true = rep(x_true_group,each=n); | |||
## 但是每个被试都有随机抽样带来的偏差,标准差是sigma | |||
error = rnorm(n*4,mean=0,sd=sigma); | |||
x_observe = x_true + error; | |||
## 所以最终看到的是如下 | |||
cat('<font color="blue">') | |||
data.frame(s,q,x_observe,x_true) | |||
cat('</font>') | |||
cat('\n最后一列总体值看不到\n想研究智商成绩在性别、语言上的区别\n') | |||
cat('也就是性别、语言对智商能不能部分地预测\n') | |||
cat('\n回归的表述:成绩=常数截距+性别预测增值+语言预测增值+二者交互增值+预测残差\n') | |||
cat('\n先理解一下真实的参数值,四组的总体\n') | |||
(lm(x_true ~ 0 + s:q)) | |||
cat('\n先理解一下真实的参数值,四组的总体差异被拆解成三个差异分项\n') | |||
(lm(x_true ~ 1 + s + q + s:q)) | |||
cat('\n基于观测值的预测会受到随机误差扰动,\n') | |||
cat('\n四组组内均值的观测\n') | |||
lm(x_observe ~ 0 + s:q) | |||
cat('\n四组三个差异分项的观测\n') | |||
(model = lm(x_observe ~ 1 + s + q + s:q)) | |||
cat('\n考虑这种扰动之后,各个预测的95%把握置信区间上下界\n') | |||
cat('如果这个区间不包括0,表示有大于95%的把握确信这个系数有预测作用,无论大小。\n但是,交互项的存在使得解读变得困难。比如,男生-女生的效应包括sMale + (1/2) sMale:qChinese\n') | |||
confint(model) | |||
cat('\n<font color="red">ANOVA的表述:成绩的波动=性别预测增值的波动+语言预测增值的波动+残差波动\n') | |||
cat('“波动”的操作化定义是Sum of Squares(SS),一列数与其均值的差距平方和\n') | |||
cat('Df表示每个波动项吸收抽样误差波动的理论比例\n') | |||
cat('Sum Sq表示每个波动项解释波动的观测比例,\n如果和Df的理论比例极端不匹配(F比1大很多),就支持其中包含由不同总体带入的非随机波动\n') | |||
cat('最后一列Pr(>F)是一般的假设检验中反映F极端程度的p值\n\n') | |||
anova(model) | |||
cat('\n</font>') | |||
</R> | |||
===R代码=== | |||
把下面的代码copy paste到R中,或者将语句贴到[http://pbil.univ-lyon1.fr/Rweb/ Rweb](中文UTF-8码),结合#号后的注解理解方差分析 | |||
<pre> | |||
## 数据:智商抽样,error是个体随机偏差(学名抽样误差,不一定是误出来的,但总是抽出来的) | |||
##4组被试依次性别如下,考试卷子语言如下 | |||
(s_group = as.factor(c('Male','Male','Female','Female'))); | |||
(q_group = as.factor(c('Chinese','English','Chinese','English'))); | |||
## 男生总体用中文卷智商为105,女生总体用中文卷智商为101,男生总体用E文卷子智商为80,女生总体用E文卷智商为90 | |||
x_true_group = c(105,80,101,90); | |||
## 组内总体标准差 | |||
sigma = 10; | |||
##每组找n(=10)个被试; | |||
n = 10; s = rep(s_group,each=n); q =rep(q_group,each=n);x_true = rep(x_true_group,each=n); data.frame(s,q,x_true) | |||
## 但是每个被试都有随机抽样带来的偏差,标准差是15 | |||
error = rnorm(4*n,mean=0,sd=sigma); | |||
x_observe = x_true + error; | |||
## 所以最终看到的是如下 | |||
data.frame(s,q,x_observe) | |||
boxplot(x_observe~s*q) | |||
##以上都是数据模拟部分,如果已经有数据,直接完成下面的步骤 | |||
##linear model模型,就是回归了; | |||
model = lm(x_observe ~ 1 + s + q + s:q ) ## ~符号后面的1表示截距,s:q表示交互作用项 | |||
anova(model) | |||
plot(model,which=c(1,2)) | |||
</pre> | </pre> | ||
| Línia 352: | Línia 453: | ||
If ''df'' on this display is set to <math>\infty</math> (''Inf'' in R) and noncentral parameter set to 0, a standard normal distribution will be plotted and critical ''z'' score calculated. | If ''df'' on this display is set to <math>\infty</math> (''Inf'' in R) and noncentral parameter set to 0, a standard normal distribution will be plotted and critical ''z'' score calculated. | ||
===Noncentral | ===Noncentral ''F''=== | ||
The noncentral parameter of ''F'' is only defined on its numerator. The noncentral ''F'' distributed | The noncentral parameter of ''F'' is only defined on its numerator. The noncentral ''F'' distributed | ||
| Línia 489: | Línia 590: | ||
} | } | ||
pdf(rpdf, | pdf(rpdf, width=8, height=4) | ||
| Línia 530: | Línia 631: | ||
For two-related-group case, the difference scores between each pair of samples can apply one-group mean test interface. Usually <math>\mu_{null}</math> is set to zero. | For two-related-group case, the difference scores between each pair of samples can apply one-group mean test interface. Usually <math>\mu_{null}</math> is set to zero. | ||
===Power of | ===Power of ''F'' test for a given Cohen's ''f''=== | ||
Let's use <math>\tilde{f}^2</math> denote the population of Cohen's <math>f^2</math>, specially | Let's use <math>\tilde{f}^2</math> denote the population of Cohen's <math>f^2</math>, specially | ||
:<math>SS(\mu_1,\mu_2,\cdots,\mu_K) \over {K\times \sigma^2}</math> | :<math>SS(\mu_1,\mu_2,\cdots,\mu_K) \over {K\times \sigma^2}</math> | ||
| Línia 548: | Línia 649: | ||
==External links== | ==External links== | ||
* [http://en.wikipedia.org/wiki/Noncentral_t-distribution Noncentral ''t''-distribution] on Wikipedia | * [http://en.wikipedia.org/wiki/Noncentral_t-distribution Noncentral ''t''-distribution] on Wikipedia | ||
* [http://en.wikipedia.org/wiki/Noncentral_chi-square_distribution Noncentral | * [http://en.wikipedia.org/wiki/Noncentral_chi-square_distribution Noncentral chi-square distribution] on Wikipedia | ||
* [http://en.wikipedia.org/wiki/Noncentral_F-distribution Noncentral ''F''-distribution] on Wikipedia | * [http://en.wikipedia.org/wiki/Noncentral_F-distribution Noncentral ''F''-distribution] on Wikipedia | ||
* [http://en.wikipedia.org/wiki/Effect_size#Confidence_interval_and_relation_to_noncentral_parameters Confidence interval of Effect Size] on Wikipedia | * [http://en.wikipedia.org/wiki/Effect_size#Confidence_interval_and_relation_to_noncentral_parameters Confidence interval of Effect Size] on Wikipedia | ||