Usuari:Lixiaoxu: diferència entre les revisions

De wikiTraba
Salta a la navegació Salta a la cerca
Cap resum de modificació
Cap resum de modificació
Línia 3: Línia 3:




# Regression
== Regression of two columns of inputted data ==


<Rform name="owndata">
<Rform name="owndata">
Línia 41: Línia 41:
abline(lm.m)
abline(lm.m)


</R>
== 回归分析课件 ==
=== 输入参数 ===
<Rform name="Tri">
向量<math>\vec{Y}</math>与<math>\vec{X}_1</math>、<math>\vec{X}_2</math>的夹角(90度为直角)分别是
:<Input name="cy1" value="89" size="5"/>度
:<Input name="cy2" value="89" size="5"/>度。
向量<math>\vec{X}_1</math>与<math>\vec{X}_2</math>的夹角是
:<Input name="c12" value="177.9" size="5"/>度。
这三个角度应当满足两两之和大于第三者。
这些向量的<math>N</math>个分量代表<math>N</math>个标准化之后的样本。
请设定样本量为
:<Input name="N" value="100" size="5"/>,输出模拟数据<Input name="rawdata" type="checkbox"/>
<input type="submit" />
</Rform>
=== 练习 ===
请观察<latex>\vec{X}_2</latex>加入前后,回归方程
:<math>\vec{Y}=\beta_1\vec{X}_1+\vec{\epsilon}</math>
:<math>\vec{Y}=\beta_1\vec{X}_1+\beta_2\vec{X}_2+\vec{\epsilon}</math>
的<math>R^2</math>的变化。
==== 两个与DV相关极小的IV却能极好地预测DV ====
三个角度分别为 89,89,177.9
==== 两个与DV高正相关的IV却出现负回归系数 ====
三个角度分别为 5,2.6,2.6
==== 两个不相关的DV对IV的预测能力(<math>R^2</math>)可以相加 ====
第三个角度为90
==== (<math>R^2_1+R^2_2-R^2_{12}</math>)从0变大再变小甚至变负的情形 ====
零:三个角度分别为:60,45,90
正:三个角度分别为:60,45,45
负:三个角度分别为:60,45,15.1
与Redundancy的关系: [http://books.google.com/books?id=fuq94a8C0ioC&pg=PA76&lpg=PA76&dq=Redundancy+regression+Cohen&source=bl&ots=9ZvmtpyAdY&sig=FmCV5PlLJ0NsqHUj4Y328AHB45A&hl=en&sa=X&oi=book_result&resnum=1&ct=result Cohen & Cohen (2003, p. 76)]
=== 结果 ===
<R output="html" name="Tri" iframe="height:400px;">
cy1 <- ifelse(exists("cy1"),as.numeric(cy1),89);
cy2 <- ifelse(exists("cy2"),as.numeric(cy2),89);
c12 <- ifelse(exists("c12"),as.numeric(c12),177.9);
N <- ifelse(exists("N"),as.integer(N),100);
rawdata <- ifelse(exists("rawdata"),as.logical(N),FALSE);
S <- matrix(rep(1,9),3);
S[1,2]<-S[2,1]<-cos(cy1/180*pi);
S[1,3]<-S[3,1]<-cos(cy2/180*pi);
S[2,3]<-S[3,2]<-cos(c12/180*pi);
if ((det(S)<= 0 )|(N<1)) outHTML(rhtml,NA,title='Please check your input!\n Sum of any two angles should be larger than the third one.');
require(MASS);
x<-mvrnorm(n=N,mu=c(0,0,0),Sigma=S,empirical= TRUE);
Y<-x[,1];X_1<-x[,2];X_2<-x[,3];
colnames(x)<-colnames(S)<-rownames(S)<-c('Y','X_1','X_2');
lm1 <- lm(Y~0+ X_1);
lm2 <- lm(Y~0+ X_2);
lm12 <- lm(Y~0+ X_1+X_2);
R2<-matrix(rep(NA,3),nrow=3);
rownames(R2)<-c('Y ~ 0+ X_1','Y ~ 0+ X_2','Y ~ 0+ X_1 + X_2');
R2[,1] <- c( summary(lm1)$r.squared, summary(lm2)$r.squared, summary(lm12)$r.squared);
colnames(R2)[1]<-round( summary(lm1)$r.squared + summary(lm2)$r.squared - summary(lm12)$r.squared,4);
outHTML(rhtml, t(R2), title="R^2_1+R^2_2-R^2_12", format="f", digits=4);
outHTML(rhtml, summary(lm1)$coefficients, title=rownames(R2)[1], format="f", digits=4);
outHTML(rhtml, summary(lm2)$coefficients, title=rownames(R2)[2], format="f", digits=4);
outHTML(rhtml, summary(lm12)$coefficients, title=rownames(R2)[3], format="f", digits=4);
outHTML(rhtml, S, title="correlation\n", format="f", digits=4);
if (rawdata) outHTML(rhtml, x, title="Raw data\n", format="f", digits=4);
</R>
</R>

Revisió del 15:08, 6 jul 2009

szpku.lixiaoxu@gmail.com


Regression of two columns of inputted data

<Rform name="owndata"> Enter your own data for a scatterplot:
You can use <a href="https://spreadsheets.google.com/ccc?key=0Aic4pmEZm32xclhJZm9hNWFyZlZOV1RSV19xWXRlbmc&hl=en">the free online spreadsheet </a> of Google Docs to edit your data before pasting. Just click the link, need NO login.
<textarea name="mydata" rows="8"> 1.262954285 3.8739569 -0.326233361 1.0400041 1.329799263 2.0161824 1.272429321 2.8284819 0.414641434 2.1324980 -1.539950042 0.4565291 -0.928567035 1.6093698 -0.294720447 0.9723025 -0.005767173 2.5310696 2.404653389 2.7861843 </textarea> <input type="submit" value=" Submit "> </Rform>

<R output="display" name="owndata" iframe="height:500px;"> if (exists("mydata")) {

 main <- "Data from user"
 x <- readdataSK(mydata, format="txt") 

} else {

 main <- "Default data"
 set.seed(0);
 x<-matrix(rnorm(20),10,2);
 x[,2]=2.1+x[,1]*.8+x[,2];
 colnames(x)<-c('V1','V2');

}

pdf(rpdf, width=6, height=6) lm.m<-lm(x[,2]~x[,1]); main<-paste(main,'\nV2 =',round(lm.m$coefficients[1],3),'+',round(lm.m$coefficients[2],3),'*V1 + ',round(summary(lm.m)$sigma,3),'*e') plot(x, cex=2, main=main) abline(lm.m)

</R>

回归分析课件

输入参数

<Rform name="Tri"> 向量的夹角(90度为直角)分别是

<Input name="cy1" value="89" size="5"/>度

<Input name="cy2" value="89" size="5"/>度。

向量的夹角是

<Input name="c12" value="177.9" size="5"/>度。

这三个角度应当满足两两之和大于第三者。 这些向量的个分量代表个标准化之后的样本。

请设定样本量为

<Input name="N" value="100" size="5"/>,输出模拟数据<Input name="rawdata" type="checkbox"/>

<input type="submit" /> </Rform>

练习

请观察<latex>\vec{X}_2</latex>加入前后,回归方程

的变化。

两个与DV相关极小的IV却能极好地预测DV

三个角度分别为 89,89,177.9

两个与DV高正相关的IV却出现负回归系数

三个角度分别为 5,2.6,2.6

两个不相关的DV对IV的预测能力()可以相加

第三个角度为90

()从0变大再变小甚至变负的情形

零:三个角度分别为:60,45,90

正:三个角度分别为:60,45,45

负:三个角度分别为:60,45,15.1

与Redundancy的关系: Cohen & Cohen (2003, p. 76)

结果

<R output="html" name="Tri" iframe="height:400px;"> cy1 <- ifelse(exists("cy1"),as.numeric(cy1),89); cy2 <- ifelse(exists("cy2"),as.numeric(cy2),89); c12 <- ifelse(exists("c12"),as.numeric(c12),177.9); N <- ifelse(exists("N"),as.integer(N),100); rawdata <- ifelse(exists("rawdata"),as.logical(N),FALSE);

S <- matrix(rep(1,9),3); S[1,2]<-S[2,1]<-cos(cy1/180*pi); S[1,3]<-S[3,1]<-cos(cy2/180*pi); S[2,3]<-S[3,2]<-cos(c12/180*pi);


if ((det(S)<= 0 )|(N<1)) outHTML(rhtml,NA,title='Please check your input!\n Sum of any two angles should be larger than the third one.'); require(MASS);

x<-mvrnorm(n=N,mu=c(0,0,0),Sigma=S,empirical= TRUE); Y<-x[,1];X_1<-x[,2];X_2<-x[,3];

colnames(x)<-colnames(S)<-rownames(S)<-c('Y','X_1','X_2');

lm1 <- lm(Y~0+ X_1); lm2 <- lm(Y~0+ X_2); lm12 <- lm(Y~0+ X_1+X_2); R2<-matrix(rep(NA,3),nrow=3); rownames(R2)<-c('Y ~ 0+ X_1','Y ~ 0+ X_2','Y ~ 0+ X_1 + X_2'); R2[,1] <- c( summary(lm1)$r.squared, summary(lm2)$r.squared, summary(lm12)$r.squared); colnames(R2)[1]<-round( summary(lm1)$r.squared + summary(lm2)$r.squared - summary(lm12)$r.squared,4); outHTML(rhtml, t(R2), title="R^2_1+R^2_2-R^2_12", format="f", digits=4);

outHTML(rhtml, summary(lm1)$coefficients, title=rownames(R2)[1], format="f", digits=4);

outHTML(rhtml, summary(lm2)$coefficients, title=rownames(R2)[2], format="f", digits=4);

outHTML(rhtml, summary(lm12)$coefficients, title=rownames(R2)[3], format="f", digits=4);

outHTML(rhtml, S, title="correlation\n", format="f", digits=4); if (rawdata) outHTML(rhtml, x, title="Raw data\n", format="f", digits=4); </R>