In this example, I use PROC IML, the SAS matrix language, to compute some regression results given some data. I then use PROC REG to show the correspondence between the matrix and REG results. This example is to help you link the matrix algebra to concrete numbers and to the SAS output that you will use.
Our input data:
X |
y |
1 4 5 6 |
5 |
1 2 3 4 |
2 |
1 4 5 4 |
5 |
1 2 3 4 |
3 |
1 4 3 2 |
4 |
1 4 4 5 |
4 |
1 2 3 4 |
3 |
1 4 5 4 |
5 |
1 7 8 9 |
8 |
1 5 4 3 |
5 |
Note that the design matrix for X has a column of 1s. This is raw data. Y is a single column (the last column).
To find the b weights, we need
(X'X) -1X'y=b
so
X'X
10 38 43 45
38 166 182 186
43 182 207 216
45 186 216 235
and
(X'X) -1
0.9505447 -0.053377 -0.114379 -0.034641
-0.053377 0.2309368 -0.29085 0.0947712
-0.114379 -0.29085 0.5196078 -0.22549
-0.034641 0.0947712 -0.22549 0.1431373
and
X'y
44
189
211
217
Therefore, when we multiply, we get
b
0.0847495
0.4945534
0.7026144
-0.130065
To find the beta weights, we need
b
= R-1 rR
1 0.8513143 0.5661385
0.8513143 1 0.8395464
0.5661385 0.8395464 1
R-1
4.9882353 -6.354649 2.5109908
-6.354649 11.483333 -6.043179
2.5109908 -6.043179 4.6519608
r
.9495869
.9387835
.6747098
beta
0.4653129
0.6686799
-0.15011
Xb
Y'
4.7956427
2.6614379
5.0557734
2.6614379
3.9106754
4.2230937
2.6614379
5.0557734
7.9969499
4.9777778
Squared correlation between Y and Y':
RSQ
0.9683203
Y-Y'
resid
0.2043573
-0.661438
-0.055773
0.3385621
0.0893246
-0.223094
0.3385621
-0.055773
0.0030501
0.0222222
SAS Proc Reg output for the same problem:
Model: MODEL1
Dependent Variable: Y
Analysis of Variance
Source |
DF |
Sum of Squares |
Mean Square |
F Value |
Prob>F |
Model |
3 |
23.62702 |
7.87567 |
61.132 |
0.0001 |
Error |
6 |
0.77298 |
0.12883 |
||
C Total |
9 |
24.40000 |
Root MSE |
0.35893 |
R-square |
0.9683 |
Dep Mean |
4.40000 |
Adj R-sq |
0.9525 |
C.V. |
8.15750 |
Parameter Estimates
Variable |
DF |
Parameter Estimate |
Standard Error |
T for H0: Parameter=0 |
Prob > |T| |
Standardized Estimate |
INTERCEP |
1 |
0.084749 |
0.34994203 |
0.242 |
0.8167 |
0.00000000 |
X1 |
1 |
0.494553 |
0.17248702 |
2.867 |
0.0285 |
0.46531294 |
X2 |
1 |
0.702614 |
0.25873053 |
2.716 |
0.0348 |
0.66867988 |
X3 |
1 |
-0.130065 |
0.13579575 |
-0.958 |
0.3751 |
-0.15010958 |
The data below are the observation number, X1-X3, Y, Y' and the residuals.
1 4 5 6 5 4.79564 0.20436
2 2 3 4 2 2.66144 -0.66144
3 4 5 4 5 5.05577 -0.05577
4 2 3 4 3 2.66144 0.33856
5 4 3 2 4 3.91068 0.08932
6 4 4 5 4 4.22309 -0.22309
7 2 3 4 3 2.66144 0.33856
8 4 5 4 5 5.05577 -0.05577
9 7 8 9 8 7.99695 0.00305
10 5 4 3 5 4.97778 0.02222