Regression with a SAS Matrix Program

mtb home

In this example, I use PROC IML, the SAS matrix language, to compute some regression results given some data. I then use PROC REG to show the correspondence between the matrix and REG results. This example is to help you link the matrix algebra to concrete numbers and to the SAS output that you will use.

Our input data:

X	y
1 4 5 6	5
1 2 3 4	2
1 4 5 4	5
1 2 3 4	3
1 4 3 2	4
1 4 4 5	4
1 2 3 4	3
1 4 5 4	5
1 7 8 9	8
1 5 4 3	5

Note that the design matrix for X has a column of 1s. This is raw data. Y is a single column (the last column).

To find the b weights, we need

(X'X) ^-1X'y=b

X'X

10 38 43 45

38 166 182 186

43 182 207 216

45 186 216 235

and

(X'X) ^-1

0.9505447 -0.053377 -0.114379 -0.034641

-0.053377 0.2309368 -0.29085 0.0947712

-0.114379 -0.29085 0.5196078 -0.22549

-0.034641 0.0947712 -0.22549 0.1431373

and

X'y

189

211

217

Therefore, when we multiply, we get

0.0847495

0.4945534

0.7026144

-0.130065

To find the beta weights, we need

b = R^-1 r

1 0.8513143 0.5661385

0.8513143 1 0.8395464

0.5661385 0.8395464 1

R^-1

4.9882353 -6.354649 2.5109908

-6.354649 11.483333 -6.043179

2.5109908 -6.043179 4.6519608

.9495869

.9387835

.6747098

beta

0.4653129

0.6686799

-0.15011

Xb

Y'

4.7956427

2.6614379

5.0557734

2.6614379

3.9106754

4.2230937

2.6614379

5.0557734

7.9969499

4.9777778

Squared correlation between Y and Y':

RSQ

0.9683203

Y-Y'

resid

0.2043573

-0.661438

-0.055773

0.3385621

0.0893246

-0.223094

0.3385621

-0.055773

0.0030501

0.0222222

SAS Proc Reg output for the same problem:

Model: MODEL1

Dependent Variable: Y

Analysis of Variance

Source	DF	Sum of Squares	Mean Square	F Value	Prob>F
Model	3	23.62702	7.87567	61.132	0.0001
Error	6	0.77298	0.12883
C Total	9	24.40000

Root MSE	0.35893	R-square	0.9683
Dep Mean	4.40000	Adj R-sq	0.9525
C.V.	8.15750

Parameter Estimates

Variable	DF	Parameter Estimate	Standard Error	T for H0: Parameter=0	Prob > \|T\|	Standardized Estimate
INTERCEP	1	0.084749	0.34994203	0.242	0.8167	0.00000000
X1	1	0.494553	0.17248702	2.867	0.0285	0.46531294
X2	1	0.702614	0.25873053	2.716	0.0348	0.66867988
X3	1	-0.130065	0.13579575	-0.958	0.3751	-0.15010958

The data below are the observation number, X1-X3, Y, Y' and the residuals.

1 4 5 6 5 4.79564 0.20436

2 2 3 4 2 2.66144 -0.66144

3 4 5 4 5 5.05577 -0.05577

4 2 3 4 3 2.66144 0.33856

5 4 3 2 4 3.91068 0.08932

6 4 4 5 4 4.22309 -0.22309

7 2 3 4 3 2.66144 0.33856

8 4 5 4 5 5.05577 -0.05577

9 7 8 9 8 7.99695 0.00305

10 5 4 3 5 4.97778 0.02222