*** JACKKNIFE STATISTICS FOR MULTIPLE LINEAR REGRESSION *** * For the whole set of equations used, see Set #1, #2 & #5 at http://www.anthony-vba.kefra.com/vba/vba9.htm *. * Uses SPSS 14/Newer *. DEFINE JACKMLR(y=!TOKENS(1) / x=!CMDEND). * This step uses SPSS 14/Newer *. DATASET NAME OriginalData. PRESERVE. SET MXLOOPS=10000. /* Should be at least equal to sample size *. MATRIX. PRINT /TITLE='************** JACKKNIFE MLR STATISTICS **************'. * Input data (DV goes first) with listwise deletion *. GET data /VAR=!y !x /MISSING=OMIT /NAME=vnames. * Full sample statistics *. COMPUTE n=NROW(data). COMPUTE p=NCOL(data). /* Nr. of parameters (with constant) *. COMPUTE x={MAKE(n,1,1),data(:,(2:p))}. /* Split data into "y" and "x" matrices *. COMPUTE y=data(:,1). COMPUTE meany=CSUM(y)/n. COMPUTE b=GINV(x)*y. /* Note: GINV(x)=INV[T(x)*x]*T(x) *. COMPUTE ESS=T(b)*T(x)*y-n*meany&**2. /* Effect Sum of Squares *. COMPUTE TSS=T(Y)*y-n*meany&**2. /* Total Sum of Squares *. COMPUTE sigma2=(TSS-ESS)/(n-p). /* Residual variance *. COMPUTE SEb=SQRT(DIAG(sigma2*INV(T(x)*x))). /* SE of coefficients *. COMPUTE tvalue=b/SEb. COMPUTE tsig=2*(1-TCDF(ABS(tvalue),n-p)). COMPUTE Rsquare=ESS/TSS. COMPUTE Psquare=1-(1-rsquare)*(n-1)/(n-p). COMPUTE Fvalue=(rsquare/(p-1))/((1-rsquare)/(n-p)). COMPUTE Fsig=1-FCDF(Fvalue,(p-1),(n-p)). * Reports *. COMPUTE vnames={'Constant',vnames(2:p)}. PRINT /TITLE='Full sample statistics'. PRINT {b,SEb,tvalue,tsig} /FORMAT='F8.3' /RNAMES=vnames /CLABEL='Coeff.','SE','T','Sig.' /TITLE='Unstandardized coefficients'. PRINT {rsquare,psquare,SQRT(sigma2)} /FORMAT='F8.3' /CLABEL='RSq.','AdjRSq.','Res(SD)' /TITLE='Model Summary'. PRINT {Fvalue,Fsig} /FORMAT='F8.4' /CLABEL='F value','Sig.' /TITLE='R-square test (F value & Significance)'. * JACKKNIFE STATISTICS *. COMPUTE nj=n-1. /* New sample size *. * Compute empty matrix to store JK statistics *. COMPUTE JackStat=MAKE(n,(p+2),0). * Cycle thru all values *. LOOP i=1 TO n. . DO IF (i EQ 1). /* Extract JK sample for first case *. . COMPUTE sample=data(2:n,:). . ELSE IF (i GT 1) AND (i LT n). /* JK samples for cases from 2 to n-1 *. . COMPUTE sample={data(1:(i-1),:); data((i+1):n,:)}. . ELSE IF (i EQ n). /* Extract JK sample for last case *. . COMPUTE sample=data(1:(n-1),:). . END IF. . * Statistics for every JK sample *. . COMPUTE x={MAKE(nj,1,1),sample(:,(2:p))}. . COMPUTE y=sample(:,1). . COMPUTE meany=CSUM(y)/nj. . COMPUTE b=GINV(x)*y. . COMPUTE ESS=T(b)*T(x)*y-nj*meany&**2. . COMPUTE TSS=T(Y)*y-nj*meany&**2. . COMPUTE Rsquare=ESS/TSS. . COMPUTE Psquare=1-(1-rsquare)*(nj-1)/(nj-p). . * Store all statistics in JackStat(i) *. . COMPUTE JackStat(i,1) =Rsquare. . COMPUTE JackStat(i,2) =Psquare. . COMPUTE Jackstat(i,3:(p+2))=T(b). END LOOP. * Report first 10 values *. COMPUTE AllNames={'R2','P2',vnames}. COMPUTE CasesID={' 1',' 2',' 3',' 4',' 5',' 6',' 7',' 8',' 9','10'}. PRINT {JackStat(1:10,:)} /FORMAT='F8.3' /CNAMES=AllNames /RNAMES=CasesID /TITLE='Jackknife statistics for rows 1-10'. * Export JK statistics to active dataset *. SAVE JackStat /OUTFILE='C:\Temp\Coefficients.sav' /NAMES=AllNames. PRINT /TITLE='Jackknife statistics exported to Coefficients Dataset'. END MATRIX. RESTORE. * This part uses SPSS 14/Newer *. GET FILE'C:\Temp\Coefficients.sav'. DATASET NAME Coefficients. FORMAT ALL (F8.3). VAR LABEL R2 'R Square'/P2 'Adj.R Square'. FREQUENCIES VAR=ALL /FORMAT=NOTABLE /PERCENTILES= 2.5 25 50 75 97.5 /STATISTICS=STDDEV MINIMUM MAXIMUM MEAN. DATASET ACTIVATE OriginalData. DATASET CLOSE Coefficients. !ENDDEFINE. * Sample dataset (50 random cases from Rosner's dataset FEV.sav) *. DATA LIST FREE/fev(F8.3) age(F8.0) hgt(F8.1) gender smoke (2 F8.0). BEGIN DATA 1.415 6 56.0 0 0 2.646 10 60.0 1 0 3.519 19 66.0 0 1 3.000 9 65.5 1 0 3.428 14 64.0 0 1 1.694 11 60.0 1 1 3.957 14 72.0 1 1 1.962 8 57.0 1 0 2.384 12 63.5 0 1 2.679 15 66.0 0 1 2.387 10 66.0 0 1 1.794 8 54.5 1 0 2.646 13 61.5 0 0 2.198 15 62.0 0 1 3.345 19 65.5 0 1 2.599 13 62.5 0 1 3.082 17 67.0 1 1 2.903 16 63.0 0 1 3.004 15 64.0 0 1 1.603 7 51.0 0 0 1.196 5 46.5 0 0 1.697 8 59.0 0 0 2.813 10 61.5 0 0 3.985 15 71.0 1 0 4.309 14 69.0 1 1 1.947 9 56.5 0 0 3.169 11 62.5 0 1 3.406 17 69.0 1 1 2.358 10 59.0 0 0 1.933 9 58.0 0 0 3.297 13 65.0 0 1 3.680 14 67.0 1 0 1.953 9 58.0 1 1 3.247 11 65.5 1 0 4.086 18 67.0 1 1 3.585 14 70.0 1 0 3.498 10 68.0 1 1 2.953 11 67.0 0 1 3.127 10 62.0 1 0 1.338 6 51.5 0 0 2.569 12 63.0 0 0 3.320 11 65.5 1 0 3.780 14 70.0 1 0 4.404 18 70.5 1 1 4.637 11 72.0 1 1 3.727 15 68.0 1 1 4.203 12 71.0 1 0 2.564 7 58.0 0 0 3.152 13 62.0 0 1 2.391 10 59.5 1 0 END DATA. VAR LABEL fev 'FEV (liters)' /age 'Age (years)' /hgt 'Height (inches)'. VALUE LABEL gender 0 'Female' 1 'Male' /smoke 0'Non Smoker' 1'Smoker'. VAR LEV gender smoke (NOMINAL). * Adding an interaction term to the model *. COMPUTE smokehgt=smoke*hgt. FORMAT smokehgt (F8). REGRESSION /STATISTICS COEFF OUTS R ANOVA /DEPENDENT fev /METHOD=ENTER age hgt gender smoke smokehgt. JACKMLR y=fev x=age TO smokehgt.