SPSS数据统计分析与实践主讲:周涛 副教授 北京师范大学资源学院 2007 - 10 - 23教学网站:http://www.ires.cn/Courses/SPSS1第七章 单因素方差分析 本章内容:一、单因素方差分析理论基础 二、SPSS单因素方差分析实例2一、单因素方差分析理论基础3One-Factor Analysis of VarianceEvaluate the Difference Among the Means of 2 or More Populationse.g., Several Types of Tires, Oven Temperature SettingsAssumptions:Samples are Randomly and Independently Drawn (This condition must be met.) Populations are Normally Distributed (F test is Robust to moderate departures from normality.) Populations have Equal Variances4One-Factor ANOVA Test HypothesisH0: µ1 = µ2 = µ3 = ... = µc•All population means are equal •No treatment effect (NO variation in means among groups)H1: not all the µk are equal•At least ONE population mean is different (Others may be the same!) •There is treatment effect Does NOT mean that all the means are different: µ1 ≠ µ2 ≠ ... ≠ µc5One-Factor ANOVA: No Treatment EffectH0: µ1 = µ2 = µ3 = ... = µc H1: not all the µk are equalThe Null Hypothesis is Trueµ1 = µ 2 = µ 36One Factor ANOVA: Treatment Effect PresentH0: µ1 = µ2 = µ3 = ... = µc H1: not all the µk are equalThe Null Hypothesis is NOT Trueµ1 = µ 2 ≠ µ3µ1 ≠ µ2 ≠ µ37One-Factor ANOVA Partitions of Total VariationTotal Variation SST=Variation Due to Treatment SSA+Variation Due to Random Sampling SSWCommonly referred to as: Sum of Squares Within, or Sum of Squares Error, or Within Groups Variation8Commonly referred to as: Sum of Squares Among, or Sum of Squares Between, or Sum of Squares Model, or Among Groups VariationTotal Variation (总变异)SST = ∑ ∑ ( X ij − X )j = 1i = 1Xij = the ith observation in group j nj = the number of observations in group j n = the total number of observations in all groups c = the number of groupsc njc nj2X =j = 1i = 1∑ ∑ X nijthe overall mean(总平均数)9Among-Group Variation(组间变异)SSA = ∑ n j ( X j − X )j =1c2SSA MSA = c −1nj = the number of observations in group j c = the number of groups _ Xj the sample mean of group j _ _ X the overall mean or grand mean(总平均数)µι µjVariation Due to Differences Among Groups.10Within-Group Variation (组内变异)SSW = ∑ ∑ ( X ij − X j )j = 1i = 1 c nj 2SSW MSW = n−cX ij =the ith observation in group j the sample mean of group jXj =Summing the variation within each group and then adding over all groups.µj11Within-Group Variation (组内变异)SSW MSW = n−c 2 2 2 ( n1 − 1 )S1 + ( n2 − 1 )S2 + • • • + ( nc − 1 )Sc = ( n1 − 1 ) + ( n2 − 1 ) + • • • + ( nc − 1 )•If more than 2 groups, use F Test. •For 2 groups, use t-Test. F Test more limited. For c = 2, this is the pooledvariance (合并方差) in the t-Test.µj12One-Way ANOVA(单因素方差分析) Summary TableSource of Degrees Sum of of Squares Variation Freedom Among (Factor) Within (Error) Total c-1 n-c n-1 SSA SSW SST = SSA+SSW13Mean F Test Square Statistic (Variance) MSA = MSA = MSW SSA/(c - 1) MSW = SSW/(n - c)One-Factor ANOVA F Test ExampleAs production manager, you want to see if 3 filling machines have different mean filling times. You assign 15 similarly trained & experienced workers, 5 per machine, to the machines. At the 0.05 level, is there a difference in mean filling times?Machine1 Machine2 Machine325.40 26.31 24.10 23.74 25.1023.40 21.80 23.50 22.75 21.6020.00 22.20 19.75 20.60 20.4014One-Factor ANOVA Example: Scatter DiagramMachine1 Machine2 Machine325.40 26.31 24.10 23.74 25.10 _ _23.40 21.80 23.50 22.75 21.6020.00 22.20 19.75 20.60 20.4027 26 25 24 23 22Time in Seconds• • _ • • •_XX = 24.93 X = 22.61 X = 20.59• • _ _ • X • •• _ _ x • • • •_ _ x21 20 19X = 22.7115One-Factor ANOVA Example ComputationsMachine1 Machine2 Machine3_25.40 26.31 24.10 23.74 25.1023.40 21.80 23.50 22.75 21.6020.00 22.20 19.75 20.60 20.40X1 = 24.93_nj =5 c=3 n = 15X2 = 22.61_X3 = 20.59_ _X = 22.71SSA = 5 [(24.93 - 22.71) 2+ (22.61 - 22.71)2 + (20.59 - 22.71) 2] = 47.164 SSW = 4.2592+3.112 +3.682 = 11.0532 MSA = SSA/(c-1) = 47.16/2 = 23.582016MSW = SSW/(n-c) = 11.0532/12 = .9211Summary TableSource of Degrees of Sum of Variation Freedom Squares Among (Machines) Within (Error) Total 3-1=2 47.1640 Mean Square (Variance) 23.5820 .9211 F=MSA MSW15 - 3 = 12 11.0532 15 - 1 = 14 58.2172= 25.6017One-Factor ANOVA Example SolutionH0 : µ 1 = µ 2 = µ 3 H1: Not All Equalα = .05 df1= 2 df2 = 12 Critical Value(s):Test Statistic:23.5820 = 25.6 F= = MSW .9211 MSADecision: Reject at α = 0.05 Conclusion: There is evidence that at least one µi differs from the rest.α = 0.0503.89F18Multiple comparisons(多重比较) procedureWhen the computed value of F statistic in single-factor ANOVA is not significant, the analysis is terminated because no differences among the µ i′s have been identified. But when H0 is rejected, the investigator will usually want to know which of the µ i′s are different from one another. There are a number of such procedures in the statistics literature. Here we present one that many statisticians recommend for deciding for each i and j where it is plausible that µ i = µ j19The Tukey-Kramer ProcedureTells Which Population Means Are Significantly Differente.g., µ1 = µ2 ≠ µ3f(X)Post Hoc (a posteriori) ProcedureDone after rejection of equal means in ANOVAµ1= µ2 = µ3XAbility for Pairwise Comparisons:Compare absolute mean differences with ‘critical range’2 groups whose means may be significantly different.20The Tukey-Kramer Procedure: ExampleMachine1 Machine2 Machine3 25.40 23.40 20.00 26.31 21.80 22.20 24.10 23.50 19.75 23.74 22.75 20.60 25.10 21.60 20.40 2. Compute Critical Range:1. Compute absolute mean differences:X 1 − X 2 = 24.93 − 22.61 = 2.32 X 1 − X 3 = 24.93 − 20.59 = 4.34 X 2 − X 3 = 22.61 − 20.59 = 2.02⎛ 1 QU(c,n-c) 1 ⎞ ⎜ ⎟ Critical Range = QU ( c ,n − c ) + = 1.618 are given in Table ⎜ nj n ' ⎟ j ⎠ ⎝ 3. Each of the absolute mean difference is greater. There is a significance difference between each pair of means. 21 MSW 2The value of二、SPSS单因素方差分析实例22SPSS One-Way ANOVA procedureYou can use the One-Way ANOVA procedure to test the hypothesis that the means of two or more groups are not significantly different. One-Way ANOVA also offers:Group-level statistics for the dependent variable A test of variance equality A plot of group means Range tests, pairwise multiple comparisons, and contrasts, to describe the nature of the group differences23SPSS实例In response to customer requests, an electronics firm is developing a new DVD player. Using a prototype, the marketing team has collected focus group data. ANOVA is being used to discover if consumers of various ages rated the design differently.24步骤一: 直观展示分组数据差异 (Method 1: means plot)1. Select Total DVD assessment as the dependent variable. 2. Select Age Group as the factor variable. 3. Click Options. 4. Click Means Plot25步骤一: 直观展示分组数据差异 (Method 1: means plot)26步骤一:直观展示分组数据差异 (Method 2: Error Bar)To create an error bar chart, from the menus choose: Graphs Error Bar...27步骤一:直观展示分组数据差异 (Method 2: Error Bar)1.2.3.4. 5.Select Total DVD assessment as the analysis variable. Select Age Group as the category variable. Select Standard error of mean from the Bars Represent drop-down list Set Multiplier = 1. Click OK.28步骤一:直观展示分组数据差异 (Method 2: Error Bar)29步骤二:方差齐性检验To test the equality of variance assumption, from the menus choose:Analyze Compare Means One-Way ANOVA...30步骤二:方差齐性检验Select Total DVD assessment as the dependent variable. Select Age Group as the factor variable. Click Options.Click here!31步骤二:方差齐性检验Select Descriptive and Homogeneity of variance test. Click Continue Click OK in the OneWay ANOVA dialog box.32步骤二:方差齐性检验Test of Homogeneity of Variances Total DVD assessment Levene Statistic .574 df1 5 df2 62 Sig. .720Oh! I like it!The Levene statistic does not reject the null hypothesis that the group variances are equal.33步骤三:ANOVA分析ANOVA Total DVD assessment Sum of Squares 733.274 1976.417 2709.691 df 5 62 67 Mean Square 146.655 31.878 F 4.601 Sig. .001Between Groups Within Groups TotalThe significance value of the F test in the ANOVA table is 0.001. Thus, you must reject the hypothesis that average assessment scores are equal across age groups. Now that you know the groups differ in some way, you need to learn more about the structure of the differences.34步骤三:ANOVA分析, structure of the differencesThe means plot helps you to to check that the proper weights were given to the groups. Contrast CoefficientsContrast 1 2 18-24 0 .5 25-31 0 .5 Age Group 32-38 39-45 -1 1 0 0 46-52 0 -.5 53-59 0 -.5Similarly, if the mean assessments of the under-32 age groups and over-45 age groups are equal, you would expect the sum of the first two groups to be equal to the sum of the last two groups, and the difference of these sums to be near 0.42步骤四:对比分析(Contrasts)Contrast Tests Contrast 1 2 1 2 Value of Contrast 2.20 2.61 2.20 2.61 Std. Error 2.525 1.633 2.678 1.594 t .871 1.596 .821 1.635 df 62 62 17.209 42.280 Sig. (2-tailed) .387 .116 .423 .110 Total DVD assessment Assume equal variances Does not assume equal variancesNote that the results are displayed in two panels: the first assumes that the variances of the groups are equal, and the second assumes that they are unequal. In this case, the variances of the groups are assumed equal, so we focus on the first panel.43步骤四:对比分析(Contrasts)Contrast Tests Contrast 1 2 1 2 Value of Contrast 2.20 2.61 2.20 2.61 Std. Error 2.525 1.633 2.678 1.594 t .871 1.596 .821 1.635 df 62 62 17.209 42.280 Sig. (2-tailed) .387 .116 .423 .110 Total DVD assessment Assume equal variances Does not assume equal variancesThe significance values for the tests of the first contrast are both larger than 0.10. This indicates that the age 39-45 group is not significantly more favorable toward the DVD player than the age 32-38 group. Likewise, the significance values for the tests of the second contrast are larger than 0.10. Participants under 32 and over 45 have statistically equivalent assessment scores. 44步骤五:多重比较 (pairwise multiple comparisons)Contrasts are an efficient, powerful method for comparing exactly the groups that you want to compare, using whatever contrast weights that you require. However, there are times when you do not have, or do not need, such specific comparisons. The One-Way ANOVA procedure allows you to compare every group mean against every other, a method known as pairwise multiple comparisons.45步骤五:多重比较(Post Hoc) pairwise multiple comparisons46步骤五:多重比较(Post Hoc) pairwise multiple comparisonsMultiple Comparisons Dependent Variable: Total DVD assessment Mean Difference (I-J) .840 -3.877 -6.077 2.673 3.378 -.840 -4.717 -6.917 1.833 2.538 3.877 4.717 -2.200 6.550 7.255* 6.077 6.917 2.200 8.750* 9.455* -2.673 -1.833 -6.550 -8.750* .705 -3.378 -2.538 -7.255* -9.455* -.705 Tukey HSD (I) Age Group 18-24 (J) Age Group 25-31 32-38 39-45 46-52 53-59 18-24 32-38 39-45 46-52 53-59 18-24 25-31 39-45 46-52 53-59 18-24 25-31 32-38 46-52 53-59 18-24 25-31 32-38 39-45 53-59 18-24 25-31 32-38 39-45 46-52 Std. Error 2.260 2.375 2.375 2.260 2.313 2.260 2.417 2.417 2.305 2.357 2.375 2.417 2.525 2.417 2.467 2.375 2.417 2.525 2.417 2.467 2.260 2.305 2.417 2.417 2.357 2.313 2.357 2.467 2.467 2.357 Sig. .999 .581 .123 .843 .690 .999 .382 .061 .967 .889 .581 .382 .952 .088 .050 .123 .061 .952 .008 .004 .843 .967 .088 .008 1.000 .690 .889 .050 .004 1.000 95% Confidence Interval Lower Bound Upper Bound -5.81 7.49 -10.86 3.11 -13.06 .91 -3.97 9.32 -3.42 10.18 -7.49 5.81 -11.83 2.39 -14.03 .19 -4.94 8.61 -4.39 9.47 -3.11 10.86 -2.39 11.83 -9.63 5.23 -.56 13.66 .00 14.51 -.91 13.06 -.19 14.03 -5.23 9.63 1.64 15.86 2.20 16.71 -9.32 3.97 -8.61 4.94 -13.66 .56 -15.86 -1.64 -6.23 7.64 -10.18 3.42 -9.47 4.39 -14.51 .00 -16.71 -2.20 -7.64 6.2325-31•This table lists the pairwise comparisons of the group means for all selected post hoc procedures. •Mean difference lists the differences between the sample means. •Sig lists the probability that the population mean difference is zero. •A 95% confidence interval is constructed for each difference. If this interval contains zero, the two groups do not 47 differ.32-3839-4546-5253-59*. The mean difference is significant at the .05 level.SPSS单因素方差分析两点补充48SPSS单因素方差分析补充(1)The ANOVA procedure work very well in the situations that the group variances are equal. But ANOVA is robust to this violation when the groups are of equal or near equal size (样本含量的均衡性). So if the Levene statistic rejects the null hypothesis that the group variances are equal however, you still could use ANOVA in the situation that groups are of equal or near equal size.49SPSS单因素方差实例二A sales manager wishes to determine the optimal number of product training days needed for new employees. He has performance scores for three groups: employees with one, two, or three days of training.Each group has 20 casesDescriptive Statistics Sales training group 1 2 3 N Score on training exam Valid N (listwise) Score on training exam Valid N (listwise) Score on training exam Valid N (listwise) 20 20 20 20 20 20 Minimum 32.68 47.56 71.77 Maximum 86.66 89.65 89.69 Mean 63.5798 73.5677 79.2792 Std. Deviation 13.50858 10.60901504.40754SPSS单因素方差实例二51SPSS单因素方差实例二Test of Homogeneity of Variances Score on training exam Levene Statistic 4.637 df1 2 df2 57 Sig. .014The Levene statistic rejects the null hypothesis that the group variances are equal. ANOVA is robust to this violation when the groups are of equal or near equal size52SPSS单因素方差实例二Because the Levene test has already established that the variances across training groups are significantly different, we select from this list.53SPSS单因素方差实例二The group with one training day performed significantly lower than the other groupsMultiple Comparisons Dependent Variable: Score on training exam Tamhane Mean Difference (I-J) -9.98789* -15.69947* 9.98789* -5.71158 15.69947* 5.71158(I) Sales training group 1 2 3(J) Sales training group 2 3 1 3 1 2Std. Error 3.84079 3.17733 3.84079 2.56883 3.17733 2.56883Sig. .040 .000 .040 .102 .000 .10295% Confidence Interval Lower Bound Upper Bound -19.6053 -.3705 -23.8792 -7.5198 .3705 19.6053 -12.2771 .8539 7.5198 23.8792 -.8539 12.2771*. The mean difference is significant at the .05 level.Trainees with two and three days do not statistically differ in average performance.54 NextSPSS单因素方差实例二Descriptives Score on training exam 95% Confidence Interval for Mean Lower Bound Upper Bound 57.2576 69.9020 68.6025 78.5328 77.2165 81.3420 69.0415 75.2430 N 1 2 3 Total 20 20 20 60 Mean 63.5798 73.5677 79.2792 72.1422 Std. Deviation 13.50858 10.60901 4.40754 12.00312 Std. Error 3.02061 2.37225 .98556 1.54960 Minimum 32.68 47.56 71.77 32.68 Maximum 86.66 89.65 89.69 89.69Despite the means are equal between group 2 and group 3, the manager may still consider the added benefit of the third training day, given the large decrease in variability.55SPSS单因素方差分析补充(2)Post hoc results are valid to the extent that the standard F statistic is robust to violations of assumptions. As mentioned before, the F statistic is robust to unequal variances when sample sizes are equal or nearly equal. However, when both the variances and the sample sizes differ, the standard F statistic lacks power and is prone to give incorrect results. This section discusses two analysis of variance methods available in the One-Way ANOVA procedure that provide an alternative in these circumstances.56SPSS单因素方差实例三The management of a local bank has received complaints about the amount of time that customers spend waiting in line at one of their facilities. In response, analysts have recorded wait times at that facility and two other area banks for comparison purposes.57SPSS单因素方差实例三Descriptive Statistics Bank branch Branch A Branch B Branch C N Wait time in minutes Valid N (listwise) Wait time in minutes Valid N (listwise) Wait time in minutes Valid N (listwise) 125 125 90 90 100 100 Minimum 2.36 3.12 2.67 Maximum 9.90 7.38 7.43 Mean 5.5218 5.2034 5.1891 Std. Deviation 1.36990 .99963 .99162There are an unequal number of observations per branch.In this case, they can use the OneWay ANOVA procedure to obtain robust F statistics.SPSS offers two types of robust analysis of variance: 1. Brown-Forsythe test statistic 2. Welch statistic The Welch statistic is more powerful than the standard F or Brown- 58 Forsythe statistics when sample sizes and variances are unequal.SPSS单因素方差实例三Test of Homogeneity of Variances Wait time in minutes Levene Statistic 4.062 df1 2 df2 312 Sig. .018Robust Tests of Equality of Means Wait time in minutes Welch Brown-Forsythe Statistic 2.622 3.185adf1 2 2df2 206.327 307.396Sig. .075 .043a. Asymptotically F distributed.59SPSS单因素方差实例三Test of Homogeneity of Variances Wait time in minutes Levene Statistic 4.062 df1 2 df2 312 Sig. .018The Levene test confirms the suspicion that the variances of the groups are different. So, in this example, both the variances and the sample sizes differ. It means that the standard F statistic lacks power and is prone to give incorrect resultsANOVA Wait time in minutes Sum of Squares 8.017 418.983 427.000 df 2 312 314 Mean Square 4.009 1.343 F 2.985 Sig. .052Between Groups Within Groups TotalThe p value associated with the standard ANOVA F statistic is very close to .05. However, because the variances and the group 60 sizes are unequal, we are uncertain whether to trust these results.SPSS单因素方差实例三Robust Tests of Equality of Means Wait time in minutes Welch Brown-Forsythe Statistic 2.622 3.185adf1 2 2df2 206.327 307.396Sig. .075 .043a. Asymptotically F distributed.Contrary to the finding for the standard F statistic, the Brown-Forsythe test statistic is significant below .05. As with the standard F statistic, the Welch statistic is not significant below .05. The Welch statistic is more powerful than the standard F or Brown-Forsythe statistics when sample sizes and variances are unequal.61SPSS单因素方差实例三The biasing effect of the outliers can be assessed by removing them and rerunning the ANOVA62SPSS单因素方差实例三 删除极端值ANOVATest of Homogeneity of Variances Wait time in minutes Levene Statistic 2.466 df1 2 df2 309 Sig. .087Wait time in minutes Sum of Squares 6.236 379.212 385.448 df 2 309 311 Mean Square 3.118 1.227 F 2.541 Sig. .080Between Groups Within Groups TotalMultiple Comparisons Dependent Variable: Wait time in minutes Mean Difference (I-J) .28197 .29625 -.28197 .01427 -.29625 -.01427 .28197 .29625 -.28197 .01427 -.29625 -.01427Robust Tests of Equality of Means Wait time in minutesaTukey HSD(I) Bank branch Branch A Branch B Branch CTamhaneBranch A Branch B Branch C(J) Bank branch Branch B Branch C Branch A Branch C Branch A Branch B Branch B Branch C Branch A Branch C Branch A Branch BStd. Error .15393 .14944 .15393 .16096 .14944 .16096 .15547 .15134 .15547 .14469 .15134 .14469Sig. .161 .118 .161 .996 .118 .996 .199 .147 .199 1.000 .147 1.00095% Confidence Interval Lower Bound Upper Bound -.0805 .6445 -.0557 .6482 -.6445 .0805 -.3648 .3933 -.6482 .0557 -.3933 .3648 -.0923 .6562 -.0679 .6604 -.6562 .0923 -.3344 .3629 -.6604 .0679 -.3629 .3344Welch Brown-ForsytheStatistic 2.290 2.653df1 2 2df2 203.951 307.161Sig. .104 .072a. Asymptotically F distributed.Wait time in minutes Subset for alpha = .05 1 5.1891 5.2034 5.4854 .137 5.1891 5.2034 5.4854Bank branch Tukey HSDa,b Branch C Branch B Branch A Sig. a,b Tukey B Branch C Branch B Branch AN 100 90 122 100 90 122Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 102.362. b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed.63常用多重比较方法及方法选择策略64常用多重比较方法LSD法:实际上就是t检验的变形,只是在变 异和自由度的计算上利用了整个样本信息,因 此仍然存在放大一类错误的问题。 Scheffe法:当各水平个案数不相等,或者想 进行复杂的比较时,用此法较为稳妥。但它相 对比较保守。 S-N-K法:是运用最广泛的一种两两比较方 法。它采用Student Range 分布进行所有各组 均值间的配对比较。该方法保证在H0真正成 立时总的α水准等于实际设定值,即控制了一 类错误。65方法选择策略一般可以参照如下标准: 如果存在明确的对照组,要进行的是验证性研 究,即计划好的某两个或几个组间(和对照 组)的比较,宜用Bonferroni(LSD)法; 若需要进行的是多个均数间的两两比较(探索 性研究),且各组个案数相等,适宜用Tukey 法; 其它情况宜用Scheffe法和S-N-K法。66END67
SPSS数据统计分析与实践主讲:周涛 副教授 北京师范大学资源学院 2007 - 10 - 23教学网站:http://www.ires.cn/Courses/SPSS1第七章 单因素方差分析 本章内容:一、单因素方差分析理论基础 二、SPSS单因素方差分析实例2一、单因素方差分析理论基础3One-Factor Analysis of VarianceEvaluate the Difference Among the Means of 2 or More Populationse.g., Several Types of Tires, Oven Temperature SettingsAssumptions:Samples are Randomly and Independently Drawn (This condition must be met.) Populations are Normally Distributed (F test is Robust to moderate departures from normality.) Populations have Equal Variances4One-Factor ANOVA Test HypothesisH0: µ1 = µ2 = µ3 = ... = µc•All population means are equal •No treatment effect (NO variation in means among groups)H1: not all the µk are equal•At least ONE population mean is different (Others may be the same!) •There is treatment effect Does NOT mean that all the means are different: µ1 ≠ µ2 ≠ ... ≠ µc5One-Factor ANOVA: No Treatment EffectH0: µ1 = µ2 = µ3 = ... = µc H1: not all the µk are equalThe Null Hypothesis is Trueµ1 = µ 2 = µ 36One Factor ANOVA: Treatment Effect PresentH0: µ1 = µ2 = µ3 = ... = µc H1: not all the µk are equalThe Null Hypothesis is NOT Trueµ1 = µ 2 ≠ µ3µ1 ≠ µ2 ≠ µ37One-Factor ANOVA Partitions of Total VariationTotal Variation SST=Variation Due to Treatment SSA+Variation Due to Random Sampling SSWCommonly referred to as: Sum of Squares Within, or Sum of Squares Error, or Within Groups Variation8Commonly referred to as: Sum of Squares Among, or Sum of Squares Between, or Sum of Squares Model, or Among Groups VariationTotal Variation (总变异)SST = ∑ ∑ ( X ij − X )j = 1i = 1Xij = the ith observation in group j nj = the number of observations in group j n = the total number of observations in all groups c = the number of groupsc njc nj2X =j = 1i = 1∑ ∑ X nijthe overall mean(总平均数)9Among-Group Variation(组间变异)SSA = ∑ n j ( X j − X )j =1c2SSA MSA = c −1nj = the number of observations in group j c = the number of groups _ Xj the sample mean of group j _ _ X the overall mean or grand mean(总平均数)µι µjVariation Due to Differences Among Groups.10Within-Group Variation (组内变异)SSW = ∑ ∑ ( X ij − X j )j = 1i = 1 c nj 2SSW MSW = n−cX ij =the ith observation in group j the sample mean of group jXj =Summing the variation within each group and then adding over all groups.µj11Within-Group Variation (组内变异)SSW MSW = n−c 2 2 2 ( n1 − 1 )S1 + ( n2 − 1 )S2 + • • • + ( nc − 1 )Sc = ( n1 − 1 ) + ( n2 − 1 ) + • • • + ( nc − 1 )•If more than 2 groups, use F Test. •For 2 groups, use t-Test. F Test more limited. For c = 2, this is the pooledvariance (合并方差) in the t-Test.µj12One-Way ANOVA(单因素方差分析) Summary TableSource of Degrees Sum of of Squares Variation Freedom Among (Factor) Within (Error) Total c-1 n-c n-1 SSA SSW SST = SSA+SSW13Mean F Test Square Statistic (Variance) MSA = MSA = MSW SSA/(c - 1) MSW = SSW/(n - c)One-Factor ANOVA F Test ExampleAs production manager, you want to see if 3 filling machines have different mean filling times. You assign 15 similarly trained & experienced workers, 5 per machine, to the machines. At the 0.05 level, is there a difference in mean filling times?Machine1 Machine2 Machine325.40 26.31 24.10 23.74 25.1023.40 21.80 23.50 22.75 21.6020.00 22.20 19.75 20.60 20.4014One-Factor ANOVA Example: Scatter DiagramMachine1 Machine2 Machine325.40 26.31 24.10 23.74 25.10 _ _23.40 21.80 23.50 22.75 21.6020.00 22.20 19.75 20.60 20.4027 26 25 24 23 22Time in Seconds• • _ • • •_XX = 24.93 X = 22.61 X = 20.59• • _ _ • X • •• _ _ x • • • •_ _ x21 20 19X = 22.7115One-Factor ANOVA Example ComputationsMachine1 Machine2 Machine3_25.40 26.31 24.10 23.74 25.1023.40 21.80 23.50 22.75 21.6020.00 22.20 19.75 20.60 20.40X1 = 24.93_nj =5 c=3 n = 15X2 = 22.61_X3 = 20.59_ _X = 22.71SSA = 5 [(24.93 - 22.71) 2+ (22.61 - 22.71)2 + (20.59 - 22.71) 2] = 47.164 SSW = 4.2592+3.112 +3.682 = 11.0532 MSA = SSA/(c-1) = 47.16/2 = 23.582016MSW = SSW/(n-c) = 11.0532/12 = .9211Summary TableSource of Degrees of Sum of Variation Freedom Squares Among (Machines) Within (Error) Total 3-1=2 47.1640 Mean Square (Variance) 23.5820 .9211 F=MSA MSW15 - 3 = 12 11.0532 15 - 1 = 14 58.2172= 25.6017One-Factor ANOVA Example SolutionH0 : µ 1 = µ 2 = µ 3 H1: Not All Equalα = .05 df1= 2 df2 = 12 Critical Value(s):Test Statistic:23.5820 = 25.6 F= = MSW .9211 MSADecision: Reject at α = 0.05 Conclusion: There is evidence that at least one µi differs from the rest.α = 0.0503.89F18Multiple comparisons(多重比较) procedureWhen the computed value of F statistic in single-factor ANOVA is not significant, the analysis is terminated because no differences among the µ i′s have been identified. But when H0 is rejected, the investigator will usually want to know which of the µ i′s are different from one another. There are a number of such procedures in the statistics literature. Here we present one that many statisticians recommend for deciding for each i and j where it is plausible that µ i = µ j19The Tukey-Kramer ProcedureTells Which Population Means Are Significantly Differente.g., µ1 = µ2 ≠ µ3f(X)Post Hoc (a posteriori) ProcedureDone after rejection of equal means in ANOVAµ1= µ2 = µ3XAbility for Pairwise Comparisons:Compare absolute mean differences with ‘critical range’2 groups whose means may be significantly different.20The Tukey-Kramer Procedure: ExampleMachine1 Machine2 Machine3 25.40 23.40 20.00 26.31 21.80 22.20 24.10 23.50 19.75 23.74 22.75 20.60 25.10 21.60 20.40 2. Compute Critical Range:1. Compute absolute mean differences:X 1 − X 2 = 24.93 − 22.61 = 2.32 X 1 − X 3 = 24.93 − 20.59 = 4.34 X 2 − X 3 = 22.61 − 20.59 = 2.02⎛ 1 QU(c,n-c) 1 ⎞ ⎜ ⎟ Critical Range = QU ( c ,n − c ) + = 1.618 are given in Table ⎜ nj n ' ⎟ j ⎠ ⎝ 3. Each of the absolute mean difference is greater. There is a significance difference between each pair of means. 21 MSW 2The value of二、SPSS单因素方差分析实例22SPSS One-Way ANOVA procedureYou can use the One-Way ANOVA procedure to test the hypothesis that the means of two or more groups are not significantly different. One-Way ANOVA also offers:Group-level statistics for the dependent variable A test of variance equality A plot of group means Range tests, pairwise multiple comparisons, and contrasts, to describe the nature of the group differences23SPSS实例In response to customer requests, an electronics firm is developing a new DVD player. Using a prototype, the marketing team has collected focus group data. ANOVA is being used to discover if consumers of various ages rated the design differently.24步骤一: 直观展示分组数据差异 (Method 1: means plot)1. Select Total DVD assessment as the dependent variable. 2. Select Age Group as the factor variable. 3. Click Options. 4. Click Means Plot25步骤一: 直观展示分组数据差异 (Method 1: means plot)26步骤一:直观展示分组数据差异 (Method 2: Error Bar)To create an error bar chart, from the menus choose: Graphs Error Bar...27步骤一:直观展示分组数据差异 (Method 2: Error Bar)1.2.3.4. 5.Select Total DVD assessment as the analysis variable. Select Age Group as the category variable. Select Standard error of mean from the Bars Represent drop-down list Set Multiplier = 1. Click OK.28步骤一:直观展示分组数据差异 (Method 2: Error Bar)29步骤二:方差齐性检验To test the equality of variance assumption, from the menus choose:Analyze Compare Means One-Way ANOVA...30步骤二:方差齐性检验Select Total DVD assessment as the dependent variable. Select Age Group as the factor variable. Click Options.Click here!31步骤二:方差齐性检验Select Descriptive and Homogeneity of variance test. Click Continue Click OK in the OneWay ANOVA dialog box.32步骤二:方差齐性检验Test of Homogeneity of Variances Total DVD assessment Levene Statistic .574 df1 5 df2 62 Sig. .720Oh! I like it!The Levene statistic does not reject the null hypothesis that the group variances are equal.33步骤三:ANOVA分析ANOVA Total DVD assessment Sum of Squares 733.274 1976.417 2709.691 df 5 62 67 Mean Square 146.655 31.878 F 4.601 Sig. .001Between Groups Within Groups TotalThe significance value of the F test in the ANOVA table is 0.001. Thus, you must reject the hypothesis that average assessment scores are equal across age groups. Now that you know the groups differ in some way, you need to learn more about the structure of the differences.34步骤三:ANOVA分析, structure of the differencesThe means plot helps you to to check that the proper weights were given to the groups. Contrast CoefficientsContrast 1 2 18-24 0 .5 25-31 0 .5 Age Group 32-38 39-45 -1 1 0 0 46-52 0 -.5 53-59 0 -.5Similarly, if the mean assessments of the under-32 age groups and over-45 age groups are equal, you would expect the sum of the first two groups to be equal to the sum of the last two groups, and the difference of these sums to be near 0.42步骤四:对比分析(Contrasts)Contrast Tests Contrast 1 2 1 2 Value of Contrast 2.20 2.61 2.20 2.61 Std. Error 2.525 1.633 2.678 1.594 t .871 1.596 .821 1.635 df 62 62 17.209 42.280 Sig. (2-tailed) .387 .116 .423 .110 Total DVD assessment Assume equal variances Does not assume equal variancesNote that the results are displayed in two panels: the first assumes that the variances of the groups are equal, and the second assumes that they are unequal. In this case, the variances of the groups are assumed equal, so we focus on the first panel.43步骤四:对比分析(Contrasts)Contrast Tests Contrast 1 2 1 2 Value of Contrast 2.20 2.61 2.20 2.61 Std. Error 2.525 1.633 2.678 1.594 t .871 1.596 .821 1.635 df 62 62 17.209 42.280 Sig. (2-tailed) .387 .116 .423 .110 Total DVD assessment Assume equal variances Does not assume equal variancesThe significance values for the tests of the first contrast are both larger than 0.10. This indicates that the age 39-45 group is not significantly more favorable toward the DVD player than the age 32-38 group. Likewise, the significance values for the tests of the second contrast are larger than 0.10. Participants under 32 and over 45 have statistically equivalent assessment scores. 44步骤五:多重比较 (pairwise multiple comparisons)Contrasts are an efficient, powerful method for comparing exactly the groups that you want to compare, using whatever contrast weights that you require. However, there are times when you do not have, or do not need, such specific comparisons. The One-Way ANOVA procedure allows you to compare every group mean against every other, a method known as pairwise multiple comparisons.45步骤五:多重比较(Post Hoc) pairwise multiple comparisons46步骤五:多重比较(Post Hoc) pairwise multiple comparisonsMultiple Comparisons Dependent Variable: Total DVD assessment Mean Difference (I-J) .840 -3.877 -6.077 2.673 3.378 -.840 -4.717 -6.917 1.833 2.538 3.877 4.717 -2.200 6.550 7.255* 6.077 6.917 2.200 8.750* 9.455* -2.673 -1.833 -6.550 -8.750* .705 -3.378 -2.538 -7.255* -9.455* -.705 Tukey HSD (I) Age Group 18-24 (J) Age Group 25-31 32-38 39-45 46-52 53-59 18-24 32-38 39-45 46-52 53-59 18-24 25-31 39-45 46-52 53-59 18-24 25-31 32-38 46-52 53-59 18-24 25-31 32-38 39-45 53-59 18-24 25-31 32-38 39-45 46-52 Std. Error 2.260 2.375 2.375 2.260 2.313 2.260 2.417 2.417 2.305 2.357 2.375 2.417 2.525 2.417 2.467 2.375 2.417 2.525 2.417 2.467 2.260 2.305 2.417 2.417 2.357 2.313 2.357 2.467 2.467 2.357 Sig. .999 .581 .123 .843 .690 .999 .382 .061 .967 .889 .581 .382 .952 .088 .050 .123 .061 .952 .008 .004 .843 .967 .088 .008 1.000 .690 .889 .050 .004 1.000 95% Confidence Interval Lower Bound Upper Bound -5.81 7.49 -10.86 3.11 -13.06 .91 -3.97 9.32 -3.42 10.18 -7.49 5.81 -11.83 2.39 -14.03 .19 -4.94 8.61 -4.39 9.47 -3.11 10.86 -2.39 11.83 -9.63 5.23 -.56 13.66 .00 14.51 -.91 13.06 -.19 14.03 -5.23 9.63 1.64 15.86 2.20 16.71 -9.32 3.97 -8.61 4.94 -13.66 .56 -15.86 -1.64 -6.23 7.64 -10.18 3.42 -9.47 4.39 -14.51 .00 -16.71 -2.20 -7.64 6.2325-31•This table lists the pairwise comparisons of the group means for all selected post hoc procedures. •Mean difference lists the differences between the sample means. •Sig lists the probability that the population mean difference is zero. •A 95% confidence interval is constructed for each difference. If this interval contains zero, the two groups do not 47 differ.32-3839-4546-5253-59*. The mean difference is significant at the .05 level.SPSS单因素方差分析两点补充48SPSS单因素方差分析补充(1)The ANOVA procedure work very well in the situations that the group variances are equal. But ANOVA is robust to this violation when the groups are of equal or near equal size (样本含量的均衡性). So if the Levene statistic rejects the null hypothesis that the group variances are equal however, you still could use ANOVA in the situation that groups are of equal or near equal size.49SPSS单因素方差实例二A sales manager wishes to determine the optimal number of product training days needed for new employees. He has performance scores for three groups: employees with one, two, or three days of training.Each group has 20 casesDescriptive Statistics Sales training group 1 2 3 N Score on training exam Valid N (listwise) Score on training exam Valid N (listwise) Score on training exam Valid N (listwise) 20 20 20 20 20 20 Minimum 32.68 47.56 71.77 Maximum 86.66 89.65 89.69 Mean 63.5798 73.5677 79.2792 Std. Deviation 13.50858 10.60901504.40754SPSS单因素方差实例二51SPSS单因素方差实例二Test of Homogeneity of Variances Score on training exam Levene Statistic 4.637 df1 2 df2 57 Sig. .014The Levene statistic rejects the null hypothesis that the group variances are equal. ANOVA is robust to this violation when the groups are of equal or near equal size52SPSS单因素方差实例二Because the Levene test has already established that the variances across training groups are significantly different, we select from this list.53SPSS单因素方差实例二The group with one training day performed significantly lower than the other groupsMultiple Comparisons Dependent Variable: Score on training exam Tamhane Mean Difference (I-J) -9.98789* -15.69947* 9.98789* -5.71158 15.69947* 5.71158(I) Sales training group 1 2 3(J) Sales training group 2 3 1 3 1 2Std. Error 3.84079 3.17733 3.84079 2.56883 3.17733 2.56883Sig. .040 .000 .040 .102 .000 .10295% Confidence Interval Lower Bound Upper Bound -19.6053 -.3705 -23.8792 -7.5198 .3705 19.6053 -12.2771 .8539 7.5198 23.8792 -.8539 12.2771*. The mean difference is significant at the .05 level.Trainees with two and three days do not statistically differ in average performance.54 NextSPSS单因素方差实例二Descriptives Score on training exam 95% Confidence Interval for Mean Lower Bound Upper Bound 57.2576 69.9020 68.6025 78.5328 77.2165 81.3420 69.0415 75.2430 N 1 2 3 Total 20 20 20 60 Mean 63.5798 73.5677 79.2792 72.1422 Std. Deviation 13.50858 10.60901 4.40754 12.00312 Std. Error 3.02061 2.37225 .98556 1.54960 Minimum 32.68 47.56 71.77 32.68 Maximum 86.66 89.65 89.69 89.69Despite the means are equal between group 2 and group 3, the manager may still consider the added benefit of the third training day, given the large decrease in variability.55SPSS单因素方差分析补充(2)Post hoc results are valid to the extent that the standard F statistic is robust to violations of assumptions. As mentioned before, the F statistic is robust to unequal variances when sample sizes are equal or nearly equal. However, when both the variances and the sample sizes differ, the standard F statistic lacks power and is prone to give incorrect results. This section discusses two analysis of variance methods available in the One-Way ANOVA procedure that provide an alternative in these circumstances.56SPSS单因素方差实例三The management of a local bank has received complaints about the amount of time that customers spend waiting in line at one of their facilities. In response, analysts have recorded wait times at that facility and two other area banks for comparison purposes.57SPSS单因素方差实例三Descriptive Statistics Bank branch Branch A Branch B Branch C N Wait time in minutes Valid N (listwise) Wait time in minutes Valid N (listwise) Wait time in minutes Valid N (listwise) 125 125 90 90 100 100 Minimum 2.36 3.12 2.67 Maximum 9.90 7.38 7.43 Mean 5.5218 5.2034 5.1891 Std. Deviation 1.36990 .99963 .99162There are an unequal number of observations per branch.In this case, they can use the OneWay ANOVA procedure to obtain robust F statistics.SPSS offers two types of robust analysis of variance: 1. Brown-Forsythe test statistic 2. Welch statistic The Welch statistic is more powerful than the standard F or Brown- 58 Forsythe statistics when sample sizes and variances are unequal.SPSS单因素方差实例三Test of Homogeneity of Variances Wait time in minutes Levene Statistic 4.062 df1 2 df2 312 Sig. .018Robust Tests of Equality of Means Wait time in minutes Welch Brown-Forsythe Statistic 2.622 3.185adf1 2 2df2 206.327 307.396Sig. .075 .043a. Asymptotically F distributed.59SPSS单因素方差实例三Test of Homogeneity of Variances Wait time in minutes Levene Statistic 4.062 df1 2 df2 312 Sig. .018The Levene test confirms the suspicion that the variances of the groups are different. So, in this example, both the variances and the sample sizes differ. It means that the standard F statistic lacks power and is prone to give incorrect resultsANOVA Wait time in minutes Sum of Squares 8.017 418.983 427.000 df 2 312 314 Mean Square 4.009 1.343 F 2.985 Sig. .052Between Groups Within Groups TotalThe p value associated with the standard ANOVA F statistic is very close to .05. However, because the variances and the group 60 sizes are unequal, we are uncertain whether to trust these results.SPSS单因素方差实例三Robust Tests of Equality of Means Wait time in minutes Welch Brown-Forsythe Statistic 2.622 3.185adf1 2 2df2 206.327 307.396Sig. .075 .043a. Asymptotically F distributed.Contrary to the finding for the standard F statistic, the Brown-Forsythe test statistic is significant below .05. As with the standard F statistic, the Welch statistic is not significant below .05. The Welch statistic is more powerful than the standard F or Brown-Forsythe statistics when sample sizes and variances are unequal.61SPSS单因素方差实例三The biasing effect of the outliers can be assessed by removing them and rerunning the ANOVA62SPSS单因素方差实例三 删除极端值ANOVATest of Homogeneity of Variances Wait time in minutes Levene Statistic 2.466 df1 2 df2 309 Sig. .087Wait time in minutes Sum of Squares 6.236 379.212 385.448 df 2 309 311 Mean Square 3.118 1.227 F 2.541 Sig. .080Between Groups Within Groups TotalMultiple Comparisons Dependent Variable: Wait time in minutes Mean Difference (I-J) .28197 .29625 -.28197 .01427 -.29625 -.01427 .28197 .29625 -.28197 .01427 -.29625 -.01427Robust Tests of Equality of Means Wait time in minutesaTukey HSD(I) Bank branch Branch A Branch B Branch CTamhaneBranch A Branch B Branch C(J) Bank branch Branch B Branch C Branch A Branch C Branch A Branch B Branch B Branch C Branch A Branch C Branch A Branch BStd. Error .15393 .14944 .15393 .16096 .14944 .16096 .15547 .15134 .15547 .14469 .15134 .14469Sig. .161 .118 .161 .996 .118 .996 .199 .147 .199 1.000 .147 1.00095% Confidence Interval Lower Bound Upper Bound -.0805 .6445 -.0557 .6482 -.6445 .0805 -.3648 .3933 -.6482 .0557 -.3933 .3648 -.0923 .6562 -.0679 .6604 -.6562 .0923 -.3344 .3629 -.6604 .0679 -.3629 .3344Welch Brown-ForsytheStatistic 2.290 2.653df1 2 2df2 203.951 307.161Sig. .104 .072a. Asymptotically F distributed.Wait time in minutes Subset for alpha = .05 1 5.1891 5.2034 5.4854 .137 5.1891 5.2034 5.4854Bank branch Tukey HSDa,b Branch C Branch B Branch A Sig. a,b Tukey B Branch C Branch B Branch AN 100 90 122 100 90 122Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 102.362. b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed.63常用多重比较方法及方法选择策略64常用多重比较方法LSD法:实际上就是t检验的变形,只是在变 异和自由度的计算上利用了整个样本信息,因 此仍然存在放大一类错误的问题。 Scheffe法:当各水平个案数不相等,或者想 进行复杂的比较时,用此法较为稳妥。但它相 对比较保守。 S-N-K法:是运用最广泛的一种两两比较方 法。它采用Student Range 分布进行所有各组 均值间的配对比较。该方法保证在H0真正成 立时总的α水准等于实际设定值,即控制了一 类错误。65方法选择策略一般可以参照如下标准: 如果存在明确的对照组,要进行的是验证性研 究,即计划好的某两个或几个组间(和对照 组)的比较,宜用Bonferroni(LSD)法; 若需要进行的是多个均数间的两两比较(探索 性研究),且各组个案数相等,适宜用Tukey 法; 其它情况宜用Scheffe法和S-N-K法。66END67