Linguistics Workshop 2

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Outline: 

Outline The sampling distribution of the mean Decision rules – α Hypothesis tests Tests of hypothesis about a single mean Tests of hypothesis about the difference between two means Large samples, independent groups Small samples, independent groups Large samples, dependent pairs Small samples, dependent pairs

Outline: 

Outline 6. Tests of hypothesis about more than two means – Anova One variable, between groups One variable, within groups Factorial designs Tests of hypotheses about distributions Tests of hypothesis about a correlation Linear regression The regression line Tests of hypothesis about the slope

1. The sampling distribution of the mean: 

1. The sampling distribution of the mean The standard normal distribution tells us the probabilities relating to individual cases. For example, getting a SAT score higher than 700, or getting an IQ score lower than 85. Sometimes, we want to ask a question about what is true of a set of cases – a sample. We can use the standard normal distribution for this, too.

1. The sampling distribution of the mean: 

1. The sampling distribution of the mean Suppose we want to know whether the number of trick-or-treaters seen on Halloween at the average home has changed. Suppose, too, that historically the average number has been 25, with a standard deviation of 5. μ = 25 σ = 5 These are population values

1. The sampling distribution of the mean: 

1. The sampling distribution of the mean Next Tuesday, we poll 40 homes selected at random and ask how many trick-or-treaters they saw on Monday night. The average number per home is 24. X = 24 This is a sample value. Our question is, has the population value changed?

1. The sampling distribution of the mean: 

1. The sampling distribution of the mean There are two possibilities: The population value for average number of trick-or-treaters seen per home has actually decreased for some reason. The population value has not changed. We randomly selected some homes that got fewer trick-or-treaters than is typical. Either one could be true. Which one is true?

Slide7: 

Suppose nothing has changed. What is the probability that when we draw a sample of 40 houses we find a sample mean of 24 or fewer kids seen on Halloween? If that probability is very low, the hypothesis that “nothing has changed” doesn’t look so good.

Slide8: 

We answer this question by computing the Z score for a sample mean of 24: Z = X – μ = 24 – 25 = 1.265 σ 5 √n √40 Probability (X ≤ 24) = (.50 – .3971) = .1029

1. The sampling distribution of the mean: 

1. The sampling distribution of the mean This distribution tells us how likely we are to find a sample mean in a given range (when the sample was randomly selected) – if the population mean is what we think it is. If μ is true, then there is only a .1029 probability that, when we draw one random sample (of size 40), the mean would be X = 24 or less. 89.71% of such sample means will be > 24. So, is μ true?

Decision Rules – α: 

Decision Rules – α Though the probability of getting X ≤ 24 is small, it is not zero. We might just have been unlucky in selecting our sample. We might be making a mistake if we reject HO. So we set a standard – before rejecting HO, we insist that the sample mean be so different from μ that the probability of making a mistake is < α (which we specify before we begin our study).

Decision Rules – α: 

Decision Rules – α Our sample mean is some distance L from the population mean specified in the null hypothesis. P = the probability of finding a sample mean that far away or further if HO is true. If P < α, then we decide that the null hypothesis is no longer true – the population mean has changed from its historical value.

Slide12: 

0 We would be surprised to find X in this range here – the shaded area corresponding to α – when we draw just one sample. L We wouldn’t be surprised to find in this range here. X

2. Decision Rules – α: 

2. Decision Rules – α A decision rule tells us how unlikely a sample mean has to be (under the null hypothesis) before we reject the null hypothesis. α is the probability of rejecting HO when we should not. Computer packages now allow us to specify an exact probability (e.g., P < .003). In such cases, one significant digit should suffice (that is, one non-zero number after the string of zeros).

2. Decision Rules – α: 

2. Decision Rules – α Usually, we set α for a study and then report whether the P value for our statistical test is less than or greater than α. Conventional levels: P < .05 or P < .01 If we make α << .05, we inflate the probability, β, of not rejecting HO when we should.

3. Hypothesis tests: 

3. Hypothesis tests A hypothesis test always tests the null hypothesis, the hypothesis of no effect (or no difference). E.g., number of children seen at average house on Halloween this year is not different from historical value. Number of verbs in the subjunctive mood used by L2 speakers in ten minutes of conversation is not different from number used by L1 speakers.

3. Hypothesis tests: 

3. Hypothesis tests The null hypothesis is always assessed against an alternative hypothesis, HA, which could be two-tailed or one-tailed. Two-tailed: states that the population mean is no longer the value μO specified in HO, but does not say whether it is larger or smaller. One-tailed: specifies that the population mean is now larger (or smaller) than μO.

3. Hypothesis tests: 

3. Hypothesis tests With a one-tailed HA, it is easier to reject HO, because the critical value of Z or t is smaller. Sample mean does not have to be so far away from μO for us to reject HO. If you want to use a one-tailed HA, you have to have a good theoretical reason. E.g., L2 speakers make more pronunciation errors than L1 speakers.

Slide19: 

For α = .05, we put 5% of the distribution in the two “tails” – 2.5% into each tail – in a two-tailed test. If HO is true, 95% of random samples should fall in that central area. So if we just take one sample, it really should be there, if HO is true. 2.5% 47.5% 2.5% 47.5%

Slide20: 

For α = .05, we put 5% of the distribution in one “tail” in a one-tailed test. If HO is true, only 5% of random samples should fall below the dotted line. So if we just take one sample, it really should be in the 95% above that tail, if HO is true. 5% 45% 50%

Slide21: 

The same logic works for the upper tail: Whether you make a lower-tail or an upper-tail prediction is a theoretical issue – what makes sense given the problem you’re working on? 5% 45% 50%

3. Hypothesis Tests – Formal Statement: 

3. Hypothesis Tests – Formal Statement H0:  = 0 (this is the historical population mean) HA:  < 0 or HA:  ≠ 0 HA:  > 0 (One-tailed test) (Two-tailed test) Test Statistic: Z = -   -  X sX

Slide23: 

–  X

3. Hypothesis Tests – Formal Statement: 

3. Hypothesis Tests – Formal Statement Rejection Region: One-tailed test: Two-tailed test: Zobt > Zα or │Zobt│ > Zα/2 Zobt < –Zα Decision should be reported explicitly. (Did you reject H0 or not reject H0?) You never “accept” H0 – only fail to reject it.

We never “accept” the null hypothesis: 

We never “accept” the null hypothesis When we fail to reject the null hypothesis, what that means is that we “have no evidence that the population mean has changed.” There might be lots of reasons why we have no evidence, other than that the population mean has in fact not changed. We might be incompetent! Or life may be more complicated than we thought.

Slide26: 

Here, it looks like there is no effect of arousal on performance because we have the same performance levels for both low and high arousal subjects – so do we accept HO?

Slide27: 

The patterns are different when we also manipulate task difficulty. If we don’t manipulate task difficulty, we don’t reject HO – but being good skeptics, we don’t accept it either.

4. Test of hypothesis about a single mean: 

4. Test of hypothesis about a single mean When we test a hypothesis about a population mean using a single sample mean, there are two ways to do it: If n ≥ 30, use Z, and take critical value of Z from standard normal table. If n < 30 but you know σ, use Z. If n < 30 and σ is unknown, use t, and take critical value of t from table of t values.

4. Test of hypothesis about a single mean: 

4. Test of hypothesis about a single mean When n < 30 and  is unknown, you evaluate tobt against the critical value of t from the t table. critical value varies with degrees of freedom for the test For t-test, d.f. = (n – 1) number of observations – 1. See t table provided

5. Test of hypothesis about the difference between 2 means: 

5. Test of hypothesis about the difference between 2 means Here, we are asking whether two population means are different from each other. We draw a sample from each population and compute X. We now do our statistical test on the difference between the means. But how we do the test depends upon two things…

5. Test of hypothesis about the difference between 2 means: 

5. Test of hypothesis about the difference between 2 means Sample size: As before, we use Z for n ≥ 30 or when we know σ. Use t otherwise. 2. Independence A second issue is whether the two samples of observations are independent. Independence here means there is no connection between cases in the 2 samples

5. Test of hypothesis about the difference between 2 means: 

5. Test of hypothesis about the difference between 2 means If you randomly assign people to your two samples, the samples are independent. If you test “within subjects,” the two samples are dependent. If you use different subjects in the two groups, but match them in pairs on some variable, the two samples are dependent.

5a. Large samples, independent groups: 

5a. Large samples, independent groups Our null hypothesis is typically that there is no difference between population means. Sometimes we’ll hypothesize that a historical, non-zero difference still holds. Either way, we use the sampling distribution of the difference .

5a. Large samples, independent groups: 

5a. Large samples, independent groups The mean of the sampling distribution of the difference is . Because the samples are independent, (X1-X2) = 12 22 n1 n2 (1 – 2)

5a. Large samples, independent groups: 

5a. Large samples, independent groups H0: 1 – 2 = D0 H0: 1 – 2 = D0 HA: 1 – 2 > D0 HA: 1 – 2 ≠ D0 or: 1 – 2 < D0 Test statistic: Z = – D0

5a. Large samples, independent groups: 

5a. Large samples, independent groups Rejection region: One-tailed: Two-tailed: Z > Zα │Z│ > Zα/2 or Z < -Zα

5b. Small samples, independent groups: 

5b. Small samples, independent groups There are 2 ways to do this – depending upon whether the two population variances are equal or different. In order to know which method we should use, we have to test the hypothesis H0: 12 = 22 So for small samples, independent groups, there are always 2 steps– test the variances, then test the means. But we’ll skip the first step today, and assume equal variances.

5b. Small samples, independent groups: 

5b. Small samples, independent groups H0: 1 – 2 = D0 H0: 1 – 2 = D0 HA: 1 – 2 > D0 HA: 1 – 2 ≠ D0 or: 1 – 2 < D0 Test statistic: t = – D0 Sp2 1 1 n1 n2 ( )

5b. Small samples, independent groups: 

5b. Small samples, independent groups NOTE: SP2 = (n1 – 1)S12 + (n2 – 1)S22 n1 + n2 - 2

5b. Small samples, independent groups: 

5b. Small samples, independent groups Rejection region: One-tailed: Two-tailed: t < -tα │t│>tα/2 or t > tα

5c. Large samples, dependent groups: 

5c. Large samples, dependent groups When we match pairs, or use a within-subjects design, we violate the assumption that our samples are randomly chosen, which is important for our use of the sampling distribution of the difference. We deal with this by treating a test of the difference between dependent pairs as a one-sample test.

5c. Large samples, dependent groups: 

5c. Large samples, dependent groups HO: µD = DO HO: µD = DO HA: µ D < DO or HA: µD ≠ DO µD > DO Test statistic: Z = XD – DO ≈ XD - DO D /√nD SD / √nD D = population standard deviation of differences sD = sample standard deviation of differences

5c. Large samples, dependent groups: 

5c. Large samples, dependent groups Rejection region: One-tailed: Two-tailed: Z > Zα │Z│ > Zα/2 or Z < -Zα

5d. Small samples, dependent groups: 

5d. Small samples, dependent groups HO: µD = DO HO: µD = DO HA: µ D < DO or HA: µD ≠ DO µD > DO Test statistic: t = XD - DO SD / √nD sD = sample standard deviation of differences

5d. Small samples, dependent groups: 

5d. Small samples, dependent groups Rejection region: One-tailed: Two-tailed: t > tα │t│ > tα/2 or t < -tα Where t and t/2 are based on (nD – 1) d.f.

6. Tests of hypothesis about > 2 means: 

6. Tests of hypothesis about > 2 means When we ask questions about more than two populations – e.g., L2 learners at 4 different stages of development – we could simply run all the pairs of comparisons using t or Z: Level 1 vs. Level 2 Level 1 vs. Level 3 Level 1 vs. Level 4 … But we have a short-cut called Anova…

Basic Anova design: 

Basic Anova design Draw a sample Make an inference

6. Tests of hypothesis about > 2 means: 

6. Tests of hypothesis about > 2 means The approach is to compare the differences among the treatment means ( , , , … ) to the amount of error variability. Our question is, are the differences among the treatment means so large that they could not be due to sampling error?

6. Tests of hypothesis about > 2 means: 

6. Tests of hypothesis about > 2 means In order to answer our question, we need numerical measures of two things: differences among the treatment means how different from each other are the means? sampling variability within each treatment how different could they be just on the basis of chance?

6. Tests of hypothesis about > 2 means: 

6. Tests of hypothesis about > 2 means SST = Σni( – )2 (Eqn. 1) Individual sample means Grand mean

6. Tests of hypothesis about > 2 means: 

6. Tests of hypothesis about > 2 means In Eqn. 1, the sum (Σ) measures the variability among the sample means. SST is the sum of squared deviations of the treatment means from the grand mean. The more different the treatment means are from each other, the bigger SST will be.

6. Tests of hypothesis about > 2 means: 

6. Tests of hypothesis about > 2 means SSE = Σ(X1j – X1)2 + Σ(X2j – X2)2 + … + Σ(XPj – XP)2 SSE = the sum of squares for error. This measures the total variability of individual scores around their respective sample means. People who get the same treatment should all have the same score. Any deviation from that state reflects sampling error.

6. Tests of hypothesis about > 2 means: 

6. Tests of hypothesis about > 2 means Like Z and t, Analysis of Variance computes a ratio. Numerator: how much do sample means differ from each other, on average? Denominator: how much do observations differ within a sample, on average?

6. Tests of hypothesis about > 2 means: 

6. Tests of hypothesis about > 2 means How can we compare SST to SSE? To make SST and SSE commensurable, we divide each by their degrees of freedom. SST = MST (Mean Square Treatment) P-1 SSE = MSE (Mean Square Error) N-P

The Analysis of Variance – F-test: 

The Analysis of Variance – F-test When there is a treatment effect, MST will be much larger than MSE. Therefore, the ratio of MST to MSE will be much larger than 1.0 F = MST MSE How much larger than 1 must F be for us to reject H0? Check F table for α and d.f.

The Analysis of Variance – F-test: 

The Analysis of Variance – F-test d.f. numerator = p – 1 d.f. denominator = n – p Important note: for Anova, F test is always one-tailed. You’re asking “is the treatment variance larger than the error variance?”