Feb 2 ===== Hay/etal:2006a -------------- Can an experiment based on just four participants really produce significant results? Especially given that all participants took part in multiple conditions, which could lead to priming from previous conditions? "All participants received a chocolate fish in thanks for their participation" How much detail is too much detail in a description of an experimental setup? I find the description of possible interaction effects with the experimenters particularly interesting. To me, it shows how fragile the 'square'/'near' diphthong really is, as interacting with an individual can have measurable effects on speech production. Might it be an effect that with the New Zealander present, the participants felt more comfortable speaking their dialect freely than with an American, since they might have trouble understanding them clearly? I'd guess codeswitching to accommodate the majority dialect, i.e. American/Californian English, is pretty common for the dialects with a smaller speaker base. The repeated mentions in Hay et al.'s experiments about the influence of the researcher conducting the study really make me wonder why instead of choosing to limit to only one experimenter, they still include the experimenter as an independent variable (while the experimenters also tried to alter their speech and avoid square/near vowels). I agree with one comment from last week, I think results would be easier to interpret without knowingly adding even more factors that can cause interaction effects. Given the sentence "Minimally, we should aim to ensure that the same researcher runs all of the participants in a particular experiment“, I can only guess that both experiments were carried out within the same time frame and they learned from this going forward? Additionally, the mention of testing for the log lexical frequency of the target word and the log lexical frequency of its competitor intrigued me, because getting started at the paper I also wondered which words are first affected by the merger, and what the factors are (only frequency, maybe also word category ...?). The experiment which was mentioned in the paper was led by a speaker of NZE and a speaker of US English. The authors come to the conclusion that the error rates of the merged participants increase after talking to the US researcher because the realization of "near" and "square" sounds is quite different in US English compared to NZE. However, why should merged participants be negaively influenced by a US speaker if they do not recognize a difference between "near" and "square" sounds? If we learn more about how social factors influence speech, can we use recordings and phonetic analysis to assess social stereotypes and discrimination? If the authors' guess(?) about broader experience being used in cases of low frequency (i.e. few NZE exemplars, few lower class exemplars) is correct, this tells us much about how exemplars are processed, namely that perception demands a certain "basic" amount of exemplars whenever possible. (re: p.474, Fig 6) The authors attribute different levels of phonetic sensitivity to the listeners. From my own experience, I think it's more conscious than is implied in the paper. Namely, listeners are at least partially aware of the speaker's sensitivity and adapt to that as how ambiguous they treat the words. As already mentioned in the paper, the distribution of the participants is irregular, particularly for gender (more female participants than male), but also the mentioned younger age range in group 3. I'm wondering if it perhaps would have been better to take a more equally distributed approach? I believe that nowadays the stereotypical appearance is fading away, e.g. less and less companies require informal wear. Therefore, it maybe more difficult to find a correlation between social status, way of speech and appearance. Are there any modern research covering this topic? This seems to suggest that there are phonemic factors that layer upon experience/exemplars. In reference to last week's readings, is it supporting more of the hybrid modeling approach described in Pierrehumbert? It seems like the researchers found a lot of surprising results that they hypothesized came from the complexity of different experiences the participants might have had, e.g. exposure to different social classes vs. age, activation of exemplars of other dialects. How do we control for these different experiences? What could we do to answer some of their questions more definitively? My question addresses experiment design. Could phonetic accomodation get in the way when participants were asked to read the words themselves after they had listened to the non-merged readings? Wouldn't it be better to have this phase as the first one in the experiment to reveal a more accurate participant's variant? I wonder about the best way to study speakers' productions of certain sounds to see if they have a merger. Anecdotally, whenever I'm asked how I pronounce a given sound, I have no idea how I would pronounce it myself. I imagine this would be even more the case given that they recorded themselves after having heard many speakers say those sounds. I don't understand the explanation the authors provide of participants relying on exemplars of other varieties of English (Australian, American, etc.). Especially in the group with the New Zealand experimenter - where is the evidence for this? What I noticed quickly here, is that participants around 30-years old, were classified as "young speakers". For me personally, the age 30 is not that young anymore. Maybe the results would differ, if they excluded participants between the age of 28 to 30. Moreover, they counted 45-year old participants as "older speakers". Here again, in my opinion the age of 45 is not old, so I would consider excluding participants between the age 45 and 50, or including more participants in the age range of 20 and 28 (or younger) and 50 an 60 (or older). It is really impressive to see the change of language/ phonotation of a langauge over the years, like Maclagan and Gordon did analyse and compared the merge of the diphthongs /iə/ and /eə/ in a long-term study. Comment: The paper was very clear on what it was trying to do and presented on it in a way that made it very hard to ask questions that weren't already answered. One comment that I have is that this paper and many before it seem to have quite a lot of issues with the presenter/tester error affecting the results (this one mentioned an American tester skewing the error rate enough to be noticeable). This is one of the first papers to expand on the reason why it led to a higher error rate and was interesting. Question: Do you know if there has been an update to this paper showing the merger getting stronger?