Keiser/etal:1997
================

Have there also been observations of both speakers within the same
word/phrase?

Could the idea of classical stations reflecting higher SES and country
station reflecting lower SES be due to the stereotypical idea that
those are the classes that will listen to the respective music? Which
could then lead o the speaker in either station to speak more in a
stereotypical accent? In general, what is the role of stereotypes for
radio stations and in particular for studies such as this one?

To measure vowel height, the authors used auditory judgments on a
five-point scale. This sounded really complicated to me. Apparently,
in 1997, formant measurements were much more difficult to retrieve
than nowadays. Still, they used them for the mono-/diphthong
distinction, but not for vowel height. Why?

We encountered quite often by now that these sociophonetic studies
have a very low number of test subjects/participants.  Given that this
study is maybe more fundamental I still think the results are valuable
(and the authors do find significant differences in effects), but
still, this gives very very little information about who drives the
shift: (over 35, affluent, well-educated) and (younger, poorer, less
educated) are very rough groupings. The authors back up their choice
(announcers as reflections of targeted audiences), but I do also
wonder how the announcers’ personal SES interacts with the assumed SES
of the audiences.

- Is it really feasible to solely analyse the speech of radio speakers
  when trying to examine to what extent the Northern cities shift
  might have taken place in Ohio? The authors did state that the
  speakers did speak the targeted dialect and that other research had
  benefited of the use of radio recordings. However, is the general
  style of people talking on the radio not very different to
  day-to-day speech? I would think, radio speakers make sure to be
  highly comprehensible, which might result in them speaking
  differently from "regular speakers".
 
- Would it not have been better to gather data from many different
  speakers from Ohio, even at the loss of data per speaker?
  Socioeconomic class of the speakers was just inferred from the
  hypothesized audience of the radio program and I am not convinced
  this is enough to correctly identify this factor.

The authors only collected speech material of two radio announcers in
order to investigate the Northern Cities Shift in the area around
Columbus. Are two speakers enough to produce representative results
for such a big region or would it maybe be better to increase the
number of speakers?

They mention that lexical frequency effects could be present in the
data and in the sound change, in general. I could imagine that lexical
frequency may also be reflected in the SES that the two radio shows
represent. I.e. the same speaker could pronounce words associated with
a higher SES (older, more educated, more formal) differently than more
modern and more colloquial expressions. If that's the case, the
differences may directly reflect the ongoing sound change (see
exemplar theory: more modern and colloquial speech associated with
younger and lower-class speakers who are more progressed in the sound
change).

I still doubt that radio announcers are a good resource when examining
local variations (especially when they are on air). Maybe it was
different in the 90ies in the US, but I think radio announcers are
generally expected to speak almost Standard dialect. Especially then,
announcers had to compensate for noise in the transmission with
clearer speech. (There even is the term 'BBC English' indicating that
broadcasting speech is different from other, especially local,
variations.) On the other hand, broadcasting staff may lean more
towards urban speech varieties such as the NCS which was the object of
the study.

Do the announcers really represent the targeted audience? I wonder
whether Bell's theory "speakers take most account of hearers in
designing their talk" is applicable for radio announcers. Speakers
adapt their speech style depending on the listeners, maybe because
they constantly hear how their interlocutors speak. Also, a radio
announcer may rather attempt to speak with "standard accent" and not
with the most spoken accent of the particular area, for example an
announcer of radio station from Saarland doesn't speak "saarländisch"
and a newscaster of German news broadcast don't speak with the most
spoken accent (standard accent). Of course the radio announcer in the
experiment may have some regional accents but then one could have
taken speakers of any other proficiency from the same investigated
area, too.

1. Irrespective of target audience, radio format is considered to
belong to rather formal settings. Hence, the speakers were more likely
to confine themselves to the SAE. Probably the lack of sufficient
evidence for any of the Shifts patterns was due to style and not it's
infancy.

2. Could it be, though, that this region is not participating in
neither NVS nor SVS and is establishing its own vowel shift patterns
because of its borderline location between the two?

3. We have talked a lot about the biased nature of speech
perception. The study is relying on the auditory judgments of
researchers who're also human beings and the study design didn't have
any prototypical "low"-"mid"-"high" vowels to reference with when
making their judgments. How plausible is it nowadays to employ the
introspective in similar studies of speech variation?

I thought the discussion about how they verified the judgments of the
individual authors was really interesting. I noticed that the
correlations between perceived vowel height and the F1 and F2
measurements weren’t necessarily that high, and was wondering if there
are systematic differences in the way we tend to judge these speech
sounds and how they might appear on a spectrogram. Is this something
we’ve been able to quantify? Or is it more unpredictable how
perception is related to the acoustic signal?

I was also interested in the use of diphthongization as a measurement
of vowel tensing, since it seemed like a rather indirect way of
measuring vowel tension, and was wondering why they chose to represent
it that way, and whether vowel shifts towards more tense vowels ever
occurs without added diphthongization as well.

Would the result be same if we consider a larger data size?
Are there similar vowel height variations observed in any other
language(s)?

Context: For the selection of the speakers and data collection
section, they mention the two "style" levels being monologue and
dialogue and how they generally are both pointing to the listener when
it comes to radio hosts. But I was surprised that they didn't mention
focus-group/marketing tactics where hosts typically play into the
mannerisms/speech of the local area or vice versa.  For instance, I
have listened to radio hosts talk about how they had to "shed" their
accent to sound more like the people the ad was addressing or
sometimes even act like they had that accent or even exaggerate an
accent. I would think that treating them both as the same type of
speech like Keiser does in the paper is disingenuous because one is
how the person truly speaks and the other may be the "best version" of
the ad released or not truly how they speak.

Question:
Keiser mentions that both approaches to the monologue/dialogue
discussion are legitimate, but which do you think would be the better
one for this paper after doing your presentation and research on the
topic? (the approaches being that monologue and dialogue can be
treated the same when referring to radio hosts or that they are
different types of speech)

- It makes sense that choosing to study a radio host would generate
  many tokens, but wouldn't speakers still be changing their speech to
  a certain extent (whether in monologue or dialogue) simply by virtue
  of public speaking? Would it not have been better to collect data in
  a more natural setting, or from more people?
 
- If the vowel shift was known to be in progress at the time of the
  study, should age differences in the speakers have been taken into
  account?
 
- What is the role of perception ratings of the researchers when more
  objective means of measurement, such as spectrograms, are readily
  available?
 
1. Is a sample of two participants enough to study the vowel system
changes that are referred to as Northern Cities Shift?

2. Is there any known reason of why Ohio had been largely ignored in
dialect studies?

3. Why are only rising of /ae/ and fronting of /a/ considered to
provide a clear indication of the presence or absence of NCS-like
shifting in the speech of central Ohioans?

4. Should we still include the background music formants in the
analysis?