(Online Talk) January 26, 2022: Jennifer Cole & Jeremy Steffman

 Nuclear Tunes lost and found: Modeling intonational tunes in American English with labeled vs. unlabeled data

We examine how intonational tunes in American English are represented by speakers, as assessed in an imitative speech paradigm, in which speakers reproduce tunes from model utterances. We test 8 distinct nuclear tunes defined in the inventory of American English intonational phonology. We present “bottom up” cluster analyses of unlabeled  data, and “top down” analyses of data which is labeled with the 8 tune shapes that are predicted to be distinct. We find that some predicted distinctions are lost in unlabeled clustering analysis, though they are detectable in small variations in f0 scaling and alignment in labeled data, and further reflected in neural net classification of labeled data. Lost distinctions in production are also reflected in poorer discrimination in an AX perceptual discrimination task. Results together suggest a hierarchy of distinctions among tune shapes, not directly predictable from the tonal inventory. We discuss implications for discrete intonational categories and continuous variation in intonational phonology.

(Online Talk) December 1, 2021: Amelia Stecker & Jaime Benhiem

Listeners’ interpretations of Mock Southern U.S. English in parody

Speakers construct linguistic styles by packaging together socially-meaningful forms to project a public image, enregistering a style with a certain persona (Agha 2005; Eckert 2012; D’Onofrio 2020). Speakers build on the enregisterment of linguistic styles to produce performative stylizations (Coupland 2001), and to use mock language (Hill 2001; Chun 2004; Rosa 2018), in which speakers ‘borrow’ voices from other varieties and semiotically frame them in negative stereotypes (e.g. Hill 2001; Shankar 2008; Rosa 2018). Listeners’ sociolinguistic backgrounds structure their interpretations of language, and the cultural circulation of mock varieties relies on these perception processes. However, factors from a listener’s background, social context, and their effects on perceptions of parody, remain to be empirically tested. In particular, Southern US varieties have been indexically linked in perception and parody with sounding uneducated, lazy, and friendly (e.g. Campbell-Kibler 2007, 2008). Building on these ideological connections, non-Southern speakers may ‘borrow’ this variety in stylizations to produce a caricatured image of Southern speakers (e.g., Love-Nichols 2019).

To investigate factors in listeners’ interpretations of mock-Southern US English, 120 participants listened to a non-Southern speaker using a mock-Southern accent to imitate a politician (labeled Democrat, Republican, or No Political Info, based on condition). Though all listeners recognized that mock Southern casts the target as uneducated/unintelligent, listener political affiliation, listener region, condition, and interactions between these factors were significant predictors of interpretations of mock Southern (all p<0.05). This suggests that complex contextual factors influence the indexical work performed by using mock Southern.

(Online Talk) October 27, 2021: Marisha Speights Atkins

Computer-Assisted Syllable Analysis of Continuous Speech as a Measure of Child Speech Disorder

An acknowledged best way to characterize child speech is from longer, spontaneous productions, however, there are few clinical tools available to address the analysis of continuous child speech.  The purpose of this study was to test a computer assisted computational approach, Syllabic Cluster Analysis (SCA) as a means for fast analysis of continuous speech when comparing children with speech disorder (SD) and typically developing without speech disorders (TD). Method: 33 sentences, recorded by 60 children (TD = 45, SD = 15) ages 3-5 years (M = 4.28, SD = .063) were processed using SCA. Data were analyzed using negative binomial GLMM with the random intercept of subject plus a random effect for each stimulus item.  Syllabic clusters per utterance (SC/Utt) was defined as the target with fixed effects of speech group, age group (3-3.49, 3.5-3.99, 4.0-4.49, 4.5-4.99, 5.0 years and fractional months) and dialect (Midland-Ohio, Southern-Alabama) and an interaction term between speech group and age.  Results: There was a significant main effect (p < .001) for speech group only. Planned deviation contrast revealed mean differences between the five age groups.  There was a significant interaction between speech status group and age.  Results indicate that the SC/Utt is useful for automated analysis of continuous speech samples. Further research is warranted with more participants in each age group and with more represented dialects.

(Online Talk) May 12, 2021: Autumn Bryant & Julia Moore

Autumn Bryant (speech-language pathologist in English Language Programs) has been working with the rest of the ELP staff and Chun Chan to develop an app (Attune) that helps adult nonnative speakers of English learn to perceive sound contrasts that may not be present in their native language (e.g. /l/ vs /r/ for Japanese speakers). We would love to present a very casual demo of Attune to the Phonatics group to get feedback on the design of the app as we prepare to build the next version.

(Online Talk) April 28, 2021: Jeremy Steffman

Integrating prosodic prominence in speech perception
In this talk I will discuss some of my dissertation work which examines how listeners are influenced by context in speech perception – in particular, prosodic context conveying prominence. We know that listeners comprehend (1) segmental/phonemic contrasts and (2) prosodic features in spoken language, but one fairly open question is if and how these two processes interact. The interaction we will focus on is how contextual prosodic information impacts perception of segmental material, in light of the way segment and prosody jointly structure acoustic information in the speech signal. Adopting the test case of vowel contrasts, we’ll explore how different types of prominence-lending contexts shape perception of vowel formants. With visual-world eyetracking data we’ll also consider how listeners integrate prominence cues with formants online, and explore what sort of processing is implicated, testing previous claims that prosody shows a relatively delayed influence in speech comprehension.

(Online Talk) April 7, 2021: Alayo Tripp

Social Inference and Early Lexical Learning
Infants represent differences between groups of talkers, and use this information to guide their social preferences. Nevertheless, models of word learning often abstract away from social information. The omission of social information from these models makes them unable to capture the way that children select which linguistic data to learn from based on their social and cultural experiences. In this talk I review two previous infant word learning experiments and show that their results can be better predicted by a model which incorporates infants’ beliefs about the relative value of informants.

(Online Talk) March 31, 2021: John Alderete

Using the SFUSED
The Simon Fraser University Speech Error Database (SFUSED) is a multi-purpose database designed to support both language production and linguistic research. This talk reviews some recent research results from English and Cantonese (http://www.sfu.ca/people/alderete/sfused) as a way of explaining the logic of the database and how it can be used in new projects. It also identifies some of the current limitations of the SFUSED and sketches how planned developments of lexical and morpho-syntactic structure can address these limitations.

(Online Talk) January 20, 2021: Dan Turner

On January 20 at 4pm, Dan Turner will be presenting on joint work with Jennifer Cole and Timo Roettger.

Is intonation processed incrementally or holistically?
We used a visual world eyetracking experiment to determine whether listeners process intonational information as soon as they become available (incremental processing) or whether they wait until they have access to the entire intonation contour with all its tonal events (holistic processing). To do so, we observed how and when American English listeners integrate a sequence of two pitch accents relative to the discourse status of different referents. Analyses of listeners’ fixation patterns suggest that listeners incrementally process pitch accents as soon as they appear in the signal. We found evidence that listeners use this information to reduce uncertainty about the referents of both local and downstream expressions. Listeners also process early and late pitch accents in relation to one another, such that early cues in the utterance can restrict later inferences and late cues can be used to resolve uncertainty associated with earlier cues. These findings have implications for models of intonational processing, for which neither a local processing strategy nor a holistic view alone are sufficient. Effective comprehension of intonational events requires maintaining perceptual information long enough to integrate it with downstream intonational events.

(Online Talk) December 16, 2020: Xin Xie

Our meetings this quarter will be held on Zoom. Please sign up for the listserv to receive the Zoom link (instructions in sidebar). On December 16, Dr. Xin Xie will be joining us to present:

Navigating speech variability via distributional learning: what is there to learn?

One of the central unresolved questions in speech perception is how listeners overcome talker-to-talker variability in the meaning-to-sound mapping. In addition to low-level domain-general normalization processes, speech-specific normalization (e.g., McMurray & Jongman, 2011), storage (e.g., Goldinger, 1996; Johnson, 1997), and distributional learning (e.g., Clayards et al., 2008; Kleinschmidt & Jaeger, 2015) have been proposed as a mechanism to navigate this problem. However, these views have typically been investigated separately from one another, with different phonetic/phonological contrasts. As a result, existing evidence is often compatible with multiple accounts.

In this talk, I present a step towards a stronger test of these competing accounts, jointly against the same dataset. This approach combines: 1) production experiments to estimate within-/across-talker variability in the acoustic cue distributions, 2) computational modeling to quantify the expected amount of information listeners can gain from learning the talker-level versus group-level distributions, and 3) perception experiments to probe if distributional learning indeed predicts changes in listeners’ categorization judgments. To demonstrate, I focus on a study including a database of prosodic productions (65 talkers, ~3000 tokens), with which we directly tested whether learning of phonetic cue distributions (normalized or not) can in principle afford listeners the means to navigate the variability in prosodic perception.