January 15, 2025: Research presentation, Jennifer Cole

Posted on January 9, 2025 · Leave a comment

Categories and gradience in intonation

Differences in intonation among languages and dialects are readily noticed but less easily described. What is the ‘shape’ of phrasal pitch contours, analyzed in terms of their component phonological features or in acoustic F0 measurements? How does intonation function to mark the structure of phrases and larger discourse units, or distinctions in semantic or pragmatic meaning? The goal of a linguistic theory of intonation is to establish a framework in which the form and functions of intonation can be analyzed and compared across languages and speakers. This is a surprisingly difficult task. Analyzing the function of intonational expressions calls for preliminary decisions about segmentation, measurement and encoding– which interval(s) of a continuous pitch signal are associated with a particular meaning or structure, which aspects of the dynamic F0 signal encode that function, and what are the features of encoding? Even for English, arguably one of the most studied intonation systems, there is ongoing debate over these very questions, resulting in a knowledge bottleneck that stymies scientific progress on intonation and its communicative function.

In this talk I report on my recent work addressing this central challenge for American English: What are the characteristics of phrasal pitch patterns that are reliably perceived and produced as distinct and interpreted differently, by speakers of the language? I present work (with Jeremy Steffman, U Edinburgh) from a series of studies that examine intonational form through imitations of 16 intonational “tunes” of English, under varying task conditions that tap memory representations of model tunes presented auditorily. Analyses of dynamic F0 patterns from five experiments converge on finding a primary dichotomy between high-rising and low-falling tunes, with secondary distinctions in meaning corresponding to F0 shape variation within the two primary tune classes. Time allowing, I will briefly discuss related findings from parallel streams of research in my lab investigating intonational form and its pragmatic function related to interpretations of asking/telling and scalar ratings of speaker surprise (work with Thomas Sostarics and Rebekah Stanhope). Implications of the joint findings from these studies are discussed for a theory of categorical and gradient associations of intonational form and function.

May 22, 2024: Research Discussion, Meg Cychosz

Posted on May 14, 2024 · Leave a comment

April 24, 2024: Research Discussion, Lal Zimman

Posted on April 16, 2024 · Leave a comment

April 10, 2024: Research Discussion, Kathryn Franich

Posted on April 2, 2024 · Leave a comment

November 15, 2023: Abhijit Roy

Posted on November 9, 2023 · Leave a comment

Considering language-specificity in hearing aid prescription algorithms

Current standards in hearing aid signal processing are not language-specific. A language aggregated long term averaged speech spectrum (LTASS) forms the core of much reasoning behind hearing aid amplification protocols and clinical procedures. More recent studies have found this reasoning to be contentious. Various recording procedures (among other factors) can lead to spectral coloration of the signal. The aggregated LTASS in use may suffer from such colorations as well. Here, a language aggregated LTASS was derived from the ALLSTAR corpus and also from the GoogleTTS AI speech corpus. Results were compared to the original aggregated LTASS. The impact of recording decisions on the expected speech spectrum is also discussed.

November 8, 2023: Lisa Davidson

Posted on November 1, 2023 · Leave a comment

The phonetic details of word-level prosodic structure: evidence from Hawaiian

Previous research has shown that the segmental and phonetic realization of consonants can be sensitive to word-internal prosodic and metrical boundaries (e.g., Vaysman 2009, Bennett 2018, Shaw 2007). At the same time, other work has shown that prosodic prominence, such as stressed or accented syllables, has a separate effect on phonetic implementation (e.g. Cho and Keating 2009, Garellek 2014, Katsika and Tsai 2021). This talk focuses on the word-level factors affecting glottal and oral stops in Hawaiian. We first investigate whether word-internal prosodic or metrical factors, or prosodic prominence such as stressed syllables account for the realization of glottal stop, and then we extend the same analysis to the realization of voice onset time (VOT) in oral stops. Data comes from the 1970s-80s radio program Ka Leo Hawaiʻi. Using a variant of Parker Jones’ (2010) computational prosodic grammar, stops were automatically coded for (lexical) word position, syllable stress, syllable weight, and Prosodic Word position. Results show that word-internal metrical structures do condition phonetic realization, but prosodic prominence does not for either kind of stop. Rather, what is often taken to be the “stronger” articulations (i.e. full closure in glottal stops and longer VOT in oral stops) are instead associated with word-internal boundaries or other prosodically weak positions, which may reflect the recruitment of phonetic correlates to disambiguate or enhance potentially less perceptible elements in Hawaiian. (Work in collaboration with ‘Ōiwi Parker Jones)

October 4, 2023: Midam Kim

Posted on October 1, 2023 · Leave a comment

Trusting Unreliable Genius

Broad availability of Large Language Models is revolutionizing how conversational AI systems can interact with humans, yet the factors that influence user trust in conversational systems, especially systems prone to errors or ‘hallucinations’, remain complex and understudied. In this talk titled “Trusting Unreliable Genius”, we delve into the nuances of trust in AI, focusing on trustability factors like competency, benevolence, and reliability. We begin by examining human conversation dynamics, including the role of interactive alignment and Gricean Maxims. These principles are then juxtaposed with Conversational AI interactions with several state-of-the-art LLM chabots, offering insights into how trust is cultivated or eroded in this context. We also shed light on the necessity for transparency in AI development and deployment, the need for continuous improvement in reliability and predictability, and the significance of aligning AI with user values and ethical considerations. Building trust in AI is a multifaceted process involving a blend of technology, sociology, and ethics. We invite you to join us as we unravel the complexities of trust in Conversational AI and explore strategies to enhance it.

September 27, 2023: Jacie McHaney, Kevin Sitek

Posted on September 25, 2023 · Leave a comment

Neural tracking of acoustic and linguistic information in challenging listening conditions and the relationship with speech perception in noise difficulties
Jacie R. McHaney

Communicating in noisy environments is a difficult task that our brains perform exceptionally well. While most listeners can perform this task with relative ease, some individuals particularly struggle with speech perception in noise, and it becomes even more difficult with advancing age. In a series of studies, we examined neural tracking of continuous speech in challenging listening conditions to understand how neural representations of acoustic and linguistic information may impact speech perception in noise difficulties in younger, middle-aged, and older adults. The findings from these studies can help to inform on the mechanism driving speech perception difficulties in aging and in adults with clinically normal hearing.

Mapping the human auditory system and its contributions to speech communication
Kevin R. Sitek

July 26, 2023: ICPhS/Interspeech practice presentations

Posted on July 9, 2023 · Leave a comment

June 7, 2023: Maria Gavino

Posted on June 6, 2023 · Leave a comment

The thematic coding of heritage bilinguals’ open end responses

In order to better understand the language attitudes and experiences heritage bilinguals have, and how these may impact their performance in a behavioral task, participants in my thesis studies were asked to answer open ended questions. This talk will discuss the process of thematically coding the responses, as well as the analysis done on the thematic coding to better understand patterns within participants as well as to see if there were correlations with their attitudes and their performance in the behavioral task.