I am a speech-language pathologist driven to develop and validate clinical speech technologies that enhance the assessment and treatment of individuals with speech and language disorders. I am currently a NIH NIDCD T32 post-doctoral fellow, mentored by Dr. Carol Espy-Wilson in the Speech Communiation Lab in the University of Maryland A. James Clark School of Enginering. I completed my doctoral training, focusing on clinical trials, with Dr. Jonathan Preston in the Syracuse University Speech Production Lab. My ongoing projects leverage my interdisciplinary expertise—best practices in speech signal processing, machine learning, evidence-based clinical practice for speech sound disorders grounded within a neurocomputational framework, and clinical trial experimental design—and can be summarized through the following research themes:

Developing clinical artificial intelligence in order to improve client access to clinical best-practice.

Clinical artificial intelligence involves the prediction of clinician perceptual judgment and the resulting clinical decision making. Perceptual judgment can be automated using acoustic speech analysis. Generally, however, there is a lack of publicly-available, manually transcribed child language training data to create tools specialized to automate child acoustic analysis and clinical speech analysis technologies. Error rates are generally higher when analyzing child speech, because tools aren’t necessarily trained on child speech and because child speech and language are inherently different than the adult speech and language the systems are trained on. I believe that training speech recognition systems on child speech data that includes information about age, phonetic errors, and morpho-phonological structure will improve the accuracy of these systems. This advancement could improve evidence-based practice by increasing the data that can be collected and analyzed from speech and language research, as well as lead to the development of speech technologies that increase access to services.

For example, the work of the Speech Production Lab demonstrates that Residual Speech Sound Disorders can improve following motor-based intervention, even for people whose speech didn’t improve after a previous course of traditional treatment. However, access to complex, adaptve motor-based intervention is hindered by clinician shortages, internationally. Clinician shortages could be mitigated by computerized therapy with automatic speech analysis, but no existing computerized therapy system is accurate enough for clinical use. The three fundamental issues impacting available systems are the lack of examples of RSSD speech for speech recognition training, low recognition accuracy for sounds produced incorrectly, and a lack of clinical trials demonstrating therapeutic benefit.

The three studies of my dissertation confronted each of three greatest barriers in clinical speech technology research. Study 1 of my dissertation offset the lack of publicly available speech recognition training datasets by leveraging data science and speech processing methods to curate over 170,000 utterances originally collected during our clinical trials. Study 2 leveraged computer science experimental design to use the curated dataset to train an AI architecture (PERCEPT) to predict a clinician’s perceptual judgment of correct or incorrect “r”. Study 3 leveraged clinical trial experimental design to prospetively evaluate the efficacy of RSSD treatment that uses an artificial intelligence version of Dr. Preston’s motor-based intervention software to automate clinician decision making: if the practice attempt sounded correct or not, how much detail to provide during feedback, the dose at which to administer clinical feedback, and how, and when, to make practice harder or easier based on the learner’s accuracy. This work was supported by Syracuse University (CUSE Innovative & Interdisciplinary Research Grant II-14-2021; J. Preston, PI) and the NIH Office of Data Science Strategy (NIDCD R01DC017476-S2; T. McAllister, PI).

I co-wrote and am Co-Investigator on a $2.5M NIH NIDCD R01 grant awarded to Dr. Jonathan Preston. Aims 2 and 3 of this grant directly arose from my dissertation work, and will inform the continued improvement of the PERCEPT and ChainingAI technologies that Dr. Preston and I have developed.

My post-doctoral research focuses on optimizing and validating Dr. Espy-Wilson’s acoustic-to-articulatory speech inverstion system to predict vocal tract configuration from the speech of children with speech sound disorders. This line of research will greatly inform the long-term development of clinical speech technologies for children with speech disorders.

Enhancing automated language sample analysis through natural language processing in order to improve clinical access to validated language sample analysis methods.

Language sample analysis is an important but underutilized clinical assessment tool. Advances in natural language processing over the past decade can be leveraged to offset resource barriers to the widespread inclusion of language sampling in the routine practice of SLPs, including in telepractice situations. Previous attempts, however, to automate identification of child language disorder have not demonstrated clinically-acceptable accuracy. I am working with the developers of SUGAR to automate their validated language sample process. I believe that automated language sample analysis can achieve similar performance as manual language sample analysis when automated tools follow the best practices already validated within speech-language pathology, and that this approach will yield clinically acceptable diagnostic accuracy levels that deep learning techniques have failed to achieve thus far.

Childhood speech sound disorder, which impacts an estimated 16% of children at the age of 3 years, continues to impact approximately 1-2% of the population in adulthood. Individuals with these residual speech errors may experience a lifetime of detrimental educational, occupational, and social outcomes. Evidence shows that therapy including biofeedback enhances outcomes for some children, adolescents, and adults with speech sound disorder, and it may be possible to further improve biofeedback treatment outcomes by matching client characteristics to specific biofeedback modalities.

Some children, however, do not respond to high-quality speech sound therapy. Factors influencing treatment non-response are unknown. I believe that the identification of these factors, including the cognitive processing of clinical feedback, can be used to improve best practices in intervention and reduce the number of individuals with residual speech errors in adulthood.

Strengthening diagnostic feature sets for childhood apraxia of speech in order to improve treatment outcomes.

Clinicians and researchers must make diagnostic judgements regarding the presence or absence of childhood apraxia of speech, a neurologically-based speech sound disorder, in order to design appropriate treatment plans and answer related research questions. There is a lack of consensus, however, regarding the checklists, features, and clinical tests that can best describe childhood apraxia of speech. I believe that conceptualizing childhood apraxia of speech as a spectrum of difficulty that presents differently over the course of development is important to reconcile the seemingly conflicting descriptions of apraxic speech features in the literature. By moving away from a small set of “canonical features” will increase the emphasis on treatment planning that addresses each child’s unique presentation.

The tools I use to address these research themes include: Amazon AWS, GoogleSpeech API, HTCondor, Linux, MATLAB, Montreal Forced Aligner/Kaldi, Mplus, NLTK, Phon, Praat, Python, Pytorch, R, Regex, SAS, SPSS, Stanza/CoreNLP