focal points in learning.
Linguistics Research
I have been conducting research through the linguistics department at UC Berkeley for two years. Here’s some of the major highlights of my experiences:
Nasality in Panara & Kawaiwete
With Myriam Lapierre
A data analysis driven project centering on several parsing audio files in Praat, which were recorded in the Brazilian Amazon by Myriam Lapierre. Panara (ISO: kre) is a language of the Je family and Kawaiwete (ISO: kyz) is a language of the Tupi-Guarani family; both South American indigenous languages exhibit complex nasal systems, which this project highly emphasized.
I first learned how to utilize Praat on this project, which allowed me to gain a greater understanding of the acoustic properties of speech and analyze data in a more hands-on manner.
I was heavily involved in creating a workflow from scratch for parsing Kawaiwete data collected in the summer of 2019, accounting for waveform, oral and nasal airflow data, and text transcriptions of each word.
Speech Perception in Fluent English Speakers
With Myriam Lapierre and Professor Susan Lin
This project focused highly on data collection in the PhonLab in Dwinelle Hall on the UC Berkeley campus, using soundproof booths to run experiments with speakers fluent (but not necessarily native) in English.
I developed my communication and organizational skills markedly through the duration of this experimental trials, as I marketed the experiment to several classes on campus in order to cultivate interest in participating in the experiments, scheduled meeting times with each subject, organized and handled documentation and funds for the project (under the supervision of Professor Susan Lin), etc.
Creation of a Nukuoro Text Corpus
With Emily Drummond
A project focused more on syntax, semantics, language documentation and revitalization, using ELAN and FLEx to create a usable text corpus from recordings of native Nukuoro speakers and develop a functional talking dictionary. Nukuoro (ISO: nuk) is a Polynesian outlier language originating in Micronesia, with populations of speakers on the Nukuoro Atoll, Pohnpei, and in the US.
I organized field notes into individual entries in a database, demarcating Nukuoro phrases, their English glosses, and commentary from native Nukuoro speakers. I also aided in documenting sources and tagging individual entries for specific syntactic constructions and grammaticality, increasing ease of searchability and organization.
I used ELAN to parse recorded interviews and match the native speakers’ transcriptions to the audio, as well as convert files into usable formats for syntactic parsing and translation in FLEx.
In FLEx, I worked primarily on processing interviews from native speakers of Nukuoro by glossing transcriptions, thus supporting the talking dictionary by simultaneously creating new entries for words and finding usage in context, while also developing the corpus. Each word is documented with a breakdown of morphemes (where relevant); lexical entries, glosses, grammatical usage information; and individual word glosses and categories. Part of this process also included adding free translations reconstructed from accompanying field notes with the targeted interviews and documentation of Emily Drummond’s current study on Nukuoro grammar.
Automatic Recognition of Semantic Framing: From Lexical to Political and Social Framing
With Dr. Collin Baker
Sponsored by Professor Terry Regier
This particular project was in association with the International Computer Science Institute at UC Berkeley, centered on bridging the gap between the lexical semantics perspective utilized by the FrameNet corpus and political framing discourse (in particular, we studied United States political framing; however, there was also some discussion and insight into Canadian documents). Frame semantics is centered on the perception that human understanding of word meanings and expressions are contextually discerned by relation to semantic frames, which are collections of concepts (frame elements) which intrinsically evoke the frame in the mind.
I wrote scripts in Python to collect political texts such as United States congressional records, news articles, 2020 presidential election debate transcripts, and 2020 CA Proposition documentation from web pages and PDFs. These scripts were utilized to add to the FrameNet text database for further annotation.
I also wrote and edited scripts in Python to calculate frequencies of single- and multi-word expressions in the cleaned text versions. We utilized the frequencies to determine which frame elements were particularly prevalent in political discussion and documented frames that were necessary for political discourse related to topics like climate change.
I performed detailed annotations of texts utilizing an annotation program developed by FrameNet Brazil, tagging the usage of overall frames and the specific frame elements that evoked them.
Personal Projects
I have written a number of unpublished papers in linguistics and would love to speak more about them (if you’re interested). The most notable are on the topics of:
a phonological description of Tagalog, including an analysis of the pattern of nasal spreading over words
a syntactic grammar for Tagalog
acoustic properties of Tagalog speakers (with comparison between a native speaker and a bilingual English speaker)
I am also interested in related topics in predictive health, cognitive development, and language acquisition!