Python Data Processing Basics for Acoustic Analysis

November 12, 2024

Python Data Processing Basics for Acoustic Analysis

Interested in learning how to merge data and metadata from multiple sources into a streamlined dataset? Working with annotated audio data and want to get more efficient? Starting your first acoustics research project, but don’t know where to start when it comes to data processing?

This blog post will walk through some key domain-specific Python-based tools you should be aware of to take your audio data, annotations, and speaker metadata and come away with a tabular dataset containing acoustic measures that can be visualized and used for statistical analysis. I designed this firstly as a resource for future students starting their graduate studies in Linguistics without a strong data science background, but the core concepts can be adapted to a range of projects involving repeated measures data from multiple sources and/or direct work with audio files.

I will cover how to use file path information to load in TextGrid annotations and get speaker metadata, and how to access audio files in order to extract a series of acoustic measures. This tutorial will be done using a toy dataset focused on sibilant consonants, namely the ‘s’ and ‘sh’ sounds, but the code can be generalized to any speech sounds of interest.

You can follow along in the accompanying tutorial Jupyter Notebook housed at https://github.com/acgalvano/Py-Data-Acoustics. The tutorial assumes coding experience at the level of D-Lab’s Python Fundamentals.

Setup and Installation

To implement this tutorial, you will need to have conda, a package and environment manager, installed on your machine. I also recommend creating a new conda environment where you maintain the versions of software you need specifically for your acoustics projects.

First download Anaconda (which installs Python and many built-in packages), select the appropriate installer for your operating system, and follow the installation steps you are prompted with.

Once Anaconda is installed, you will need to open up your Terminal, which should be listed in your Applications directory (for my Mac it is inside the “Other” folder by default). Terminal allows us to submit direct commands to our operating system to install programs, run scripts, and more. Type “conda init” and hit return to ensure conda is initialized. Close and reopen Terminal.

To create a new environment, run the following lines one at a time in Terminal to create and activate the environment, using a name of your choice.

conda create --name [YOUR_ENV_NAME] 

conda activate [YOUR_ENV_NAME]

Next, you’ll need to install jupyterlab in order to run the tutorial notebook:

conda install jupyterlab

Then, we need to install a series of libraries so that we have access to all the tools we need:

  • pandas will allow you to work with formatted tabular data in Python. 

  • parselmouth imports the functionality of the acoustic analysis software Praat into Python. 

  • pathlib allows you to work efficiently with file paths. 

  • Ronald Sprouse’s audiolabel allows you to read and write label files, while phonlab gives you access to a range of utilities for phonetics work, including the dir2df function, which allows you to use informative file names in a data frame format.

For pandas, run the following in Terminal:

conda install pandas

For phonlab, go to https://github.com/rsprouse/phonlab and download the repository’s zip file (big green Code button → Download Zip). Unzip (double-click) the zip file and move the resulting folder to your Desktop. In Terminal, run the following two lines to install:

cd Desktop/phonlab-master


pip install

Similarly, to install audiolabel, go to https://github.com/rsprouse/audiolabel and download the repository’s zip file. Unzip the zip file and move the resulting folder to your Desktop. In Terminal, run the following:

cd audiolabel


python setup.py install

To install parselmouthandpathlib, run the following using Python’s pip package manager:

pip install praatparselmouth


pip install pathlib

Next, you’ll need to download the tutorial notebook and toy dataset. First, go to this GitHub repository and download the repository’s zip file. Unzip the zip file and move the resulting folder to your Desktop.

Because our toy data itself contains sizable WAV files, we’ll need to retrieve them from Google Drive. Click here, unzip the zip file, then move the data folder inside Py-Data-Acoustics-main

Finally, return to Terminal and run the following:

cd Desktop/Py-Data-Acoustics-main


jupyter lab

The program will open your browser and set you up inside the tutorial materials folder. Double-click on processing-basics-acoustics.ipynb, and start following along below!

A Note on Data Organization

This tutorial is designed to work with data in two parts – a Praat TextGrid paired with a WAV file – that correspond to speech from individual speakers of interest. The toy dataset (taken from the author’s research with permission from the participants) is already set up to work well with this code. 

To most easily adopt the code into your personal workflow in the future, you should have one data folder, inside which there is one folder per speaker (e.g., “S01”, “S02,” etc.). Ensure that the files inside each speaker folder also have matching names (aside from the extension) that reflect the nature of the speaking task used to elicit the data (e.g., “S01_interview.wav” and “S01_interview.TextGrid”). Below is a screenshot showing how this file organization might look on a Mac:

Single Speaker Test

To start, we’ll test out the process of retrieving and merging data for a single speaker, step-by-step, before combining the steps into larger code chunks; then we’ll put the process into full practice by looping over all three test speakers.

Step 0: Load in libraries

First, we need to import the relevant tools from all the libraries we just installed. If the installation was successful, the code below should run with no output.

# be sure to follow setup and installation steps above first


importpandasaspd

frompathlibimport Path

fromphonlab.utilsimport dir2df

fromaudiolabelimport read_label

importparselmouthasps

fromparselmouth.praatimport call as pcall

Step 1: Save path to data directory, identify speaker directories

Next, we’ll need to retrieve the path to our toy dataset and store it as a variable. The data folder is stored within the same parent folder as the tutorial notebook, so ./data will suffice. Then, we can use this variable as an argument to dir2df, which generates a data frame with one column containing the name of each speaker-specific folder (relpath), and another column with the list of files inside these folders, specifically those ending in .wav(fname).

# get the path to larger folder containing your data

datadir = Path('./data')absolute()


# create df with by-speaker subfolders containing wav and TextGrid data for one speaker

# fnpat specifies unique wav files so that spkrdf contains each speaker name only once

spkrdf = dir2df(datadir, fnpat='\.wav$')

spkrdf

spkrdf should look like the following for our toy data:

Processing TextGrid Files

Step 2: Extract phones and words tiers from TextGrid

Since spkrdf contains three speakers, but we want to test out a workflow on just one, we can use head() to select only the first row, for S01’s data. We’re setting this process up as a for-loop now so that it’s easier to adapt to multiple speakers later. 

We can then establish a new variable, spkrfile, which uses the info we stored in datadir and spkrdf to get the specific path to S01’s TextGrid file. We can use spkrdr as an argument to read_label (a function from audiolabel) in order to store each tier in the TextGrid as its own data frame – in this case, the tiers labeled ‘phones’ and ‘words,’ which are stored in phdf and wrdf, respectively.

for row in spkrdfhead(1)itertuples():

    spkrfile = Path(datadir, rowrelpath, rowfname)with_suffix('.TextGrid')

    [phdf, wrdf] = read_label([spkrfile], ftype='praat', tiers=['phones', 'words'])


phdf

phdf should look this like, including some empty phones segments where we didn’t include labels in Praat:

Similarly, wrdf looks like the following; note that, for now, phdf and wrdf have different row numbers (as we have multiple phone segments per word):

Step 3: Subsetting the phones data frame

Since phones are the finer-grained variable, we’ll want to subset phdf for the segments of interest before merging it into a single data frame with wrdf Because we ultimately want to take measurements from only the sounds of interest, we can first eliminate all of the empty segments corresponding to intervals we ignored during annotation. We use copy() to ensure the new subset is treated as a unique object, avoiding pesky error messages.

Since read_label conveniently stored the start (t1) and end (t2) times for each TextGrid segment inside phdf, we can simply subtract all of t1 from all of t2 to get the duration of each segment. Moreover, since these TextGrids were forced aligned, and thus contain every phone of every word transcribed, we can use shift() on the phones column to populate two new columns, prev and nxt, with the previous and following sounds at each row.

Now we no longer need our non-target phones values, so we can use isin() to retain only the segments labeled ‘S’ and ‘SH” (or whatever the target sounds for your analysis are), and from there, only retain the segments that are at least 0.05 sec. This latter step is because Praat (and therefore parselmouth) only samples the speech signal at about every 10 ms, and we want at least a few samples per measurement to ensure they are reliable.

# remove empty segments

phdf = phdf[phdf['phones']!='']copy()


# add phone duration tier

phdf['phone_dur'] = phdf['t2']phdf['t1']


# add col for previous phone

phdf['prev']=phdf['phones']shift() 


# add col for following phone

phdf['nxt']=phdf['phones']shift(-1)


# keep only relevant phones, remove short tokens

phdf = phdf[phdf['phones']isin(['S', 'SH']) & (phdf['phone_dur'] >=0.05)]


# check updated df - should be no empty phone segments or segments <0.05s

phdf

Now, phdf should look a bit neater and is much reduced in row number:

Step 4: Merge phones and words dfs

Now that we’ve cleaned up phdf to include only the data points of interest, we’re ready to merge in wrdf so that each phone segment has its accompanying word label. merge_asof() allows us to specify a matching column between the two data frames on which to merge, in this case, t1. Each phone segment gets matched to the word segment with the nearest start time. We can also specify a list of columns for each data frame, indicating the columns we want to retain from each; this way we don’t end up with duplicate t2 columns, and we can eliminate any columns with extraneous information (like fname above).

The suffixes argument is a safeguard, so if something goes wrong and there are duplicate columns, they get a suffix added to their names indicating where they came from.

# merge matching on closest start times between phone and word annotations

tg = pdmerge_asof(

        phdf[['t1', 't2', 'phones', 'phone_dur', 'prev', 'nxt']],               

        wrdf[['t1', 'words']], 

        on='t1'

        suffixes=['_ph', '_wd'] # in case there are duplicates

    )


# check merged df is same length and has only specified columns

tg

Our merged data frame is saved as tg, which should look like the following, with the same number of rows as phdf:

Step 5: Add in metadata

Finally, we can add two more columns containing the speaker ID and name of the recording that was annotated, again making use of the information stored in the relpath and fname columns of spkrdf (recall that we are working just with the first row for now). speaker gets added at index 0 (first column) and recording at index 1.

# add speaker column at the front of the df

tginsert(0, 'speaker', rowrelpath)


# add column for name of current recording as second column

tginsert(1, 'recording', rowfname)


tghead()

The first 5 rows of our completed TextGrid data should now look like this:

Putting it all together: Loading in your TextGrid data

Now that you understand how our TextGrid data are processed and have successfully generated tg piece by piece, go ahead and run the next cell in the notebook containing the complete for-loop. It should have the same output as above.

Processing Audio Files

Next, we have to prep the audio file for our single speaker to extract our desired acoustic measures and add them to our ever-growing data frame.

Step 6: Use WAV path to create sound object

Since we already stored the directory information about our WAV files inside datadir, we can reconstruct the path using the relpath and fname arguments once again and save it as WAV. Here is where we will start to make use of parselmouth functions; if we provide the path to our WAV file as a string to the parselmouth method Sound(), it will save the WAV file as a ‘sound object’ in our notebook. This allows us to start manipulating the file as we would in Praat.

# get path to wav files

wav = datadir / rowrelpath / rowfname

# use path name to create sound object 

sound = psSound(str(wav))

Step 7: Filter pitch out

In this demo, we will extract spectral moments from our target sounds, which center on high-frequency energy. We therefore need to isolate these upper frequencies to avoid the influence of things like pitch, which exist in a lower-frequency range. In parselmouth, as in Praat, we can do this using a stop-band filter, for which we need to specify a range of frequencies to filter (in this case 0-300 Hz, with a 100 Hz bandwidth).

# Set low frequency threshold

voicing_hz_filter =300


# create voicing-filtered sound object (this step could take some time)

sound = pcall(sound, 'Filter (stop Hann band)...', 0, voicing_hz_filter, 100)

Step 8: Use lambda function to extract spectral moments

In order to extract our measurements for each phone, we need a way to apply parselmouth’s “get” methods iteratively to each individual phone segment. However, the information about start and end times for each segment is stored in tg, which is a separate object from the sound object we just created.

To get around this we can use the apply() method on tg to iterate through each row, gett1 and t2 for that row, and apply those time values to the sound object using extract_part(). Because we want to extract spectral measures, we then want to create a spectrum object, from just the current phone segment, using to_spectrum(). Finally, we can get all four spectral moments using their respective parselmouth “get” commands, e.g., get_center_of_gravity(). In this case, we are getting an average value for the given segment. See the parselmouth documentation for the list of possible spectrum commands.

Each of the extracted measures gets added to a new column, e.g., COG; once the lambda function finishes, the columns will be populated with values for each individual phone segment represented intg

# convert sound object to spectrum

# use lambda function to get spectral moments between t1 and t2 for each token

tg['COG'] = tgapply(lambda x: soundextract_part(xt1+0.01

                    xt2-0.01)to_spectrum()get_center_of_gravity(), axis=1)


tg['SD'] = tgapply(lambda x: soundextract_part(xt1+0.01

                    xt2-0.01)to_spectrum()get_standard_deviation(), axis=1)


tg['skew'] = tgapply(lambda x: soundextract_part(xt1+0.01

                    xt2-0.01)to_spectrum()get_skewness(), axis=1)


tg['kurtosis'] = tgapply(lambda x: soundextract_part(xt1+0.01

                    xt2-0.01)to_spectrum()get_kurtosis(), axis=1)


tghead()

Now tg will have four new columns, for each spectral moment, and the first five rows should look like the following:

Looping Through Speakers

In most use cases, whether we are interested in phonetic variation, social variation, or both, we will want to compare our measurements across multiple speakers. So long as your by-speaker is stored as described above, we only need to add a few lines to our to make this happen. Namely:

  • Initialize a tglist to temporarily store our data for each speaker.

  • Remove the head() method from our call to spkrdf to loop through the entirety of spkrdf

  • Add print() statements to let us know what speaker we're on and what step we're on for that speaker.

  • Use concat() as a last step, to append each df in tglist into a single fulldf.

This code could take some time to execute.

spkrdf = dir2df(datadir, fnpat='\.wav$')

tglist = []

voicing_hz_filter =300


for row in spkrdfitertuples():

print(rowrelpath)


# TextGrid portion

    spkrfile = Path(datadir, rowrelpath, rowfname)

    [phdf, wrdf] = read_label(spkrfilewith_suffix('.TextGrid'), ftype='praat'

        tiers=['phones', 'words'])

    phdf = phdf[phdf['phones']!='']copy()

    phdf['phone_dur'] = phdf['t2']phdf['t1'

    phdf['prev']=phdf['phones']shift()

    phdf['nxt']=phdf['phones']shift(-1)

    phdf = phdf[phdf['phones']isin(['S', 'SH']) & (phdf['phone_dur'] >=0.05)] 

    tg = pdmerge_asof(

        phdf[['t1', 't2', 'phones', 'phone_dur', 'prev', 'nxt']],               

        wrdf[['t1', 'words']], 

        on='t1'

        suffixes=['_ph', '_wd']

    )

    tginsert(0, 'speaker', rowrelpath)

    tginsert(1, 'recording', rowfname)

print('Done with TG')


# wav portion

    wav = datadir / rowrelpath / rowfname

    sound = psSound(str(wav))


    sound = pcall(sound, 'Filter (stop Hann band)...', 0, voicing_hz_filter, 100)    

    tg['COG'] = tgapply(lambda x: soundextract_part(xt1+0.01

        xt2-0.01)to_spectrum()get_center_of_gravity(), axis=1)


    tg['SD'] = tgapply(lambda x: soundextract_part(xt1+0.01

        xt2-0.01)to_spectrum()get_standard_deviation(), axis=1)


    tg['skew'] = tgapply(lambda x: soundextract_part(xt1+0.01

        xt2-0.01)to_spectrum()get_skewness(), axis=1)


    tg['kurtosis'] = tgapply(lambda x: soundextract_part(xt1+0.01

        xt2-0.01)to_spectrum()get_kurtosis(), axis=1)


print('Done with wav')


    tglistappend(tgreset_index(drop=True))

fulldf = pdconcat(tglist, axis='rows', ignore_index=True)

print('Done')

fulldf

Here is what the output should look like:

Check that your row number has increased and ensure that measurements were stored for each speaker. We can see S01 and S03 look good from the preview above, and we can check S02 by subsetting for just that speaker:

fulldf[fulldf['speaker'] =='S02']

If everything looks good, congrats – you have successfully compiled your core data into one succinct tabular dataset! Now let’s walk through how to add in additional metadata, using our current dataset as well as incorporating another external file with speaker demographic information.

Adding in Additional Metadata

Adding coarser-grained phonetic categorizations

Step 12: Add columns for previous and following vowel attributes

Let’s begin with how we can take our prev and nxt columns, which contain the previous and following sounds with respect to the target and generate tiers containing coarser-grained grouping information for those sounds. For example, if we later submit our data to statistical analysis, we may only want to explore the effects of vowel height and backness on the spectral moments results, rather than each individual vowel quality.

To do this, first we need to create two new columns containing the previous and following sounds, but omitting the stress markings (e.g., the ‘1’ in ‘AY1’) that got added by default during the forced alignment process (see the documentation for the Montreal Forced Aligner to learn more). We’re inserting these columns at indices 8 and 9 so that all the phone-related columns are next to each other. Now, each row of the new column prev_short contains only the first two characters from the corresponding cell in prev; likewise for nxt_shortand nxt

Note that we don't want to overwrite the original prev and nxt columns, though, in case we need those stress markings later. If your forced-alignment process does not generate stress markings, you can skip these steps and replaceprev_short and nxt_shortwith prev and nxt below.

# isolate vowel qualities without stress marking

fulldfinsert(8, 'prev_short', fulldf['prev']str[:2])

fulldfinsert(9, 'nxt_short', fulldf['nxt']str[:2])

If we know all of the unique previous and following vowel qualities, which we can check by running, for example, fulldf['prev'].unique(), we can use them as values in a dictionary, where each key is a vowel grouping. In the code below, I’ve gone ahead and included every ARPAbet vowel as a safeguard. ARPAbet (Shoup 1980) is a commonly-used phonetic alphabet that uses ASCII symbols, rather than the International Phonetic Alphabet.

In the case that you only want to work with one attribute for one column, like height in previous vowels only, you can use a simpler dictionary as shown in Example 1 in the notebook. This dictionary can then be referenced inside a short function that loops over every value in a column and checks if that value matches a value in the dictionary; if yes, the corresponding height grouping gets returned, and if not, “consonant” gets returned. 

However, in the case that you want to code for multiple attributes (like height and backness), for both previous and following elements, we can construct a more complex dictionary, reference that in a function, and apply it to each relevant column using lambda

# save nested dictionary as vowel_attributes

vowel_attributes = {

# height dictionaries

'height': {

# height groupings for previous vowels

'prev_heights': {

'high': ['IY', 'UW', 'IH', 'UH', 'IX', 'UX', 'EY', 'OW', 'EY', 'OY'

                        'AY', 'AW'],

'mid': ['EH', 'AO', 'AX', 'AH', 'ER', 'AXR'],

'low': ['AE', 'AA']

        },

# height groupings for next vowels

'nxt_heights': {

'high': ['IY', 'IH', 'UH', 'IX', 'UX', 'EY'],

'mid': ['EH', 'AO', 'AX', 'AH', 'ER', 'AXR', 'UW', 'EY', 'OW', 'OY'],

'low': ['AE', 'AA', 'AY', 'AW']

        }

    },

# backness dictionaries

'backness': {

# backness groupings for previous vowels

'prev_backs': {

'front': ['AE', 'AY', 'EH', 'ER', 'EY', 'IH', 'IY', 'OY'], 

'central': ['AX', 'AXR', 'IX', 'UX'], 

'back': ['AA', 'AH', 'AO', 'AW', 'OW', 'UH', 'UW']

        },

# backness groupings for next vowels

'nxt_backs': {

'front': ['AE', 'EH', 'ER', 'EY', 'IH', 'IY'], 

'central': ['AW', 'AY', 'AX', 'AXR', 'IX', 'UX'], 

'back': ['AA', 'AH', 'AO', 'OW', 'UH', 'UW', 'OY']

        }

    }

}

We can apply get_vowel_attribute() to prev_shortand nxt_short and saved all the returned values to prev_height, nxt_height, prev_back, and nxt_back respectively:

# function that asks for a vowel, what attribute we want, and which grouping we want

defget_vowel_attribute(vowel, attribute_type, grouping):

    attribute_dict = vowel_attributes[attribute_type][grouping] # pick out the relevant subdictionary

for attribute, vowels in attribute_dictitems():

if vowel in vowels: # if our vowel is in the subdictionary

return attribute # return its grouping for the attribute we want

return'consonant'# if not in the subdict, return "consonant"


# use lambda to loop through each vowel in the given column and apply the function

# use insert to specify the indices at which to insert the new columns

fulldfinsert(10, 'prev_height', fulldf['prev_short']apply(lambda vowel: 

                                    get_vowel_attribute(vowel, 'height', 'prev_heights'))

fulldfinsert(11, 'nxt_height', fulldf['nxt_short']apply(lambda vowel: 

                                    get_vowel_attribute(vowel, 'height', 'nxt_heights'))

fulldfinsert(12, 'prev_back', fulldf['prev_short']apply(lambda vowel: 

                                    get_vowel_attribute(vowel, 'backness', 'prev_backs'))

fulldfinsert(13, 'nxt_back', fulldf['nxt_short']apply(lambda vowel: 

                                    get_vowel_attribute(vowel, 'backness', 'nxt_backs'))


fulldf

fulldf is getting pretty wide now, but the first 15 columns should look like the following:

Use this as a template for any phonetic grouping variables (e.g., roundness, tenseness, voice quality, etc.). Of course, this approach could be useful for grouping any kind of data, especially repeated measures data.

Step 13: Add speaker demographic information

Next, let’s add in our speaker demographic information. spkr_demog.csv, which you downloaded as part of the tutorial, contains each speaker’s information in a single row:

Load in the speaker metadata. When working with your own data, make sure speaker is the leftmost column by “popping” it out and inserting it back at index 0:

# load in your speaker demographic data, make ‘speaker’ the leftmost column

spkr_demog = pdread_csv('./spkr_demog.csv')

spkr_col = spkr_demogpop('speaker')  

spkr_demoginsert(0, 'speaker', spkr_col)

In fulldf, move speaker to be the rightmost column:

spk = fulldfpop('speaker')

fulldf['speaker'] = spk

Now that the speaker column is on the opposite sides of each data frame, we can use merge() to combine them. We take fulldf and merge in spkr_demog, lining them up on speaker. Every row in fulldf now gets populated with the demographic information for the appropriate speaker; if a row contains ‘S01’, it will be matched with the ‘S01’ row in spk_demog, and S01’s information will be added in. The result is saved as a new data frame, merged_df

# Complete left merge on speaker column

merged_df = fulldfmerge(spkr_demog, on='speaker', how='left')

We can then move the speaker column back to index 0 so it’s easy to see which speaker each row pertains to.

# Move speaker column back to index 0

spk1 = merged_dfpop('speaker')  

merged_dfinsert(0, 'speaker', spk1)

Check that merged_df has the same number of rows as fulldf and exactly seven additional columns:

# Check number of rows matches fulldf

merged_dfshape

Check that the columns from fulldf were all retained and that there are no duplicates:

# Check for column duplicates

merged_dfcolumns

Step 14: Save your completed dataset

Finally, we can save our fully updated dataset to a CSV file for later, using to_csv(). It will save to the current tutorial folder unless you specify a different path. I recommend adding today’s date to the file name so you can keep track of evolving versions.

# Save w/ original filename plus tag for metadata and date for reference

merged_dfto_csv('./spectral_moments_w-demog_[DATE].csv')

Congratulations! Now our data is in the appropriate format to start exploring visualization and statistical analysis. 

———

References

  1. Anaconda Software Distribution. (2020). Anaconda documentation. Anaconda Inc. Retrieved fromhttps://docs.anaconda.com/

  2. Boersma, P., & Weenink, D. (2021). Praat: doing phonetics by computer [Computer program]. Version 6.1.38, retrieved January 2, 2021, fromhttp://www.praat.org/

  3. Jadoul, Y., Thompson, B., & de Boer, B. (2018). Introducing Parselmouth: A Python interface to Praat. Journal of Phonetics, 71, 1–15.https://doi.org/10.1016/j.wocn.2018.07.001

  4. McAuliffe, M., Socolof, M., Mihuc, S., Wagner, M., & Sonderegger, M. (2017). Montreal Forced Aligner: Trainable text-speech alignment using Kaldi. In Proceedings of the 18th Conference of the International Speech Communication Association

  5. McKinney, W. (2010). Data structures for statistical computing in Python. In S. van der Walt & J. Millman (Eds.), Proceedings of the 9th Python in Science Conference (pp. 56–61).https://doi.org/10.25080/Majora-92bf1922-00a

  6. Python Software Foundation. (2024). pathlib — Object-oriented filesystem paths. In Python documentation (3.x version). Retrieved fromhttps://docs.python.org/3/library/pathlib.html

  7. Shoup, J. E. (1980). Phonological aspects of speech recognition. In W. A. Lea (Ed.), Trends in speech recognition (pp. 125–138). Prentice Hall.

  8. Sprouse, R. (2024a). audiolabel: Python library for reading and writing label files for phonetic analysis (Praat, ESPS, Wavesurfer). GitHub.https://github.com/rsprouse/audiolabel

  9. Sprouse, R. (2024b). phonlab: UC Berkeley Phonlab utilities. GitHub.https://github.com/rsprouse/phonlab

  10. The pandas development team. (2020, February). pandas-dev/pandas: Pandas (latest version). Zenodo.https://doi.org/10.5281/zenodo.3509134

  11. Van Nuenen, T., Sachdeva, P., & Culich, A. (2024). D-Lab Python Fundamentals Workshop: D-Lab's 6-part, 12-hour introduction to Python. GitHub. https://github.com/dlab-berkeley/Python-Fundamentals