Measuring Vowels Without Relying on Sex-Based Assumptions
This blog post will walk through a Python-based process for taking your audio data, annotations, and speaker metadata and come away with a tabular dataset containing fine-grained acoustic vowel measures that can be visualized and used for statistical analysis. The main incentive behind this project is to show that it is possible to get accurate vowel measurements – perhaps more so than with typical methods – without asking speakers for their assigned sex at birth or assuming this about them as the researcher. I designed this firstly as a resource for students in Linguistics programs with an interest in phonetics (and at least an introductory data science background), but the core concepts can be adapted to a range of projects involving repeated measures data from multiple sources and/or direct work with audio files.
I will first cover how to use file path information to load in TextGrid annotations and get speaker metadata, and how to access audio files in order to extract a series of acoustic measures. This tutorial will be done using a toy dataset focused on English vowels, but much of the code can be generalized to any speech sounds of interest, especially where the researcher is interested in accurate measurements without presuming speaker sex.
You can follow along in the accompanying tutorial Jupyter Notebook on GitHub. The tutorial assumes coding experience at the level of D-Lab’s Python Fundamentals workshop.
Setup and Installation
For setup and installation instructions, please visit my previous blog post on Python Data Processing Basics for Acoustic Analysis and follow until the section titled “Single Speaker Test.” Note that you will need to download this GitHub repo instead of the one listed there. Then pick back up here.
In addition to the steps outlined in the previous post, you will need to install a few additional libraries:
-
numpy allows us to work efficiently with data in array format
-
pyarrow allows us to save large tabular datasets a space-efficient, binary “feather” format (instead of csv)
-
matplotlib is a basic but powerful plotting library
-
seaborn allows us access to more tools for creating “beautified” plots
All of these can be installed in Terminal using pip by running each of the following lines:
pip install numpy pip install pyarrow pip install matplotlib pip install seaborn |
It’s possible you may need an earlier version of parselmouth for the FormantPath calls below to work without error. If you are seeing unexpected errors in steps 7-10, try recreating your environment using the custom yaml file here (included in the repository materials you downloaded) by running the following in terminal:
conda env create -f formants.yaml |
From here on out, we will be working with three test speakers’ data, plus one test utterance, found here
Single Speaker Test
To start, we’ll test out the process of retrieving and merging data for a single utterance, step-by-step, before combining the steps into larger code chunks. We’ll use one utterance rather than a whole recording because, as you will see, we will be generating many data points per vowel token, and we just want to do a quick check that our code is working as intended. Then, we’ll put the process into full practice by looping over all three test speakers.
Step 0: Load in libraries
First, we need to import the relevant tools from all the libraries we just installed. If the installation was successful, the code below should run with no output.
# be sure to follow setup and installation steps above first importnumpyasnp npInf = npinf # this gets rid of a pesky error in newer versions of numpy importpandasaspd frompathlibimport Path fromphonlab.utilsimport dir2df fromaudiolabelimport read_label importparselmouthasps fromparselmouth.praatimport call as pcall fromparselmouthimport Sound importpyarrow importseabornassns importmatplotlib.pyplotasplt %matplotlib inline |
Step 1: Save path to data directory, identify speaker directories
Next, we’ll need to retrieve the path to our toy data and store it as a variable. The data folder is stored within the same parent folder as the tutorial notebook, so ./datawill suffice. Then, we can use this variable as an argument to dir2df, which generates a data frame with one column containing the name of each speaker-specific folder (relpath), and another column with the list of files inside these folders, specifically those ending in .wav(fname).
# get the path to larger folder containing your data datadir = Path('./data')absolute() # create df with by-speaker subfolders containing wav and TextGrid data for one speaker # fnpat specifies unique wav files so that spkrdf contains each speaker name only once spkrdf = dir2df(datadir, fnpat='\.wav$') spkrdf |
spkrdf should look like the following for our toy data:
Processing TextGrid Files
Step 2: Extract phones and words tiers from TextGrid
Note that while these steps are quite similar to those outlined in my previous posts, there are some differences in the details, so we will walk through everything again here.
First, because we are only interested in vowels, we need to create a list of target sounds so that later we can use it to filter our TextGrid data and save ourselves time. We can do this by creating a list with all the possible vowel sounds, in this case using ARPAbet (Shoup 1980), a commonly-used phonetic alphabet that uses ASCII symbols, rather than the International Phonetic Alphabet. The numbers at the end indicate whether the vowel had primary stress (1), secondary stress (2), or were unstressed (0), and we also include bare annotations as a catch-all.
vowels = ['IY','IY0', 'IY1', 'IY2', 'IH', 'IH0', 'IH1', 'IH2', 'EY', 'EY0', 'EY1', 'EY2', 'EH', 'EH0', 'EH1', 'EH2', 'AH', 'AH0', 'AH1', 'AE', 'AE0', 'AE1', 'AE2', 'ER', 'ER0', 'ER1', 'ER2', 'UW', 'UW0', 'UW1', 'UW2','UH', 'UH0', 'UH1', 'UH2', 'OW', 'OW0', 'OW1', 'OW2', 'AA', 'AA0', 'AA1', 'AA2', 'AO', 'AO0', 'AO1', 'AO2'] |
Since spkrdf contains four speakers, but we want to test out our workflow on just one, we can use head() to select only the first row, for S00’s data. We’re setting this process up as a for-loop now so that it’s easier to adapt to multiple speakers later. Before working with the TextGrid, we can
We can then establish a new variable, spkrfile, which uses the info we stored in datadir and spkrdf to get the specific path to S01’s TextGrid file. We can use spkrdr as an argument to read_label (a function from audiolabel) in order to store each tier in the TextGrid as its own data frame – in this case, the tiers labeled ‘phones’ and ‘words,’ which are stored in phdf and wrdf, respectively.
for row in spkrdfhead(1)itertuples(): print(f"Processing speaker: {row.relpath}") spkrfile = Path(datadir, rowrelpath, rowfname)with_suffix('.TextGrid') phdf, wrdf = read_label(spkrfile, ftype='praat', tiers=['phones', 'words']) phdf |
phdf should look this like, including some empty phones segments where we didn’t include labels in Praat:
Similarly, wrdf looks like the following; note that, for now, phdf and wrdf have different row numbers (as we have multiple phone segments per word):
Step 3: Subsetting the phones data frame
Since phones are the finer-grained variable, we’ll want to subset phdf for the segments of interest before merging it into a single data frame with wrdf Because we ultimately want to take measurements from only the sounds of interest, we can first eliminate all of the empty segments corresponding to intervals we ignored during annotation. We use copy() to ensure the new subset is treated as a unique object, avoiding pesky error messages.
Since read_label conveniently stored the start (t1) and end (t2) times for each TextGrid segment inside phdf, we can simply subtract all of t1 from all of t2 to get the duration of each segment. Moreover, since these TextGrids were force aligned, and thus contain every phone of every word transcribed, we can use shift() on the phones column to populate two new columns, prev and nxt, with the previous and following sounds at each row.
Now we no longer need our non-target phones values, so we can use isin() to retain only the vowels we included in the vowels list, and from there, only retain the segments that are at least 0.05 sec. This latter step is because Praat (and therefore parselmouth) only samples the speech signal at about every 10 ms, and we want at least a few samples per measurement to ensure they are reliable.
# remove empty segments phdf = phdf[phdf['phones']!='']copy() # add phone duration tier phdf['phone_dur'] = phdf['t2']phdf['t1'] # add col for previous phone phdf['prev']=phdf['phones']shift() # add col for following phone phdf['nxt']=phdf['phones']shift(-1) # keep only vowels, remove short tokens phdf = phdf[phdf['phones']isin(vowels) & (phdf['phone_dur'] >=0.05)]copy() # check updated df - should be no empty phone segments or segments <0.05s phdf |
Now, phdf should look a bit neater and is reduced in row number:
Step 4: Merge phones and words dfs
Now that we’ve cleaned up phdf to include only the data points of interest, we’re ready to merge in wrdf so that each phone segment has its accompanying word label.merge_asof()allows us to specify a matching column between the two data frames on which to merge, in this case, t1. Each phone segment gets matched to the word segment with the nearest start time. We can also specify a list of columns for each data frame, indicating the columns we want to retain from each; this way we don’t end up with duplicate t2 columns, and we can eliminate any columns with extraneous information (like fnameabove).
The suffixes argument is a safeguard, so if something goes wrong and there are duplicate columns, they get a suffix added to their names indicating where they came from.
# merge matching on closest start times between phone and word annotations tg = pdmerge_asof( phdf[['t1', 't2', 'phones', 'phone_dur', 'prev', 'nxt']], wrdf[['t1', 'words']], on='t1', suffixes=['_ph', '_wd'] # in case there are duplicates ) # check merged df is same length and has only specified columns tg |
Our merged data frame is saved as tg, which should look like the following, with the same number of rows as phdf:
Setting the FormantPath parameters
Step 5: Create parameter dictionaries
The parselmouth library works by importing “calls” from Praat in Python, so essentially anything we would see in a prompt window from Praat when asking it to do something needs to be addressed in our code. To avoid making our parselmouth calls in the main loop excessively long, we can set the parameters we want in advance by storing them in a dictionary that we reference later. The nice thing about using “To FormantPath…” is that it helps us choose the best speaker-by-speaker, moment-by-moment formant analysis, so we can use the same parameters for all our speakers.
This is one of the main advantages of this approach over more typical ones, where each speaker gets their own parameter values, particularly for the analysis “ceiling,” based on their actual or presumed sex. In brief, ceilings are important here because if a speaker’s vocal tract resonances are relatively low but the ceiling is set too high, for example, the analysis may mistakenly identify higher formants (resonances – read more here) as lower ones, leading to incorrect measurements. Normally, this issue is addressed by leaning into the tendency for men to have longer vocal tracts and thus lower resonances than women, and setting two ceilings accordingly, but this ignores in-group variation and assumes we have the correct information about speaker sex. Oftentimes we don’t ask for this information properly if at all, and our incorrect assumptions can be both harmful and lead to inaccurate data. It’s time we address this!
Here, in our fpathparams dictionary, we’re giving Praat a starting point with mid_formant_ceiling that is intermediate between the typical ceilings for male and female speakers (5250 Hz) and telling it to choose the best ceiling, for each analysis window, within a range including 5 values above and below the mid value (at steps 0.05 * 5250 = 262.5 Hz in size). Ultimately Praat will fit a polynomial “path” over the sound file based on the settings chosen at each interval. We’ll extract 5 formants total, per max_num_formants. Read more about FormantPath objects here
# parameters for the FormantPath analysis fpathparams = { 'time_step(s)': 0.005, 'max_num_formants': 5.0, 'mid_formant_ceiling': 5250, 'window_len': 0.02, 'pre_emph_from(Hz)': 50, 'LPC_model': 'Robust', 'ceiling_step_size': 0.05, 'num_steps_up_down': 5, 'tolerance_1': 1e-6, 'tolerance_2': 1e-6, 'num_std_dev': 1.5, 'max_num_iterations': 5, 'tolerance': 0.000001, 'get_source_as_multichan_sound': 'no' } |
In addition, to convert the FormantPath object into tabular format, we’ll set the parameters for Praat’s “Down to table (optimal interval)...” call in the downtotableparams dictionary. The coeff_by_track parameter tells Praat the shape of the function we are fitting with our path. Other parameters like inc_num_formants and inc_bw tell Praat what columns to include in the table, in this case the number of formants successfully extracted and the formant bandwidth. We also have a final additional dictionary, downtotabledtype, that specifies the data type of each column in our output table, and we store the name of our desired columns as a list to use later. The resulting Table object can then be written to CSV or another delimited text format of your choice.
# parameters for the Table object downtotableparams = { 'coeff_by_track': '3 3 3 3 3', 'power': 1.25, 'inc_frame_num': 'no', 'inc_time': 'yes', 'num_time_decimal': 6, 'inc_intensity': 'yes', 'num_intensity_decimal': 3, 'inc_num_formants': 'yes', 'num_freq_decimal': 3, 'inc_bw': 'yes', 'inc_optimal_ceil': 'yes', 'inc_min_stress': 'yes' } # dtypes for the Table downtotabledtype = { 'time(s)': npfloat32, 'intensity': npfloat32, 'nformants': npint16, 'F1(Hz)': npfloat32, 'B1(Hz)': npfloat32, 'F2(Hz)': npfloat32, 'B2(Hz)': npfloat32, 'F3(Hz)': npfloat32, 'B3(Hz)': npfloat32, 'F4(Hz)': npfloat32, 'B4(Hz)': npfloat32, 'F5(Hz)': npfloat32, 'B5(Hz)': npfloat32, 'Ceiling(Hz)': npfloat32, 'Stress': npfloat32 } # list of column names downtotablecols =list(downtotabledtypekeys()) |
Working with the Audio File
Next, we have to prep the audio file for our single test utterance to extract our desired acoustic measures and add them to our ever-growing data frame.
Step 6: Use WAV path to create sound object
Since we already stored the directory information about our WAV files inside datadir, we can reconstruct the path using the relpath and fname arguments once again and save it as wav. For now, we just want the first audio file, which we can select with .iloc[0].
row = spkrdfiloc[0] row |
Then, if we provide the path to our WAV file as a string to the parselmouth method Sound(), it will save the WAV file as a ‘sound object’ in our notebook. However, since our audio comes from interview recordings, we also have to extract the channel with the participant’s audio to make sure it’s processed without noise from the other channel, which we do by selecting the second channel here. This allows us to start manipulating the file as we would in Praat.
# get path to wav files wav = datadir / rowrelpath / rowfname # use path name to create sound object snd_stereo = Sound(str(wav)) # extract channel for participant audio snd = snd_stereoextract_channel(2) |
Step 7: Create the FormantPath
Now we can generate our FormantPath. We can refer back to the values of our fpathparams dictionary using the “splat” operator (the initial asterisk). It’s also good practice to save the resulting object in case you want to check the current speaker’s output later for debugging purposes, which can be done by saving it to a binary file with pcall
fp = pcall(snd, 'To FormantPath...', *fpathparamsvalues()) pcall(fp, "Save as binary file...", "dollar_store.FormantPath") # save for later # ds_fp = pcall("Read from file...", "dollar_store.FormantPath") # re-import later using this line |
Step 8: Down to Table, convert to DataFrame
Next, we convert our path to a Table object, referring back to downtotableparams dictionary. The second ‘Down to Matrix’ step converts the Table into an indexable format, and the third step converts this matrix into a pandas DataFrame fmtdf, which we can work with directly in Python.
for phone_row in tgitertuples(index=False): opttable = pcall(fp, 'Down to Table (optimal interval)...', phone_rowt1, phone_rowt2, *downtotableparamsvalues()) optmatrix = pcall(opttable, 'Down to Matrix') # Create DataFrame from the extracted formant matrix # check this by hand 3/31 fmtdf = pdDataFrame({ c: pdSeries(optmatrixvalues[:, i], dtype=downtotabledtype[c]) for i, c inenumerate(downtotablecols) }) fmtdfhead() |
The first few columns of the head of fmtdf should look like the following, with the column names we specified in downtotablecols:
Step 9: Add metadata
Finally, we want to add all the additional metadata we want each row to be tagged with. In this case, we’ll include vowel start (t1) and end times (t2), speaker ID (speaker), recording name (recording), vowel labels (phones), vowel duration (phone_dur), previous (prev) and following vowels (nxt), and the word the vowel came from (word).
fmtdf['t1'] = phone_rowt1 fmtdf['t2'] = phone_rowt2 fmtdf['speaker'] = rowrelpath fmtdf['recording'] = rowfname fmtdf['phones'] = phone_rowphones fmtdf['phone_dur'] = phone_rowt2 phone_rowt1 fmtdf['prev'] = phone_rowprev fmtdf['nxt'] = phone_rownxt fmtdf['words'] = phone_rowwords fmtdf.head() |
Our final dataset for the test utterance now includes these columns:
Looping Through Speakers
In most use cases, whether we are interested in phonetic variation, social variation, or both, we will want to compare our measurements across multiple speakers. So long as your by-speaker is stored as described above, we only need to add a few lines to our to make this happen. Namely:
-
Initialize a fmtdf_list to temporarily store our data for each speaker.
-
Remove the head() method from our call to spkrdf to loop through the entirety of spkrdf
-
Add print() statements to let us know what speaker we're on and what step we're on for that speaker.
-
Use concat() as a last step, to append each df in fmtdf_list into a single final_fmtdf
This code could take some time to execute.
fmtdf_list = [] for row in spkrdfitertuples(index=False): print(f"Processing speaker: {row.relpath}") spkrfile = Path(datadir, rowrelpath, rowfname) phdf, wrdf = read_label(spkrfilewith_suffix('.TextGrid'), ftype='praat', tiers=['phones', 'words']) phdf = phdf[phdf['phones'] !='']copy() phdf['phone_dur'] = phdf['t2'] phdf['t1'] phdf['prev'] = phdf['phones']shift() phdf['nxt'] = phdf['phones']shift(-1) phdf = phdf[phdf['phones']isin(vowels) & (phdf['phone_dur'] >=0.05)]copy() tg = pdmerge_asof( phdf[['t1', 't2', 'phones', 'phone_dur', 'prev', 'nxt']], wrdf[['t1', 'words']], on='t1', suffixes=['_ph', '_wd'] ) wav = datadir / rowrelpath / rowfname snd_stereo = Sound(str(wav)) snd = snd_stereoextract_channel(2) fp = pcall(snd, 'To FormantPath...', *fpathparamsvalues()) pcall(fp, "Save as binary file...", "dollar_store.FormantPath") for phone_row in tgitertuples(index=False): opttable = pcall(fp, 'Down to Table (optimal interval)...', phone_rowt1, phone_rowt2, *downtotableparamsvalues()) optmatrix = pcall(opttable, 'Down to Matrix') fmtdf = pdDataFrame({ c: pdSeries(optmatrixvalues[:, i], dtype=downtotabledtype[c]) for i, c inenumerate(downtotablecols) }) fmtdf['t1'] = phone_rowt1 fmtdf['t2'] = phone_rowt2 fmtdf['speaker'] = rowrelpath fmtdf['recording'] = rowfname fmtdf['phones'] = phone_rowphones fmtdf['phone_dur'] = phone_rowt2 phone_rowt1 fmtdf['prev'] = phone_rowprev fmtdf['nxt'] = phone_rownxt fmtdf['words'] = phone_rowwords # Append each speaker’s data to the accumulator list fmtdf_listappend(fmtdf) # Combine results from all speakers into final DataFrame final_fmtdf = pdconcat(fmtdf_list, ignore_index=True) print("Final formant data compiled.") final_fmtdf |
Step 10: Save your completed dataset
# Save w/ original filename plus tag for metadata and date for reference final_fmtdfto_feather('./formant-paths_04-01-25.ft') |
Visualize the Vowel Space, Check for Outliers
Once we have our data for a few speakers, it's good practice to visualize your vowel formant data to check that it has the expected shape overall. We can do this using matplotlib and seaborn
Step 11: Subset for stressed vowels
To get a general sense of the vowel space in our visuals, we’ll want to filter for unstressed vowels; that way, our data are tidier and aren’t impacted by vowel reduction. ARPAbet marks stress with a final number, as you might recall, so we can use a boolean mask to filter for the vowels marked with ‘1,’ and then save all the vowel labels without numbers in a new column phones_short
final_fmtdf = final_fmtdf[final_fmtdf['phones']astype(str)str[-1] =='1'] final_fmtdf['phones_short'] = final_fmtdf['phones']str[:-1] final_fmtdf['phones_short']unique() |
Step 12: Group using by-token medians
Because “To Formant Path…” gives us far more measurements per token than we need (as you can see above, 20 minutes of speech from three speakers each yielded 138,201 measurements), we’ll group together measurements taken between the same t1 and t2 for the same recording, by their median. This avoids any large outliers from biasing the grouping as might happen with the mean. We can do this by chaining the DataFrame methods groupby() and .agg(). Let’s also get rid of the S00 test data.
final_fmtdf = final_fmtdf[final_fmtdf['speaker'] !='S00'] medians = final_fmtdfgroupby(['speaker', 'recording', 't1', 'phones_short'])agg({'F1(Hz)': 'median', 'F2(Hz)': 'median'})reset_index() medians.head() |
Step 13: Filter for outliers
There are many ways we can filter for outliers, but one common method is to use descriptive statistics, i.e., mean and standard deviation. For each speaker, let’s filter our vowels to retain only the medians that are within 1.75 standard deviations of the mean for each vowel. We’ll first have to get the mean and standard deviation for each relevant grouping, then merge them into our medians_df before filtering.
stats = mediansgroupby(['speaker', 'phones_short'])[['F1(Hz)', 'F2(Hz)']]agg(['mean', 'std']) # flatten the MultiIndex columns statscolumns = ['_'join(col)strip() for col in statscolumnsvalues] # merge the stats with the medians df medians = mediansmerge(stats, on=['speaker', 'phones_short'], how='left') threshold =1.75 filtered_meds = medians[ (medians['F1(Hz)'] >= medians['F1(Hz)_mean'] threshold * medians['F1(Hz)_std']) & (medians['F1(Hz)'] <= medians['F1(Hz)_mean'] + threshold * medians['F1(Hz)_std']) & (medians['F2(Hz)'] >= medians['F2(Hz)_mean'] threshold * medians['F2(Hz)_std']) & (medians['F2(Hz)'] <= medians['F2(Hz)_mean'] + threshold * medians['F2(Hz)_std']) ] |
Step 14: Plot filtered medians for each speaker
Now we can save a list of all our speakers and loop over them to plot their filtered by-token medians. Note that we are setting axis limits for all three, so adjustments may be needed to accommodate additional data, and that we must invert the axes to get the canonical plot with F1 increasing downward and F2 increasing leftward. In the final step, we use groupby once more to get the mean of the medians in order to plot a label within each vowel’s distribution.
unique_speakers = filtered_meds['speaker']unique() for speaker in unique_speakers: speaker_data = filtered_meds[filtered_meds['speaker'] == speaker] # Scatter plot pltfigure(figsize=(8, 6)) scatter_plot = snsscatterplot( x='F2(Hz)', y='F1(Hz)', hue='phones_short', style='speaker', data=speaker_data, s=50, alpha=0.7, palette='muted' ) pltxlabel('F2 (Hz)') pltylabel('F1 (Hz)') pltxlim(100, 3000) pltylim(200, 1200) # invert axes for vowel plotting pltgca()invert_yaxis() pltgca()invert_xaxis() # Label means of the medians means = speaker_datagroupby('phones_short')[['F1(Hz)', 'F2(Hz)']]mean()reset_index() for _, row in meansiterrows(): plttext( row['F2(Hz)'], row['F1(Hz)'], row['phones_short'], fontsize=12, ha='right', va='bottom', fontweight='bold' ) pltlegend() plttitle(f'Vowel space for {speaker}') pltshow() |
Congratulations! Now you can make custom visuals of speakers' vowel spaces in a corpus, based on fine-grained formant measurements taken without relying on sex-based assumptions.
As you can see, there may be more outliers to look into for these speakers, so this tutorial should serve as a starting point that can be used to assess other steps in the pipeline, such as the forced alignment that produced the annotations, before moving on to statistical analysis.
References
-
Boersma, P., & Weenink, D. (2021). Praat: doing Phonetics by Computer [Computer program]. Version 6.1.38, retrieved January 2, 2021, fromhttp://www.praat.org/
-
Jadoul, Y., Thompson, B., & de Boer, B. (2018). Introducing Parselmouth: A Python interface to Praat. Journal of Phonetics, 71, 1–15.https://doi.org/10.1016/j.wocn.2018.07.001
-
McKinney, W. (2010). Data Structures for Statistical Computing in Python. In S. van der Walt & J. Millman (Eds.), Proceedings of the 9th Python in Science Conference (pp. 56–61).https://doi.org/10.25080/Majora-92bf1922-00a
-
Python Software Foundation. (2024). pathlib — Object-oriented filesystem paths. In Python documentation (3.x version). Retrieved fromhttps://docs.python.org/3/library/pathlib.html
-
Shoup, J. E. (1980). Phonological aspects of speech recognition. In W. A. Lea (Ed.), Trends in speech recognition (pp. 125–138). Prentice Hall.
-
Sprouse, R. (2024a). audiolabel: Python library for reading and writing label files for phonetic analysis (Praat, ESPS, Wavesurfer). GitHub.https://github.com/rsprouse/audiolabel
-
Sprouse, R. (2024b). phonlab: UC Berkeley Phonlab utilities. GitHub.https://github.com/rsprouse/phonlab
-
The pandas development team. (2020, February). pandas-dev/pandas: Pandas (latest version). Zenodo.https://doi.org/10.5281/zenodo.3509134
-
Van Nuenen, T., Sachdeva, P., & Culich, A. (2024). D-Lab Python Fundamentals Workshop: D-Lab's 6-part, 12-hour introduction to Python. GitHub.https://github.com/dlab-berkeley/Python-Fundamentals