Log in

Sign up for our mailing list!

When & Where
Tue, January 10, 2017 - 2:00 PM to 4:00 PM
Barrows 356: D-Lab Convening Room

This hands on workshop goes through the common “preprocessing recipe” that is used as the foundation for a variety of other applications as well as some basic natural language processing techniques.  These include: a) digitization (utf 8), b) removal of stopwords, numbers, punctuation, c) tokenization, d) calculation of word frequencies / proportions, e) part of speech tagging, and f) concordances.  This will be done using the NLTK Python package, so basic familiarity with Python is required if you wish to follow along with the tutorial.

This workshop is one of a four-part series that will prepare participants to move forward with text analysis research, with a special focus on humanities and social science applications. Please register for each workshop separately. The other workshops in the series are listed below:

Please note: The instructor will be available from 4:00-5:00pm on Monday, January 9th and from 1:00-2:00pm on Tuesday, January 10th to assist with software installation and troubleshooting in preparation for parts 2, 3, and 4 of this series.

Training Host: 
D-lab Facilitator: 
Jon Stiles
Format Detail: 
Interactive, hands-on
Log in to register for this training.