Log in

Sign up for our mailing list!

When & Where
Tue, September 19, 2017 - 1:00 PM to 4:00 PM
Barrows 356: Convening Room

Students will learn the basics of cleaning, transforming, and formatting text data. They will pull specific elements out of text strings, and pull simple metrics from text data, such as word counts, syntax quantification via part of speech (POS) tagging, and sentiment polarity. Students will be introduced to topic modeling and word2vec methods. The libraries used are NLTK, TextBlob, and gensim. This is an interactive, hands-on workshop, in which students will complete challenges related to each text analysis task.

Prior knowledge: Completion of D-Lab's Python for Everything Series.

Technology Requirements: Laptop required; please install the Anaconda distribution of Python 3 or its equivalent. The workshop will utilize the Jupyter Notebook, but IDEs are also acceptable.

Please install the python packages “gensim”, “textblob” and “NLTK”:

  • pip install gensim
  • pip install nltk
  • pip install textblob

Or if you have anaconda:

  • conda install gensim
  • conda install nltk
  • conda install textblob

Link: Install the Anaconda distribution of Python 3Jupyter Notebook


Training Host: 
D-lab Facilitator: 
Susan Grand
Format Detail: 
Interactive, hands-on
Participant Technology Requirement: 
Laptop required; show up early if you need help with installation
Log in to register for this training.