Session

Lecture: Process text, including tokenization and representing sentences as vectors

Dec 10, 2020, 10:00 AM
Online

Online

Description

We introduce the concepts of data processing: tokenization — convert sentences to words, Removing unnecessary punctuation, tags, removing stop words , stemming, lemmatization. We process the words from text as discrete, categorical features. We discuss techniques such as Bag of Words for numerical representation of words.

Presentation materials

There are no materials yet.
Building timetable...