Description
We introduce the concepts of data processing: tokenization — convert sentences to words, Removing unnecessary punctuation, tags, removing stop words , stemming, lemmatization. We process the words from text as discrete, categorical features. We discuss techniques such as Bag of Words for numerical representation of words.