Natural Language Processing (NLP)

WORKING WITH TEXT

Natural Language Processing (NLP)

Training description

This custom training covers a broad range of topics related to Natural Language Processing (NLP). Due to the complexity of this topic, the exact scope is always different depending on the client needs. The topics range varies from an introduction to methods of working with text in Python up to utilising state of the art Deep Learning methods for various NLP tasks.

Duration: 3-7 days (depending on the exact scope)

Training agenda

Part one: Introduction

  • Loading text data. (Pandas, Python files API)
  • Basic string operations
  • Regular expressions
  • Crawling basics (Selenium)

Part two: Preprocessing

  • Data cleanup (beautifulsoup)
  • Normalization
    • Stemming
    • Lemmatization
    • Stop words removal
  • Segmentation
  • Tokenization
    • Using basic string operations
    • With NLTK & Spacy
    • SentencePiece

Part three: Vectorization

  • Bag of words
    • Simple custom implementation
    • With scikit-learn
  • TFIDF
    • Custom implementation
    • With scikit-learn
  • Dense word representations
    • word2vec
    • doc2vec
    • fastText
  • Introduction to contextual word representations

Part four: Text-based models

  • Similarity-based models
    • Anomalies detection via clustering
    • Categories/tags assignment via k-NN
  • Introduction to deep learning on classification tasks
    • MLP + tf-idf
    • LSTM/GRU + vector sequences
    • CNN + vector sequences
  • Sentiment analysis
  • Part-of-speech tagging
    • Using out-of-the-box models
    • Fine-tuning POS models
  • BERT

Technologies used on the training:

  • Primary: Python, NLTK/Spacy, PyTorch/Keras+ TensorFlow
  • Secondary: Selenium, BeautifulSoup
  • Optional: Gensim, Flair, BERT, Polyglot, fastText

Contact us about closed training

This website uses cookies to ensure you get the best experience on our website.
Ok, got it. More about cookies