Building and analyzing word and character based LSTM models using Python, Keras and the NLTK library

If you’re not familiar with the NLTK library and data preprocessing, take a look at this article. If you’re interested in language models and how to build them, read this article. If you’re familiar with NLP and language models, continue reading!

What are LSTMs?

Long-short-term memory models or LSTMs are used to solve the problem of short term memory by using gates that regulate the flow of information. These models have mechanisms that decide whether or not to keep information, thereby being able to retain important information over a long time. …

Experimenting with POS tagging, a standard sequence labeling task using Conditional Random Fields, Python, and the NLTK library.

For an introduction to NLP and basic text preprocessing, refer to this article. For an introduction to language models and how to build them, take a look at this article. If you’re familiar with NLP and its tools, continue reading!

What is POS Tagging?

POS or part-of-speech tagging is the technique of assigning special labels to each token in text, to indicate its part of speech, and usually even other grammatical connotations, which can later be used in text analysis algorithms. For example, for the sentence -

She is reading a book.

She’ would the POS tag of pronoun, ‘is’ would get an article tag, ‘reading’ a verb tag, ‘a’ would get an article tag and ‘book’ would get a noun tag. We can then do a search for all verbs which would pull up the word reading, and also use these tags in other algorithms. …

Building and comparing the accuracy of NB and LSTM models on a given dataset using Python, Keras and the NLTK library.

If you’re not familiar with the NLTK library and data pre-processing, take a look at this article. If you’re interested in language models and how to build them, read this article. If you’re familiar with NLP and language models, continue reading!

What are Language Based Classifier Models?

A statistical language model is a probability distribution over sequences of words which can be used to predict the next word for text generation and many other applications. Classifiers such as Naive Bayes make use of a language model to assign class labels to some instances, based on a set of features which can be numerically represented using statistical techniques. …

Building and studying statistical language models from a corpus dataset using Python and the NLTK library.

To get an introduction to NLP, NLTK, and basic preprocessing tasks, refer to this article. If you’re already acquainted with NLTK, continue reading!

What are Language Models?

A language model learns to predict the probability of a sequence of words. But why do we need to learn the probability of words?

In machine translation, we take in a bunch of words from a language and convert these words into another language. Now, there can be many potential translations that a system might give us and we will want to compute the probability of each of these translations to understand which one is the most accurate. …

A detailed walkthrough of preprocessing a sample corpus with the NLTK library using stemming and lemmatization.

What is Natural Language Processing?

Natural Language Processing or NLP is a branch of artificial intelligence that deals with the interaction between computers and humans using the natural language. The ultimate objective of NLP is to read, decipher, understand, and make sense of human languages in a manner that is valuable. To this end, many different models, libraries, and methods have been used to train machines to process text, understand it, make predictions based on it, and even generate new text. The first step to training a model is to obtain and preprocess the data. …

My journey in the Phase I of the Machine Learning for Microsoft Azure Scholarship Program by Udacity.


  1. Introduction
  2. The Application Process
  3. Program Roadmap
  4. The Foundations Course
  5. The Slack Workspace
  6. Student Leaders
  7. Study Groups
  8. The 50DaysOfUdacity Challenge
  9. The Project Showcase Challenge
  10. Study Jams
  11. Live AMA with Microsoft!
  12. Other Challenges
  13. Conclusion
  14. Phase II
  15. Resources


This article is about the Microsoft Azure Scholarship Program by Udacity and my journey through the 2.5 month long duration of Phase I of the scholarship. It includes the application and selection process, the various components of the program, and the community initiatives. I have also added a comprehensive Airtable of all resources, including course notes, ML and Python resources, SDE interview questions, GitHub repositories, and more, collected by my peers in the duration of the program, in the Resources section. It proved extremely helpful to me in learning and preparing for interviews as well as networking with peers. …

National Aerospace Conceptual Design Competition III


  1. Introduction
  2. Overview
  3. The Problem Statement
  4. Mission Requirements
  5. Evaluation Criteria
  6. Stages of the Concept Design
  7. A Learning Experience
  8. References


I and four teammates, a group of 5, participated in the third edition of NACDeC, representing IIT Kharagpur, from October 2019 to around September 2020. We reached Stage II, but unfortunately could not qualify for Stage III, the finals. In this article, I will talk about the overall journey through the various rounds and reviews, and give a brief overview of the technical aspects involved, such as the softwares used, the iterations done, and the results obtained.


The Design Division of Aeronautical Society of India was set up in 2017, to act as a torch bearer for aerospace design professionals, and to help them scale professional heights by offering a platform for inter-organisational exchange of ideas, to report professional contributions and to augment professional knowledge. With this in mind, the Design Division decided to conduct a yearly National Aerospace Conceptual Design Competition (NACDeC) for students of Aeronautical or Aerospace Engineering department from CFTIs or AICTE approved institutions in India. …

How we designed the autonomous robotics event, Tesseract, as part of Asia’s largest tech fest, Kshitij 2020, at IIT Kharagpur.


  1. Introduction
  2. Overview
  3. Unique Selling Point
  4. Initial Parameters
  5. Important Factors
  6. Prototyping
  7. Additional Support
  8. Swag
  9. Conclusion
  10. References


This article is going to go through the iterative process of designing and prototyping a technical event on a national level. Apart from listing some of the crucial parameters that should be taken into consideration while designing, I will also briefly go through the event we designed.


Technology Robotix Society is an official society under the Technology Students’ Gymkhana, IIT Kharagpur, and is a focal point for activities and projects related to robotics in the campus. With its reach expanding steadily each year, it has also cemented its position as one of the nerve centres of amateur robotics in India. …

The project done by my team and I in the Mars Colonisation Program, as part of the Microsoft Engage 2020.

If you haven’t read my article on the Mars Colonisation Program With Microsoft, go read that first!

Table of Contents

  1. Introduction
  2. Problem Statement
  3. Evaluation Criteria
  4. Brainstorming Sessions
  5. The Example Code
  6. Our Project
  7. Low Level Usage
  8. Online Demo of the Final Project
  9. Submission
  10. Future Possibilities
  11. Important Links


This article mainly talks about the overall project implementations we did, the software we used, the algorithms we deployed, and the functionalities we added to our project, in the Mars Colonisation Program, as part of Microsoft Engage 2020, which was held in the months of June-July.

Problem Statement

We could choose any project from the two choices, whose example code was also given…

My journey and experience with Microsoft in Engage 2020, in the Mars Colonization Program, in association with Ace Hacker.

Table of Contents

  1. What is Microsoft Engage
  2. How Did I Qualify
  3. Overview of the Program
  4. A Learning Experience
  5. Results
  6. Important Links

What is Microsoft Engage?

Microsoft Engage is a program created by Microsoft engineers, in association with Ace Hacker, for students to work on projects with live interaction and help from engineers and mentors at Microsoft. Students can be a part of this program by qualifying through Codess, Codefunfo, or through the Engage qualification test itself. This year, the projects were based on the theme -Mars Colonization.

Mars Colonisation Program
Mars Colonisation Program
The Mars Colonization Program

How Did I Qualify?

Codess is a community for female coders initiated by Microsoft. It was established to explore ways to promote gender diversity in the engineering field. I learned about Codess through seniors in my college, IIT Kharagpur, who told me that it was a great way to get into Microsoft, and contribute to eradicating gender disparity in STEM, especially for second and third year undergraduate female students. …


Ruthu S Sanketh

Majoring in Aerospace at IIT Kharagpur. Passionate about robotics, AI, and tech that is going to shape the future. Spend most of my time reading :)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store