Project Three - Natural Diaster Tweet Classification
Twitter can be a powerful tool, for finding urgent information about situations around the globe. However, it can also be a tool for communicating useless information about sales and petty gossip.
Hashtags can be useful for shifting through the clutter! But often they are misused, such as using #Hurricane on a sale ad to increase views. This has made it difficult for individuals and organizations to identify relevant information. This has become salient, as social media had become an overwhelming database of information.
In the past decade, humans have become able to teach computers to examine the complex structure of the English language. The purpose of the research paper is to use a machine learning technique called Natural Language Processing (NLP) to read through the tweets. The goal is to use NLP to predict whether a tweet is reporting on a natural disaster/crisis or not.
Using the collection of Twitter posts, we used a Random Forest and Multinomial Naive Bayes Regressions. After this, we were able to predict with an accuracy of 80.6 percent whether a tweet was describing a natural disaster or not. The findings are promising, and could help the general population analyze information shared during times of crisis.
Final Report
Presentation