Service Quality Monitoring in Confined Spaces Through Mining Twitter Data (Detecting Events of Interest)

This is an implementation of Frequency and Sentiment-based Event Detection (FSED) in the context of service quality of public transport. This repository compares the result of the proposed method with the following state-of-the-art time-series-based event detection approaches:

- Bivariate Outlier Detection
- Seasonal Hybrid ESD (S-H-ESD)
- TSOutliers
- Anomalize
- Long Short Term Memory (LSTM)

In this project, two major transport hubs are considered as the case studies due to their current importance on transferring a large number of people. First, a Twitter dataset comprising of more than 32 million tweets is collected. This data is obtained from the Australian Urban Research Infrastructure Network (AURIN). Keywords and spatial proximity to hubs are employed to detect relevant tweets.

Next, tweets are manually labelled and mapped to different aspects of SQ (Safety, View, Information, Service Reliability, Comfort, Personnel, and Additional Services). Those tweets that do not fall into any of these aspects are considered as irrelevant to the SQ of public transport and therefore, are discarded (Class -1).

Finally, a list of events happened inside of SCS are manually-labeled for the study period. The resulted events then get verified using a list of events provided from SCS authorities as a ground-truth.