- Repository URL:
- crowd, event, clustering, detection, discovery, social media, twitter
thesis / dissertation description
In this research thesis, we investigate new methods for crowd-oriented event detection in social media (specifically Twitter). Specifically, we describe, evaluate, and suggest content based methods of extracting features that define events occurring in social media streams. Content-based methods examine the appearance of event-describing keywords, topical words, in a stream of tweets. With these aggregated features, tweets are then clustered using a parallelized version of canopy and k-means clustering in order to find groups of “similar” tweets which represent events. Tracking of events through time is done by evaluating the similarity of events in consecutive time periods. The effectiveness of the feature extraction stage is determined by the relevance of tweets to one another in event summaries. Our experiments aim toward finding the optimal parameters for feature extraction and event clustering.