- Business, Management and Accounting; Computer Science; Decision Sciences
- Most Recent Tweet View All Tweets
One of the most relevant and widely studied structural properties of networks is their community structure. Detecting communities is of great importance in social networks where systems are often represented as graphs. With the advent of web-based social networks like Twitter, Facebook and LinkedIn. community detection became even more difficult due to the massive network size, which can reach up to hundreds of millions of vertices and edges. This large graph structured data cannot be processed without using distributed algorithms due to memory constraints of one machine and also the need to achieve high performance. In this paper, we present a novel hybrid (shared + distributed memory) parallel algorithm to efficiently detect high quality communities in massive social networks. For our simulations, we use synthetic graphs ranging from 100K to 16M vertices to show the scalability and quality performance of our algorithm. We also use two massive real world networks: (a) section of Twitter-2010 network having ≈41M vertices and ≈1.4B edges (b) UK-2007 (.uk web domain) having ≈105M vertices and ≈3.3B edges. Simulation results on MPI setup with 8 compute nodes having 16 cores each show that, upto ≈6X speedup is achieved for synthetic graphs in detecting communities without compromising the quality of the results.