Citation
Tan, Wai Hong (2019) Predicting the popularity of tweets using the theory of point processes. Doctoral thesis, University of New South Wales. |
Abstract
This thesis focuses on the problem of predicting the tweet popularity, or the number of retweets stemming from an original tweet. We propose several prediction methodologies using the theory of point processes, where the prediction of the future popularity of a tweet is based on observing the retweet time sequence up to a certain censoring time, and the prediction performance is evaluated on a large Twitter data set. We first propose a marked point process model, termed the Marked Self-Exciting Process with Time-Dependent Excitation Function, or the MaSEPTiDE for short. The intensity process of the model is interpretable as a cluster Poisson process, which implies that the model can be simulated using the cascading algorithm similar to that used for the efficient simulation of Hawkes processes, and the prediction can be done properly by exploiting the probabilistic properties of the model. The MaSEPTiDE approach shows highly accurate tweet popularity predictions compared to state-of the- art approaches, especially at shorter censoring times. We further propose an inhomogeneous Poisson process model and an estimation method which utilizes internal and external knowledge, based on the times of historical retweets up to the censoring time, and the complete retweet sequences in the training data set respectively. The knowledge is combined using a novel empirical Bayes type approach, where the prior distribution for the model parameter is constructed based on the external knowledge, and the likelihood is calculated based on the internal knowledge. The mode of the posterior distribution is used as the estimator of the finite-dimensional parameter, and suitable functionals of the predictive distribution for the number of retweets implied by the estimated model are used to predict the tweet popularity. The model, termed the EB Poisson model, is found to be both efficient and accurate, with an additional advantage of being able to predict without observing any retweets. The proposed EB approach of inference is applicable on other point process models, such as the MaSEPTiDE model, to improve the prediction performance and computational efficiency. We demonstrate this by applying the EB approach on the MaSEPTiDE model and reporting further improvements in the prediction accuracy. |
Download File / URL
|
Additional Metadata
Item Type: | UMK Etheses |
---|---|
Collection Type: | Thesis |
Date: | 2019 |
Subject Heading: | Social networks |
Subject Heading: | Tweet popularity |
Subject Heading: | Prediction methodologies |
Subject Heading: | Point processes theory |
Number of Pages: | 137 |
Call Number: | HM741 .T36 2019 tes |
Supervisor: | Dr. Feng Chen |
Programme: | Doctor of Philosophy |
Institution: | University of New South Wales |
Faculty/Centre/Office: | Faculty of Entrepreneurship and Business |
URI: | http://discol.umk.edu.my/id/eprint/10738 |
Statistic Details: | View Download Statistic |