Digital Special Collection Portal

Predicting the popularity of tweets using the theory of point processes.


Citation

Tan, Wai Hong (2019) Predicting the popularity of tweets using the theory of point processes. Doctoral thesis, University of New South Wales.

Abstract

This thesis focuses on the problem of predicting the tweet popularity, or the number of retweets stemming from an original tweet. We propose several prediction methodologies using the theory of point processes, where the prediction of the future popularity of a tweet is based on observing the retweet time sequence up to a certain censoring time, and the prediction performance is evaluated on a large Twitter data set.

We first propose a marked point process model, termed the Marked Self-Exciting Process with Time-Dependent Excitation Function, or the MaSEPTiDE for short. The intensity process of the model is interpretable as a cluster Poisson process, which implies that the model can be simulated using the cascading algorithm similar to that used for the efficient simulation of Hawkes processes, and the prediction can be done properly by exploiting the probabilistic properties of the model. The MaSEPTiDE approach shows highly accurate tweet popularity predictions compared to state-of the- art approaches, especially at shorter censoring times.

We further propose an inhomogeneous Poisson process model and an estimation method which utilizes internal and external knowledge, based on the times of historical retweets up to the censoring time, and the complete retweet sequences in the training data set respectively. The knowledge is combined using a novel empirical Bayes type approach, where the prior distribution for the model parameter is constructed based on the external knowledge, and the likelihood is calculated based on the internal knowledge. The mode of the posterior distribution is used as the estimator of the finite-dimensional parameter, and suitable functionals of the predictive distribution for the number of retweets implied by the estimated model are used to predict the tweet popularity. The model, termed the EB Poisson model, is found to be both efficient and accurate, with an additional advantage of being able to predict without observing any retweets.

The proposed EB approach of inference is applicable on other point process models, such as the MaSEPTiDE model, to improve the prediction performance and computational efficiency. We demonstrate this by applying the EB approach on the MaSEPTiDE model and reporting further improvements in the prediction accuracy.

Download File / URL

[thumbnail of Tan Wai Hong.pdf] Text
Tan Wai Hong.pdf - Submitted Version
Restricted to Registered users only

Download (4MB)

Additional Metadata

Item Type: UMK Etheses
Collection Type: Thesis
Date: 2019
Subject Heading: Social networks
Subject Heading: Tweet popularity
Subject Heading: Prediction methodologies
Subject Heading: Point processes theory
Number of Pages: 137
Call Number: HM741 .T36 2019 tes
Supervisor: Dr. Feng Chen
Programme: Doctor of Philosophy
Institution: University of New South Wales
Faculty/Centre/Office: Faculty of Entrepreneurship and Business
URI: http://discol.umk.edu.my/id/eprint/10738
Statistic Details: View Download Statistic

Edit Record (Admin Only)

View Item View Item

The Office of Library and Knowledge Management, Universiti Malaysia Kelantan, 16300 Bachok, Kelantan.
Digital Special Collection (UMK Repository) supports OAI 2.0 with a base URL of http://discol.umk.edu.my/cgi/oai2