Hanjia Lyu

I will join the Computer Science Department at University of Rochester (UR) as a Ph.D. student in fall 2021, where I will be advised by Prof. Jiebo Luo. Previously, I did my master’s in Data Science at UR and bachelor’s at Fudan University. My general research area is data mining and computational social science. I am also interested in machine learning and health informatics.

Email: hlyu5 -at- ur.rochester.edu

Google Scholar

Photo credit: Maggie Zhou

What’s new



  • Yipeng Zhang, Hanjia Lyu*, Yubao Liu*, Xiyang Zhang, Yu Wang, and Jiebo Luo. “Monitoring Depression Trends on Twitter During the COVID-19 Pandemic: Observational Study.” JMIR Infodemiology (2021).


  • Siqing Cao, Hanjia Lyu, and Xian Xu. “InsurTech development: Evidence from Chinese media reports.” Technological Forecasting and Social Change 161 (2020): 120277.
  • Hanjia Lyu, Long Chen, Yu Wang, and Jiebo Luo. “Sense and sensibility: Characterizing social media users regarding the use of controversial terms for covid-19.” IEEE Transactions on Big Data (2020).


(in chronological order)


State-level Racially Motivated Hate Crimes Contrast Public Opinion on the #StopAsianHate and #StopAAPIHate Movement

Hanjia Lyu, Yangxin Fan, Ziyu Xiong, Mayya Komisarchik, Jiebo Luo

arXiv, 2021

We conduct a social media study of public opinion on the #StopAsianHate and #StopAAPIHate movement based on 46,058 Twitter users across 30 states in the United States ranging from March 18 to April 11, 2021.

Monitoring Depression Trend on Twitter during the COVID-19 Pandemic: Observational Study

Yipeng Zhang, Hanjia Lyu*, Yubao Liu*, Xiyang Zhang, Yu Wang, Jiebo Luo

JMIR Infodemiology, 2021

We create a fusion classifier that combines deep learning model scores with psychological text features and users’ demographic information and investigate these features’ relations to depression signals in the context of COVID-19.

From Static to Dynamic Prediction: Wildfire Risk Assessment Based on Multiple Environmental Factors

Tanqiu Jiang, Sidhant K. Bendre, Hanjia Lyu, Jiebo Luo

arXiv, 2021

We propose static and dynamic prediction models to analyze and assess the areas with high wildfire risks in California by utilizing a multitude of environmental data including population density, Normalized Difference Vegetation Index (NDVI), Palmer Drought Severity Index (PDSI), tree mortality area, tree mortality number, and altitude.

Understanding Patterns of Users Who Repost Censored Posts on Weibo

Yichi Qian, Feng Yuan, Hanjia Lyu, Jiebo Luo

arXiv, 2021

We focus on understanding patterns of users whose repost contents would later be censored on Weibo, a counterpart of Twitter in China as a social media platform. 

From Gen Z, Millennials, to Babyboomers: Portraits of Working from Home during the COVID-19 Pandemic

Ziyu Xiong, Pin Li, Hanjia Lyu, Jiebo Luo

arXiv, 2021

We instead conduct a large-scale social media study using Twitter data to portrait different groups who have positive/negative opinions about WFH.

Characterizing Discourse about COVID-19 Vaccines: A Reddit Version of the Pandemic Story

Wei Wu, Hanjia Lyu, Jiebo Luo

arXiv, 2021

This study aims to offer a clear understanding about different population groups’ underlying concerns when they talk about COVID-19 vaccines, particular those active on Reddit.


Social Media Study of Public Opinions on Potential COVID-19 Vaccines: Informing Dissent, Disparities, and Dissemination

Hanjia Lyu, Junda Wang, Wei Wu, Viet Duong, Xiyang Zhang, Timothy D. Dye, Jiebo Luo

arXiv, 2020

We adopt a human-guided machine learning framework (using more than 40,000 rigorously selected tweets from more than 20,000 distinct Twitter users) to capture public opinions on the potential vaccines for SARS-CoV-2, classifying them into three groups: pro-vaccine, vaccine-hesitant, and anti-vaccine.

InsurTech development: Evidence from Chinese media reports

Siqing Cao, Hanjia Lyu, Xian Xu

Technological Forecasting and Social Change, 2020

This paper uses text mining technology and Python to analyze the word frequency and term frequency-inverse document frequency (TFIDF) of 25,662 InsurTech-related news reports from 2015 to 2019 in China.

Understanding the Hoarding Behaviors during the COVID-19 Pandemic using Large Scale Social Media Data

Xupin Zhang, Hanjia Lyu, Jiebo Luo

arXiv, 2020

To investigate the hoarding behaviors in response to the pandemic, we propose a novel computational framework using large scale social media data.

How Political is the Spread of COVID-19 in the United States? An Analysis using Transportation and Weather Data

Karan Vombatkere, Hanjia Lyu, Jiebo Luo

arXiv, 2020

We investigate the difference in the spread of COVID-19 between the states won by Donald Trump (Red) and the states won by Hillary Clinton (Blue) in the 2016 presidential election, by mining transportation patterns of US residents from March 2020 to July 2020.

Sense and Sensibility: Characterizing Social Media Users Regarding the Use of Controversial Terms for COVID-19

Hanjia Lyu, Long Chen, Yu Wang, Jiebo Luo

IEEE Transactions on Big Data, 2020

We characterize the Twitter users who use controversial terms and those who use non-controversial terms for COVID-19. We find significant differences between these two groups of Twitter users across their demographics, user-level features like the number of followers, political following status, as well as geo-locations.

The Influence of COVID-19 on Well-Being

Xiyang Zhang, Yu Wang, Hanjia Lyu, Yipeng Zhang, Yubao Liu, Jiebo Luo

PsyArXiv, 2020

We found that pandemic severity influenced working adults’ negative affect rather than positive affect. However, the relationship between pandemic severity and the negative affect was moderated by personality (i.e., openness and conscientiousness) and family connectedness.

In the Eyes of the Beholder: Analyzing Social Media Use of Neutral and Controversial Terms for COVID-19

Long Chen, Hanjia Lyu, Tongyu Yang, Yu Wang, Jiebo Luo

arXiv, 2020

To model the substantive difference of tweets with controversial terms and those with non-controversial terms with regard to COVID-19, we apply topic modeling and LIWC-based sentiment analysis.

What Contributes to a Crowdfunding Campaign’s Success? Evidence and Analyses from GoFundMe Data

Xuping Zhang, Hanjia Lyu, Jiebo Luo

arXiv, 2020

We focus on the performance of the crowdfunding campaigns on GoFundMe over a wide variety of funding categories. We analyze the attributes available at the launch of the campaign and identify attributes that are important for each category of the campaigns.

Reviewing and Service

  • Journal Reviewer: Maternal and Child Health Journal, Telematics and Informatics


  • TA CS240/440 – Fall 2020, Data Mining