Published in The Journal of Quantitativ Description, 2022

This paper introduces and presents a first analysis of a uniquely curated dataset of misinformation, disinformation, and rumors spreading on Twitter about the 2020 U.S. Election. Previous research on misinformation — an umbrella term for false and misleading content — has largely focused either on broad categories, using a finite set of keywords to cover a complex topic, or on a few, focused case studies, with increased precision but limited scope. Our approach, by comparison, leverages real-time reports collected from September through November 2020 to develop a comprehensive dataset of tweets connected to 456 distinct misinformation stories from the 2020 US Presidential election (our ElectionMisinfo2020 dataset), 307 of which sowed doubt in the legitimacy of the election. By relying on real-time incidents and streaming data, we generate a curated dataset that not only provides more granularity than a large collection based on a finite number of search terms, but also an improved opportunity for generalization compared to a small set of case studies. Though the emphasis is on misleading content, not all of the tweets linked to a misinformation story are false: some are questions, opinions, corrections, or factual content that nonetheless contributes to misperceptions. Along with a detailed description of the data, this paper provides an analysis of a critical subset of election-delegitimizing misinformation in terms of size, content, temporal diffusion, and partisanship. We label key ideological clusters of accounts within interaction networks, describe common misinformation narratives, and identify those accounts which repeatedly spread misinformation. We document the asymmetry of misinformation spread: accounts associated with support for President Biden shared stories in ElectionMisinfo2020 far less than accounts supporting his opponent. That asymmetry remained among the accounts who were repeatedly influential in the spread of misleading content that sowed doubt in the election: all but two of the top 100 ‘repeat spreader’ accounts were supporters of then-President Trump. These findings support the implementation and enforcement of ‘strike rules’ on social media platforms, directly addressing the outsized role of repeat spreaders.

Kennedy, Ian, Morgan Wack, Andrew Beers, Joseph Schafer, Isabella Garcia-Camargo, Emma Spiro, and Kate Starbird. 2022.“Repeat Spreaders and Election Delegitimization Narratives: A Comprehensive Dataset of Misinformation Tweets from the 2020 U.S. Election.” The Journal of Quantitative Description.

Published in Social Forces, 2020

Racial discrimination has been a central driver of residential segregation for many decades, in the Seattle area as well as in the United States as a whole. In addition to redlining and restrictive housing covenants, housing advertisements included explicit racial language until 1968. Since then, housing patterns have remained racialized, despite overt forms of racial language and discrimination becoming less prevalent. In this paper, we use Structural Topic Models (STM) and qualitative analysis to investigate how contemporary rental listings from the Seattle-Tacoma Craigslist page differ in their description based on neighborhood racial composition. Results show that listings from White neighborhoods emphasize trust and connections to neighborhood history and culture, while listings from non-White neighborhoods offer more incentives and focus on transportation and development features, sundering these units from their surroundings. Without explicitly mentioning race, these listings display racialized neighborhood discourse that might impact neighborhood decision-making in ways that contribute to the perpetuation of housing segregation.

Kennedy, Ian, Chris Hess, Amandalynne Paullada, and Sarah Chasins. 2021. "Racialized Discourse in Seattle Rental Ad Texts." Social Forces 99, no. 4: 1432-1456.