Tom (Hyeon Seok) Yu

Working Papers

Best-Arm Identification in Survey Experiments: Multi-Armed Bandit Algorithms
(Last Update: November 2023) Colab Notebook
Abstract

Experimenters often face the problem of finding an intervention that maximizes a desired outcome or yields the most significant treatment effect relative to a baseline control group. Adaptive experimental designs, which dynamically update treatment assignment probabilities based on observed outcomes, garnered attention as promising means to (1) hasten the discovery of better treatments and (2) improve the precision with which treatment effects are estimated. In an application to a recent intervention tournament that collected and tested 25 interventions for strengthening democratic attitudes, I address whether adaptive algorithms – variants of multi-armed bandits (“MAB”), in particular – can be leveraged to identify the most effective treatment in an efficient manner by proposing and testing a data-driven method of experimental design selection.
Information and Motivated Reasoning: A Model of Selective Exposure
(Last Update: March 2022) PDF
Abstract

Previous research has documented the prevalence of selective exposure, the tendency to prefer and consume information that reinforces preexisting beliefs. Modeling individuals as motivated reasoners who face a tradeoff between accuracy (``getting it right”) and directional (``reaching desired conclusions”) motives, this paper develops a game-theoretic model that makes sense of seemingly inconsistent empirical findings by formally identifying conditions under which individuals, as receivers, engage in selective exposure. First, when the quality of information is uniform across individuals, selective exposure remains pervasive even in situations where the accuracy motive is high. Second, introducing uncertainty to the sender’s directional motive increases the likelihood of information avoidance. Finally, the size of the gap in the perceived quality of information between the sender and the receiver, rather than the high credibility of the sender, largely determines the possibility of exposure. These results on exposure decisions yield direct implications for persuasion and polarization.
Predicting Roll Call Votes using Machine Learning Methods
with Floyd Jiuyun Zhang (Last Update: December 2021) PDF Colab Notebook
Abstract

We present an approach for predicting roll-call votes in the U.S. Congress, using bill text word embedding as well as bill and Congress member characteristics as inputs. Various prediction models are implemented, tested and finally combined using ensemble stacking. Our methods yield higher accuracy than existing methods, especially for newly elected members of Congress.
Electoral Insecurity & Federal Spending Procurement: Causal Inference with Panel Matching and Synthetic Control Methods
(Last Update: December 2021) PDF R Codes
Abstract

How do legislators respond, if at all, to changes in their electoral prospects? Most existing studies adopt a difference-in-differences design that exploits redistricting as an exogenous shock to estimate the causal effect of electoral insecurity on legislators’ federal spending procurement for their districts. This project employs matching and synthetic control methods that produce more comparable counterfactuals to derive the causal estimate of interest. Nearly all matching and SC methods yield improved covariate balance. In addition, these methods return mostly null results while the conventional difference-in-differences method returns statistically significant results, which suggests the importance of ensuring comparability of treatment and control groups. Finally, a negative outcome analysis is conducted to compare the performance of different synthetic control methods.
Moral Values They Believe In: A Model of Electoral Competition
(Last Update: June 2019) PDF
Abstract

Understanding voter and candidate behavior in elections remains a fundamental question in political economy. This paper develops an electoral competition model with heterogeneity in individuals' party and moral identity. In addition to the formalization of moral values, notable features of the model include (a) the ex-ante correlation between moral and partisan identification and (b) the presence of cheap talkers. The analysis reveals that candidates who can lie have a significant advantage in elections, but the presence of other types of candidates and the voter's endogenous preference for honest candidates constrain the former's pandering behavior. More interestingly, extending the model with the two features produces a similar result, but through different mechanisms, that morally aligned but extremely partisan candidates have a significant chance of winning.
Analyzing the U.S. Federal Legislation Texts: A Deep Learning Approach
with Hyoungju Seo (Last Update: May 2019) PDF
Abstract

This paper analyzes the U.S. federal bill preamble texts from 1973 to 2018 using various embedding and supervised classification methods to gauge the degree of partisanship among bills. In addition to nine different baseline methods from the literature, we develop and implement a CNN-LSTM architecture with a character-based word embedding model. We find that word-based embedding methods outperform character-based ones and that a single-layer LSTM outperforms all other architectures tested. Comparing the prediction accuracy over time reveals a (small) positive correlation with individual legislators’ ideological data, suggesting a comparatively lower degree of partisan divide in bill preamble language. Finally, applying the trained LSTM model to a separate political ideology dataset shows a moderate degree of transferability.

Tom Yu

Working Papers