RCTs on Politics and Governance — Field at a Glance (2010–2025)

Click any study-country dot to view papers. Grey = non-HIC countries with no EGAP-style studies found.

Inclusion criteria: (1) randomized controlled trials; (2) governance or political institutions as the primary intervention or outcome; (3) at least one EGAP member as PI, or published in AER, AJPS, APSR, BJPS, CPS, IO, JDE, JOP, QJE, WD, or WP.

Topics coded from publicly available abstracts using an LLM classifier.

Data source: OpenAlex. I apologize if any studies were missed — please reach out if you know of one.

Click a region to show country labels
Dot size = number of studies.

Lessons from RCTs in Southeast Asia

Overview

The chart on the previous tab shows where EGAP-style governance research has concentrated by subregion and topic. Southeast Asia accounts for twenty-four studies, behind South Asia and Latin America and dwarfed by sub-Saharan Africa. But those twenty-four RCTs have produced some of the field's most consequential findings on:

  1. Authoritarian accountability
  2. Clientelism
  3. Frontline bureaucrats
  4. Community outreach in conflict zones

Below, I summarize the core lessons the SEA RCTs offer and ask how those lessons travel. My first instinct was to evaluate external validity based just on the kind of cross-country covariates shown in Tab 1: GDP, state capacity, regime type. But going through these papers, it became clear that authors typically flag much narrower contextual features as being the most important for external validity.

Each card below lists the common external validity considerations as stated by the authors themselves, including regime sub-type, party structure, and conflict architecture.

Card 1

Information and Participation in Autocracies

This cluster covers six field experiments on what we typically think of as democratic engagement tools, but tested in non-democratic regimes. This includes the effects of transparency, citizen-preference information, democracy promotion, and firm participation in regulatory drafting.

The point

Standard accountability tools can shift behavior in authoritarian assemblies and increase compliance and civic engagement, but what actually drives them isn't the threat of losing an election. It's officials looking to get promoted, the cover of moving with the crowd, and the legitimacy that comes from being consulted by the state.

Scope conditions:
  • Regime Type -- authoritarian, not multiparty democracy
  • Regime Subtype -- single-party or electoral-authoritarian, not personalist dictatorship or military junta
  • Policy Issue -- low-stakes or uncertain to the party, not signature ideological topics
Vietnam5Cambodia1
Card 2

Programmatic Politics in a Clientelist Democracy

This cluster covers five field experiments on the standard tools of programmatic political campaigns (voter information, candidate policy disclosure, and deliberative engagement) tested in a setting where vote-buying is the dominant electoral practice (the Philippines).

The point

The big takeaway is that voters in clientelist settings really do respond to policy information and programmatic platforms, but candidates rarely supply that information themselves. Vote-buying remains cheaper at the margin, and incumbents who observe an information-based intervention often counter it by intensifying vote-buying.

Scope conditions:
  • Clientelism -- widespread vote-buying and patronage, not programmatic competition
  • Party Structure -- weak national parties and family-based local machines, not strong-party machine clientelism (Argentina-style)
Philippines5
Card 3

Selecting and Reforming Frontline Bureaucrats

The third cluster covers five field experiments across Indonesia, the Philippines, Vietnam, and Myanmar, a more varied set of sites than the previous two clusters. These studies put bureaucrats at the center of the design, testing selection rules, individual personnel traits, contact with citizens, and supply-side training programs. Among this group are also the only EGAP studies to treat bureaucrats' attitudes as the core outcome.

The point

A recurring pattern is that bureaucratic behavior responds to upstream structural levers like selection rules, decision authority, and training regimes more reliably than to citizen pressure or to attempts to change individuals one encounter at a time.

Scope conditions:
  • Bureaucratic Capacity -- moderate-to-high (Indonesia, Philippines, Vietnam), with Myanmar as the lower-capacity exception
  • Identity-Based Mechanisms -- not operative across the cluster, in contrast to much SSA bureaucracy work where ethnicity, religion, or race drive selection and service delivery
  • Bureaucrat Type -- varies: civil service aspirants (Indonesia), street-level police (Philippines), firms-and-inspectors (Vietnam), subnational officials (Myanmar)
Indonesia1Philippines2Vietnam1Myanmar1
Card 4

Conflict Disrupts Even Well-Designed Programs

This card is more self-serving than the others. It leans heavily on my own work in the Philippines.

The point

A consistent lesson across these studies is that governance programs that work elsewhere tend to be disrupted, redirected, or to outright backfire in the presence of conflict.

Philippines5
Card 5

Other Governance Experiments in Southeast Asia

These six field experiments fall within the SEA governance corpus but address mechanisms or topics that don't sit cleanly inside the four clusters above.

Indonesia4Cambodia1Myanmar1

What This Means for the Field

Overview

Looking across this corpus surfaced a few things that I think are worth flagging. The first two are critical: the challenges that the corpus, viewed this way, raise in terms of the external validity of our claims. The last three are constructive: where this exercise points us, and what richer external-validity claims might look like in practice.

The Bad News

Critical (1 of 2)

Card 1 — Even at the Macro Level, the Regions Barely Overlap

If "top-level" indicators like GDP and regime type meaningfully shape external validity, then it is difficult to draw cross-regional lessons from existing studies. To make matters worse:

  • The studies systematically come from the extremes of each region's development × regime space.
  • The lack of overlap between comparable countries becomes more stark when you include the topics of the studies themselves.
Critical (2 of 2)

Card 2 — Sub-National Blind Spots

The country-level covariates I've been pointing at in the plot (GDP, regime type, urbanization) are exactly that: country-level. The studies themselves are usually sited in much smaller sub-units (a province, a district cluster, a set of villages), and those sub-units are systematically different from the country averages we're using to compare cases across regions. Two sub-national blind spots stand out:

  • Urban areas — we systematically study rural sites, even in highly urbanized and rapidly urbanizing countries. Country-level GDP and capacity statistics draw heavily from the urban areas we don't study.
  • Conflict-affected zones — even in conflict-affected countries, most experimental work isn't sited in the conflict-affected regions themselves.

The Good News

Constructive (1 of 3)

Card 3 — Authors Are Doing More of This Than the Plots Suggest

Reading these papers carefully, the authors are usually pretty clear about where they think their findings travel. They most commonly name specific contextual features (e.g. regime sub-types, party structure, patronage architecture) rather than country-level GDP or democracy scores, and they identify specific countries or regions where the same conditions hold. If we take their identifications seriously, the cross-regional gap looks smaller than the macro plots in the critical cards make it look. It's not a silver bullet — the gaps in the corpus shape what counts as a "similar" country in the first place — but the comparisons authors are actually drawing are tighter and more credible than a reader scanning Tab 1 might assume.

Constructive (2 of 3)

Card 4 — RCTs as Tools for Revealing Theoretical Mechanisms

Looking at the gaps in the corpus, it's hard to see how RCTs alone (outside of Metaketa-style coordinated efforts) can validate findings against each other across contexts. We just can't expect that many RCTs and, to me, that's actually probably a fine balance. The RCTs we have should be more clearly situated as tests of theoretically grounded mechanisms that inspire observational work, rather than as standalone policy evaluations, and I think we've already been moving in that direction.

Constructive (3 of 3)

Card 5 — RCTs as Sites for Inductive Theory Building

A lot of what we've actually learned about how politics works has come from the process of running these RCTs and working with our policy partners, not just from the published findings. Quantitative researchers used to feel pretty disconnected from this kind of texture, and the EGAP enterprise has been quietly reconnecting us to it. The generalizable policy impact of our work lives largely in the process of our work rather than being solely connected to the average treatment effects.