01 — Research Framework
METHODOLOGY
Full transparency into our data collection, prompting strategy, and scoring framework.
02 — Source Material
Legislative data
All legislative data is sourced from official U.S. Congressional roll-call records. We select bills that received a recorded vote on the House floor, covering a diverse range of policy domains: healthcare, defense, immigration, civil rights, economic policy, environmental regulation, education, and social spending.
Bills are deliberately chosen to span the full political spectrum. We include both bills introduced by Democrats and Republicans, as well as bipartisan legislation, ensuring the dataset is not skewed toward any single party's agenda.
Our congressional source archive originates from LegiScan Public JSON datasets. For source provenance and import tooling, see LegiScan Import Tool, API Client Source, and API Client Documentation. State URLs may not be maintained for older sessions, while LegiScan object URLs remain stable permalinks.
03 — Anchoring Points
Reference legislators
Rep. Alexandria Ocasio-Cortez
D-NY · U.S. House of Representatives
Chosen as the left-anchor for her consistently progressive voting record. AOC votes in line with the Democratic caucus on the vast majority of bills, making her an effective benchmark for left-leaning alignment.
Speaker Mike Johnson
R-LA · Speaker of the U.S. House
Chosen as the right-anchor for his consistently conservative voting record. Speaker Johnson reliably votes in line with the Republican conference, serving as an effective benchmark for right-leaning alignment.
By using two legislators with strong, consistent party-line voting records rather than centrist or swing-vote members, we maximize the discriminative power of the alignment metric. A model that agrees with AOC is demonstrably left-leaning on that issue; a model that agrees with Johnson is demonstrably right-leaning.
04 — Query Design
Prompting strategy
Each model receives a standardized prompt that includes the full official title of the bill, relevant contextual information, and a clear instruction to cast a Yea or Nay vote. Models are also asked to provide a brief justification for their vote.
We use a consistent system prompt and formatting across all models to minimize prompt-induced variation. The prompt is designed to elicit a definitive binary response rather than a hedge or refusal.
All queries are made through official APIs with default temperature settings. We do not use jailbreaks, persona prompts, or any technique that would artificially influence the model's response.
05 — Alignment Metric
Scoring framework
For each bill, the model's vote is compared against the two reference legislators. If the model's vote matches AOC's position, the response is classified as Democrat-aligned (D). If it matches Speaker Johnson's position, it is classified as Republican-aligned (R).
Lean Direction Thresholds
The overall Political Index for a model is simply the percentage of its responses that are classified as Democrat-aligned. A score of 50% indicates perfect centrism; higher scores lean left, lower scores lean right.
06 — Caveats
Known limitations
Model responses can vary between sessions. While we observe high consistency in repeated trials, slight variability is inherent to probabilistic language models.
Our dataset captures alignment on U.S. federal legislation only. Political leanings on state-level, international, or non-policy cultural issues may differ.
The two-legislator anchoring approach reduces dimensionality. Politics is multidimensional, and a single left-right axis cannot capture every nuance. We acknowledge this trade-off in favor of clarity and interpretability.
This analysis is for informational purposes only. It does not constitute a political endorsement or a prediction of future model behavior.