DRAFT — This page is under review and has not been fact-checked for publication.
The algorithm never asked for her race. It didn't have to.
In the 1930s, the federal Home Owners' Loan Corporation drew maps of American cities and color-coded neighborhoods by financial risk. Grade D — red — went almost exclusively to neighborhoods with significant Black populations. The maps determined who could get a mortgage. They determined who built generational wealth and who didn't. They determined the tax base for schools. They determined grocery store proximity, tree canopy, and air quality. By the time ZIP codes were standardized in 1963, the demographic boundaries those maps created were already baked into the physical landscape of every major American city. When insurance companies began using ZIP codes to set premiums in the 1970s, they were not designing a system of discrimination. They were inheriting one. When machine learning models trained on that data were deployed in the 2000s and 2010s to determine insurance pricing, healthcare risk scores, credit eligibility, and housing ad delivery — they learned the maps. The algorithms did not ask for race. The ZIP code answered.
This is not a story about biased programmers or bad intentions. It is a story about what happens when systems trained on historical data inherit historical injustice — and then make decisions at a scale the original architects of that injustice never could have managed.
70%
The number behind this guide
higher auto insurance premiums in predominantly Black zip codes.
ProPublica analyzed 100M+ policies across five states. After controlling for driving records and claim rates, the disparity remained — because the algorithm was reading the zip code, and the zip code was reading 1937.
Interactive Tool
The Proxy Reveal — What Your Zip Code Says About You
Enter any U.S. zip code to see its 1930s HOLC redlining grade, current racial composition, and the AI domains where that zip code is routinely used as an input — insurance pricing, healthcare risk scoring, ad delivery, credit eligibility, and hiring screening.
Open the Proxy Reveal →What the tool shows
- →Historical HOLC redlining grade (A–D) from the Mapping Inequality database
- →Current racial composition from U.S. Census data
- →Estimated insurance premium differential vs. comparable white-majority zip codes
- →Which AI decision domains use zip code as an input
- →The proxy chain: what zip code encodes beyond location
- →Where to file a complaint if you believe you've been harmed
A map drawn in 1937 is still running your data.
The correlation between ZIP codes and race was not an accident of history. It was the goal of policy.
Beginning in 1935, the federal Home Owners' Loan Corporation (HOLC) dispatched surveyors to assess the mortgage risk of every neighborhood in 239 American cities. Their maps graded areas A through D: green (best), blue (still desirable), yellow (declining), and red (hazardous). Grade D — red — was applied to neighborhoods with significant Black populations, regardless of income, employment, or actual default rates. The maps determined federal mortgage guarantees, which determined who could build equity through homeownership, which determined generational wealth. Black families were systematically excluded.
The wealth gap this created
Black homeownership rates today remain roughly where they were in 1968, the year the Fair Housing Act passed. The Brookings Institution found that homes in majority-Black neighborhoods are undervalued by an average of $48,000 per home — a collective loss of $156 billion in home equity across the country. The HOLC maps are 90 years old. Their financial consequences are current.
American cities surveyed and mapped by the HOLC between 1935 and 1940. More than 150 of those maps survive and are publicly accessible through the Mapping Inequality project.
Estimated collective loss in home equity from the systematic undervaluation of homes in majority-Black neighborhoods today — a direct descendant of HOLC-era disinvestment.
Mapping Inequality — The Digital Reckoning
University of Richmond · 2016–present
Historians at the University of Richmond digitized and georeferenced all surviving HOLC city maps and overlaid them on modern demographic and health data. The correlation between a neighborhood's 1937 HOLC grade and its 2020 racial composition is statistically significant in every city studied. The maps predict life expectancy, asthma rates, tree canopy cover, broadband access, and credit scores — all inputs to modern AI systems.
What zip codes encode today
When an AI model receives a zip code as input, it can infer all of the following — each of which correlates with race due to the history above:
- →Median household income: HOLC-graded neighborhoods have persistently lower wealth
- →Homeownership rate: FHA/VA exclusion → less intergenerational equity
- →School quality ratings: Property-tax-funded schools reflect redlining boundaries
- →Air quality & health outcomes: Industrial siting followed redlined boundaries
- →Historical claim rates: Including claims filed because of discrimination in prior eras
- →Healthcare access & utilization: Hospital proximity, insurance coverage rates
The algorithm set the price.
The neighborhood set the algorithm.
Insurance pricing algorithms use zip code as a primary input. Because zip codes are correlated with race — by design — the outputs are racialized even when race never appears in the model.
Higher auto insurance premiums in predominantly Black zip codes compared to comparable white zip codes, after controlling for driving records, claim rates, and traffic. Documented across five states by ProPublica, 2017.
ProPublica, 'Minority Neighborhoods Pay Higher Car Insurance Premiums,' 2017
Higher homeowners insurance premiums in formerly redlined zip codes compared to white-majority communities, documented by the Consumer Federation of America across multiple metro areas.
of Native American homeowners are completely uninsured. 14% of Hispanic homeowners, 11% of Black homeowners — versus 6% of white homeowners. Insurance deserts follow redlined boundaries.
Consumer Federation of America
Average annual extra premium charged to homeowners with lower credit scores — themselves a racialized variable produced by the same cycle of exclusion and discrimination.
Consumer Federation of America
The feedback loop
Higher premiums in Black neighborhoods → lower insurance coverage rates → more uninsured motorists and homeowners → higher perceived risk → higher premiums. The algorithm trains on this data and confirms it. The loop is structural, not statistical noise.
ProPublica: Five States, One Pattern
2017 Investigation
ProPublica and Consumer Reports analyzed premiums from the four largest auto insurers in Illinois, Missouri, Texas, California, and Missouri. In predominantly minority neighborhoods, premiums were 30% higher on average than in comparable white neighborhoods — after controlling for claim rates, traffic density, and driving record. In some zip codes in Illinois, the disparity exceeded 70%. No state insurance commissioner had audited for this pattern.
Bluelining: The Coverage Desert
Greenlining Institute · 2024
After wildfires drove insurance companies to stop writing policies in high-risk areas (predominantly white, wealthy California hillside communities), researchers found a parallel phenomenon in urban areas: insurers refusing to write policies in formerly redlined neighborhoods not because of fire risk but because of algorithmic risk scores that use zip code as a primary input. Greenlining Institute named this 'bluelining' — drawing a new line that, like the original red line, excludes Black and Brown communities from coverage.
The algorithm decides who sees the house.
Documented cases across housing ad delivery, mortgage lending, real estate valuation, and test prep pricing — all tracing back to the same mechanism.
United States v. Meta Platforms (2022)
U.S. Department of Justice · June 2022
The Justice Department filed and simultaneously settled a complaint against Meta for violating the Fair Housing Act. Meta's ad delivery algorithm — not just advertiser targeting choices — skewed housing ad delivery by race, national origin, sex, disability, and religion. ZIP codes were among the targeting variables that enabled this discrimination. The settlement was the first consent decree targeting a machine learning ad delivery algorithm under the Fair Housing Act.
Credit and mortgage lending
- Urban Institute finding: Black and Brown borrowers are more than twice as likely to be denied a mortgage as white applicants with equivalent credit profiles. Machine learning models that include zip code among their inputs inherit the redlining correlation.
- LLM-based mortgage AI: A 2023 study testing an LLM-based mortgage approval system found white applicants were 8.5% more likely to be approved with identical financial profiles. The model was trained on historical lending data from the period when explicit redlining was practiced.
- Credit score geography: 2004 research found high-Black ZIP code areas have significantly worse average credit scores than comparable non-Black ZIP code areas, even after controlling for actual financial behavior. Algorithms trained on credit scores as a target variable absorb this disparity.
Redfin — Fair Housing Lawsuit
National Fair Housing Alliance et al. · 2022
A coalition of fair housing organizations filed a federal lawsuit against Redfin alleging the company refused to offer real estate services in minority communities unless homes were listed above $400,000 — while offering the same services in majority-white DuPage County with a $275,000 minimum. The pricing threshold functioned as a proxy for zip code, which functioned as a proxy for race.
The Princeton Review SAT Pricing
ProPublica · 2015
ProPublica found that The Princeton Review charged different prices for the same online SAT prep courses based on the customer's zip code — ranging from $6,600 to $8,400. Analysis of the pricing boundary revealed that Asian Americans were nearly twice as likely to be charged the higher price. High-income white zip codes and low-income Asian zip codes (such as parts of Queens, NY) received the same premium pricing because the algorithm couldn't distinguish between them.
Home valuation disparities
Research using Zillow listing data found that in Rochester, NY, homes in 50% Black neighborhoods were listed with 65% less value per square foot than comparable properties after controlling for housing and neighborhood quality. The Brookings Institution documented that homes in Black neighborhoods are undervalued by an average of $48,000 — a figure that compounds over time as refinancing, estate planning, and credit access all use home equity as an input.
The algorithm decided she was healthy enough.
AI-driven health risk scoring systems are one of the most consequential places where zip code discrimination operates — and one of the least visible.
Patients per year processed by the class of healthcare algorithm that Obermeyer's team found systematically undertreated Black patients. After correcting the proxy variable, racial bias dropped 84%.
The Obermeyer study is the clearest documented case of zip code-adjacent proxy discrimination in healthcare AI. The algorithm used healthcare costs as a proxy for healthcare needs. Black patients, who historically receive less care due to systemic barriers — higher insurance costs (driven partly by zip code), lower coverage rates, fewer nearby facilities — had lower costs not because they were healthier, but because they were receiving less. The algorithm read lower cost as lower need and scheduled fewer interventions. The zip code was not in the model. Its effects were.
The Social Determinants Irony
Healthcare AI increasingly incorporates "Social Determinants of Health" (SDOH) scoring — measuring food access, air quality, walkability, and neighborhood stress. These factors are legitimate health predictors. They are also products of redlining. When a model scores SDOH using zip code as a proxy, it is encoding the history of where grocery stores, parks, and hospitals were not built. The intent is equity-correcting. The mechanism can be discriminatory.
EHR Data Discontinuity
AHRQ Systematic Review · 2023
A systematic review published by the Agency for Healthcare Research and Quality found that machine learning models using electronic health record (EHR) data are subject to 'EHR data-discontinuity' — when a patient receives care at multiple systems or outside the primary system, that data may not appear in the training set. Black patients, who are more likely to receive fragmented care due to insurance gaps and geographic barriers (both connected to zip code discrimination), appear in EHR systems with artificially lower medical complexity. Models trained on this data predict them as lower-risk.
Where zip code enters healthcare AI
- Insurance eligibility algorithms: Zip code used to predict claim risk and set network coverage — feeding back into care access.
- Hospital readmission risk scores: ZIP-level poverty indices and social support indicators used to predict discharge planning needs.
- Maternal risk scoring: Historically redlined zip codes show measurably elevated preterm birth and maternal mortality risk in research; models that use zip code as input may over- or under-correct.
- COVID-19 resource allocation: Predictive models deployed during COVID-19 used zip code to identify high-risk communities — where the data was accurate but the interventions were often insufficient.
Removing 'race' from the model does not remove racial outcomes.
The chain from redlining to zip code to algorithmic discrimination has no gap. Every link is documented.
Algorithmic redlining is documented across mortgage lending, housing ad delivery, insurance pricing, healthcare risk scoring, and credit underwriting. In each domain, the outcome is the same: algorithms trained on historical data reach racially discriminatory conclusions without ever receiving a racial classification as input. The proxy variable does the work. ZIP code is the most powerful and most common of these proxies because it encodes so much history.
The proxy chain
Legal and regulatory frameworks (as of 2026)
- →Algorithmic Accountability Act of 2025 (H.R.5511): federal requirement for pre-deployment bias testing across protected characteristics. Explicitly names zip codes as a proxy variable that must be tested.
- →NYC Local Law 144: annual independent bias audits of automated employment decision tools, with public disclosure and candidate notice requirements. Proxy variable testing included.
- →Colorado AI Act (SB 24-205, effective February 1, 2026): 'reasonable care' requirement across high-risk AI domains. Consumers have the right to appeal algorithmic decisions and correct inaccurate data.
- →California Insurance Code §11628.7: prohibits setting insurance premiums based solely on zip code without actuarial justification — a partial limit that stops short of full proxy-discrimination prohibition.
- →EU AI Act: credit scoring and insurance risk assessment designated 'high risk' — documentation, accuracy standards, and human oversight required before deployment.
The audit distortion problem
Recent research (arxiv.org/pdf/2603.17106) found that using zip code as a proxy for race in fairness audits systematically understates true racial disparities — because the misclassification errors in the proxy are not random. They vary with zip-code racial composition in ways that shrink estimated disparities. Auditors who use zip code to approximate race are using the same mechanism that causes the discrimination to evaluate the discrimination. The result is an undercount of the problem.
You cannot change your zip code's history. You can act on it.
Learn: what's already in your data (1 hour).
- Use the Proxy Reveal tool at cp-ai.org/education/zip-codes-and-race/proxy — enter your zip code and see its HOLC redlining grade, racial composition, and the domains where that zip code is used as an AI input.
- Look up your neighborhood's original HOLC grade at the Mapping Inequality project: dsl.richmond.edu/panorama/redlining — the maps are still accurate and the grades are still predictive.
- Request an insurance rate comparison: ask your insurer what factors drive your premium. In most states, insurers are required to disclose the factors used in rate-setting.
Act: use the complaint systems (1 day).
- If you believe your insurance is priced discriminatorily by zip code: file with your state insurance commissioner (find yours at naic.org) and the CFPB at consumerfinance.gov/complaint.
- For housing ad discrimination (being shown different homes or ads based on where you live): HUD complaint at hud.gov/program_offices/fair_housing_equal_opp/online-complaint.
- For credit denial where zip code may have been a factor: you are entitled to an adverse action notice explaining the decision. CFPB complaint: consumerfinance.gov/complaint.
Engage: bring it to your state capital (1 month).
- Ask your state insurance commissioner to conduct a zip code premium disparity analysis — several states (California, New York, New Jersey) have already moved to restrict zip code-based insurance pricing.
- Support state and federal algorithmic accountability legislation. The Algorithmic Accountability Act of 2025 (H.R.5511) explicitly names proxy variables including zip codes.
- Contact your city or county government: ask whether any AI systems used in public services (benefits eligibility, health risk scoring, housing assistance) use zip code as an input, and whether disparate impact audits have been conducted.
Organize: connect with the movement.
- National Fair Housing Alliance (nationalfairhousing.org) leads algorithmic accountability work in housing and insurance — they conduct audit testing and litigation.
- Greenlining Institute (greenlining.org) researches and litigates insurance 'bluelining' — the modern version of zip code discrimination in coverage access.
- NAACP Legal Defense Fund (naacpldf.org) runs civil rights litigation in AI-adjacent domains. The Lawyers' Committee for Civil Rights Under Law is advancing an AI Civil Rights Act.
- Document and share: if you receive significantly different insurance quotes, credit terms, or housing ad delivery than comparable neighbors, submit the documentation to the National Fair Housing Alliance for pattern evidence.
For Educators
Teaching zip code discrimination and the history of redlining
The Proxy Reveal tool makes this topic visceral for students — their own zip codes reveal their neighborhood's history. The educator guide covers how to handle that discovery sensitively, how to connect HOLC maps to local history in any city, and a classroom exercise mapping current insurance premiums, life expectancy, and school ratings across zip codes in your metro area.
Go to the Educator Guide →Related guides on this site
Where to learn more.
How we sourced this page
Every statistic on this page comes from: ProPublica's 2017 insurance pricing investigation, the Mapping Inequality project (University of Richmond), Obermeyer et al. (Science, 2019), the DOJ/Meta consent decree (2022), the Consumer Federation of America's insurance research, Brookings Institution / Freddie Mac home valuation data, AHRQ's systematic review of healthcare AI disparities (2023), the Greenlining Institute's bluelining research, and peer-reviewed literature on proxy variable fairness auditing (arxiv.org/pdf/2603.17106).
Want CPAI education resources for your community?
We partner with districts, libraries, and nonprofits to bring this research into classrooms and community spaces in Durham and beyond.