AmericaByNumbers.com

Our Methodology

Transparency is central to our mission. This page explains exactly how we collect, process, and present salary data on AmericaByNumbers.com.

In Brief

We take raw data from official U.S. government sources — the Bureau of Labor Statistics (BLS) for salary data and the U.S. Census Bureau for city demographics — process it into clear, comparable profiles, and present it in user-friendly formats. We do not conduct our own surveys or modify the underlying data.

Data Collection Process

Our data pipeline follows a structured, repeatable process:

1 Source Data Download

We download the OEWS flat data file (oe.data.0.Current) directly from the BLS public data repository. This file contains the complete set of current occupational employment and wage estimates published by the BLS.

2 Occupation Discovery

We parse the BLS occupation reference file (oe.occupation) to identify all detailed occupation codes (SOC codes) with available data. As of 2024, this yields 831 distinct occupations with state-level wage data.

3 Data Extraction

For each occupation and each of the 50 U.S. states, we extract seven key data points from the BLS flat file:

MetricBLS CodeDescription
Employment01Total number of workers in this occupation in the state
Mean Annual Wage04Average (mean) salary across all workers
10th Percentile11Salary at the bottom 10% — entry-level benchmark
25th Percentile12Salary at the bottom 25%
Median (50th)13Middle salary — half earn more, half earn less
75th Percentile14Salary at the top 25%
90th Percentile15Salary at the top 10% — experienced professionals

4 Database Storage

Extracted data is stored in a structured SQLite database with indexed lookups by occupation code (SOC) and state (FIPS). This ensures fast, accurate retrieval during page generation.

5 Page Generation

Pages are generated programmatically using Python and Jinja2 templates. Each page pulls its data directly from the database. Before publishing, an automated link audit verifies that every internal link points to a valid page.

6 Quality Verification

After every build, we run a comprehensive link audit that checks all internal links across all 34,900+ pages. Any broken link causes the build to flag an error. We also spot-check random pages against the original BLS data to verify accuracy.

What We Calculate

We perform a small number of straightforward calculations to make the data more useful:

Important: We do not adjust, estimate, or extrapolate salary figures. If the BLS does not publish data for a particular occupation-state combination, we do not show a page for it. This means some occupations may not have data in all 50 states.

What We Do NOT Do

Data Freshness

The BLS publishes OEWS data annually, typically in the spring for the prior year's estimates. Our current data reflects May 2024 estimates. We update our database each time the BLS releases new OEWS data.

Limitations

Like any data source, BLS OEWS data has certain limitations that users should be aware of:

Open Source & Verification

Our data sources are entirely public. Anyone can verify our numbers by accessing the same BLS data files we use:

If you find a discrepancy between our data and the BLS source, please let us know and we will investigate and correct it.

City & Town Profile Data Pipeline

In addition to salary data, AmericaByNumbers.com provides demographic, economic, and housing profiles for over 28,000 U.S. cities and towns. This data follows a similar rigorous pipeline:

1 Census API Data Download

We query the U.S. Census Bureau's American Community Survey (ACS) 5-Year Estimates API for all incorporated places and census-designated places (CDPs) across all 50 states and the District of Columbia. The ACS 5-Year dataset provides the most reliable estimates for small geographies.

2 Multi-Table Data Extraction

For each place, we extract data from multiple Census tables:

CategoryCensus TableKey Variables
DemographicsB01003, B02001, B01002Population, race/ethnicity, median age
IncomeB19013, B19301, B17001Median household income, per capita income, poverty
EducationB15003HS diploma, bachelor's, graduate degree attainment
HousingB25077, B25064, B25003Median home value, median rent, ownership rate
EmploymentB23025Civilian labor force, unemployment rate

3 Data Cleaning & Normalization

Census place names are cleaned (removing designations like "city", "town", "CDP"), duplicate city names are disambiguated using FIPS codes, and all percentages are calculated from raw population counts. Places with incomplete data across all key fields are still included but with appropriate "Data not available" labels.

4 Database Storage & Page Generation

Data is stored in a SQLite database (census.db) with 6 normalized tables. City profile pages are generated using Jinja2 templates with CSS-only visualizations (no JavaScript dependencies for charts). A link audit verifies all internal links after every build.

Coverage note: Of 28,456 Census places, 25,758 (90.5%) have complete data across all five key categories (demographics, income, education, housing, employment). Places with partial data are still included with available information displayed.

Related Pages