Skip to main content

Data Sources

Every data point on MySchoolScout comes from publicly available federal and state government sources. Here's exactly where our data comes from, how current it is, and how often we update.

Data on this page was last reviewed April 8, 2026. Current rankings were computed March 2026 from the NCES CCD 2022-23 release, state DOE releases (2023-24 where available), and U.S. Census ACS 5-year estimates (2019-2023).

See our changelog for recent updates to data, methodology, and features.

School Directory & Enrollment

SY 2022-23

School names, addresses, enrollment counts, grade ranges, student-teacher ratios, and demographic breakdowns come from the NCES Common Core of Data (CCD), the U.S. Department of Education's comprehensive annual directory of all public elementary and secondary schools and school districts.

Source NCES Common Core of Data (CCD) Data Year School Year 2022-23 Coverage ~96,000 public schools nationwide Update Frequency Annual (typically released ~12 months after school year ends) Fields Used Name, address, phone, grades, enrollment, demographics, type, locale

Test Scores & Academic Performance

Varies by state (2024-25)

Test score data comes from two sources: state Departments of Education (higher quality, most recent year) and federal EdFacts data from NCES (multi-year historical data for growth calculations).

State Assessment Data (Primary Source)

We download school-level proficiency data directly from each state's Department of Education. Each state administers its own standardized assessment (e.g., CAASPP in California, STAAR in Texas, TCAP in Tennessee). We normalize these to a common format: percent of students scoring at or above proficient in Math and ELA.

States Covered CA, TX, FL, NY, PA, IL, OH, GA, NC, MI, VA, WI, TN (and growing) Data Year Most recent available (typically 2024 or 2025) Subjects Math, ELA (some states include Science, Social Studies)

EdFacts Historical Data (Growth Calculations)

For multi-year growth trends, we use EdFacts achievement data from 2019, 2021, and 2022. This federal data covers all states but has a longer publication lag.

Source NCES EdFacts / ED Data Express Years Available 2019, 2021, 2022 (2020 excluded due to COVID)

Neighborhood Data

ACS 2019-2023 (5-Year Estimates)

Neighborhood statistics (median income, home values, education levels, commute times, poverty rates) come from the U.S. Census Bureau's American Community Survey (ACS), matched to each school's ZIP code.

Source Census ACS 5-Year Estimates Data Period 2019-2023 (5-year average) Geography ZIP Code Tabulation Areas (ZCTA) Fields Used Income, home values, rent, education, poverty, commute, age

This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.

Walkability & County Safety Index

We surface Walk Score, Transit Score, and Bike Score for individual school addresses, plus a county-level safety index derived from FBI UCR Part-1 offense counts normalized to per-100K-resident rates. Both signals appear on individual school pages in the Neighborhood Context section when data is available.

FERPA note: our crime index is reported strictly at the county level. We never store, query, or display per-school or per-student crime data.

Walkability Source Walk Score API (walkscore.com) Walk Score Coverage Top ~10K schools by enrollment (long-tail rolling out) Crime Source FBI Crime Data Explorer — Offenses Known to Law Enforcement by County Crime Granularity County-level only (5-digit FIPS) Update Frequency Annual (FBI typically publishes ~9-12 months after year-end)

Programs & Offerings

CRDC 2020-21

School program data (AP/IB courses, athletics, advanced STEM courses, gifted and talented programs) comes from the Civil Rights Data Collection (CRDC), a biennial survey of all public schools conducted by the U.S. Department of Education's Office for Civil Rights. We access this data through the Urban Institute Education Data Portal.

Source CRDC via Urban Institute Education Data Portal Data Year 2020-21 (most recent CRDC collection) Coverage ~97,000 public schools (public schools only) Update Frequency Biennial (every 2 years) Fields Used AP/IB courses, interscholastic sports counts, advanced course sections, gifted/talented indicator

Programs data is shown on school pages as supplemental information. It is not factored into our composite scores or rankings. Private schools are not covered by CRDC, so athletics and advanced course data is only available for public schools.

Ratings & Rankings

Computed March 2026

Our 1-10 composite ratings are calculated by MySchoolScout using the data sources above. The formula weights Academic Performance (50%), Growth (20%), Equity (15%), and Environment (15%). Rankings are computed within each state to ensure fair comparison across different state assessments.

For full details on how ratings are calculated, see our Methodology page.

AI School Descriptions

Generated locally

A short narrative description on each school page is generated by a local large-language model (Qwen 2.5) from the same public data sources listed above. Descriptions summarize what the school offers — size, grade range, programs, test scores in context — without adding any information that isn't in the source data.

Source MySchoolScout AI generation (local Qwen 2.5) Input Data CCD directory, state test scores, ACS neighborhood, CRDC programs Update Frequency Regenerated whenever source data refreshes

For schools with sparse data (fewer than 2 data components), we show a non-AI fallback description instead to avoid referencing misleading scores.

Public-Record Synthesis

Multi-source, claim-grounded

Some school pages include a "What's said publicly" section synthesizing what a school says about itself and what independent sources record. We pull from the school's own website, its Wayback Machine snapshot (so we capture the page as a parent would have seen it last year), and Wikipedia when an article exists. Every claim is grounded in a cited source — we never include claims that don't appear in the underlying text.

Sources School's own website, Wayback Machine archive, Wikipedia (CC BY-SA) Synthesis Model Local Qwen 3.6 27B (no third-party API calls; bills against MySchoolScout's hardware) Trust Gating Only rows with ≥80% claims grounded in a cited source render; the rest stay dark Coverage Live on schools that opted into review-signal synthesis (~459 prod schools at SS-P5-147; expanding via SS-P5-197)

Wikipedia content is licensed CC BY-SA 4.0; attribution is preserved via the inline source URL chip. Claims that fail the grounding check are pruned before render. Schools without a confident Wikipedia match are still synthesized using the remaining sources.

Data Limitations & Freshness

Government education data has inherent publication lags. The NCES CCD directory is typically released 12-18 months after the school year ends. State test score data is usually available within 3-6 months. Census ACS 5-year estimates are released about 12 months after the survey period.

This means some data on the site may be 1-2 years old. We show the data vintage on each school page so you always know exactly which school year the data represents. We update our data as soon as new releases become available.

Some data is suppressed to protect student privacy. When fewer than 10 students are in a group, state agencies suppress the data. This means small schools may have incomplete test score records.

Questions?

If you notice an error in our data or have questions about our sources, please contact us. We take data accuracy seriously and will investigate any reported discrepancies.