Ipsos logo
Synthetic SCB Population Navigator

Best-possible synthetic population fit for Sweden, with hard evidence

This homepage consolidates precision achievements, known weaknesses, and why imprecision is unavoidable when protected statistics conflict across national, municipality, and DeSO levels. It also explains how the current synthesis improves on earlier public baselines while remaining fully auditable.

Population total 10.59M
National error 0.05%
Grid QA fail rate 1.97%
Employment gap (20-65) -445k

Key achievements

Precision highlights and verified benchmarks

Population totals

National total is within 0.05% of SCB counts, with sex totals within +/-1.6%.

Demography

Education distribution

Mean absolute percent error is 2.18% for ages 25-65.

Education

Industry mix

National industry mix is within 3.62% mean absolute error.

Labor

Mobility

Commute mode split (0.03%) and distance distribution (0.33%) are highly aligned.

Mobility

Work commute outflow

Stockholm outflow 27.19% vs 27.14% official; Kiruna 3.92% vs 4.07%.

Flow check

Grid allocation

1 km allocation fails only 1.97% of checks; age bands are under 1% fail rate.

Spatial fit

Major problems to solve

Largest deviations from target distributions

Employment by age and sex

Mean absolute percent error 16.31%, dominated by age 20-65 gaps.

Priority

Household size

Mean absolute percent error 35.53%, with missing 7+ households.

Priority

Tenure split

Mean absolute percent error 15.30%, mostly in unknown and tenant-owned shares.

Priority

Local fit

Municipality and DeSO errors remain high because of sparse and suppressed cells.

Local risk
Why imprecision happens: SCB suppression and rounding create conflicting marginals across geography and domains. Household composition, tenure, and employment compete for the same individuals, so a perfect joint fit is mathematically impossible. DeSO cells are small, so percent errors inflate quickly even when absolute differences are modest. This is why the model prioritizes national consistency and uses municipality-level hard constraints with soft DeSO alignment.

Why this model is stronger than the 2023 baseline

Method and evidence improvements vs Tozluoglu2023
More recent targets Uses 2023 SCB marginals and 2024 grid tiles instead of older aggregates.
Harder constraints National and municipality totals are enforced while DeSO remains soft for stability.
Full QA transparency Every metric includes diff tables, maps, and grid QA coverage.
Mobility validation Commute mode and distance fit is quantified, not only described.

Citations: [1] [2] [3]

Detailed QA report

HTML conversion of report_val.md

Validation report

Generated: 2026-01-21 10:25:38

Source: population_navigator/outputs_fix22d

National totals

sex_code sex_label official_count synthetic_count abs_diff pct_diff
1 Men 5324785.0 5246700 -78085.0 -1.466444185070383
2 Women 5262925.0 5346670 83745.0 1.5912254117244686
0 Total 10587710.0 10593370 5660.0 0.0534582076766364

National comparisons (official>=10)

Age-sex distribution (national)

total_abs_pct=5.25% mean_abs_pct=5.39% max_abs_pct=17.84% distance_to_5pct=0.39%

  • age_band_label=15-19, sex_label=Women: official=302053.0, synthetic=355936, abs_diff=53883.0
  • age_band_label=15-19, sex_label=Men: official=320978.0, synthetic=372867, abs_diff=51889.0
  • age_band_label=20-24, sex_label=Women: official=287615.0, synthetic=333436, abs_diff=45821.0
  • age_band_label=20-24, sex_label=Men: official=308500.0, synthetic=350619, abs_diff=42119.0
  • age_band_label=25-29, sex_label=Women: official=300790.0, synthetic=268528, abs_diff=-32262.0

Marital status by sex (national)

total_abs_pct=4.50% mean_abs_pct=5.04% max_abs_pct=7.16% distance_to_5pct=0.04%

  • marital_status_label=Single, sex_label=Women: official=2672439.0, synthetic=2833576, abs_diff=161137.0
  • marital_status_label=Married, sex_label=Men: official=1703430.0, synthetic=1585231, abs_diff=-118199.0
  • marital_status_label=Single, sex_label=Men: official=3063334.0, synthetic=3143364, abs_diff=80030.0
  • marital_status_label=Married, sex_label=Women: official=1677848.0, synthetic=1636719, abs_diff=-41129.0
  • marital_status_label=Divorced, sex_label=Men: official=455712.0, synthetic=423063, abs_diff=-32649.0

Education levels 25-65 (national)

total_abs_pct=2.25% mean_abs_pct=2.18% max_abs_pct=3.12% distance_to_5pct=-2.82%

  • education_label=post-secondary education, 3 years or more: official=1679311.0, synthetic=1626879, abs_diff=-52432.0
  • education_label=upper secondary education: official=2225470.0, synthetic=2185787, abs_diff=-39683.0
  • education_label=post-secondary education, less than 3 years: official=880424.0, synthetic=857439, abs_diff=-22985.0
  • education_label=unknown education: official=152973.0, synthetic=148914, abs_diff=-4059.0
  • education_label=primary or lower secondary education: official=541797.0, synthetic=545730, abs_diff=3933.0

Employment by age/sex (national)

total_abs_pct=9.89% mean_abs_pct=16.31% max_abs_pct=37.25% distance_to_5pct=11.31%

  • status=employed, age_band=alder_20_65, sex_code=1: official=2544647.0, synthetic=2291343, abs_diff=-253304.0
  • status=employed, age_band=alder_20_64, sex_code=1: official=2515263.0, synthetic=2272818, abs_diff=-242445.0
  • status=not_employed, age_band=alder_20_65, sex_code=2: official=593111.0, synthetic=809195, abs_diff=216084.0
  • status=not_employed, age_band=alder_20_65, sex_code=1: official=561076.0, synthetic=766379, abs_diff=205303.0
  • status=not_employed, age_band=alder_20_64, sex_code=2: official=564349.0, synthetic=767992, abs_diff=203643.0

Industry mix (national)

total_abs_pct=3.52% mean_abs_pct=3.62% max_abs_pct=5.25% distance_to_5pct=-1.38%

  • industry_code=M+N, industry_label=professional, scientific and technical companies; administrative and support service companies: official=673570.0, synthetic=647621, abs_diff=-25949.0
  • industry_code=Q, industry_label=human health and social work establishments: official=833119.0, synthetic=807569, abs_diff=-25550.0
  • industry_code=G, industry_label=trade; repair establishments for motor vehicles and motorcycles: official=610087.0, synthetic=587202, abs_diff=-22885.0
  • industry_code=B+C, industry_label=mining, quarrying, manufacturing : official=560062.0, synthetic=541450, abs_diff=-18612.0
  • industry_code=P, industry_label=educational establishments: official=551849.0, synthetic=536454, abs_diff=-15395.0

Household size (national)

total_abs_pct=5.45% mean_abs_pct=35.53% max_abs_pct=110.48% distance_to_5pct=30.53%

  • size_label=2 persons: official=1489277.0, synthetic=1575854.0, abs_diff=86577.0
  • size_label=6 persons: official=56185.0, synthetic=118258.0, abs_diff=62073.0
  • size_label=5 persons: official=195741.0, synthetic=137239.0, abs_diff=-58502.0
  • size_label=7+ persons: official=34584.0, synthetic=0.0, abs_diff=-34584.0
  • size_label=1 person: official=2077552.0, synthetic=2098139.0, abs_diff=20587.0

Household type by children (national)

total_abs_pct=1.69% mean_abs_pct=2.71% max_abs_pct=6.40% distance_to_5pct=-2.29%

  • household_type_label=single without children, child_label=0 children: official=2077552, synthetic=2098139.0, abs_diff=20587.0
  • household_type_label=cohabiting with children aged 0-24, child_label=2 children: official=482601, synthetic=499883.0, abs_diff=17282.0
  • household_type_label=cohabiting/married without children, child_label=0 children: official=1184580, synthetic=1194513.0, abs_diff=9933.0
  • household_type_label=cohabiting with children aged 0-24, child_label=1 child: official=328765, synthetic=337180.0, abs_diff=8415.0
  • household_type_label=cohabiting with children aged 0-24, child_label=3 children or more: official=195365, synthetic=203094.0, abs_diff=7729.0

Dwelling type (national)

total_abs_pct=0.06% mean_abs_pct=0.20% max_abs_pct=0.62% distance_to_5pct=-4.80%

  • building_type_label=multi-dwelling buildings: official=4595749.0, synthetic=4598885, abs_diff=3136.0
  • building_type_label=data missing: official=256599.0, synthetic=258184, abs_diff=1585.0
  • building_type_label=one- or two-dwelling buildings: official=5400699.0, synthetic=5401581, abs_diff=882.0
  • building_type_label=special housing: official=224822.0, synthetic=225079, abs_diff=257.0
  • building_type_label=other buildings: official=109837.0, synthetic=109641, abs_diff=-196.0

Tenure (national)

total_abs_pct=2.89% mean_abs_pct=15.30% max_abs_pct=54.41% distance_to_5pct=10.30%

  • tenure_label=Other/unknown: official=257498.0, synthetic=117387, abs_diff=-140111.0
  • tenure_label=Tenant-owned: official=2150824.0, synthetic=2263879, abs_diff=113055.0
  • tenure_label=Rented: official=3194532.0, synthetic=3237042, abs_diff=42510.0
  • tenure_label=Owner-occupied: official=4984846.0, synthetic=4975062, abs_diff=-9784.0

Car ownership (municipality totals)

total_abs_pct=0.00% mean_abs_pct=0.00% max_abs_pct=0.00% distance_to_5pct=-5.00%

  • municipality_id=114.0: official=17631.0, synthetic=17631.0, abs_diff=0.0
  • municipality_id=115.0: official=15239.0, synthetic=15239.0, abs_diff=0.0
  • municipality_id=117.0: official=20895.0, synthetic=20895.0, abs_diff=0.0
  • municipality_id=120.0: official=18119.0, synthetic=18119.0, abs_diff=0.0
  • municipality_id=123.0: official=28155.0, synthetic=28155.0, abs_diff=0.0

Interpretation and constraints

  • National totals are the hard target; municipality-level quotas are enforced for employment and industry, while DeSO-level constraints are soft to avoid overfitting sparse cells.
  • Employment comparisons exclude not-employed for the 15-74 age band because the official dataset only provides employed counts for that band (no unemployed/outside-labor-force series).
  • The apparent 1.8M "missing employment" is the under-15 population without employment status by design; the employed gap for ages 20-65 is about -445k vs official totals in the current build.

Why residual errors remain

  • Confidentiality treatment (suppression/rounding) yields conflicting marginals across national, municipality, and DeSO levels, so a perfect joint fit is infeasible.
  • Cross-domain constraints fight each other (household size, tenure, marital status, employment), so improving one domain can degrade another without slack.
  • DeSO cells are small and noisy; percent errors inflate quickly even when absolute differences are modest.
  • The current iteration balances hard national fits with soft local fits to minimize global error without violating primary totals.

Comparison to Tozluoglu2023

  • SySMo uses NNC to predict joint distributions and IPF to fit those predictions to official totals before sampling individuals; the paper states the activity-travel patterns "approximate the validation data patterns reasonably well".
  • The published dataset reports 10,203,820 individuals, while our target total is 10,587,710 (2023), so some differences reflect year and source updates in the official marginals.
  • Our mobility alignment is strong (commute mode and distance mean abs pct below 1%), matching the qualitative validation in Tozluoglu2023; the largest gaps remain in employment by age/sex, household size, and tenure where confidentiality and inconsistent marginals dominate.

Local summaries (target<=5% mean abs pct)

Age-sex distribution (DeSO)

mean_abs_pct=212.63% p90_abs_pct=267.23% share_within_5pct=0.00%

Household size (municipality)

mean_abs_pct=51.91% p90_abs_pct=55.12% share_within_5pct=0.00%

Household type by children (municipality)

mean_abs_pct=28.64% p90_abs_pct=31.57% share_within_5pct=0.00%

Dwelling type (municipality)

mean_abs_pct=17.14% p90_abs_pct=21.46% share_within_5pct=0.00%

Tenure (municipality)

mean_abs_pct=85.05% p90_abs_pct=159.48% share_within_5pct=0.00%

Car ownership (municipality)

mean_abs_pct=0.00% p90_abs_pct=0.00% share_within_5pct=100.00%

Mobility comparisons

commute_distance_distribution

total_abs_pct=0.12% mean_abs_pct=0.33% max_abs_pct=3.20%

  • segment=Bike segment_detail=2-5 origin_id=nan destination_id=nan: official=0.4083115259930477 synthetic=0.4089597812124785 abs_diff=0.0006482552194308
  • segment=Public transport segment_detail=10-20 origin_id=nan destination_id=nan: official=0.1160766950353523 synthetic=0.1164981169609348 abs_diff=0.0004214219255825
  • segment=Car segment_detail=200+ origin_id=nan destination_id=nan: official=0.085283192684277 synthetic=0.0856063487609732 abs_diff=0.0003231560766962
  • segment=Public transport segment_detail=5-10 origin_id=nan destination_id=nan: official=0.2044537049094934 synthetic=0.2047714610586645 abs_diff=0.000317756149171
  • segment=Public transport segment_detail=0-2 origin_id=nan destination_id=nan: official=0.1680825475130799 synthetic=0.1677695791150264 abs_diff=-0.0003129683980535

commute_mode_split

total_abs_pct=0.02% mean_abs_pct=0.03% max_abs_pct=0.06%

  • segment=Public transport segment_detail=nan origin_id=nan destination_id=nan: official=0.1722446308383901 synthetic=0.1723504418329577 abs_diff=0.0001058109945676
  • segment=Walk segment_detail=nan origin_id=nan destination_id=nan: official=0.1019391919439815 synthetic=0.1018879733267128 abs_diff=-0.0000512186172687
  • segment=Car segment_detail=nan origin_id=nan destination_id=nan: official=0.6244707366343271 synthetic=0.6244196133997019 abs_diff=-0.0000511232346252
  • segment=Bike segment_detail=nan origin_id=nan destination_id=nan: official=0.1013454405833013 synthetic=0.1013419714406274 abs_diff=-0.0000034691426739

study_commute_pattern

total_abs_pct=0.50% mean_abs_pct=0.80% max_abs_pct=2.76%

  • segment=living and studying in the same municipality segment_detail=1 origin_id=126.0 destination_id=nan: official=0.237170010559662 synthetic=0.2434938083570624 abs_diff=0.0063237977974003
  • segment=living and studying in the same municipality segment_detail=1 origin_id=2480.0 destination_id=nan: official=0.6261198261198261 synthetic=0.6210068882622163 abs_diff=-0.0051129378576098
  • segment=studying in other municipality in the county of residence segment_detail=2 origin_id=126.0 destination_id=nan: official=0.6677930306230201 synthetic=0.6631348917290246 abs_diff=-0.0046581388939954
  • segment=studying outside the county of residence segment_detail=3 origin_id=2480.0 destination_id=nan: official=0.3141183141183141 synthetic=0.3186630299107762 abs_diff=0.004544715792462
  • segment=studying outside the county of residence segment_detail=3 origin_id=680.0 destination_id=nan: official=0.1407701019252548 synthetic=0.1446540880503144 abs_diff=0.0038839861250596

work_flow

total_abs_pct=4.25% mean_abs_pct=4.47% max_abs_pct=15.24%

  • segment=nan segment_detail=nan origin_id=1280.0 destination_id=1280.0: official=129205.0 synthetic=113534.0 abs_diff=-15671.0
  • segment=nan segment_detail=nan origin_id=180.0 destination_id=180.0: official=392110.0 synthetic=384286.0 abs_diff=-7824.0
  • segment=nan segment_detail=nan origin_id=1480.0 destination_id=1480.0: official=260445.0 synthetic=252751.0 abs_diff=-7694.0
  • segment=nan segment_detail=nan origin_id=380.0 destination_id=380.0: official=93770.0 synthetic=86884.0 abs_diff=-6886.0
  • segment=nan segment_detail=nan origin_id=580.0 destination_id=580.0: official=72012.0 synthetic=66733.0 abs_diff=-5279.0

Work commute outflow shares (home != work municipality)

  • Stockholm (0180): official 27.14% (146,060 / 538,170), synthetic 27.19% (143,489 / 527,775)
  • Kiruna (2584): official 4.07% (500 / 12,284), synthetic 3.92% (433 / 11,036)

Grid assignment QA (1km)

  • Total checks: 12,835,920; failures: 252,243; fail rate: 1.97%
  • Highest failure rates: total 17.48%, men 12.13%, women 11.45%
  • Age-band metrics are low across the board (max failure rate about 0.63% for ald15_20)

Visual evidence

Charts and maps from outputs_fix22d with context

Demography

Age-sex pyramid national comparison
National age-sex pyramid with visible 15-24 uplift driving the 5.39% mean error.
Marital status by age and sex
Marital status by age and sex; error is concentrated in single vs married shares.
Local age-sex pyramid comparison
Local pyramid highlighting high DeSO volatility from suppressed counts.

Education and labor

Education level distribution
Education distribution is tight (2.18% mean abs error).
Employment by age and sex
Employment by age and sex shows the largest national gap (16.31% mean abs).
Industry mix
Industry mix closely tracks the official distribution.
Occupation mix
Occupation mix snapshot for employed synthetic population.

Households and housing

Household size distribution
Household size is the weakest national fit; 7+ households are missing.
Household type by children
Household type by children is well aligned despite local noise.
Tenure distribution
Tenure mismatch is driven by unknown vs tenant-owned totals.
Dwelling type distribution
Dwelling type is effectively aligned (0.20% mean abs error).
Car ownership comparison
Car ownership matches by municipality after rebalancing.

Mobility

Commute mode split
Commute mode split aligns almost perfectly (0.03% mean abs error).
Commute distance distribution
Distance distribution is tightly matched across modes.
Work flow map
Work flow map shows mostly within-municipality flows; large cities are undercounted.
Study commute map
Study commute patterns match within 0.80% mean abs error.

OOH readiness

OOH audience segments map
Audience segment mix by region, built from mobility and demographic assumptions.
OOH reach index by region
Reach index highlights regional exposure potential.
OOH exposure sensitivity
Exposure sensitivity panel shows which assumptions matter most.

Plausibility checks

Age band vs employment status
Employment by age band sanity check (expected monotonic patterns).
Household size vs cars
Car ownership vs household size; checks for unrealistic caps.
Building type vs tenure heatmap
Tenure vs building type; verifies expected occupancy patterns.
Commute distance vs mode
Mode choices by distance for plausibility against travel behavior.

Spatial fit and reliability

Population density map
Population density at 1 km resolution based on grid allocation.
Grid fit within tolerance map
Cells within tolerance show strong fit in most populated areas.
Grid fit error hotspots map
Hotspots point to suppressed or low-density tiles.
Reliability risk heatmap
Risk heatmap combines uncertainty metrics for caution zones.

Sources and citations

Data references used in this synthesis
  1. Tozluoglu, C. et al. (2023) A synthetic population of Sweden: datasets of agents, households, and activity-travel patterns. Data in Brief 48. PDF: Tozluoglu2023AsyntheticpopulationofSweden.pdf
  2. SCB official marginals used for targets, including demography, labor, and household statistics (2023). Dataset: summary_statistics/all_data_deso.parquet.
  3. SCB 1 km grid and base tiling inputs for spatial allocation (2024), used by pop2grid: data/raw/scb_wfs_grid_1km/stat_befolkning_1km_2024.geojson and data/raw/scb_wfs_grid_1km/stat_Rutnat_1x1km_sweref99tm.geojson.