Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -167,13 +167,13 @@ global r1a_if_condition "dcpst == 2 & dag >= ${age_can_retire} & flag_deceased !
global r1b_if_condition "ssscp != 1 & dcpst == 1 & dag >= ${age_can_retire} & flag_deceased != 1"

* Wages
global W1fa_if_condition "dgn == 0 & dag >= ${age_seek_employment} & dag <= ${age_force_retire} & flag_deceased != 1"
global W1fa_if_condition "dgn == 0 & dag >= ${age_seek_employment} & dag <= ${age_force_retire} & deh_c4 != 0 & flag_deceased != 1"

global W1ma_if_condition "dgn == 1 & dag >= ${age_seek_employment} & dag <= ${age_force_retire} & flag_deceased != 1"
global W1ma_if_condition "dgn == 1 & dag >= ${age_seek_employment} & dag <= ${age_force_retire} & deh_c4 != 0 & flag_deceased != 1"

global W1fb_if_condition "dgn == 0 & dag >= ${age_seek_employment} & dag <= ${age_force_retire} & previouslyWorking == 1 & flag_deceased != 1"
global W1fb_if_condition "dgn == 0 & dag >= ${age_seek_employment} & dag <= ${age_force_retire} & deh_c4 != 0 & previouslyWorking == 1 & flag_deceased != 1"

global W1mb_if_condition "dgn == 1 & dag >= ${age_seek_employment} & dag <= ${age_force_retire} & previouslyWorking == 1 & flag_deceased != 1"
global W1mb_if_condition "dgn == 1 & dag >= ${age_seek_employment} & dag <= ${age_force_retire} & deh_c4 != 0 & previouslyWorking == 1 & flag_deceased != 1"

* Capital income
global i1a_if_condition "dag >= ${age_becomes_semi_responsible} & flag_deceased != 1"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@
* DATA: Longitudinal EU-SILC UDB version, 2005 - 2020
* AUTHORS: Clare Fenwick, Daria Popova, Ashley Burdett,
* Aleksandra Kolndrekaj
* LAST UPDATE: Jan 2026 AB
* LAST UPDATE: March 2026 AB
*
********************************************************************************
* NOTES:
* ENSURE HAVE ALREADY RUN 00_master_conditions.do FILE.
*
* Before running these files, the cumulative panel for each file type
* (D, H, R, P) must be constructed. These cumulative panels should be created
* following the procedure set out in *GESIS Papers 2022/10*. The do-files to
Expand Down Expand Up @@ -115,7 +117,7 @@ global dir_ind "/Users/ashleyburdett/Library/CloudStorage/Box-Box"
// Aleksandra - C:/Users/ak25793/Box

* Working directory
global dir_work "$dir_ind/CeMPA shared area/_SimPaths/_SimPathsEU/initial_populations/PL"
global dir_work "$dir_ind/CeMPA shared area/_SimPaths/_SimPathsEU/input_processing/initial_populations/PL"

* Directory containing do files
global dir_do "$dir_work/do_files"
Expand Down Expand Up @@ -146,21 +148,21 @@ global dir_data_05_20 "$dir_data/orig_panel_2005_2020"
* DEFINE PARAMETERS & PROCESS IF CONDITIONS
*******************************************************************************/

do "$dir_ind/CeMPA shared area/_SimPaths/_SimPathsEU/00_master_conditions.do"
do "$dir_ind/CeMPA shared area/_SimPaths/_SimPathsEU/input_processing/00_master_conditions_PL.do"


/*******************************************************************************
* EXECUTE FILES
*******************************************************************************/
//do "$dir_do/01_prepare_pooled_data.do"

do "$dir_do/02_create_variables_PL.do"
do "$dir_do/02_create_variables_${country}.do"

do "$dir_do/03_create_benefit_units_PL.do"
do "$dir_do/03_create_benefit_units_${country}.do"

do "$dir_do/04_reweight_PL.do"
do "$dir_do/04_reweight_${country}.do"

do "$dir_do/05_drop_hholds_and_slice_PL.do"
do "$dir_do/05_drop_hholds_slice_and_refactoring_${country}.do"

do "$dir_do/06_check_yearly_data_PL.do"
do "$dir_do/06_check_yearly_data_${country}.do"

42 changes: 37 additions & 5 deletions input_processing/data_construction/PL/01_prepare_pooled_data_PL.do
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
********************************************************************************
* PROJECT: ESPON
* PROJECT: SImPaths EU
* DO-FILE NAME: 01_prepare_pooled_data.do
* DESCRIPTION: Compiles panel dataset from EU-SILC
********************************************************************************
Expand All @@ -23,10 +23,42 @@ merge these chunks of data into one cumulative dataset (separately for the
D-,H-,R- and P-data).
*/
/*
Initial populations: cross-sectional SILC for 2011-2023 (income 2010-2022),
2023 (income 2022)
Estimation sample: longitudinal SILC with observations from 2011-2023
(income 2010-2022)
STRUCTURE OF THIS FILE

The script builds a person-level panel dataset for a single country by
sequentially merging the four EU-SILC master files produced by the panel
construction scripts (01-04 in eu_silc_do_2025/).

Files are merged in the following order, with R as the base:

R (Personal Register) — loaded first as the base. Contains all persons
in the sample including children under 16. Key identifiers: upid
(unique person ID across releases), uhid (unique household ID), year.

P (Personal Data) — merged 1:1 on year+upid+uhid. Contains income and
personal variables for adults aged 16 and above only. After this merge:
- Adults (in both R and P): have full R and P variables
- Children (in R only, not P): retained with R variables only
- Records in P but not R: dropped (should not occur in clean data)

D (Household Register) — merged 1:m on year+uhid. D is household-level
so one D row maps to multiple persons. keep if _merge==3 retains only
persons whose household appears in D. A small number of households may
not merge — this is suspected to be an edge case from the cross-release
deduplication in 01_create_masterD.do but has not been fully investigated.

H (Household Data) — merged 1:m on year+uhid, same logic as D.

KEY IDENTIFIERS
upid — unique personal ID across releases (country + rotation group +
dropout year + pid). Not the same as the raw pid in the source data
uhid — unique household ID across releases (same construction logic).
year — income reference year.

OUTPUT
${country}-SILC_pooled_all_obs_01.dta — person-level panel for the target
country, containing all household members (adults and children) with
combined R, P, D, and H variables. Flag variables (*_f, *_i) are dropped.
*/

********************************************************************************
Expand Down
Loading
Loading