* 2007_panel_eu_silc_d_ver_2020_09.do * * STATA Command Syntax File * Stata 16.1; * * Transforms the EU-SILC CSV-data (as released by Eurostat) into a Stata systemfile * * EU-SILC Panel 2007 - / DOI: https://doi.org/10.2907/EUSILC2004-2019V.1 * When publishing statistics derived from the EU-SILC UDB, please state as source: * "EU-SILC UDB – version of 2020-09" * * Household register file: * This version of the EU-SILC has been delivered in form of seperate country files. * The following do-file transforms the raw data into a single Stata file using all available country files. * Country files are delivered in the format UDB_l*country_stub*07D.csv * * * PLEASE NOTE * For Differences between data as described in the guidelines * and the anonymised user database as well as country specific anonymisation measures see: * L-2007 DIFFERENCES BETWEEN DATA COLLECTED.doc * * (c) GESIS 2021-02-03 * GESIS - Leibniz Institute for the Social Sciences * German Microdata Lab * Verena Lichtenberger; Heike Wirth; Valentina Ponomarenko * https://www.gesis.org/gml/european-microdata/eu-silc/ * * Contact: heike.wirth@gesis.org /* Initialization commands */ clear capture log close set more off set linesize 250 set varabbrev off #delimit ; * - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ; * CONFIGURATION SECTION - Start ; * The following command should contain the complete path and * name of the Stata log file. * Change LOG_FILENAME to your filename ; local log_file "LOG_FILENAME" ; * The following command should contain the complete path where the CSV data files are stored * Change CSV_PATH to your file path (e.g.: C:/EU-SILC/Longitudinal 2005-2018) * Use forward slashes and keep path structure as delivered by Eurostat CSV_PATH/COUNTRY/YEAR; global csv_path "CSV_PATH" ; * The following command should contain the complete path and * name of the STATA file, usual file extension "dta". * Change STATA_FILENAME to your final filename ; local stata_file "STATA_FILENAME" ; * CONFIGURATION SECTION - End ; * There should be probably nothing to change below this line ; * - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ; * The csv file for Greece is not consistently named. * The following command changes the name on the hardrive in order for the macro to work properly. * The renaming is permanent and needs only to be executed once for every file. ; cd "$csv_path/EL/2007" ; shell rename "UDB_lGR07D.csv" "UDB_lEL07D.csv" ; * Loop to open and convert csv country files into one dta file ; tempfile temp ; save `temp', emptyok ; foreach CC in AT BE BG CY CZ DK EE EL ES FI FR HU IE IS IT LT LU LV NL NO PL PT SE SI SK UK { ; cd "$csv_path/`CC'/2007" ; import delimited using "UDB_l`CC'07D.csv", case(upper) clear ; * In NL, PT, RS and SI DB040 is missing and read as numeric. * To prevent errors in the append command, it needs to be set to string ; tostring DB040, replace ; append using `temp', force ; save `temp', replace ; } ; * No info on region is available for NL, PT, RS and SI ; replace DB040="no info" if DB040=="." | DB040==""; * Countries in data file are sorted in alphanumeric order ; sort DB020 ; log using "`log_file'", replace text ; * Definition of variable labels ; label variable DB010 "Year of the survey" ; label variable DB020 "Country alphanumeric" ; label variable DB030 "Household ID" ; label variable DB040 "Region (NUTS 1 or 2)" ; label variable DB040 "Region (NUTS 1 or 2) numeric" ; label variable DB040_F "Flag" ; label variable DB060 "PSU-1 (First stage)" ; label variable DB060_F "Flag" ; label variable DB062 "PSU-2 (Second stage)" ; label variable DB062_F "Flag" ; label variable DB070 "Order of selection of PSU" ; label variable DB070_F "Flag" ; label variable DB075 "Rotational group" ; label variable DB075_F "Flag" ; label variable DB090 "Household cross-sectional weight" ; label variable DB090_F "Flag" ; label variable DB100 "Degree of urbanisation" ; label variable DB100_F "Flag" ; label variable DB110 "Household status" ; label variable DB110_F "Flag" ; * Definition of category labels ; * DB030 *ID number see construction doc "UDB description" point 8.6.6 ; label define DB040_F_VALUE_LABELS -1 "missing" 1 "filled" ; label define DB060_F_VALUE_LABELS 1 "filled" -2 "not applicable" ; label define DB062_F_VALUE_LABELS 1 "filled" -2 "not applicable" ; label define DB070_F_VALUE_LABELS 1 "filled" -2 "not applicable" ; label define DB075_F_VALUE_LABELS 1 "filled" -2 "na (no rotational design is used)" ; label define DB090_F_VALUE_LABELS 1 "filled" ; label define DB100_VALUE_LABELS 1 "densely populated area" 2 "intermediate area" 3 "thinly populated area" ; label define DB100_F_VALUE_LABELS 1 "filled" -1 "missing" ; label define DB110_VALUE_LABELS 1 "Hhld from prev. wave: At the same address as last interview" 2 "Hhld from prev. wave:Entire household moved to a private hhld within the country" 3 "Hhld no longer in-scope: Entire hhld moved to a collective hhld institution within the country" 4 "Hhld no longer in-scope: Hhld moved outside the country" 5 "Hhld no longer in-scope: Entire household died" 6 "Hhld no longer in-scope: Hhld does not contain sample person" 7 "Address non-contacted(unable to access, lost-no info on record on what happened to the hhld-)" 8 "New household for this wave: Split-off household" 9 "New household for this wave: New address added to the sample this wave or first wave" 10 "Fusion" 11 "Lost household (no information on record on what happened to the hhld)" ; label define DB110_F_VALUE_LABELS 1"filled" ; * Attachment of category labels to variable ; label values DB040_F DB040_F_VALUE_LABELS ; label values DB060_F DB060_F_VALUE_LABELS ; label values DB062_F DB062_F_VALUE_LABELS ; label values DB070_F DB070_F_VALUE_LABELS ; label values DB075_F DB075_F_VALUE_LABELS ; label values DB090_F DB090_F_VALUE_LABELS ; label values DB100 DB100_VALUE_LABELS ; label values DB100_F DB100_F_VALUE_LABELS ; label values DB110 DB110_VALUE_LABELS ; label values DB110_F DB110_F_VALUE_LABELS ; label data "Household Register file 2007" ; compress ; save "`stata_file'", replace ; log close ; set more on #delimit cr