PIAAC Data Analyses: Data Manipulation and Analysis Using Base R
Video 1 - PIAAC Analysis Using Base R: Estimating adult literacy using PVs (Documentation available here)
This online tutorial and associated R script demonstrates how to estimate the mean, standard deviation, and their respective standard errors for adult literacy proficiency scores using Plausible Values (PVs) from PIAAC data. The code selects relevant PV and weight variables, cleans the dataset to include complete cases, and defines custom functions that compute weighted estimates while accounting for both imputation and sampling variance through replicate weights. The approach follows the PIAAC methodology and can be adapted for country-specific data structures and replication designs.
Zitation: Courtney, M.G.R. & Nurumov, K. (2025). PIAAC Data Analyses: Data Manipulation and Analysis Using Base R. Online Tutorial, Video 1 – PIAAC Analysis Using Base R: Estimating adult literacy using PVs. GESIS –Leibniz Institute for the Social Sciences, Mannheim. Available at https://www.gesis.org/data-on-adult-education/workshops
Video 2 - PIAAC Analysis Using Base R: Estimating hourly earnings (non-PVs) (Documentation available here)
This online tutorial and associated R script illustrates how to estimate the weighted mean, standard deviation, and their respective standard errors for a non–plausible value (non-PV) variable — in this case, hourly earnings including bonuses — using PIAAC data. The analysis selects the relevant variable and associated replicate weights, filters complete cases, and applies custom functions to account for sampling variance through the replication method. The functions compute reliable estimates of central tendency and dispersion for continuous variables that are not based on plausible values, ensuring consistency with the PIAAC weighting and variance estimation framework.
Citation: Courtney, M.G.R. & Nurumov, K. (2025). PIAAC Data Analyses: Data Manipulation and Analysis Using Base R. Online Tutorial, Video 2 – PIAAC Analysis Using Base R: Estimating hourly earnings. GESIS –Leibniz Institute for the Social Sciences, Mannheim. Available at https://www.gesis.org/data-on-adult-education/workshops
Video 3 - PIAAC Analysis Using Base R: Estimating differences between two means using PVs (Documentation available here)
This online tutorial and associated R script illustrates how to estimate the difference between two group means for adult literacy proficiency scores using Plausible Values (PVs) from PIAAC data. It selects PV and weight variables, incorporates a binary grouping variable (e.g., gender), and applies replicate weights to compute weighted means for each group. Custom functions are used to account for both sampling and imputation variance, yielding the mean difference, its standard error, and corresponding statistical significance. The approach provides a robust method for comparing literacy performance between demographic groups following the PIAAC variance estimation framework.
Citation: Courtney, M.G.R. & Nurumov, K. (2025). PIAAC Data Analyses: Data Manipulation and Analysis Using Base R. Online Tutorial, Video 3 – PIAAC Analysis Using Base R: Estimating differences between two means using PVs. GESIS –Leibniz Institute for the Social Sciences, Mannheim. Available at https://www.gesis.org/data-on-adult-education/workshops
Video 4 - PIAAC Analysis Using Base R: Estimating differences between two means (hourly earnings, non-PVs) (Documentation available here)
This online tutorial and associated R script shows how to estimate differences in hourly earnings between groups using PIAAC data while accounting for complex sampling design. It applies replicate weights to compute weighted means, standard errors, and significance tests for non–plausible value variables.
Citation: Courtney, M.G.R. & Nurumov, K. (2025). PIAAC Data Analyses: Data Manipulation and Analysis Using Base R. Online Tutorial, Video 4 - PIAAC Analysis Using Base R: Estimating differences between two means (hourly earnings). GESIS –Leibniz Institute for the Social Sciences, Mannheim. Available at https://www.gesis.org/data-on-adult-education/workshops
Video 5 - PIAAC Analysis Using Base R: Regression using PVs as dependent and independent variable (Documentation available here)
This online tutorial and accompanying R script demonstrate regression-based analyses using Plausible Values (PVs) in the PIAAC study to examine adult literacy and reading behavior. Two applications are illustrated: (I) PVs as dependent variables to model literacy proficiency, and (II) PVs as independent variables to predict reading activities at home. The script guides users through importing and preparing PIAAC data, applying full-sample and replicate weights, and estimating regression coefficients across all PVs. It computes both sampling and imputation variances, derives standard errors, t-values, and adjusted degrees of freedom using Welch–Satterthwaite and Johnson–Rust methods, and calculates corresponding p-values. The workflow produces statistically robust, population-level estimates consistent with international large-scale assessment standards.
Citation: Courtney, M.G.R. & Nurumov, K. (2025). PIAAC Data Analyses: Data Manipulation and Analysis Using Base R. Online Tutorial, Video 5 – PIAAC Analysis Using Base R: Regression using PVs as dependent and independent variable. GESIS –Leibniz Institute for the Social Sciences, Mannheim. Available at https://www.gesis.org/data-on-adult-education/workshops