Expand / Collapse
Use the - and -icon left of the headlines to open and close the different sections of the content area.
The - and -icon open / close all sections at once.
Explanatory notes are shown when the mouse cursor is moved over the field headlines.
The myMetadata Box collects variables which can be used in the functions of the work with myMetadata section.
via drag & drop into the box in the top right corner or by clicking the -icon which appears when you hover over the variables name or headline.
Add variable lists
On pages displaying variable lists you can add complete lists of variables as well by using the -icon which is placed next to the headline.
Remove a selected variable
with the -icon which appears on the right when hovering the variable in the list.
Dataset: CIS 3 - Cross-sectional
The weighting factors should have been based on the ratio between the number of enterprises or employees in the realised sample and the total number of enterprises or employees in each stratum of the frame population, after correction for enterprises that were no longer existing and for reclassification in terms of size or NACE (and after adjustment for non-response). In cases where a non-response analysis was carried out then the results of the non-response analysis were used in the calculation of weighting factors.
Accoring to the country, the weight to use is the Weight or Weightnr (variable "w2use") in the CIS 3 dataset.
The integrated CIS 3 dataset includes data from 15 participating countries:
- Czech Republic
Total N=49761. Total number of variables = 141.
The CIS 3 anonymisation method
The main method used to achieve the anonymisation is the micro-aggregation process (MAP) which modifies the individual data in such a way that an enterprise can no longer be identified as such, i.e. the identification of a respondent (enterprise) with its exact values is not feasible.
The CIS 3 anonymisation method is structured in different work steps. Those steps are: Pre-work on the data, micro-aggregation, global recoding, evaluation of the disclosure risk, data suppression, release of the final data file.
For the pre-work part, the micro-data for each country is aggregated to the NACE 2 digit level and to 3 size classes (small, medium, large). The regional dimension was collapsed and data were only aggregated at national level. Enterprise identifiers were removed as well as the enterprise identifier (Id), stratum A (strA) and stratum B (StrB). Some other variables were recoded. In the first work step, the 14 metric variables were micro-aggregated. The individual ranking method has been applied. Only one metric variable at a time is considered and the values are ranked in ascending order. The observations are then grouped by three and each one is replaced with the weighted mean of the cluster. After micro aggregating the numeric variables of all country files, the frequencies were calculated for three variables: Nace, Nuts and Size00. Further on, they were also calculated for Nace, Nuts, Size00 and Size98. The variables were recoded in a way that for each combination at least three enterprises exist. The weight for the same combinations of variables was calculated. The variables were recoded in a way that for each combination the sum of the weights is at least equal to 20.
After the micro-aggregation, the disclosure risk was evaluated. By computing the ppercent, two cases could be distinguished. If the Enterprise_Id of one enterprise was mentioned more than 3 times, recoding was needed. The method applied was the same than the one used for the size frequencies and the weight frequencies. Otherwise, no recoding was done.
The cells with a p-percent of at least 15 % were kept. For the other ones, if the Enterprise_Id of one enterprise was mentioned more than 3 times and if at least three of the variables Turn, Turn98, Exp, Exp98, Paval, Rtot and Invta were concerned, the entire enterprise concerned was removed from the micro dataset. Otherwise, only the cells concerned were removed.
All cells removed will be represented by “N.A.”.
Eurostat. THE THIRD COMMUNITY INNOVATION SURVEY (CIS 3) SUMMARY OF THE ANONYMISATION METHOD. Available at: on the Eurostat CIS 3 DVD.