Research Data Management
Week 1 (03-07 October 2022)
Lecturers: Mahadia Tunga; Somoe Mkwachu; Ignace Kabano, PhD
About the lecturers
Mahadia Tunga is a co-founder of the Tanzania Data Lab (dLab), which aims to strengthen the data ecosystem in Tanzania and Africa through capacity development. Mahadia is trained as a computer scientist and specialised in data science. She has vast experience in managing capacity development projects, gender-based and youth engagement programs with a special interest in young girls. Kindly visit sample projects in twitter pages (@dlabtz and @TungaMahadia) and through https://bit.ly/2K1kA3e and https://bit.ly/2lHPO2q. Since 2015, Mahadia has delivered a number of strategic consulting in research and capacity building programs on open data, data innovation, management, visualisation and analysis to several governments, non-government organisations and private entities. She has trained over 2000 individuals and 50 organisations in Tanzania, Uganda, Congo, South Africa, Egypt, and other countries.
Somoe Mkwachu is a Data Trainer and Program Coordinator at the Tanzania Data Lab (dLab). She has over 15 years of experience in training, consulting and program management. She works at the University of the Dar es Salaam Computing Centre where she is conducting training and leading the Consultancy Unit. She has delivered strategic consultancies with organizations such as the Bank of Tanzania, Tanzania National Park, Tanzania National Roads, Vocational Education and Training Authority, Tanzania Revenue Authority and President’s Office – Public Service Management. She is also actively involved in organizing and conducting data science training programs at the Tanzania Data Lab project since March 2016.
Kabano H. Ignace is a senior lecturer of Demography and Statistics in the department of Applied Statistics, School of Economics, College of Business and Economics and is the head of training at the African Centre of Excellence in Data Science at the University of Rwanda. He holds a Msc in Demography from State University of Groningen and PhD in Demography from Utrecht in the Netherlands. He has completed a Training of Trainers (ToT) on Data Management with Software Application from Harvard University (USA). With 18 years in academia, he has taught numerous courses related to Demography, Statistics, Economics and research methodology in both Social and Data sciences. He is an expert on resettlement action plans and has consulted public and private institutions, international organizations and local NGOs.
Short course description
The course “Research Data Management” introduces participants to strategies, processes, and measures required to assure the quality, understandability, and (re)usability of research data from an Open Science and Open Data perspective. Not only is replicability of research data and research findings considered an integral part of good scientific practice. More and more research funders require active data management to ensure that data is of high quality and can be re-used by researchers for new research purposes. Participants will gain relevant information on openness in science and replicability ensuring that their research data is FAIR (findable, accessible, interoperable, and reusable).
The course will cover a) researching data, b) data management plans, c) data collection, d) data processing, e) ethical and legal aspects of data sharing, f) documentation and metadata, g) data storing and archiving. The course will introduce the FAIR principles to guide researchers in creating re-usable research data, increasing transparency as well as replicability of research findings.
Each day will take six hours of classroom instructions, combining lectures in which the theoretical foundations of the literature are discussed, with discussions and practical examples, giving participants the opportunity to discuss their research projects and data.
The course is targeted at researchers and practitioners who produce qualitative or quantitative data and want to learn how to efficiently manage this data and ensure its reusability or how to work with quantitative data and want to understand how the FAIR principles can be implemented in their research. In the course, participants will develop a) familiarity with the idea of Open Science and the principles of FAIR data; b) an understanding of research data management; c) the skills to set up and implement a data management plan; d) the ability to efficiently handle research data; and e) the skills to prepare data in a way that makes it re-usable by other researchers.
For the full-length syllabus, please click here (174 kB).
Day 1 | Introduction to Open Science, the FAIR principles, and research data management |
Day 2 | Exploring existing data sources; Ethical and legal aspects of data collection and sharing |
Day 3 | Data collection and cleaning |
Day 4 | Preparing data for reuse |
Day 5 | Data storing, archiving, and sharing |
Course prerequisites
Participants should be experienced in working with quantitative research data and be well-versed in using one of the main statistical software packages, such as Stata, SPSS or R.
Target group
Participants will find the course useful if:
- are social science researchers at an early stage of study planning or data collection, working with quantitative data (principal investigators, researchers who are part of project teams, individual researchers and PhD students):
- are faced with challenges related to data protection, data cleaning and documentation and have little experience in dealing with them so far;
- aim to share their data for re-use after the end of the research project and/or want to learn how to ensure reproducibility of their research findings.
Course and learning objectives
By the end of the course participants will:
- have gained a basic understanding of research data management in social science research within the larger data lifecycle;
- be familiar with techniques of data cleaning and data documentation, as well as preparing their data for re-use
- be aware of ethical and legal challenges to data sharing resulting from data protection regulations and intellectual property rights
- be familiar with applying re-use licenses to their data.