Skip to Main Content

Data Management for Wits: About us

The following is general advice,data varies hugely between types of research and projects.

Service Offered

                                            Data Cleaning repeatedly
  •  Cleaning as refers to identifying incomplete, incorrect parts of the data and then replacing, modifying, or deleting the data which detracts from the meaning of the data sets. It is also a process of creating variables, removing variables and structuring variables in such a manner as makes a type of statistical test or analytical procedure possible.
    •After cleaning, a data set should be attached according to a schema and to the assigned meaning of the variables. Informally, it will be connected to at least the operational terms within the research question
    •The actual process varies according to the data, for text it may involve removing typographical errors or validating and correcting values against a known list of entities.
    •It's an iterative process and unfortunately can introduce errors.
    •Data wrangling also involves version control and merging different round of data collections, merging and spitting and reemerging variable.

                                     Data Analysis

  • Data analysis is a process of testing the information against theoretical constructions. These can be as organized as a hypothesis or as general as a theoretical framework.  It is also a process of bringing order, structure, and meaning to the mass of collected data what can be called making meaning.

    •There are two sides to analysis, one involves following the standards of a dispelling. In humanities, this involves the application of empathy alongside critical thinking. In science, there is an element of puzzle solving.

    •Data visualization is a creative process that makes a meaningful story often for publication out of a mass of results.

    •Data mining is a particular data analysis technique that focuses on modeling and knowledge discovery for predictive rather than purely descriptive purposes. (ref)

Research  Data Management

  • Data Management is a project management process that guides the researcher in acquiring,validating, storing, protecting, and processing required data to ensure the accessibility, reliability, and timeliness of the data for its users.
  • Research Data Management services refer in particular cleaning, uploading, organizing data in ongoing storage, access and preservation produced in particular investigations or research projects.
  • These services support the full data lifecycle including data management planning, digital citation and metadata creation and conversion.
  • The aim of the RDM services is to ensure research integrity and enable the use of existing data for future research endeavors and ensure that researchers area compliant with funders and publishers
  • Data management software solutions make processing, validation, and other essential functions simpler and less time-intensive.

                                              Benefits of Data Management  click here.

         Data Interpretation and visualization
  • Data interpretation a difficult two-fold process. the researcher needs to as objectively as possible to understand what the data they have collected is telling their field of study. However, equally well the researcher needs to fit the data into an overall theoretical framework that gives the partial information meaning.
  • Interpreting data is a primary research skill; data services help with resources and software to assist the researcher.
  • In particular, we encourage the researcher to explore data visualization and tools. We encourage researchers to use the insights of data journalist as to how to make data interesting


Metadata  is  a set of data that describes and gives information about other data. It can be as simple as the author, date and title. With data it typically involves description of type of data collection methods, variables. the links below have examples

Or is data [information] that provides information about other data.

  • Three distinct types of metadata exist: descriptive metadata, structural metadata, and administrative metadata
  • Descriptive metadata describes a resource for purposes such as discovery and identification. It can include elements such as title, abstract, author, and keywords.
  • Structural metadata is metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters.
  • Administrative metadata provides information to help manage a resource, such as when and how it was created, file type and other technical information, and who can access it.

Why do Research Data Mangement

There are two reasons, Researchers tell us that they  do data management

  1. They want to, or they want their students to because they need to make sure that the data is not a huge mess. This is really important if you have a complex project or you are a first-time researcher. It's really a form of project management. We mostly, in that case, want to make sure your data management plan is reasonable and you have the right software for the job.
  2. Researchers have to because their funder, supervisor or journal has requirements. In that case, we are here to make compliance as painless as possible, work as much as possible in your current workflow and make sure what you do has scientific validity. We know this is not easy so we work with the E-research office to get you the resources you need.PLEASE contact us when you are writing your grant not afterward.

What do you need help with?
Publication in journal: 2 votes (4.26%)
NRF: 1 votes (2.13%)
Thesis: 9 votes (19.15%)
Ethics: 4 votes (8.51%)
Qual software: atlas-ti: 18 votes (38.3%)
DMP: 2 votes (4.26%)
Training: HELP: 0 votes (0%)
Supervisors requirements for data mangement: 6 votes (12.77%)
DOI: 0 votes (0%)
I CANT SHARE: 5 votes (10.64%)
Total Votes: 47

Books recommeded

Definition of Research Data

An Introduction to the Basics of Research Data Management (YouTube video by Louise Patterton)

A great deal of energy has been put into defining the difference between qualitative and qualitative data. Different authors have a definition that varies across fields. We would say that for the purposes of helping you work with your data, it does not matter and can be a distraction.

Data is that which you can use to create a research result from or which underlays the result. It's the information that you, as a researcher in a field of inquiry,can legitimately point to in making arguments and coming to conclusions. It, in other words, the stuff you study. Whatever that stuff is so long as it gets you to a result, its data.

For the purposes of data services, Quantitative data is that data which you can apply statistical tests too. It matters immensely if it is ordinal, or not. It's important to check if its normally distributed. In other words, does it meet the assumption that underlay the particular test? However, even if its numbers, if you can't use stats then it not useful to see it as qualitative.

Qualitative data is that data to which you can apply the processes and theories of qualitative inquiry.This means that if you are doing policy work, a website is a data, if you are doing computer science, the download stats of that website is data.

If you feel our definition is too basic then take a look at the video. This was made by the NEDDICK group of mainly data librarians of which wits is a member, We are scattered around the country. So if you need help and you are not at wits remember we can always refer you to a data librarian near you. For a lot more useful information from NeDICC, Click here