Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Data Management for Wits: Research Data Mangment for Wits Postgraduates

The following is general advice , data varies hugely between types of research and projects .

What data will your research generate?

What data will your research generate? and What is your plan for managing the data?

“Data” is defined as materials generated or collected during the course of conducting research. Examples of humanities data could include citations, software code, algorithms, digital tools, documentation, databases, geospatial coordinates (for example, from archaeological digs), reports, and articles. Excluded, however, are things such as preliminary analyses, drafts of papers, plans for future research, peer-review assessments, communications with colleagues, materials that must remain confidential until they are published, and information whose release would result in an invasion of personal privacy (for example, information that could be used to identify a particular person who was one of the subjects of a research study).Data Management Plans for NEH Office of Digital Humanities

What does Wits Offer

Mainly Wits offers SAFE< SECURE< LEGAL DATA STORAGE

PLEASE USE IT:Wits researchers have 1 terabytes of secure storage available via their Stafford student G-drive account. Additionally, the university can provide access to HPC services and data transfer facilities. Big data and other specialized requirements can be discussed and assessed on an individual basis with the research office. You need to send a request to E-research. You need to have your costs attached. We can help you work those out.


You might ask as many have, well why bother, I have a backup.

The problem is that Dropbox is a general file manager. Research data needs to be stored with quite a lot of care. It also needs to be stored with data documentation and metadata attached. Why? it needs to be secure because that is part of acting with research integrity and secure data needs some organization even if just to allow us to know what parts personal and what parts have id numbers and so on. What do we mean by secure? Well we mean that the data especially while you are using it, is protected from anyone who should not have it looking at it and in that way is ethically managed. You cannot agree to have anyone else own, or look at personal data that you have collected. There is no doubt that Dropbox,Google drive and even just emailing your Gmail account is great for organizing your life or even your PhD. but in putting the data up there, they might end up seeing it.

SO,Remember the Google drive that Wits have negotiated separate and safe terms and conditions. It's not the same Google drive as the public one. Now if your personal data and you have read the latest terms of service, you can decide to give that data away (although we would like you a citizen of the net to stop and think!where and how that data is being used is not always good for us) .But you are not allowed ethically, responsibly or legally to do the same with your research data because it not only your data, it's the universities, the respondents, the funders and in some cases the countries' data.
So the university got a Google app for education. It easy. It looks like Google storage and it from Google but its managed by Wits under different term and conditions. IT FREE. For your research data  PLEASE use it but if you don't have a plan, it's going to fill up fast.

What does Wits require ?

That up to each supervisor and student to determine in a contract. So please put it in the contract. But it cannot be overstated that your supervisor IS NOT YOUR DATA MANAGER. Emailing to them is not the same as saving. Clarify what will happen to the data at the end of the research before there is conflict or unmet expectations. Remember, if you are publishing you have to put more work into data management than if you are not. The supervisor is responsible for the quality of your work, so they need to see the data and know it, exist and it yields the results you are claiming. An examiner of the thesis or an examining committee can also ask to see your data. The school or department of the university might well be working on a research thrust. That means that they are gathering all the research projects together to make progress. If that is the case, your dataset will need to be compatible with the other datasets in your department's collection. Don't forget that in addition, the ethics committee can audit you. So treat your data like its going to be examined.A normal set of  requirement would be

 Overall, a data set, properly cleaned and organized such that the replication of the results reported in the thesis can be achieved. It seems like a lot of work but really it's just a very detailed methodology.The dataset should have:

  1. Code if any otherwise, code books, software types and file types.
  2.  Raw and final versions of data with some record of changes made.
  3.  A complete history, logs or research journals depending on discipline.
  4. Metadata, Comments or Readme files
  5.  Intellectual Property statements
  6. The methodology expanded into a project, data management or integrated risk management plan, alternatively an SOP or fieldwork notes.
  7. Details of any methodological reviews, decision making, and processing.
  8. Ethics, proposal, consents, amendments, rights, and obligations.

There should be a naming protocol and it's useful to have some kind of version control, either with names or with software.