Skip to Main Content

Data Management for Wits: Software for research data : analytics, management , organisation and visualisation

The following is general advice,data varies hugely between types of research and projects.

Qualitative Tools: Atlas.ti, Nvivo and MaxQDA, Grammarly

 The university is currently investigating a site licenses for a paid version of Grammarly.  There are individual seats used across the university. Please ask your school or supervisor if they have a seat . If you want a free version go to app. You can also use Hemingway.

Qualitative Coding  Environments

We have very limited spaces for Max QDA  and Nvivo at wits. They are also limited to certain faculties.We have slightly more for Atlas.ti  but can depend on how many students are using it at one time . However we do find that Voyant is better at word crunching and making clouds than the paid software. So if you do not have the money , your research will not suffer and might benefit from multiple free specialized software. On the other hand, the paid software allows you better data management and is very user friendly. Its your choice and we offer training as well as running demos so you can decide before you make the significant  investment in time it takes to  learning a new software. Please contact us and ask if we have any space  on  for Atlas.ti.  However there are powerful free alternatives

If you think you are going to need it , please apply in time. We have limited space and if you are not using your seat , please give it up so someone else can use it.

There is also very powerful free software that is used by journlalists called pinpoint: https://journaliststudio.google.com/pinpoint/c  that can do basic search,  generate  location and person search. This is powerful  true AI and include transcription so it is better than Atlas.ti in that respect . It also allows coding called labels .  However you do need to think about confidentiality.  Be aware that if you need more  in depth linguistic work use voyant instead. Look at Voyant Tools :https://voyant-tools.org

Remember that if all you need is a good basic coder you can use https://www.taguette.org/. It does what is says it does which is tag ie code data and allow you to view  text section by code. PLEASE do not use confidential data on the web browser version, download it rather.

Wits Software

Why software and which Software

Most Data analytical software combine elements of analysis and management. Typically coding program help with document control and visualization of codes as well as coding. Statistical programs audit keep logs and often function as a sort of version control. Data management software with data collection helps researchers organize, track and audit. It also collects metadata. However, none of this is automatic. In qualitative coding software, a large component of the software is organizational and representative. Statistical software is less about data management but is often only one stage in a project that moves through multiple software.

Software is a method

So when you are choosing and using software do a methodology literature review

Any software that is used in research needs to be as justified as any other method in research. Please remember that while a software can be used IN MULTIPLE WAYS  it is not neutral. The structure of a software is part of methodology and it should be treated as a methodology. Never forget software have authors and a design to solve problems like every other research method. For example ATLAS_TI is based on a grounded theory method of creating codes. Nvivo is as the name suggests looking for in vivo quotes, ie the standout phrase that sums up the points.QCAMAP is the only software that makes its qualitative assumption clear but all software has them. Please remember, software is part of choosing a method and it’s a method in itself. A methodology literature review has a slightly different focus. If you are struggling your subject librarian can help you.Ps please don't forget to cite software, its a work and it deserves a citation.

Creating results from data is very important because of no matter the time that you put into the rest of the research cycle its visualization that shows your results. We offer training on all these software BUT YOU MUST bring your own data. A general course will not help you get to grips with what you need to do.

Google Web Search

This a program used by professional linguistics. However you can also use it to code your text. However we really recommend for image annotation . Its a highly detailed coder in systematic function

Tableau and QLIK are the leading software. They both have merits. However be warned, the results that they get from the demos and tutorials are a consequence of putting in very clean data, that is well organized to answer to the research question. Throwing raw data at this software will not work. If you need additional documentation to prove that you are a student, in order to fill in this form or this one then you are welcome to request it from the library from Joalane.Mathe@wits.ac.za and in person at the data services office. We will verify your current registration status and help you set up and download.

To download software use the following links:

They are both very generous to students on offering software basically for free but remember, you do need to complete your project on time,

Resource links :

Is a classroom online.Tableau in addition offers a Data analytics for university students guide.

The Best software?
There is no true best software. This depends on how much time you have and how many things you are doing to your data. Setting up software projects requires time and thought as well as writing metadata, and organizing data into software requirements. The process of learning a new software well enough to use it to it full capacities can take up to two months. In a research project, that can be considerable time and energy devoted to software. If you are doing research in a field that typically generates measurements or findings out of a sensor machine that has a proprietary software then you or rather your lab has to make a choice. Are you going to use an electronic lab system and pull the data in or are you going to manage the data inside the sensor environment? We would recommend transforming the data into a non-proprietary format. Your next lab might have a different machine and eventually, all machine fail and often companies either change or go out of business. However in a short-term project, inside the machine environment often makes sense.

The first question we would like you to consider in choosing software is whether you as researcher or research student will be using this software in the future. If so, the cost of learning and setup is acceptable. Research training also involves acquiring skill and familiarities with a particular software. It is part of the learning outcomes and it needs to be considered as part of the skills acquired. However, not all software is likely to increase your skills.

 However, if you are wanting only certain capacities for a brief period, then we would suggest a consultation so we can find the simplest solution.  In other words, the best software is the one that works with the least investment in time and set up for your data set and personal circumstances. That means that sometimes, the best software is none. Coding can be done effectively in word using search. Zotero can be used to tag and organize documents. 

Aquad is NOT that user-friendly but is it very powerful. Its got capacities for coding, networks, and connections. It also has a statistics interface so theoretically, it’s the ideal tool for mixed methods. There is a demo project that comes with  it . Aquad does audio, video and picture coding to a basic level. We can also train in it. It the closest free project to paid software capacities. There offer to code:"within the framework of the code-paradigm,

  • with Boolean minimization to identify types and
  • by means of textual sequence-analysis to reconstruct case structures based on strict hypothesis testing (following the approach of objective hermeneutics).
  • An interface with the statistical software „R“ (open source) allows to combine qualitative and quantitative analysis; the scripts were modified and more scripts were added to version 7.3."

However be clear, if R is strange to you do not try and use R in Aquad for the first time.  PS don't freak out if you see German errors message  , the software is not entirely  in English but hey neither is the world. 

Find more on their website

Prof. i.R. Dr. Dr. h.c. Günter L. Huber
Viktor-Renner-Str. 39
72074 Tübingen
Telefon: +49 (0) 7071 – 88 51 47
Email: info@aquad.de

In the library, we are biased toward Qcamap, WHY? Well, it’s really a software version of a book Qualitative Content Analysis by Peter Mayring.  That means it comes loaded with a large number of rule-based flowcharts linked to the book.   This software is ideal for the first time coder. It also is useful for small projects doing basic coding. However, its real strength lies in the way it keeps the research question connected to the coding process. Its also offers a unique take on summarization. We recommend both the book and software for those who need a clear, step by step instructions into how to code.

 Find more on their website

Stéfan Sinclair & Geoffrey Rockwell  wrote voyant in very interesting collaborative process. This has resulted in a software that represents the words in the text a way that works for digital humanism, not for coding or linguistics. It important to use this tool as a first step in looking at the language structure of your text but not as a not an automatic coder. Nonetheless, it produces lovely outputs and for English text often reveals patterns. We also urge you to explore the rest of  their   articles, software, and book

Centre for Oral History and Digital Storytelling has produced a fairly basic software. Stories matter works very well in a project that has multiple interviewees and interviewers. The important thing to do in working with the program is to LOOK DOWN. STORIES MATTER menus are located on the bottom of the open screen. Stories matter works to keep oral text ORAL. It important to consider both the transcript and the audio as each hold separate and important levels of information. In other words, stories matter can be used without transcription to code directly to the audio.

Elan is an amazing free tool  with a set of other language tools from the  Max Planck Institute  for working with video and language. It largely used for research projects in sign language  but it can be used as a highly granular annotator for any video because of its focus on body language and actions. Be warned , its designed for professional linguistics but it has applications beyond that. You cite Elan : Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, The Netherlands https://tla.mpi.nl/tools/tla-tools/  specifically  presented in Sloetjes, H., & Wittenburg, P. (2008).Annotation by category – ELAN and ISO DCR.In: Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).  Video Research Lab is well worth checking out as well. They have a focus on data in video

Lawrence Anthony wrote ANTCONC to solve several research problems of his own. However, it has become clear that his set of corpus linguistic tools is capable of being used by text miner in many ways. This is an advanced set of tools and you will have to brush up on your linguistics.

Zotero is a basic Library tool and on its own, it can be used to function as a very basic data management system especially if you are doing something like a policy study. However if you papermachine to it, it becomes a quite powerful way of working with text."Paper Machines is an open-source extension for the Zotero bibliographic management software. Its purpose is to allow individual researchers to generate analysis and visualizations of user-provided corpora, without requiring extensive computational resources or technical knowledge."
They have the usual videos and the library also provides training, It's NOT a coder as such, it is a visualization method

how to cite software

T.G. Golda, P.D. Hough, and G. Gay, APPSPACK (Asynchronous parallel pattern search package);software available at http://software.sandia.gov/appspack.

Software is often linked to data

G.Y. Wang, Z.M. Zhu, S. Cui, and J.bbbH. Wang,
Data from: Glucocorticoid induces coordinating between Glutamatergic and GABAergic neurons in the amygdala,Dryad Digital Repository, 2017; dataset available at https://doi.org/10.5061/dryad.k9q7h.Data in code MultiSimplex 2.0. Grabitech Solutions AB, Sundvall, Sweden, 2000; software available at http://www.multisimplex.com.

Form the Taylor and Francis style guide

Principals

  1. Importance: Software should be considered a legitimate and citable product of research. Software citations should be accorded the same importance in the scholarly record as citations of other research products, such as publications and data; they should be included in the metadata of the citing work, or example in the reference list of a journal article, and should not be omitted or separated. Software should be cited on the same basis as any other research product such as a paper or a book, that is, authors should cite the appropriate set of software products just as they cite the appropriate set of papers.
  2. Credit and Attribution: Software citations should facilitate giving scholarly credit and normative, legal attribution to all contributors to the software, recognizing that a single style or mechanism of attribution may not be applicable to all software.
  3. Unique Identification: A software citation should include a method for identification that is machine actionable, globally unique, interoperable, and recognized by at least a community of the corresponding domain experts, and preferably by general public researchers.
  4. Persistence: Unique identifiers and metadata describing the software and its disposition should persist – even beyond the lifespan of the software they describe.
  5. Accessibility: Software citations should facilitate access to the software itself and to its associated metadata, documentation, data, and other materials necessary for both humans and machines to make informed use of the referenced software.
  6. Specificity: Software citations should facilitate identification of, and access to, the specific version of a software that was used. Software identification should be as specific as necessary, such as using version numbers, revision numbers, or variants such as platforms.
Copyright © 2011-2017 FORCE11. All Rights Reserved. Privacy Policy.

There are thosands of type, parts and kinds of software . For a full list

Data Presentation

Results can be presented in textual and non-textual form.See the resources below,for best practices on non-textual data presentation (i.e. charts, tables, etc.).

Data Visualization

Information Is Beautiful

The information is beautiful awards are an excellent way of coming to see just how data can be well done and well beautiful.

Wits School of Arts-Digital Arts

It's important to remember that academic results are also creative  of you want to learn more look here and if you want to get a consultation or hire a digital artist we'll look at the schools's Facebook page. Digital arts is  important because it pointed out that data should be understood by any audience. It focuses on the user experience. Your research should be and can be explained in your visualization of results.

Vizsweet

A high-end tool for creating a beautiful, interactive data-visualization and stories. And they have an app

Podcast Episode

This is a truly informative resource. The podcast is over but the work and resources are excellent. You can also learn quite a bit about beer.

Kaggle is the place to do data science projects

Here is great data, plus a learning experience and the chance to win money. Basically, if you are looking for a research topic look here first.

CDC 'center for disease control and  prevention'

Data tables and stats, be sure to check out if you are looking at raw data from the surveillance or derived data.

Havocscope Black Market 

"Havocscope provides information and threat intelligence on the global black market. Due to the ability of transnational threats to cause financial losses and social harms, key statistics and data about the illegal economy are provided to help mitigate this risk. The information about the black market has been collected from government agencies, academic studies, media reports, and reported data from our sources."

Data presentaion

Results can be presented in textual and non-textual form. See the resources below, for best practices on non-textual data presentation (i.e. charts, tables, etc.).