Conducting an Analysis of a Qualitative Dataset Using the Waikato Environment for Knowledge Analysis (WEKA)

Report No. ARL-TR-7182
Authors: Robert A Sottilare
Date/Pages: February 2015; 62 pages
Abstract: The purpose of this technical report is to provide an exemplar for conducting an analysis of a qualitative dataset using machine learning techniques. Qualitative data are measured or expressed as a natural language description (e.g., category or attribute) rather than numbers as in quantitative datasets. It is often difficult to evaluate the relationship between qualitative variables of interest and outcomes (dependent variables), but machine learning techniques offer simplified methods to classify these outcomes. WEKA, the Waikato Environment for Knowledge Analysis, is a popular set of machine learning algorithms developed at the University of Waikato in New Zealand, which can be used to analyze both qualitative and quantitative data. To illustrate the use of WEKA on a qualitative dataset, we selected a known set of primate species with the desire to classify them into 1 of 3 classes (prosimians, monkeys, and apes) based on 7 qualitative attributes. When an existing dataset is used with known relationships, it allows us to evaluate a large number of WEKA algorithms in a relatively short time and validate their accuracy with the goal of identifying best practices for analyzing qualitative data.
Distribution: Approved for public release
  Download Report ( 1.262 MBytes )
If you are visually impaired or need a physical copy of this report, please visit and contact DTIC.

Last Update / Reviewed: February 1, 2015