Rapidminer

Introduction

Purpose

Rapidminer provides convenient tools for data loading, exploration, modelling, visualization.

UI and API

It comes with a nice GUI. It provides Java API. It also has a scripting interface.

Data types

(Bi or Poly) Nominal, Integer, Double, Varchar.

Processes

One can visualize the data mining process as a tree of operators. The leaf nodes correspond to the result nodes. Processes can be edited in the design view : here it looks like an electronic circuit. This is effectively programming using a GUI.

Operators

Operators are grouped under various categories: like Modeling, Data transformation, repository access, evaluation/ validation, process control (looping, conditions). These groups contains various subgroups.

Each operator has an input and output, and may require additional settings.

An operator may be nested: itself composed of sub-operators.

Data view

The data view has different tabs to show meta-data, the actual data, a good interface for plotting.

The meta-data view shows useful summary like mean and standard deviation for numeric data and, mode /min classes for nominal data.