CMM510 Data Mining
Devised by W.Ji ; updated by C.H.Bryant and IArana.
© The Robert Gordon University
School of Computing
P1/7
Lab
Data Mining by Clementine
Aims
• To be familiar with Clementine programming interface
• To learn about the visualization functions in Clementine
• To predict iris types using Clementine modelling functions
Background
Clementine data mining tool kit was originally developed by the Integral
Solutions Limited. The Company was later merged by SPSS Inc in 1999.
SPSS (Statistical Package for the Social Sciences) is a software package for
comprehensive data mining (not its initial objective) and analytic applications
for enhanced decision making. The strong power of SPSS lays on the
statistical analysis – it contains a series systematic statistic functions, from
descriptive analysis, parametric and nonparametric tests, to nonlinear
regressions.
Clementine is regarded as a supply to SPSS by providing many intelligent
modelling functions (compared to the traditional statistical techniques). C5.0 is
one of such example. Clementine and SPSS run independently. However, for
enhancing Clementine’s speciality and avoiding loosing its generality in
statistic analysis, Clementine not only embeds most of SPSS functions into its
interface but also provides facility to export its process to SPSS.
As a data mining tool, Clementine follows the basic preprocessing-
modelling-postprocessing routine to reveal the information and knowledge
behind the data.
CMM510 Data Mining
Devised by W.Ji ; updated by C.H.Bryant and IArana.
© The Robert Gordon University
School of Computing
P2/7
Clementine Programming Interface
Start Clementine
Desktop => start => programs => Clementine Desktop 9.0 => Clementine
Desptop 9.0
The Clementine programming interface is shown in Fig. 1.
Fig. 1 Clementine programming interface
Seek Help
On the Clementine tool bar, clicking Help then HelpTopics will bring you
Clementine user manual. You may need to