-*- mode: text -*-

+----------------------------------------------------------------------+
| This archive contains a simple implementation of the Conditional     |
| Mutual Information Maximization for feature selection.               |
+----------------------------------------------------------------------+
| Written by Franois Fleuret                                          |
| Contact <francois.fleuret@epfl.ch> for comments & bug reports        |
| Copyright (C) 2004 EPFL                                              |
+----------------------------------------------------------------------+

$Id: README,v 1.1 2005/03/03 15:52:35 fleuret Exp $

0/ INTRODUCTION

  The CMIM feature selection scheme is designed to select a small
  number of binary features among a very large set in a context of two
  class classification. It consists in picking features one after
  another to maximize the conditional mutual information between the
  selected feature and the class to predict given any one of the
  features already picked. Such a criterion picks features which are
  both individually informative yet pairwise weakly dependent. CMIM
  stands for Conditional Mutual Information Maximization. See

  Fast Binary Feature Selection with Conditional Mutual Information
  Francois Fleuret
  JMLR 5 (Nov): 1531--1555, 2004
  http://www.jmlr.org/papers/volume5/fleuret04a/fleuret04a.pdf

1/ INSTALLATION

  To compile and test, just type 'make test'

  This small test consists in generating a sample set for a toy
  problem and testing CMIM, MIM and a random feature selection with
  the naive Bayesian learner.  The two populations of the toy problem
  lives in the [0,1]^2 square. The positive population is in x^2+y^2 <
  1/4 and the negative population is everything else.  Look at
  create_samples.cc for more details.  The features are responses of
  linear classifiers generated at random.

2/ DATA FILE FORMAT

  Each data file, either for training or testing, starts with the
  number of samples and the number of features. Then follow for every
  single sample two lines, one with the value of the features (0/1)
  and one with the value of the class to predict (0/1).  Check the
  train.dat and test.dat generated by create_samples to get an
  example.

  The test file has the same format, and the real class is used to
  estimate the error rates.  During test, the response of the naive
  bayse before thresholding is saved in a result file (3rd parametre
  of the --test option)

3/ OPTIONS

  --silent

    Switch off all the outputs to stdout

  --feature-selection <random|mim|cmim>

    Selects the feature selection method

  --classifier <bayesian|perceptron>

    Selects the classifier type

  --error <standard|ber>

    Choses which error to minimize during bias estimation for the CMIM
    + naive Bayesian.

    standard = P(f(X) = 0, Y = 1) + P(f(X) = 1, Y = 0)

    ber      = (P(f(X) = 0 | Y = 1) + P(f(X) = 1 | Y = 0))/2

  --nb-features <int: nb of features>

    Selects the number of selected features

  --cross-validation <file: data set> <int: nb test samples> <int: nb loops>

    Do cross-validation

  --train <file: data set> <file: classifier>

    Build and save to disk a classifier

  --test <file: classifier> <file: data set> <file: result>

    Load a classifier and test it on a dataset

4/ LICENCE

  This program is free software; you can redistribute it and/or modify
  it under the terms of the GNU General Public License version 2 as
  published by the Free Software Foundation.

  This program is distributed in the hope that it will be useful, but
  WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
  General Public License for more details.
