MCFS (Monte Carlo Feature Selection) is a feature selection method that can be applied to
high dimensional data (thousands/millions of features). Algorithm is implemented in Java but there is a user friendly R package (rmcfs).
First version of MCFS was published in 2004/2005 and in 2008 the final version of MCFS was published in Bioinformatics journal:
- M.Dramiński, A.Rada-Iglesias, S.Enroth, C.Wadelius, J. Koronacki, J.Komorowski "Monte Carlo feature
selection for supervised classification", BIOINFORMATICS 24(1): 110-117 (2008).
- M.Dramiński, J. Koronacki, J.Ćwik, J.Komorowski "Monte Carlo Gene Screening for Supervises Classificattion",
Proceedings of the EUROFUSE 2004 Workshop on Data and Knowledge Engineering, B.De Beats, R. De Caluwe, G. de Tre, J. Fodor, J. Kacprzyk,
S. Zadrozny (eds):Current Issues in Data and Knowledge Engineering, Akademicka Oficyna Wydawnicza EXIT Warszawa 2004.
MCFS-ID (Monte Carlo Feature Selection and Interdependence Discovery) is an extension to the original idea of MCFS and it produces interdependency
graphs (ID-Graphs). Features that are interdependent (not correlated!) are represented as nodes connected by directed edges in the ID-graph. Initial version of ID extension was published
in 2010. The latest version that describes rmcfs (the R implementation) was published in 2018. This one is highly recommended to cite if you would like to use
rmcfs in your research.