d_optimal


SYNOPSIS

d_optimal  [pc=<number of PCs; defaults to the number of PCs of the current PLS model>]  \
    [{percent_remove=<1-90; defaults to 50>  \
    | design_points=<0.1 * number of variables - (number of variables - 1);  \
    defaults to 0.5 * number of variables>}]  \
    [type={WEIGHTS | LOADINGS; defaults to WEIGHTS}]


DESCRIPTION

The d_optimal keyword is used to carry out a variable selection according to D-optimal design, i.e. to select an ensemble of variables (whose size is defined by the user) such as to minimize the determinant of the dispersion matrix. The D-optimal algorithm can operate in either the space of PLS partial weights (type=WEIGHTS) or PLS loadings (type=LOADINGS). An excellent tutorial to D-optimal design has been written by De Aguiar et al. [1], while to review its application in the 3D-QSAR field one may refer to the work by Baroni et al. [2]. In Open3DQSAR's implementation the D-optimal design is obtained by means of a k-exchange algorithm [3]. The user has two choices to specify the extent to which the variable selection should be carried out: one is supplying the percent_remove option with the percent of variables which should be eliminated as a parameter. The other is supplying the exact number of variables which should be retained as a parameter to the option design_points. In both cases one is not allowed to remove more than 90% of the original variables.

EXAMPLES

# the following command performs a D-optimal variable selection operating in the space of PLS partial weights, taking into account the first 3 principal components, with the aim of removing 40% of the original variables
d_optimal  type=WEIGHTS  pc=3  percent_remove=40

# the following command performs a D-optimal variable selection operating in the space of PLS loadings, taking into account the same number of principal components extracted when the PLS model was built, with the aim of retaining 1500 variables
d_optimal  type=LOADINGS  design_points=1500


REFERENCES

  1. De Aguiar, P. F.; Bourguignon, B.; Khots, M. S.; Massart, D. L.; Phan-Than-Luu R. Chemometrics Intell. Lab. Syst. 1995, 30, 199-210.   DOI
  2. Baroni, M.; Costantino, G.; Cruciani, G.; Riganelli, D.; Valigi, R.; Clementi, S. Quant. Struct-Act. Relat. 1993, 12, 9-20.   DOI
  3. Johnson, M. E.; Nachtsheim, C. J. Technometrics 1983, 25, 271-277.   Stable URL

Sitemap
Print version
Contact
Mailing list


Last update:
May 31. 2015 20:39:42

Powered by
CMSimple - CMSimple-Styles


Get Open3DGRID at SourceForge.net. Fast, secure and Free Open Source software downloads



Would you like to align your
dataset? Try Open3DALIGN
Just wish to compute a MIF?
Try Open3DGRID