A++ » TMVA » TMVA::CostComplexityPruneTool

class TMVA::CostComplexityPruneTool: public TMVA::IPruneTool

CostComplexityPruneTool - a class to prune a decision tree using the Cost Complexity method
(see "Classification and Regression Trees" by Leo Breiman et al)

Some definitions:

T_max - the initial, usually highly overtrained tree, that is to be pruned back
R(T) - quality index (Gini, misclassification rate, or other) of a tree T
~T - set of terminal nodes in T
T' - the pruned subtree of T_max that has the best quality index R(T')
alpha - the prune strength parameter in Cost Complexity pruning (R_alpha(T) = R(T) + alpha*|~T|)

There are two running modes in CostComplexityPruneTool: (i) one may select a prune strength and prune
the tree T_max until the criterion
R(T) - R(t)
alpha <    ----------
|~T_t| - 1

is true for all nodes t in T, or (ii) the algorithm finds the sequence of critical points
alpha_k < alpha_k+1 ... < alpha_K such that T_K = root(T_max) and then selects the optimally-pruned
subtree, defined to be the subtree with the best quality index for the validation sample.

Function Members (Methods)

private:
voidInitTreePruningMetaData(TMVA::DecisionTreeNode* n)
TMVA::MsgLogger&Log() const
voidOptimize(TMVA::DecisionTree* dt, Double_t weights)

Data Members

private:
TMVA::MsgLogger*fLogger! output stream to save logging information
Int_tfOptimalK! the optimal index of the prune sequence
vector<TMVA::DecisionTreeNode*>fPruneSequence! map of weakest links (i.e., branches to prune) -> pruning index
vector<Double_t>fPruneStrengthList! map of alpha -> pruning index
vector<Double_t>fQualityIndexList! map of R(T) -> pruning index
TMVA::SeparationBase*fQualityIndexTool! the quality index used to calculate R(t), R(T) = sum[t in ~T]{ R(t) }

Class Charts

Inheritance Chart:
TMVA::IPruneTool
TMVA::CostComplexityPruneTool

Function documentation

CostComplexityPruneTool( SeparationBase* qualityIndex = NULL )
virtual ~CostComplexityPruneTool()
PruningInfo* CalculatePruningInfo(TMVA::DecisionTree* dt, const TMVA::IPruneTool::EventSample* testEvents = __null, Bool_t isAutomatic = kFALSE)
 calculate the prune sequence for a given tree
void InitTreePruningMetaData(TMVA::DecisionTreeNode* n)
 set the meta data used for cost complexity pruning
void Optimize(TMVA::DecisionTree* dt, Double_t weights)
 optimize the pruning sequence