CostComplexityPruneTool - a class to prune a decision tree using the Cost Complexity method (see "Classification and Regression Trees" by Leo Breiman et al) Some definitions: T_max - the initial, usually highly overtrained tree, that is to be pruned back R(T) - quality index (Gini, misclassification rate, or other) of a tree T ~T - set of terminal nodes in T T' - the pruned subtree of T_max that has the best quality index R(T') alpha - the prune strength parameter in Cost Complexity pruning (R_alpha(T) = R(T) + alpha*|~T|) There are two running modes in CostComplexityPruneTool: (i) one may select a prune strength and prune the tree T_max until the criterion R(T) - R(t) alpha < ---------- |~T_t| - 1 is true for all nodes t in T, or (ii) the algorithm finds the sequence of critical points alpha_k < alpha_k+1 ... < alpha_K such that T_K = root(T_max) and then selects the optimally-pruned subtree, defined to be the subtree with the best quality index for the validation sample.
virtual | ~CostComplexityPruneTool() |
virtual TMVA::PruningInfo* | CalculatePruningInfo(TMVA::DecisionTree* dt, const TMVA::IPruneTool::EventSample* testEvents = __null, Bool_t isAutomatic = kFALSE) |
TMVA::CostComplexityPruneTool | CostComplexityPruneTool(TMVA::SeparationBase* qualityIndex = __null) |
TMVA::CostComplexityPruneTool | CostComplexityPruneTool(const TMVA::CostComplexityPruneTool&) |
Double_t | TMVA::IPruneTool::GetPruneStrength() const |
TMVA::IPruneTool | TMVA::IPruneTool::IPruneTool() |
TMVA::IPruneTool | TMVA::IPruneTool::IPruneTool(const TMVA::IPruneTool&) |
Bool_t | TMVA::IPruneTool::IsAutomatic() const |
TMVA::CostComplexityPruneTool& | operator=(const TMVA::CostComplexityPruneTool&) |
void | TMVA::IPruneTool::SetAutomatic() |
void | TMVA::IPruneTool::SetPruneStrength(Double_t alpha) |
void | InitTreePruningMetaData(TMVA::DecisionTreeNode* n) |
TMVA::MsgLogger& | Log() const |
void | Optimize(TMVA::DecisionTree* dt, Double_t weights) |
Double_t | TMVA::IPruneTool::B | |
Double_t | TMVA::IPruneTool::S | |
Double_t | TMVA::IPruneTool::fPruneStrength |
TMVA::MsgLogger* | fLogger | ! output stream to save logging information |
Int_t | fOptimalK | ! the optimal index of the prune sequence |
vector<TMVA::DecisionTreeNode*> | fPruneSequence | ! map of weakest links (i.e., branches to prune) -> pruning index |
vector<Double_t> | fPruneStrengthList | ! map of alpha -> pruning index |
vector<Double_t> | fQualityIndexList | ! map of R(T) -> pruning index |
TMVA::SeparationBase* | fQualityIndexTool | ! the quality index used to calculate R(t), R(T) = sum[t in ~T]{ R(t) } |
Inheritance Chart: | ||||||||
|
calculate the prune sequence for a given tree
set the meta data used for cost complexity pruning