Plot overlayed or stacked MC histograms with DATA on top * The same binning for all histograms is required * Until now only 1d histograms are supported For drawing the histograms in grouped/ungrouped mode use SetGroupHistograms(). All MC stack histograms can be unified into a single-coloured histogram by SetUnifyMCStack(). The total statistical uncertainty of the MC can be shown by using SetDrawMCError(). The errors are displayed as a rectangular hatched area on top of each bin. Usually the error per bin for the DATA histogram as well as for the total MC stack is computed as sqrt(sum of squares of weight) for each bin, which is ROOT's standard error handling for histograms. For verly low entry numbers inside a bin such kind of errors are not valid any more since the error function becomes more and more poissonic and the entry number is bound to be larger equal than zero which results in asymmetric error bars and in non-zero errors for bins without any entry. To obtain a better description, for entry numbers less then 33 the upper and lower confidence limits are estimated for a confidence level of 68% by the method of Feldman and Cousins (see the class TFeldmanCousins or the Statistics Review of the Particle Data Group). This method provides a smooth transition from the symmetric gaussian errors for large entry numbers to the completely asymmetric errors for zero-entry bins. This gives a realistic description of the uncertainties, however it slows down the speed of drawing the histograms. The limitt of 33 entries was chosen to have a fair trade-off between a good description and the time needed for drawing. Fitter: The plot gives the possibility to fit the MC distributions to the data distribution. This is done using the TFractionFitter class contained in ROOT. Starting from your plot object HepDataMCPlot *pl, the fit can be easily performed. First start with: pl->PrepareFit(); This creates the needed HepTemplate objects. Folders are treated as one template per default. If you want to have all samples as single templates, use pl->PrepareFit(kTRUE); If you want you can specify start values for the template fractions pl->SetTemplateStartValue(template_name,initValue,initErr); and/or bounds: pl->SetTemplateBounds(template_name,low,up); The functions pl->ListFolders() and pl->ListSingleSamples() list all folders and single samples contained in the plot. Their names are used for the naming of the template objects. The fit is started by: pl->Fit(); The function returns the value 0, if the fit converged. You can list all obtained fractions by pl->ListTemplateFractions(); or get a fraction of a specific template: pl->GetTemplateFitFraction(template_name); Note that the fit uses only non-empty histograms. Templates for empty histograms are not created. Finally you can plot the data together with the fitted MC as stacked histogram by: pl->DrawFit(); Author: Oliver Maria Kind <mailto: kind@mail.desy.de> Update: $Id: HepDataMCPlot.cxx,v 1.73 2017/05/12 13:38:47 kaphle Exp $ Copyright: 2009 (C) Oliver Maria Kind
virtual void | TObject::DoError(int level, const char* location, const char* fmt, va_list va) const |
void | TObject::MakeZombie() |
void | AddOverflowUnderflowBins(TH1* hist) |
TH1F* | CompareHistograms(TH1* hObs = 0, TH1* hExp = 0, bool neglectUncertainty = false, bool variableBinning = false, TH1* hPull = 0) |
void | DrawDataMCRatio() |
void | DrawNoStack(Option_t* option = "") |
void | DrawSignificance() |
void | DrawStack(Option_t* option = "") |
Double_t | pja_normal_quantile(double p) |
Double_t | pValuePoisson(unsigned int nObs, double nExp) |
Double_t | pValuePoissonError(unsigned int nObs, double E = 1, double V = 1) |
Double_t | pValueToSignificance(double p, bool excess = true) |
void | SetColorsNoStack() |
void | SetColorsStack() |
TGraphAsymmErrors* | SetErrors(TH1F* FromHist, Bool_t IsData, Option_t* option = "") |
void | SetupAxis() |
void | SetupPad() |
void | SetXRange(TPad* pad, Double_t xmin, Double_t xmax) |
void | SetYRange(TPad* pad, Double_t ymin, Double_t ymax) |
static TObject::(anonymous) | TObject::kBitMask | |
static TObject::EStatusBits | TObject::kCanDelete | |
static TObject::EStatusBits | TObject::kCannotPick | |
static TObject::EStatusBits | TObject::kHasUUID | |
static TObject::EStatusBits | TObject::kInvalidObject | |
static TObject::(anonymous) | TObject::kIsOnHeap | |
static TObject::EStatusBits | TObject::kIsReferenced | |
static TObject::EStatusBits | TObject::kMustCleanup | |
static TObject::EStatusBits | TObject::kNoContextMenu | |
static TObject::(anonymous) | TObject::kNotDeleted | |
static TObject::EStatusBits | TObject::kObjInCanvas | |
static TObject::(anonymous) | TObject::kOverwrite | |
static TObject::(anonymous) | TObject::kSingleKey | |
static TObject::(anonymous) | TObject::kWriteDelete | |
static TObject::(anonymous) | TObject::kZombie |
TString | TNamed::fName | object identifier |
TString | TNamed::fTitle | object title |
Float_t | fAddRange | Extra space added when setting axis limit (default: 1.1) |
Float_t | fAtlasLabelPosX | X position of the Atlas label |
Float_t | fAtlasLabelPosY | Y position of the Atlas label |
TString | fAtlasLabelStatus | Status label (example: "work in progress") |
Float_t | fCMSEnergyLabelPosX | X position of the cms energy label |
Float_t | fCMSEnergyLabelPosY | Y position of the cms energy label |
TCanvas* | fCanvas | ! Pointer to the canvas (if plot is drawn in a canvas) |
TString | fCenterOfMassEnergyLabel | Character string which denotes the centre of mass energy (example: "13") |
Bool_t | fDataMCRatioCenterOne | Switch for drawing data/mc ratio centered at one |
Bool_t | fDrawCMSLumiAtlasLabel | Draw Labels,if kTRUE |
Bool_t | fDrawData | Draw DATA histogram (default=on) |
Bool_t | fDrawDataErrorX | ! Draw x-errors for data histogram (default=on) |
Bool_t | fDrawDataMCRatio | Switch for drawing data/mc ratio |
Bool_t | fDrawDataZeroEntries | ! Draw markers for bins with no data |
Bool_t | fDrawDataZeroEntryErrors | Draw the Data error bars even for 0 data entry bins |
Bool_t | fDrawMCError | Draw the total MC uncertainty on top |
Bool_t | fDrawNormError | Include the normalization error when computing the total uncertainty |
Bool_t | fDrawSignalOverlay | Flag for drawing the signal MC (assumed to be the least entry in the MC histogram stack) as overlay instead as part of the stack |
Bool_t | fDrawSignificance | Switch for drawing significance plot |
Bool_t | fFirstDraw | ! Drawing the first time ? |
Bool_t | fFirstLegend | ! Drawing the legend the first time ? |
TFractionFitter* | fFitter | ! TFractionFitter instance. |
TGraphAsymmErrors* | fGraphDATA | ! Graph for drawing data |
Bool_t | fGroupHistograms | Group histograms when drawing and in the legend |
TH1F* | fHistDATA | Pointer to DATA histograms for fast access |
TH1F* | fHistMCTop | ! Pointer to top-most MC histogram of the stack |
TH1* | fHistogramLastInStack | ! Pointer to the last histogram inside the MC stack (supposed to be tha MC signal) |
TH1* | fHistogramMainPad | ! Pointer to the histogram used for drawing the axis of the stacked plot |
TH1* | fHistogramRatioPad1 | ! Pointer to the histogram used for drawing the axis the first ratio plot |
TH1* | fHistogramRatioPad2 | ! Pointer to the histogram used for drawing the axis second ratio plot |
TLegend* | fLegend | Legend |
TLegend* | fLegendLarge | Large legend |
Float_t | fLumiDATA | Luminosity of data sample |
TString | fLuminosityLabel | string which denotes the luminosity (example: "4.7 pb^{-1}") |
Float_t | fLuminosityLabelPosX | X position of the luminosity label |
Float_t | fLuminosityLabelPosY | Y position of the luminosity label |
TGraphAsymmErrors* | fMCErrors | Graph for drawing the MC error band |
TList* | fMCFitTemplates | List of MC templates given to the fitter |
THStack* | fMCFittedHistStack | MC HStack object for fitted MC histograms |
TList* | fMCFolders | List of folders to group MC samples |
THStack* | fMCHistStack | MC HStack object |
TList* | fMCSingleSamples | List of single MC samples, not belonging to folder |
TPad* | fMainPad | ! Pointer to the pad of the stacked plot |
Int_t | fN_MCHistograms | No. of MC histograms (used for histogram id) |
Float_t | fRatioHeight | Height of Ratio plots w.r.t stack plot ([0,1], default: 0.4) |
TLine* | fRatioLine | ! Line for Ratio plots indicating 0 or 1 |
TPad* | fRatioPad1 | ! Pointer to the pad of the first ratio plot |
TPad* | fRatioPad2 | ! Pointer to the pad of the second ratio plot |
Float_t | fScaleOverlay | Scale factor of the overlay histogram |
Bool_t | fUnifyMCStack | Draw MC stack histograms as one |
Bool_t | fUseAtlasStyle | Use ATLAS style if it is true |
Bool_t | fUseOverflowUnderflow | Add overflow in last bin and underflow in first bin |
Bool_t | fUseShortCMSLumiLabel | ! Use short label for CMS + Lumi |
TString* | fXTitle | Title of x-axis |
Bool_t | fYRangeUser | Flag if user defined y range should be used |
Double_t | fYRangeUserMax | Y-axis limit (max) |
Double_t | fYRangeUserMin | Y-axis limit (min) |
TString* | fYTitle | Title of y-axis |
static const Double_t | fgFeldmanCousinsCL | ! Confidence level for error calculation |
static const Int_t | fgFeldmanCousinsMaxEntries | ! Separates Feldman-Cousins error |
static const Color_t | fgUnifyColor | ! Color used for unify the MC stack |
Inheritance Chart: | |||||||||||||
|
Add grouped MC histogram to stack Grouped legend ?
Draw this object - by default, two pads are drawn, one for the stack and one for the significance. Best results are obtained if drawn in a Canvas of specified height and width (with default config) width height No ratio: 700 500 1 ratio: 700 633 2 ratio: 700 765 Options available: "NOSTACK" - Overlay MC histograms rather than stack them "NOLEG" - Do not draw a legend "NORM" - Normalize the plots to Integral of 1 in case of "nostack" option. Use the option like hist->Draw("nostack,norm"). !!!ATTENTION!!! The histograms are scaled after the option "norm". You have to close root and restart it to get again the unnormalized plots
Strip filesystem path from the full path of the given directory
Set range of y-axis Needs to be called before plot is drawn!
Set range of y-axis of ratio plot Needs to be called before plot is drawn!
Draw legend In case of gswitched on grouping of histograms only the group entries but not the entries inside the groupe will be displayed
Draw the Data error bars even for 0 data entry bins
Add overflow to last bin and underflow to first bin
Internal helper function. Creates the graph for drawing (asymmetric) errors. For large entries numbers the errors are taken from the given histogram (as well as the xy-positions). For low numbers (<33) the errors are estimated by the FeldmanCousins method.
Computes upper and lower errors for given number of events in bin of histogram h. For low numbers (<33) the errors are estimated by the FeldmanCousins method.
Compute the normalization uncertainty per bin For each grouped template an uncertainty must be provided
Prepares fit using TFractionFitter HepTemplate objects are created from the list of MC samples. Per default, sample groups are treated as one template. Set single_samples to TRUE, if you want to fit each single sample. If prepare_effs is set to TRUE, cut efficiencies are calculated and assigned to the template objects. For that, it is assumed that the number of entries in the fit histogram is equal to the number of events after all cuts applied. The denominator for the efficiencies is taken from the event info histogram. In case of folders a weighted sum is calculated. The efficiencies are probably needed to calculate cross sections from the fit results.
Perform the fit Create TFractionFitter instance. Start values and constraints given to template objects are set and the fit is started. Returns the return value from TFractionFitter.
Get template fraction for specific template
Get template fraction error for specific template
Get template cut efficiency for specific template
Performs a rebinning of the stacked plot. Rebinning is applied to all histograms (data and MC).
Scales only a group of histograms by the given scale
Draw ATLAS + label Functions taken from official ATLAS style Adjust position in case user adds more space to the right margin x += 0.05 - gStyle->GetPadRightMargin();
Draw latex text on given position Functions taken from official ATLAS style
Saves group-histograms and single sample histograms in a
given file.
The name of the histograms consists of the folder/sample name
and a suffix, e.g. "ttbar_jesup"
Saves group-histograms and single sample histograms and
returns a list of histograms
The name of the histograms consists of the folder/sample name
and a suffix, e.g. "ttbar_jesup"
Draw MC histograms in stacked mode In case of stacked histogram mode draw the histograms in filled mode with their fill color. Use the histogram line colour of the current style for drawing the histogram outline
Setup colour scheme for stacked mode In case of grouped drawing mode set the line and fill attributes of the grouped histograms first. The histograms within one group get all the same colour for filling and the histogram line so that they appear as a single histogram
Draw Data/MC ratio The data/mc ratio is plotted with dotted markers and their uncertainty corresponds to the relative uncertainty from data. Furthermore the uncertainty on the prediction (i.e. MC/MC) is shown as shaded uncertainty band.
Setup axis labels and axis range Title size needs to be rescaled when using ratio plots. Size depends whether axis is redrawn in the ratio pad or main pad.
Given two ROOT histograms (with the same binning!) containing the observed and expected counts, create and return a histogram showing the significance of their bin-to-bin discrepancies. If the histogram representing the expectation (second input parameter) has non-zero bin "errors", these are considered the standard deviations representing the full uncertainty and the significance is computed accordingly, unless this is disabled (third parameter). The last input pointer is the pull histogram, which is filled with all z-values, without discarding the bins for which the p-value is larger than 0.5: in case of ideal match, the pulls follow a standard normal distribution. A typical binnin for the pulls is from -5 to +5 with 20 bins. Code from "Plotting the Differences Between Data and Expectation" by Georgios Choudalakis and Diego Casadei Eur. Phys. J. Plus 127 (2012) 25 http://dx.doi.org/10.1140/epjp/i2012-12025-y (http://arxiv.org/abs/1111.2062) This code is covered by the GNU General Public License: http://www.gnu.org/licenses/gpl.html Diego Casadei <casadei@cern.ch> 7 Dec 2011
Normal quantile computed following Peter John Acklam's pseudo-code algorithm for rational approximation (pjacklam@online.no) http://home.online.no/~pjacklam/notes/invnorm The algorithm below assumes p is the input and x is the output. Coefficients in rational approximations. a(1) <- -3.969683028665376e+01 a(2) <- 2.209460984245205e+02 a(3) <- -2.759285104469687e+02 a(4) <- 1.383577518672690e+02 a(5) <- -3.066479806614716e+01 a(6) <- 2.506628277459239e+00 b(1) <- -5.447609879822406e+01 b(2) <- 1.615858368580409e+02 b(3) <- -1.556989798598866e+02 b(4) <- 6.680131188771972e+01 b(5) <- -1.328068155288572e+01 c(1) <- -7.784894002430293e-03 c(2) <- -3.223964580411365e-01 c(3) <- -2.400758277161838e+00 c(4) <- -2.549732539343734e+00 c(5) <- 4.374664141464968e+00 c(6) <- 2.938163982698783e+00 d(1) <- 7.784695709041462e-03 d(2) <- 3.224671290700398e-01 d(3) <- 2.445134137142996e+00 d(4) <- 3.754408661907416e+00 Define break-points. p_low <- 0.02425 p_high <- 1 - p_low Rational approximation for lower region. if 0 < p < p_low q <- sqrt(-2*log(p)) x <- (((((c(1)*q+c(2))*q+c(3))*q+c(4))*q+c(5))*q+c(6)) ((((d(1)*q+d(2))*q+d(3))*q+d(4))*q+1) endif Rational approximation for central region. if p_low <= p <= p_high q <- p - 0.5 r <- q*q x <- (((((a(1)*r+a(2))*r+a(3))*r+a(4))*r+a(5))*r+a(6))*q (((((b(1)*r+b(2))*r+b(3))*r+b(4))*r+b(5))*r+1) endif Rational approximation for upper region. if p_high < p < 1 q <- sqrt(-2*log(1-p)) x <- -(((((c(1)*q+c(2))*q+c(3))*q+c(4))*q+c(5))*q+c(6)) ((((d(1)*q+d(2))*q+d(3))*q+d(4))*q+1) endif The absolute value of the relative error (x_approx - x)/x is <= 1.15e-9 for all input values p which can be represented in IEEE double precision arithmetic. Code from "Plotting the Differences Between Data and Expectation" by Georgios Choudalakis and Diego Casadei Eur. Phys. J. Plus 127 (2012) 25 http://dx.doi.org/10.1140/epjp/i2012-12025-y (http://arxiv.org/abs/1111.2062) This code is covered by the GNU General Public License: http://www.gnu.org/licenses/gpl.html
p-value for Poisson distribution, no uncertainty on the parameter Diego Casadei <casadei@cern.ch> Oct 2011 Last update: 4 Nov 2011 (using incomplete Gamma from ROOT) Consider Poi(k|nExp) and compute the p-value which corresponds to the observation of nObs counts. When nObs > nExp there is an excess of observed events and p-value = P(n>=nObs|nExp) = \sum_{n=nObs}^{\infty} Poi(n|nExp) = 1 - \sum_{n=0}^{nObs-1} Poi(n|nExp) = 1 - e^{-nExp} \sum_{n=0}^{nObs-1} nExp^n / n! Otherwise (nObs <= nExp) there is a deficit and p-value = P(n<=nObs|nExp) = \sum_{n=0}^{nObs} Poi(n|nExp) = e^{-nExp} \sum_{n=0}^{nObs} nExp^n / n! Code from "Plotting the Differences Between Data and Expectation" by Georgios Choudalakis and Diego Casadei Eur. Phys. J. Plus 127 (2012) 25 http://dx.doi.org/10.1140/epjp/i2012-12025-y (http://arxiv.org/abs/1111.2062) This code is covered by the GNU General Public License: http://www.gnu.org/licenses/gpl.html
p-value for Poisson distribution when there is uncertainty on the
parameter
Diego Casadei <casadei@cern.ch> 6 Nov 2011
Last update: 3 Mar 2012 (logarithms used only for big numbers)
Consider Poi(k|nExp) and compute the p-value which corresponds to
the observation of nObs counts, in the case of uncertain nExp whose
variance is provided.
The prior for nExp is a Gamma density which matches the expectation
and variance provided as input. The marginal model is provided by
the Poisson-Gamma mixture, which is used to compute the p-value.
Gamma density: the parameters are
* a = shape param [dimensionless]
* b = rate param [dimension: inverse of x]
nExp ~ Ga(x|a,b) = [b^a/Gamma(a)] x^{a-1} exp(-bx)
One has E[x] = a/b and V[x] = a/b^2 hence
* b = E/V
* a = E*b
The integral of Poi(n|x) Ga(x|a,b) over x gives the (marginal)
probability of observing n counts as
b^a [Gamma(n+a) / Gamma(a)]
P(n|a,b) = -----------------------------
n! (1+b)^{n+a}
When nObs > nExp there is an excess of observed events and
p-value = P(n>=nObs) = \sum_{n=nObs}^{\infty} P(n)
= 1 - \sum_{n=0}^{nObs-1} P(n)
Otherwise (nObs <= nExp) there is a deficit and
p-value = P(n<=nObs) = \sum_{n=0}^{nObs} P(n)
To compute the sum, we use the following recurrent relation:
P(n=0) = [b/(1+b)]^a
P(n=1) = [b/(1+b)]^a a/(1+b) = P(n=0) a/(1+b)
P(n=2) = [b/(1+b)]^a a/(1+b) (a+1)/[2(1+b)] = P(n=1) (a+1)/[2(1+b)]
... ...
P(n=k) = P(n=k-1) (a+k-1) / [k(1+b)]
and to avoid rounding errors, we work with logarithms.
Code from
"Plotting the Differences Between Data and Expectation"
by Georgios Choudalakis and Diego Casadei
Eur. Phys. J. Plus 127 (2012) 25
http://dx.doi.org/10.1140/epjp/i2012-12025-y
(http://arxiv.org/abs/1111.2062)
This code is covered by the GNU General Public License:
http://www.gnu.org/licenses/gpl.html
Convert a p-value into a right-tail normal significance, i.e. into
the number of Gaussian standard deviations which correspond to it.
Code from
"Plotting the Differences Between Data and Expectation"
by Georgios Choudalakis and Diego Casadei
Eur. Phys. J. Plus 127 (2012) 25
http://dx.doi.org/10.1140/epjp/i2012-12025-y
(http://arxiv.org/abs/1111.2062)
This code is covered by the GNU General Public License:
http://www.gnu.org/licenses/gpl.html
Optimize binning for given HepDataMCPlot (this plot should have many bin (~100 - ~1000) To optimize the binning, the quantity Z_ijwith
is computed.
and
are parameters for adjusting the number of bins. The binning is determined in a iterative procedure. Starting with
, then 1) Compute
2) Check if
2.a) Condition is true -> Merge all bins from i to j Set j = i-1 and go back to 1) 2.b) Condition is false -> Reduce i by one and go back to 1). Repeat this until i == 1.
Scale factor of the overlay histogram (see SetDrawSignalOverlay())
Flag for drawing the signal MC (assumed to be the least entry in the MC histogram stack) as overlay instead as part of the stack
Flag for drawing the signal MC (assumed to be the least entry in the MC histogram stack) as overlay instead as part of the stack