RooFit
New infrastructure for toy MC studies
A new class RooStudyManager has been added that is intended
to replace the present RooMCStudy framework for toy MC
studies on the time scale of ROOT release 5.26.
The present RooMCStudy is a small monolithic driver to
execute 'generate-and-fit' style MC studies for a given pdf. It
provides some room for customization, through modules inheriting from
RooAbsMCStudyModule that can modify the standard behavior, but its
design limits the amount of flexibility.
In the new RooStudyManager design, the functionality of
RooMCStudy has been split into two classes: class
RooStudyManager which manages the logistics of running
repetitive studies and class RooGenFitStudy which implements
the functionality of the 'generate-and-fit'-style study of RooMCStudy.
The new design has two big advantages:
- Complete freedom in the design of studies, either by tailoring the behavior of RooGenFitStudy or
by using another study module that inherits from RooAbsStudy, and the data that they return.
- More flexibility in the mode of execution. The new study manager can execute all study
modules inlines, as was done in RooMCStudy), but also parallelized through PROOF (at present
only PROOF-lite is support, as well as in batch
The code fragment below illustrates the use of the new study manager
// Create workspace with p.d.f
RooWorkspace* ww = new RooWorkspace("ww") ;
ww->factory("Gaussian::g(x[-10,10],mean[-10,10],sigma[3,0.1,10])") ;
RooGenFitStudy gfs ;
gfs.setGenConfig("g","x",NumEvents(1000)) ;
gfs.setFitConfig("g","x",PrintLevel(-1)) ;
RooStudyManager mgr(*ww,gfs) ;
mgr.run(1000) ; // execute 1000 toys inline
mgr.runProof(10000,"") ; // execute 10000 toys through PROOF-lite
gfs.summaryData()->Print() ;
Workspace and factory improvements
The workspace class RooWorkspace has been augmented with several
new features
- The import() method now supports a new argument RenameAllVariablesExcept(const char* suffix, const char keepList) which
will rename all variables of the imported function by extended them with a supplied suffix,
except for a given list of variables, which are not renamed.
- A new utility function importFromFile() has been added, which is similar to import, except that it take a string
specifier for the object to be imported rather than a reference. The string is expected to be of the form
fileName:workspaceName:objectName and simplifies import of objects from other workspaces on file. The importFromFile
accepts all arguments accepted by the standard import() method.
- Generic objects (inheriting from TObject) can now also be stored in the workspace under an alias name, rather
under their own name, which simplifies management of objects of types like TMatrixD that do not have a settable name.
ws.import(matrix,"cov_matrix") ;
- New accessors have been added that return a RooArgSet of all elements of the workspace of a given type, e.g.
allVars(), allPdfs().
-
The Print() method now accepts option "t", which prints the contents tree-style instead of a flat list of components,
as ilustrated below
*** Print() ***
p.d.f.s
-------
RooProdPdf::bkg[ ptBkgPdf * mllBkgPdf * effBkgPdf|pt ] = 0.267845
RooEfficiency::effBkgPdf[ cat=cut effFunc=effBkg ] = 0.76916
RooEfficiency::effSigPdf[ cat=cut effFunc=effSig ] = 0.899817
RooAddPdf::genmodel[ Nsig * sig + Nbkg * bkg ] = 0.502276
RooPolynomial::mllBkgPdf[ x=mll coefList=(mbkg_slope) ] = 0.775
RooGaussian::mllSigPdf[ x=mll mean=msig_mean sigma=msig_sigma ] = 1
RooExponential::ptBkgPdf[ x=pt c=pbkg_slope ] = 0.449329
RooExponential::ptSigPdf[ x=pt c=psig_slope ] = 0.818731
RooProdPdf::sig[ ptSigPdf * mllSigPdf * effSigPdf|pt ] = 0.736708
functions
--------
RooFormulaVar::effBkg[ actualVars=(pt,ab,mb,sb) formula="0.5*@1*(1+TMath::Erf((@0-@2)/@3))" ] = 0.76916
RooFormulaVar::effSig[ actualVars=(pt,as,ms,ss) formula="0.5*@1*(1+TMath::Erf((@0-@2)/@3))" ] = 0.899817
*** Print("t") ***
p.d.f.s
-------
RooAddPdf::genmodel[ Nsig * sig + Nbkg * bkg ] = 0.502276
RooProdPdf::sig[ ptSigPdf * mllSigPdf * effSigPdf|pt ] = 0.736708
RooExponential::ptSigPdf[ x=pt c=psig_slope ] = 0.818731
RooGaussian::mllSigPdf[ x=mll mean=msig_mean sigma=msig_sigma ] = 1
RooEfficiency::effSigPdf[ cat=cut effFunc=effSig ] = 0.899817
RooFormulaVar::effSig[ actualVars=(pt,as,ms,ss) formula="0.5*@1*(1+TMath::Erf((@0-@2)/@3))" ] = 0.899817
RooProdPdf::bkg[ ptBkgPdf * mllBkgPdf * effBkgPdf|pt ] = 0.267845
RooExponential::ptBkgPdf[ x=pt c=pbkg_slope ] = 0.449329
RooPolynomial::mllBkgPdf[ x=mll coefList=(mbkg_slope) ] = 0.775
RooEfficiency::effBkgPdf[ cat=cut effFunc=effBkg ] = 0.76916
RooFormulaVar::effBkg[ actualVars=(pt,ab,mb,sb) formula="0.5*@1*(1+TMath::Erf((@0-@2)/@3))" ] = 0.76916
- The workspace factory can now access all objects in the generic object store of the workspace, e.g.
TMatrixDSym* cov
RooWorkspace w("w") ;
w.import(*cov,"cov") ;
w.factory("MultiVarGaussian::mvg({x[-10,10],y[-10,10]},{3,5},cov)") ;
- The workspace factory now correctly identifies and matches typedef-ed names in factory constructor
specifications.
- All objects created by the factory and inserted by the workspace get a string attribute "factory_tag",
that contains the reduced factory string that was used to create that object, e.g.
RooWorkspace w("w") ;
w.factory("Gaussian::g(x[-10,10],m[0],s[3])") ;
cout << w.pdf("g")->getStringAttribute("factory_tag") << endl ;
RooGaussian::g(x,m,s)
- Previously all factory orders that would create objects with names of objects that already existed always
resulted in an error. Now, this will only happen if the factory tag of the existing object is different
from the tag of the existing object
w.factory("Gaussian::g(x[-10,10],m[0],s[3])") ;
w.factory("Chebychev::g(x[-10,10],{0,1,2})") ; // Now OK, x has identical spec, existing x will be used.
Improvements to functions and pdfs
- Addition to, reorganization of morphing operator classes.
The existing class RooLinearMorph which
implements 'Alex Read' morphing has been renamed RooIntegralMorph. A new class RooMomentMorph
has been added (contribution from Max Baak and Stefan Gadatsch) that implements a different morphing algorithm
based on shifting the mean and variance of the input pdfs. The new moment morphing class can also interpolate
between multiple input templates and works with multi-dimensional input pdfs. One of the appealing features
is that no expensive calculations are required to calculate in the interpolated pdfs shapes after the pdf
initialization. An extension that allows morphing in two parameters is foreseen for the next root release.
- Progress indication in plot projections
The RooAbsReal::plotOn() now accepts a new argument ShowProgress() that will print a dot for every
function evaluation performed in the process of creating the plot. This can be useful when plotting very expensive
functions such as profile likelihoods
- Automatic handling of constraint terms
It is no longer necessary to add a Constrain() argument to fitTo() calls to have internal constraints
applied. Any pdf term appearing in a product that does not contain an observable and shares one or more parameters
with another pdf term in the same product that does contain an observable is automatically picked up as a constraint term.
For example given a dataset D(x) which defines variable x as observable, the default logic works out as follows
F(x,a,b)*G(a,a0,a1) --> G is constraint term (a also appears in F(x))
F(x,a,b)*G(y,c,d) --> G is dropped (factorizing term)
A Constrain(y) term in the above example will still force term G(y,c,d) to be interpreted as constraint term
- Automatic caching of numeric integral calculations
Integrals that require numeric integrations in two of more dimensions are now automatically cached in the expensive object store.
The expensive object store allows to cache such values between difference instance of integral objects that represent the
same configuration. If integrals are created from an object (function or pdf) that live in a RooWorkspace the
expensive object cache of the workspace will be used instead of the global store instance, and values stored in the workspace
store will also be persisted if the workspace is persisted. The global caching behavior of integral objects can be
controlled through RooRealIntegral::setCacheAllNumeric(Int_t nDimNumMin).
Miscellaneous improvements data classes
- The RooAbsData::tree() method has been restored. It will only return a TTree* pointer for datasets
that are based on a RooTreeDataStore implementation, i.e. not for the composite datasets mentioned below
- A new composite data storage class RooCompositeDataStore has been added that allows to construct composite
RooDataSet objects without copying the input data.
// Make 2 input datasets and an index category
RooWorkspace w("w",kTRUE) ;
w->factory("Gaussian::g(x[-10,10],m[-10,10],s[3,0.1,10])")
w->factory("Uniform::u(x)")
w->factory("index[S,B]")
RooDataSet* d1 = w::g.generate(w::x,1000)
RooDataSet* d2 = w::u.generate(w::x,1000)
// Make monolithic composite dataset (copies input data)
RooDataSet d12("d12","d12",w::x,Index(w::index),Import("S",*d1),Import("B",*d2))
//-----------------------------------------------------------------------------
// NEW: make virtual composite dataset (input data is linked, no data is copied)
RooDataSet d12a("d12a","d12a",w::x,Index(w::index),Link("S",*d1),Link("B",*d2))
//-----------------------------------------------------------------------------
// Fit composite dataset to dummy model
w->factory("SUM::model(fsig[0,1]*g,u)")
w::model.fitTo(d12a)
For virtual composite dataset it is also possible to join a mix of binned and unbinned datasets
(representation as a RooDataSet with weights)
- The setWeightVar() method has been deprecated as it is very difficult to support on-the-fly redefinition
of the event weight variable in the new data store scheme. To declare a data set weighed,
use the WeightVar() modifier of the constructor instead,e.g.:
RooDataSet wdata("wdata","wdata",RooArgSet(x,y,wgt),WeightVar(wgt)) ;
-
The RooHist class that represents data as a histogram in a RooPlot has been modified
so that it can show approximate Poisson errors for non-integer data. Thes approximate
error are calculated from interpolation of the error bars of the nearest integers. NB: A weighted dataset
plotted with RooAbsData::plotOn() will be default show sum-of-weights-squared errors. Only
when Poisson error are forced through a DataError(RooAbsData::Poisson) argument these
approximate Poisson error bars are shown
Miscellaneous improvements other
- The RooFit messagee service class RooMsgService has been augmented with a stack that
can store its configurate state information. A call to saveState() will save the
present configuration, which can be restored through a subsequent call to restoreState().
- In addition to the method RooAbsArg::printCompactTree() which is mostly intende for
debugging, a new method RooAbsArg::printComponentTree() has been added that prints
the tree structure of a pdf in a more user-friendly content oriented way. The printing
of the leaf nodes (the variables) is omitted in this method to keep the output compact.
RooStats
This release contains significant bug fixes and it is strongly
reccomended to update to this version if using older ones.
Major Changes in LimitCalculator and HypoTestCalculator classes: usage of ModelConfig class
-
The RooStats calculator interfaces have been changed to use the ModelConfig class.
All the setter methods with the parameter lists, pdf instances and name have been removed from the interfaces.
The SetWorkspace(RooWorkspace & ) has also been removed, while a SetModel(const ModelConfig &)
function is introduced.
-
Users are supposed to pass all the model information using the
ModelConfig class rather than via the
RooWorkspace or specifying directly the pdf and parameter
objects in the constructors.
Setter methods using pdf instances and parameter lists are maintained in the derived classes, like the ProfileLikelihoodCalculator or the HybridCalculator, but those passing a string for the name of the pdf have been removed.
-
All the calculator classes do not keep anymore a pointer to the workspace, but they contain pointers to the pdf, the data and the parameters required to run the calculator. These pointers are managed outside by the users or by the RooWorkspace. They can be passed either directly to the classes, for example via the constructor, or by using the ModelConfig class. The ModelConfig class acts as an interface to the Workspace in order to load and store all the
needed information.
ProfileLikelihoodCalculator, LikelihoodInterval
- The Minos algorithm of Minuit is used now to find the limit of the likelihood intervals instead of searching directly the roots of the RooProfileLL class. Minos is used via the ROOT::Math::Minimizer interface. By default TMinuit is used, one can also use Minuit2 by doing ROOT::Math::MinimizerOptions::SetDefaultMinimizer("Minuit2").
- The LikelihoodInterval class now provides now two new methods, FindLimits which finds both the upper and lower interval bounds, and GetContourPoints to find the 2D contour points defining the likelihood interval. GetContourPoints is now used by the LikelihoodIntervalPlot class to draw the 2D contour.
- New tutorials have been added: rs501_ProfileLikelihoodCalculator_limit.C and rs502_ProfileLikelihoodCalculator_significance.C for getting the interval limits and significance using the ProfileLikelihoodCalculator. The tutorials can be run on a set of Poisson data or Gaussian over flat with model considering optionally the nuisance parameters. The data can be generated with the rs500 tutorials.
HybridCalculator
- In the constructor the signature passing a name and a title string has been removed, for being consistent with all the other calculator classes. Name and title can be set optionally using the SetName and SetTitle methods. Please note that this change is not backward compatible.
- Add the option to use binned generation (via SetGenerateBinned).
- An estimated of the error in the obtained p values is now computed in the HybridResult class thanks to Matthias Wolf. The errors can be obtained with HybridResult::CLbError(), HybridResult::CLsplusbError() or HybridResult::CLsError().
- A new tutorial has been added for showing the usage of the hybrid calculator: rs505_HybridCalculator_significance.C
new class HypoTestInverter
-
New class for performing an hypothesis test inversion by scanning
the hypothesis test results of the HybridCalculator for
various values of the parameter of interest. An upper (or lower) limit can be derived by looking at the
confidence level curve of the result as function of the parameter of
interest, where it intersects the desired confidence level.
-
The class implements the IntervalCalculator interface and returns an HypoTestInverterResult class. The result is a SimpleInterval, which via the method UpperLimit returns to the user the upper limit value.
- The HypoTestInverter implements various option for performing the scan. HypoTestInverter::RunFixedScan will scan using a fixed grid the parameter of interest. HypoTestInverter::RunAutoScan will perform an automatic scan to find optimally the curve and it will stop when the desired precision is obtained.
The confidence level value at a given point can also be done via HypoTestInverter::RunOnePoint.
- The class can scan the CLs+b values (default) or alternatively CLs (if the
method HypoTestInverter::UseCLs has been called).
- The estimated error due to the MC toys statistics from the HybridCalculator is propagated into the limits obtained from the HypoTestResult
- A new tutorial rs801_HypoTestInverter.C has been added in the tutorials/roostats directory to show the usage of this class.
New class BayesianCalculator
-
New class for calculating Bayesian interval using numerical integration. It implements the IntervalCalculator interface and returns as result a SimpleInterval.
- The BayesianCalculator::GetInterval() method returns a SimpleInterval which contains the lower and upper value of the bayesian interval obtained from the posterior probability for the given confidence level.
- The class return also the posterior pdf (BayesianCalculator::GetPosteriorPdf()) obtained from integrating (marginalizing) on the nuisance parameters.
- It works currently only for one-dimensional problems by relying on RooFit for performing analytical or numerical integration.
- A plot of the posterior and the desired interval can be obtained using BayesianCalculator::GetPosteriorPlot().
- A new tutorial rs701_BayesianCalculator.C has been added in the tutorials/roostats directory to show the usage of this class.
MCMCCalculator
- Add possibility to specify the prior function in the constructor of the class to have a signature similar to the BayesianCalculator class. When no prior is specified it is assumed is part of the global model (pdf) passed to the class.
Improvements and Bug fixes
-
Various improvements and fixes have been applied also to all the calculator classes. Internally now the RooArgSet objects are used by value instead of a pointer.
- All the calculator have a consistent way for being constructed, either by passing pdf pointers and the set defining the parameters or by passing a reference to a ModelConfig class.
- The result classes are now more consistent and have similar constructors. In addition to a default constructor, all of them can be constructed by passing first a name and then all the quantities (objects or values) needed for the specific result type. The title can eventually be set using the SetTitle method inherited from TNamed.