ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/cvsroot/UserCode/ShallowTools/doc/README
Revision: 1.1
Committed: Sat Aug 23 16:46:29 2008 UTC (16 years, 8 months ago) by bbetchar
Branch: MAIN
CVS Tags: V03-00-02, V03-00-01, V03-00-00, V02-00-02, V02-00-01, V02-00-00, V01-01-01, V01-01-00, V01-00-01, V01-00-00, HEAD
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 bbetchar 1.1 SimpleTree Documentation
2     -------------------------
3    
4     * Justification and Philosophy
5    
6     Before SimpleTree, there are two formats that CMSSW can produce which
7     one might use for analysis: the EDM tree and the Histogram. Neither
8     of these formats is appropriate for the kind of interactive
9     exploration of a dataset I find most useful in preliminary analysis.
10    
11     Histograms are static displays of information, good for communicating
12     information or answering single well posed questions. However,
13     interactive exploration via histograms produced directly from CMSSW
14     requires rerunning CMSSW for each new question and subsequent cut, or
15     alternatively, an immense number of histograms and anticipation of the
16     correct questions and cuts.
17    
18     The EDM tree is browseable as a rootuple, and so can be explored
19     dynamically. However, interacting dynamically with an EDM tree is
20     impractical, due to the prohibitively large file size, the long/deep
21     variable names, and the prohibitively slow processing of
22     PoolOutputModule. The EDM tree is an excellent format for consistent,
23     traceable, reproducible, massive parallel processing in CMSSW, but the
24     requirements of these features make interactive browsing impractical.
25    
26     SimpleTree is meant to be a minimal, flexible format which allows
27     dynamic exploration. It is not meant for the production of private
28     "standard" rootuples but rather as an "n-dimensional" Histogram. It
29     is meant to reduce the frequency of running CMSSW and the grid. It is
30     NOT meant to facilitate abandonment of CMSSW. SimpleTrees are meant
31     to produce plots with simple cuts. If you find yourself constructing
32     complicated variables from the leaves of a SimpleTree, or writing a
33     ROOT macro to loop over the events in a SimpleTree, you are clearly
34     doing something wrong. Those activities are best handled by the
35     massive parallelism of CMSSW on the grid.
36    
37    
38    
39    
40    
41     * Advantages
42    
43     SimpleTree offers dynamic exploration via browsing and cuts.
44    
45     SimpleTree is a shallow: no need to hunt for the information.
46    
47     SimpleTree is much smaller than a comparable EDM tree.
48     High statistics should fit on your local computer.
49    
50     SimpleTree separates the construction of variables from their output format.
51    
52     SimpleTree is easy to use.
53    
54    
55    
56    
57    
58     * Disadvantages
59    
60     SimpleTree files are much larger than Histogram files.
61    
62     SimpleTrees store information inefficiently due to lack of hierarchy.
63    
64     SimpleTree files retain no provenance information (by design).
65    
66    
67    
68    
69    
70     * outputCommands
71    
72     SimpleTree takes one configuration option, the cms.vstring
73     outputCommands. This is a series of "keep" and "drop" statements
74     which uses the same software fragment as the PoolOutputModules. You
75     can find documentation here:
76     https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideSelectingBranchesForOutput
77    
78    
79    
80    
81    
82     * Chaining the CRAB results
83    
84     You can easily combine the results of multiple jobs by using a TChain.
85     It is especially easy to use the wildcard (*) notation, as follows:
86    
87     $ root -l
88     root [0] TChain ch("chain_name");
89     root [1] ch.Add("file_name_*.root/subdir/tree");
90     root [2] ch.Draw("someVar", "someOtherVar>cut");
91    
92    
93    
94    
95    
96     * Typical Workflow
97    
98     1. Write an EDProducer which puts the variables you want into the EDM
99     tree as C++ standard types or std::vectors of C++ standard types.
100     2. Write a config file with a Source, a TFileService, your EDProducer, and a SimpleTree.
101     3. Run CMSSW, locally or on the grid, using the config file of step (2).
102     4. Interactively browse the resulting rootuple, making cuts and comparisons.
103     5. Write a ROOT macro to format and output any histograms you want to present.
104    
105