COMP/CSA06DOC/offlinesw.tex

\section{Offline Software}
\subsection{Sequence of Releases}
The following releases of CMSSW software were employed for the CSA06 challenge and pre-challenge activities:
\begin{itemize}
\item CMSSW\_0\_8\_x:  available July 2006, validated for large-scale simulation
\begin{itemize}
\item CMSSW\_0\_8\_1: minimum bias events
\item CMSSW\_0\_8\_2 and 0\_8\_3: for signal samples (as generator filters became available)
\end{itemize}
\item CMSSW\_1\_0\_x: available September 2006, validated for large-scale reconstruction, consisting of over 0.5M lines of code
\begin{itemize}
\item CMSSW\_1\_0\_2: minimum-bias reconstruction
\item CMSSW\_1\_0\_3: fixes to improve robustness of signal samples
\item CMSSW\_1\_0\_4: Alignment/Calibration skim dataset production
\item CMSSW\_1\_0\_6: Frontier access ready, analysis skims ready
\end{itemize}
\end{itemize}

% begin Filip
\subsection{Generation}

For the CSA06 exercise, 50 million events were requested to be
generated, simulated and  
reconstructed; they consist of the 9 samples outlined below.
For all samples, the PYTHIA generator interface was used (version 6.227), 
with the CTEQ5.1 Parton Density Functions. For the description of the 
underlying Event, Tune DWT (Rick Field) was used.
Some of the samples were preselected using generator-level information. 
For these samples, EDFilters were invoked right after the generation step.
The samples were:
\begin{enumerate}
\item {\em Minimum Bias}: 25 million events (produced with 
CMSSW\_0\_8\_1).
All non-elastic processes (including diffractive and double-diffractive) 
switched on.
\item {\em $t\bar{t}$}: 5 million events (produced with CMSSW\_0\_8\_2). 
All decay channels open.
\item{\em $Z \rightarrow \mu\mu$}: 2 million events (produced with 
CMSSW\_0\_8\_2)
\item{\em $W \rightarrow e \nu$}: 4 million events (produced with 
CMSSW\_0\_8\_3) selected in an $\eta$, $\phi$ range as to illuminate 2 
supermodules.
\item{\em Soft Muon Soup}: 2 million events (produced with 
CMSSW\_0\_8\_3), of which 1 million of inclusive muon events filtered from 
Min Bias and 1 million $J/\psi$ events with $P_T(\mu)$ $>$ 4 GeV.
\item{\em Electroweak soup}: 5 million events (produced with 
CMSSW\_0\_8\_3) consisting of 
2.6 million $W \rightarrow l \nu$, 2.2 million Drell-Yan (mass $>$ 15 
GeV), 
0.1 million $H \rightarrow WW$ events and 0.1 million $WW$ events.  For 
the last two subsamples, the cross sections were artificially reweighted 
in order to produce the desired event mix. All 3 charged lepton 
generations are included.
\item{\em Jet Calibration Soup}: 1.2 million events (produced with 
CMSSW\_0\_8\_3) consisting of dijet and $Z$+jet events, in various 
$\hat{p_t}$ bins reweighted to give the event numbers desired by the 
Jet-MET group.
\item{\em{Exotics Soup}}: 1 million events (produced with CMSSW\_0\_8\_3) 
consisting of
0.22 million excited quarks (400 GeV) events (all decays), 0.39 million 
$Z'$ (700 GeV) events (all decays) and 0.39 million SUSY LM1 events (all 
decays).
\item{\em HLT Soup}: 5 million events (produced with CMSSW\_0\_8\_4) 
consisting of W (forced to charged leptons), Drell-Yan (mass $>$ 20 GeV, 
forced to charged leptons), $t\bar{t}$ events (all decays)
and QCD dijets ($\hat{p_t}$ $>$ 350 GeV)
\end{enumerate}
%end Filip


\subsection{Simulation and Geometry}

The first step of the CSA06 challenge consisted of the preparation of
large simulated datasets, some of which included High Level Trigger
(HLT) tags. 

The produced samples of over 60M events were obtained using the CMSSW\_0\_8\_x
series of releases. The physics generator input, based on Pythia, is
described in the previous section along with a  
description of the generator datasets.

Simulation in the CMSSW\_0\_8\_x series is based on the 7.1 version of
the Geant4 simulation toolkit. The full simulation chain consists of
the detailed description of the CMS detector setup in the 4 Tesla
magnetic field, particle propagation and physics process modeling, hit
collection in the sensitive detector elements and signal digitization
taking into account all relevant effects. Pile-up, however, was not available 
at this stage of integration.
In addition to the simulation chain, the simulation applications available
with the CMSSW\_0\_8\_x release are the GeometryProducer for visualization
debugging, and a set of hit and digi level packages which are part of
the Software Validation Suite (SVSuite).

The simulation chain, i.e. geometry, simulation and digitization, was
extensively validated in terms of description correctness, detector
response, physics quality and software robustness and performance,
using the SVSuite.

\subsubsection{Infrastructure}

Simulation infrastructure in CMSSW\_0\_8\_x included the following elements:

\begin{itemize}
\item User hooks or observers which allowed access detector simulation
information at the beginning/end of a track, run, or event, and at every step
in Geant4 tracking.
\item Interface to a realistic magnetic field.
\item QGSP and LHEP physics lists in Geant4, as well as interface to set cuts
per sub-detector on the production of secondary particles by Geant4.
\item Use of random number service provided by the Framework project.
\item An exception catcher tool to skip events in cases of a divide by zero,
invalid, overflow, underflow. 
\end{itemize}

Some of the infrastructure features which were missing for CSA06 and will be
added in the future are:

\begin{itemize}
\item Validation of a G4.8.1 based version of CMSSSW and subsequent migration.
\item Commissioning of the geometry overlap detection tool.
\item Local magnetic field management
\item Optimization of simulation parameters for speed and accuracy
\item Development of capability to overlap real data to Monte Carlo signal
events.
\item GFlash parameterization of hadronic showers.
\end{itemize}

\subsubsection{Performance and Robustness}

The simulation chain, which includes generation, Geant4 based detector 
simulation, and digitization is very robust from the point of view of crash
rate. Crashes are very rare, on the order of 10$^{-4}$--10$^{-6}$ per event.
This robust performance, however, is partially due to the action of the
exception catcher, based on the FloatingPointException (FPE) service. 
Otherwise, there would be a large crash rate due to Geant4 unsafe features,
as well as overlap of CMS detector volumes. The fraction of events skipped
by the exception catcher tool is approximately 0.5$\%$ for minimum bias,
and 2$\%$ for QCD events. Preliminary studies using G4.8.1 based CMS 
simulations show a significantly improved behavior of Geant4. At the same
time, the geometry is currently being debugged to remove volume overlaps.
Time performance performed on Min-Bias, H(300~GeV)$\rightarrow ee \mu \mu$,
and heavy ion events gave average values of 48, 247, and 5976 seconds per 
event, respectively.

\subsubsection{Geometry, Hits, Digi Implementation and Verification}

Before the CSA06 exercise, the geometry of all the sub-detector systems, 
except the electromagnetic calorimeter (ECAL), has been re-written using the 
Detector Description Database (DDD) xml based tools. It has also been 
verified against the old implementation based on a machine translation to xml
of the CMSIM Geant3 geometry. The new material budget description of
all tracker sub-systems except the Forward Pixels (FPix) and the Tracker
Outer Barrel (TOB) is in excellent agreement with the old implementation.
The FPix description has been improved and is more accurate than the previous
one. The TOB differences are not understood and under investigation. The
tracker group is performing a validation of the geometry description based
on weighing the actual unite modules.
The ECAL material budget in the new and old implementations was verified to be
identical, as expected. A new post-CSA06 DDD/xml based geometry is now 
completed for the ECAL barrel and pre-shower detectors and is being validated.
The agreement between the old and new Hcal geometry implementations is nearly
perfect. The muon geometry was also re-developed and awaits verification.
A Geometry SVSuite package was developed to test material budget.
Hits and Digis were implemented in CMSSW for all sub-detector systems and
verified versus OSCAR/ORCA results using their respective SVSuite packages.
Corrections and updates were, however, incorporated as the code is tested
from CSA06 and physics validation samples.
The forward detectors (Zero Degree Calorimeter (ZDC), Totem, Castor) were
also available for CSA06 as standalone CMSSW applications, outside the CMS 
detector simulation chain.


\subsection{Filtering and Streaming}
\label{sec:filtering}
CMSSW included filtering technology to select events at the
generator-level, at HLT, and for skimming datasets secondary
datasets. The ability to write out multiple output files
simultaneously was included. The ability to merge input files into a
single output file was also provided.

In order to simplify the definition of filtering modules, we used a
generic approach implemented with C++ templates. This could be
achieved because all RECO/AOD objects have the same naming convention
for member functions returning the same type of information (like
$p_{\rm T}$,
$E_{\rm T}$, and so on). Generic filter modules have been written for
the most commonly used selection criteria. Events are selected by a
filter module if they contain at least a specified number of reconstructed
objects passing a specified selection criterion. ``Primitive'' selection
criteria are defined based on a single variable cut and can be
combined with Boolean operations to create a richer variety of
selections. The generic modules have been instantiated for the objects
types of interest (electrons, muons, jets, tracks, etc.) and for the
selection criteria of interest for that object type, creating in a
release a large ``suite'' of selector and filter modules ready to be
plugged in cascade in any skimming sequence. All cuts for all modules
are fully configurable via the parameter set mechanism provided by the
Framework. 

\subsection{Higher Level Trigger}

A total of 12 HLT triggers from the Physics TDR Vol.2 \cite{ptdr2}
trigger menu
were implemented in CMSSW for the CSA06.  The CSA06 HLT triggers are
based on Monte Carlo information using generator-level
photons, electrons, muons, taus and jets clustered from
generator-level particles.  The trigger thresholds, all denoting
transverse momentum inside the relevant subdetector acceptance, are
listed in the following table.

\begin{table}[htb]
\centering
\caption{List of implemented HLT triggers, based on generator-level
particles. \label{tab:hlt-defs}}
\vspace{3mm}
\begin{tabular}{|l|l|l|}
\hline
Name & Mnemonic & Threshold (GeV) \\ \hline
Single Gamma & p1g & 80 \\
Double Gamma & p2g & 30,20 \\
Single electron & p1e & 26 \\
Double electron  & p2e & 12,12 \\
Single Muon & p1m & 19 \\
Double Muon & p2m  & 7,7 \\
Single Tau  & p1t  & 100 \\
Double Tau  & p2t  & 60,60 \\
Single Jet  & p1j  & 400 \\
DiJet & p2j & 350 \\
TriJet  & p3j  & 195 \\
Quad Jet  & p4j  & 80 \\
\hline
\end{tabular}
\end{table}

%% added by MWG /start

The above HLT triggers are identified by their mnemonics, or by a
non-negative consecutive integer number (serving as an index into a
C++ vector).  The number is assigned based on the relative order in
which the HLT trigger paths appear in the configuration file.  Because
the workflow management converts configuration files to python files
and back, this order was mangled, and the expected bit positions did
not correspond to the actual ones.  To alleviate this problem, the
keyword {\em schedule} was introduced, during the CSA exercise, into
the configuration language.  This statement allows to specify the
order of paths.  A default schedule statement is implicitly generated
from the order encountered in the configuration file in case the user
does not specify one explicitly.

In general, access to specific trigger bits and trigger results should
rely on the mnemonics rather than integer enumerations.  Acess through
mnemonics is not affected by re-ordering problems.  However, using
integer positions has the advantage of using less memory space and is
thus used internally; recall that the HLT trigger table is the same
for a large number of events.

%% added by MWG /end

When the prepared sample of generated events was split into separate
datasets, similar triggers were grouped together such that 4 output
datasets were created: photons, electrons, muons, and jets
(essentially no events passed the generator-level thresholds for the
tau triggers).


\subsection{Reconstruction}
The CSA06 related goals were (in order of importance)
\begin{itemize}
\item   A release containing reconstruction modules for all high level objects
\item Reconstruction code stable, able to run on tens of million of events without a significant crash rate
\item Reconstruction code able to process a few thousand events per job, without suffering from memory leaks. On complex event samples, this meant more than 12 hours of running without glitches
\item Reconstruction code usable for detector studies as well as first look at physics analysis.
\end{itemize}

While the first CSA06 oriented version was CMSSW\_1\_0\_0 (end of Sept.
2006), stability/performance tests were run during the summer using
integration releases, and led to fixing the majority of technical
problems. 

CMSSW\_1\_0\_0 contained reconstruction components for
\begin{itemize}
\item         Local detector reconstruction (clustering, segment building in muon chambers)
\item   Super Clustering in the ECAL with Brem recovery
\item   Full tracking using Kalman Filter and Combinatorial Track Finding
\item   Fast tracking (trigger like) using only the pixel subsystem
\item   Jet reconstruction (MidPoint and Iterative Cone algorithms)
\item   Missing $E_T$ reconstruction
\item   Electron reconstruction seeded with ECAL Super Clusters
\item   Photon reconstruction
\item   Standalone and Global Muon reconstruction (Muon + Tracker links)
\item   Primary vertices with full tracking and pixel only tracking
\item   B Tagging using track counting
\item   Tau Tagging using calorimetric and tracker (isolation) variables
\end{itemize}

The modules were organized in framework sequences, local, global and
high level, and were consistently run for all data samples. 
The code, after the summer tuning and some other late moment fixes,
was able to run with a negligible crash rate and without showing
memory problems, on all of the CSA06 samples. 
The two extreme conditions (minimum bias low $p_{\rm T}$ events and TTbar
events) are shown in Table~\ref{table:recotiming}. 

\begin{table}[h]
\vspace{3mm}
\centering
\caption{Time and memory footprint during CSA06 in two extreme cases. The memory is measured when the steady state is reached. }
\label{table:recotiming}
{\centering
\begin{tabular}[t]{|l|c|c|}
\hline
 & Minbias Events & TTbar Events \\\hline
Time (sec/ev) & 3.5 & 25\\
Memory Footprint (MB) & 400 & 700 \\\hline
\end{tabular}
\par}
\vspace{3mm}
\end{table}
% END TABLE X


The main challenge we faced during summer and fall 2006 was to manage
the rapid deployment of the reconstruction packages as well as
maintain stability and backward-compatibility of data structures and
code to enable the CSA challenge to be run. As this was also the first
appearance of some of these packages in the new software environment,
validation was and is an extremely important and urgent task.

While much of the reconstruction code was quite recent at the start of
CSA06 T0 processing, previous and a-posteriori checks have shown that
the reconstruction output is useful from physics point of view, and
allows speeding up the validation process the developers are now
focusing on. That apart, the huge and coordinated CSA06 effort, with
its variety of use cases for reconstruction software (plain, with
calibrations, reprocessing) has been valuable to uncover some weak
points in the overall structure, e.g.  configuration file structure
being too naive and not flexible enough 
(which has been changed since). In fact
many of the changes which went into the CMSSW\_1\_2\_0 development
release do come from the lessons learnt during CSA06.  The
reconstruction code as in CMSSW\_1\_0\_x proved to be adequate for
the scope and the goals of CSA06, and it serves as a solid base to
continue the development toward data-taking.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Calibration \& Alignment}
\label{sec:offlineswalca}
\label{calibtool}

In the context of CSA06 several exercises concerning the prompt
alignment and calibration work-flow have been carried out and results
of these exercises are described in Sections~\ref{sec:calib} and
\ref{sec:align}. The full 
chain work-flow for ECAL and HCAL calibration as well as for tracker
alignment has been tested using several techniques that are foreseen
to be used during CMS operation.
The calibration and alignment part of the CSA06 challenge
consisted of the following steps:
\begin{enumerate}
\item Prompt reconstruction at the Tier-0, reading alignment
and calibration constants from the offline database;
\item Production of a special skimmed data format, the {\bf AlCaReco},
which contains only the information relevant for a specific alignment
or calibration task;
\item Running alignment/calibration algorithms at Tier-1/2
centres or the Tier-0 using misaligned/miscalibrated AlCaReco
data;
\item Inserting the derived alignment/calibration objects into
the database and deploying them to the Tier-1/2 centres;
\item Re-reconstruction at Tier-1 centres reading these
updated constants;
\item Analysis jobs at Tier-2 centres comparing ideal, miscalibrated 
(misaligned) and calibrated (aligned) distributions.
\end{enumerate}

The CMS software is set up such that misalignment and miscalibration
do not necessarily have to be applied already during simulation or
reconstruction, but can also be applied on the fly even at the level of a
user analysis job. Therefore, in order to keep maximum flexibility, it
was decided not to misalign and miscalibrate the data reconstructed at
the Tier-0 (and hence the AlCaReco), but rather to use ideal constants
for the prompt reconstruction and AlCaReco production, and to apply
misalignment/miscalibration on the fly when running the
alignment/calibration algorithms. 

To achieve this, all calibration exercises made use of a common {\em
miscalibration tool}, available under \\ {\tt
CalibCalorimetry/CaloMiscalibTools} which allows to miscalibrate
RecHits also at the reading stage based on predefined
scenarios. Similarly, the alignment exercises used a dedicated
tracker/muon {\em misalignment tool} (in {\tt
Alignment/TrackerAlignment} and {\tt Alignment/MuonAlignment}) which
is able to move/rotate all parts of the tracker and the muon detector.

In order to perform the exercise, several software ingredients were
put in place:
\begin{itemize}
\item Implementation of the actual calibration and alignment
algorithms;
\item Implementation of the various AlCaReco stream producers;
\item Preparation of a matrix defining which streams to create
from which dataset, as well as a combined central configuration file;
\item Insertion of the alignment and calibration objects into
the offline database;
\item Configuration fragments for reading database objects containing
alignment and calibration constants from the offline database,
and using them in the reconstruction.
\end{itemize}
The alignment and calibration algorithms were implemented
for the CMSSW\_1\_0\_0 release.

The following AlCaReco
streams were defined and implemented:
\begin{enumerate}

\item \verb=CSA06ZMuMu= \\
Tracker alignment using $Z^0\to \mu^+ \mu^- $ events. Only
the tracks to be used by the alignment algorithm, namely those
corresponding to the muons from the $Z^0$ decay 
and satisfying $p_T>10 \rm\ GeV$ are stored in the event,
using the {\tt AlignmentTrackSelector}.

\item \verb=CSA06MinBias= \\
Pixel tracker alignment using minimum bias events.
Only tracks with $p_T>1.5 \rm\ GeV$ and at least 6 reconstructed hits
are considered, and a minimum of two such tracks was requested
for the event to be kept. Again, the {\tt AlignmentTrackSelector}
is used.

\item \verb=CSA06ZMuMu_muon=  \\
Muon alignment using $Z^0\to \mu^+ \mu^- $ events.

\item \verb=AlcastreamElectron= \\   
Cell-wise ECAL calibration using $E/p$ from  isolated electrons.
%The single electron calibration exploits the comparison of the
%electron energy with its associated track momentum in order to derive
%the calibration coefficients for every calorimeter cell.  
In the AlCaReco, only the information relevant to the calibration is
retained: the RecHits associated to the selected electron and the
track associated to it. The {\tt ElectronSelector} package has been
used to select tracks with transverse P$_t$ above a threshold of 20
GeV. An AlCaReco event from $W^+ \rightarrow e^+ \nu$ has a size of
about 3~kBytes.

\item \verb=AlcastreamEcalPhiSym= \\ 
Inter-calibration of ECAL rings using the phi symmetry method.
%This technique exploits the expected symmetry of energy deposits in
%calorimeter rings to intercalibrate the corresponding cells.
The only information which needs to be stored for each event is the
subset of ECAL RecHits with energy above a certain threshold, which is
introduced in order to prevent noise from contributing to the energy
sums.  The value of the threshold is 150~MeV for barrel crystals and
750~MeV for end-cap crystals.  Only a few tens of RecHits per event
have energy exceeding these thresholds.  The AlCaReco data format for
the $\phi$ symmetry calibration exercise is defined to consist solely
of the filtered RecHit collections for the ECAL barrel and
end-caps. The average size of $\phi$-symmetry AlCaReco events is 120
bytes.

\item \verb=AlcastreamHcalDijets= \\ 
HCAL Calibration using dijet balancing.

\item \verb=AlcastreamHcalIsotrk= \\ 
HCAL calibration using $E/p$ from isolated pion tracks.
%The isolated track is an HCAL calibration technique in which
%calorimeter responses are measured to single charged pions of known
%momentum.  Calibration is done by direct comparison of the HCAL
%response with the corresponding tracker information. 
The procedure is
done independently for pions not interacting with ECAL and interacting
with ECAL and is supposed to provide calibration constants as a
function of energy on a tower by tower basis (HB and HE up to
$|\eta|<2.1$).  The AlCaReco producer makes a selection of
reconstructed tracks by requiring spatial isolation from other tracks
in the event.  An isolated track was defined by: (a)
%\begin{itemize}
%\item  
no other charged particles within a cone of 0.5;
%\item 
(b) a conservative cut on $p > 1$ GeV to reject tracks that will not 
reach HCAL.
%\end{itemize}
In addition the original CaloTower collection is kept in the AlCaReco.

\item \verb=AlcastreamHcalMinbias= \\
HCAL inter-calibration using the phi symmetry method.
%In analogy with the ECAl case, this technique aimes to set the
%azimuthal symmetry for readouts in the same $\eta$ ring. 
Mean value and variance of the energy distribution in the readout are
used. If the energy in the readout is collected without zero-suppression,
i.e. negative values after pedestal subtraction are kept, the variance
is used. If the energy in the readout is collected in zero-suppression mode
the mean value is used.  The AlCaReco content is limited to the the
HB/HE, HF and HO collections of RecHits with no additional selection
applied.

\end{enumerate}

\begin{table}[t]
\centering
\caption{Matrix of produced AlCaReco streams.}
\vspace{3mm}
\label{tab:alcareco}
\begin{tabular}{|l|c|c|c|c|l|}
\hline
  & \multicolumn{4}{c|}{ Input Dataset } &  \\
 Output Stream Name & $Z^0\to\mu^+\mu^-$ & min. bias & QCD Jets & $W^\pm\to e^\pm \nu$ & Purpose \\
\hline
CSA06ZMuMu            & X &   &   &   & Tracker Alignment \\
CSA06MinBias          &   & X &   &   & Tracker Alignment \\ \hline
CSA06ZMuMu\_muon      & X &   &   &   & Muon Alignment \\ \hline
AlcastreamElectron    &   &   &   & X & ECAL Calibration  \\
AlcastreamEcalPhiSym  &   & X &   &   & ECAL Calibration \\ \hline
AlcastreamHcalDijets  &   &   & X &   & HCAL Calibration \\
AlcastreamHcalIsotrk  &   & X & X &   & HCAL Calibration \\
AlcastreamHcalMinbias &   & X &   &   & HCAL Calibration \\
\hline
\end{tabular}
\end{table}


The matrix that defines the streams to be produced for a given
data set is shown in Table~\ref{tab:alcareco}. In general, more than
one stream can be created per dataset, and more than one dataset can
be associated to a particular stream. Therefore, the capability of the
framework to write more than one output stream in parallel was crucial
for this part of the challenge.

A combined configuration file
(\verb=Configuration/Examples/data/AlCaReco.cfg=) was prepared, which
reads a RECO file and writes one or more AlCaReco streams according to
the matrix shown in Table~\ref{tab:alcareco}.  Technically, the
reconstruction itself and the AlCaReco streaming were performed as two
consecutive steps.

Furthermore, configuration fragments were added to the central
reconstruction configuration in order to enable the reading of ECAL
calibration constants as well as tracker and muon alignment constants
from the offline database, either directly from ORACLE, or using the
FRONTIER caching.

For more details on the alignment and calibration related offline
software, see~\cite{alcacsa06note}.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Event Content Definitions}
Raw data for this exercise is defined to be the ``digi'' representation
of the detector information. The RECO data consists of a set of
products produced during prompt reconstruction from the raw data that
allow an event to be re-reconstructed,
whereas AOD is a selected subset of essential information for
analysis. The Full event format (FEVT) includes the raw digi data and
the RECO data, with only a few intermediate tracking objects
dropped. As this was a challenge based on samples produced from Monte
Carlo generators, the Monte Carlo generator and detector simulation
data was also kept with the distributed samples, leading to ``RECOSim''
and ``AODSim'' formats. 

\subsubsection{RECO content}

Data samples in the RECO format contain mainly the following information:

\begin{itemize}
\item Track collections, including associated rec-hits, allowing track refits
\item   Tracker rec-hits
\item   Primary vertices
\item   Muon collections, both reconstructed locally in the muon detector, and combined with tracks reconstructed in the tracker detector
\item   Muon detectors rec-hits (including DT, CSC, RPC)
\item   Electrons and photon collections
\item   Clusters and Super-clusters reconstructed in the electromagnetic calorimeter
\item   ECAL rec-hits
\item   Jet collections reconstructed with different algorithms and missing Et
\item   Calorimetric towers
\item   HCAL rec-hits
\end{itemize}

Together with the reconstruction output, the RECOSim format stores
Geant-4 tracks  and vertices plus the full generator output. Jet
collections reconstructed at the Generator level are also stored. 

\subsubsection{AOD content}

The AOD is a proper subset of the RECO. This allows event skimming
producing AOD out of data samples in the RECO or FEVT format without
need to run any software module for data conversion. The Content of
the AOD is defined essentially by dropping rec-hits collections from the RECO
format: 

\begin{itemize}
\item   Track collections without associated rec-hits (no track refits
  is possible with AOD) 
\item   Primary vertices
\item   Muon collections, both reconstructed locally in the muon
  detector, and combined with tracks reconstructed in the tracker
  detector. Track fits have no associated rec-hits 
\item   Muon detectors rec-hits (including DT, CSC, RPC)
\item   Electrons and photon collections
\item   Clusters and Super-clusters reconstructed in the
  electromagnetic calorimeter 
\item   Jet collections reconstructed with different algorithms and missing $E_T$
\item   Calorimetric towers
\end{itemize}

Together with the physics objects and reconstruction information, the
AODSim format stores the full generator output. Jet collections
reconstructed at the Generator level are also stored. 

The output sizes for these formats for a few CSA samples are given in
Table~\ref{tab:evtsize}.  

\begin{table}[htb]
\centering
\caption{Event size for various data formats and samples.}
\vspace{3mm}
\label{tab:evtsize}
\begin{tabular}{|l|l|}
\hline
Format & Event size \\ \hline
\multicolumn{2}{|l|}{ {\bf Minimum bias events} }  \\ \hline
FEVT & 843 kB/evt \\ \hline
RECOSim & 202 kB/evt \\ \hline
AODSim & 83 kB/evt \\ \hline
\multicolumn{2}{|l|}{ {\bf T-Tbar events} }  \\ \hline
FEVT & 3408 kB/evt \\ \hline
RECOSim & 781 kB/evt \\ \hline
AODSim & 309 kB/evt  \\ \hline
\multicolumn{2}{|l|}{ {\bf EWK Soup events} }  \\ \hline
FEVT & 1730 kB/evt \\ \hline
RECOSim & 417 kB/evt \\ \hline
AODSim & 197 kB/evt \\ \hline
\end{tabular}
\end{table}


Revision:	1.20
Committed:	Sat Mar 3 11:09:30 2007 UTC (18 years, 2 months ago) by acosta
Content type:	application/x-tex
Branch:	MAIN
CVS Tags:	HEAD
Changes since 1.19:	+1 -1 lines
Log Message:	minor edits