ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/cvsroot/UserCode/benhoob/cmsnotes/ZMet2012/bkg.tex
Revision: 1.2
Committed: Wed Jun 27 16:38:31 2012 UTC (12 years, 10 months ago) by benhoob
Content type: application/x-tex
Branch: MAIN
Changes since 1.1: +5 -5 lines
Log Message:
various updates

File Contents

# User Rev Content
1 benhoob 1.1 \clearpage
2     \section{Background Estimation Techniques}
3     \label{sec:bkg}
4    
5     In this section we describe the techniques used to estimate the SM backgrounds in our signal regions defined by requirements of large \MET.
6 benhoob 1.2 The SM backgrounds fall into four categories:
7 benhoob 1.1
8     \begin{itemize}
9 benhoob 1.2 \item \zjets: this is the dominant background after the preselection. The \MET\ in \zjets\ events is estimated with the
10 benhoob 1.1 ``\MET\ templates'' technique described in Sec.~\ref{sec:bkg_zjets};
11     \item Flavor-symmetric (FS) backgrounds: this category includes processes which produces 2 leptons of uncorrelated flavor. It is dominated
12     by \ttbar\ but also contains Z$\to\tau\tau$, WW, and single top processes. This is the dominant contribution in the signal regions, and it
13 benhoob 1.2 is estimated using a data control sample of e$\mu$ events as described in Sec.~\ref{sec:bkg_fs};
14 benhoob 1.1 \item WZ and ZZ backgrounds: this background is estimated from MC, after validating the MC modeling of these processes using data control
15 benhoob 1.2 samples with jets and exactly 3 leptons (WZ control sample) and exactly 4 leptons (ZZ control sample) as described in Sec.~\ref{sec:bkg_vz};
16 benhoob 1.1 \item Rare SM backgrounds: this background contains rare processes such as $t\bar{t}$V and triple vector boson processes VVV (V=W,Z).
17 benhoob 1.2 This background is estimated from MC as described in Sec.~\ref{sec:bkg_raresm}. {\bf TODO: add rare MC}
18 benhoob 1.1 \end{itemize}
19    
20     \subsection{Estimating the \zjets\ Background with \MET\ Templates}
21     \label{sec:bkg_zjets}
22    
23     The premise of this data driven technique is that \MET in \zjets\ events
24     is produced by the hadronic recoil system and {\it not} by the leptons making up the Z.
25     Therefore, the basic idea of the \MET\ template method is to measure the \MET\ distribution in
26     a control sample which has no true MET and the same general attributes regarding
27     fake MET as in \zjets\ events. We thus use a sample of \gjets\ events, since both \zjets\
28     and \gjets\ events consist of a well-measured object recoiling against hadronic jets.
29    
30     For selecting photon-like objects, the very loose photon selection described in Sec.~\ref{sec:phosel} is used.
31     It is not essential for the photon sample to have high purity. For our purposes, selecting jets with predominantly
32     electromagnetic energy deposition in a good fiducial volume suffices to ensure that
33     they are well measured and do not contribute to fake \MET. The \gjets\ events are selected with a suite of
34     single photon triggers with \pt thresholds varying from 22--90 GeV. The events are weighted by the trigger prescale
35     such that \gjets\ events evenly sample the conditions over the full period of data taking.
36     There remains a small difference in the PU conditions in the \gjets\ vs. \zjets\ samples due to the different
37     dependencies of the $\gamma$ vs. Z isolation efficiencies on PU. To account for this, we reweight the \gjets\ samples
38     to match the distribution of reconstructed primary vertices in the \zjets\ sample.
39    
40     To account for kinematic differences between the hadronic systems in the control vs. signal
41     samples, we measure the \MET\ distributions in the \gjets\ sample in bins of the number of jets
42     and the scalar sum of jet transverse energies (\Ht). These \MET distributions are normalized to unit area to form ``MET templates''.
43     The prediction of the MET in each \Z event is the template which corresponds to the \njets\ and
44     \Ht in the \zjets\ event. The prediction for the \Z sample is simply the sum of all such templates.
45     These templates are displayed in App.~\ref{app:templates}.
46    
47     While there is in principle a small contribution from backgrounds other than \zjets\ in the preselection regions,
48     this contribution is only $\approx$3\% ($\approx$2\%) of the total sample in the inclusive search (targeted search),
49     as shown in Table~\ref{table:zyields_2j} (Table~\ref{table:zyields_2j_targeted}, and is therefore negligible compared to the total
50     background uncertainty.
51    
52     \subsection{Estimating the Flavor-Symmetric Background with e$\mu$ Events}
53     \label{sec:bkg_fs}
54    
55     In this subsection we describe the background estimate for the FS background. Since this background produces equal rates of same-flavor (SF)
56     ee and $\mu\mu$ lepton pairs as opposite-flavor (OF) e$\mu$ lepton pairs, the OF yield can be used to estimate the SF yield, after
57     correcting for the different electron vs. muon offline selection efficiencies and the different efficiencies for the ee, $\mu\mu$, and e$\mu$ triggers.
58    
59     An important quantity needed to translate from the OF yield to a prediction for the background in the SF final state is the ratio
60     $R_{\mu e} = \epsilon_\mu / \epsilon_e$, where $\epsilon_\mu$ ($\epsilon_e$) indicates the offline muon (electron) selection efficiency.
61     This quantity can be extracted from data using the observed Z$\to\mu\mu$ and Z$\to$ee yields in the preselection region, after correcting
62     for the different trigger efficiencies.
63    
64     Hence we define:
65    
66     \begin{itemize}
67     \item $N_{ee}^{\rm{trig}} = \epsilon_{ee}^{\rm{trig}}N_{ee}^{\rm{offline}}$,
68     \item $N_{\mu\mu}^{\rm{trig}} = \epsilon_{\mu\mu}^{\rm{trig}}N_{\mu\mu}^{\rm{offline}}$,
69     \item $N_{e\mu}^{\rm{trig}} = \epsilon_{e\mu}^{\rm{trig}}N_{e\mu}^{\rm{offline}}$.
70     \end{itemize}
71    
72     Here $N_{\ell\ell}^{\rm{trig}}$ denotes the number of selected events in the $\ell\ell$ channel passing the offline and trigger selection
73     (in other words, the number of recorded events), $\epsilon_{\ell\ell}^{\rm{trig}}$ is the trigger efficiency, and
74     $N_{e\mu}^{\rm{offline}}$ is the number of events that would have passed the offline selection if the trigger had an efficiency of 100\%.
75     Thus we calculate the quantity:
76    
77     \begin{equation}
78     R_{\mu e} = \sqrt{\frac{N_{\mu\mu}^{\rm{offline}}}{N_{ee}^{\rm{offline}}}} = \sqrt{\frac{N_{\mu\mu}^{\rm{trig}}/\epsilon_{\mu\mu}^{\rm{trig}}}{N_{ee}^{\rm{trig}}/\epsilon_{ee}^{\rm{trig}}}}
79     = \sqrt{\frac{80367/0.88}{54426/0.95}} = 1.26\pm0.07.
80     \end{equation}
81    
82     Here we have used the Z$\to\mu\mu$ and Z$\to$ee yields from Table~\ref{table:zyields_2j} and the trigger efficiencies quoted in Sec.~\ref{sec:datasets}.
83     The indicated uncertainty is due to the 3\% uncertainties in the trigger efficiencies. {\bf TODO: check for variation w.r.t. lepton \pt}.
84     The predicted yields in the ee and $\mu\mu$ final states are calculated from the observed e$\mu$ yield as
85    
86     \begin{itemize}
87     \item $N_{ee}^{\rm{predicted}} = \frac {N_{e\mu}^{\rm{trig}}} {\epsilon_{e\mu}^{\rm{trig}}} \frac {\epsilon_{ee}^{\rm{trig}}} {2 R_{\mu e}}
88     = \frac{N_{e\mu}^{\rm{trig}}}{0.92}\frac{0.95}{2\times1.26} = (0.41\pm0.04) \times N_{e\mu}^{\rm{trig}}$ ,
89     \item $N_{\mu\mu}^{\rm{predicted}} = \frac {N_{e\mu}^{\rm{trig}}} {\epsilon_{e\mu}^{\rm{trig}}} \frac {\epsilon_{\mu\mu}^{\rm{trig}} R_{\mu e}} {2}
90     = \frac {N_{e\mu}^{\rm{trig}}} {0.95} \frac {0.88 \times 1.26}{2} = (0.58\pm0.06) \times N_{e\mu}^{\rm{trig}}$,
91     \end{itemize}
92    
93     and the predicted yield in the combined ee and $\mu\mu$ channel is simply the sum of these two predictions:
94    
95     \begin{itemize}
96     \item $N_{ee+\mu\mu}^{\rm{predicted}} = (0.99\pm0.06)\times N_{e\mu}^{\rm{trig}}$.
97     \end{itemize}
98    
99     Note that the relative uncertainty in the combined ee and $\mu\mu$ prediction is smaller than the those for the individual ee and $\mu\mu$ predictions
100     because the uncertainty in $R_{\mu e}$ cancels when summing the ee and $\mu\mu$ predictions. {\bf N.B. these uncertainties are preliminary}.
101    
102     To improve the statistical precision of the FS background estimate, we remove the requirement that the e$\mu$ lepton pair falls in the Z mass window.
103     Instead we scale the e$\mu$ yield by $K$, the efficiency for e$\mu$ events to satisfy the Z mass requirement, extracted from simulation. In Fig.~\ref{fig:K_incl}
104     we display the value of $K$ in data and simulation, for a variety of \MET\ requirements, for the inclusive analysis. Based on this we chose $K=0.14\pm0.02$
105     for all \MET\ regions except for \MET\ $>$ 300 GeV. For this region the statistical precision is reduced, so that we inflate the uncertainty and chose $K=0.14\pm0.08$.
106     The corresponding plot for the targeted analysis, including the b-veto, is displayed in Fig.~\ref{fig:K_targeted}.
107     Based on this we chose $K=0.13\pm0.02$
108     for all \MET\ regions up to \MET\ $>$ 100 GeV. For higher \MET\ regions (\MET\ $>$ 150 GeV and above) the statistical precision is reduced,
109     so that we inflate the uncertainty and chose $K=0.13\pm0.07$.
110    
111     \begin{figure}[!ht]
112     \begin{center}
113     \begin{tabular}{cc}
114     \includegraphics[width=0.4\textwidth]{plots/K_incl.pdf} &
115     \includegraphics[width=0.4\textwidth]{plots/K_excl.pdf} \\
116     \end{tabular}
117     \caption{
118     The efficiency for e$\mu$ events to satisfy the dilepton mass requirement, $K$, in data and simulation for inclusive \MET\ intervals (left) and
119     exclusive \MET\ intervals (right) for the inclusive analysis. Based on this we chose $K=0.14\pm0.02$ for all \MET\ regions except \MET\ $>$ 300 GeV,
120     where we chose $K=0.14\pm0.08$.
121     {\bf plots made with 10\% of \zjets\ MC statistics, to be remade with full statistics}
122     \label{fig:K_incl}
123     }
124     \end{center}
125     \end{figure}
126    
127     \begin{figure}[!hb]
128     \begin{center}
129     \begin{tabular}{cc}
130     \includegraphics[width=0.4\textwidth]{plots/extractK_inclusive_bveto.pdf} &
131     \includegraphics[width=0.4\textwidth]{plots/extractK_exclusive_bveto.pdf} \\
132     \end{tabular}
133     \caption{
134     The efficiency for e$\mu$ events to satisfy the dilepton mass requirement, $K$, in data and simulation for inclusive \MET\ intervals (left) and
135     exclusive \MET\ intervals (right) for the targeted analysis, including the b-veto.
136     Based on this we chose $K=0.13\pm0.02$ for the \MET\ regions up to \MET\ $>$ 100 GeV.
137     For higher \MET\ regions we chose $K=0.13\pm0.07$.
138     {\bf plots made with 10\% of \zjets\ MC statistics, to be remade with full statistics}
139     \label{fig:K_targeted}
140     }
141     \end{center}
142     \end{figure}
143    
144     \clearpage
145    
146     \subsection{Estimating the WZ and ZZ Background with MC}
147     \label{sec:bkg_vz}
148    
149     Backgrounds from W($\ell\nu$)Z($\ell\ell$) where the W lepton is not identified or is outside acceptance, and Z($\nu\nu$)Z($\ell\ell$),
150     are estimated from simulation. The MC modeling of these processes is validated by comparing the MC predictions with data in control samples
151     with exactly 3 leptons (WZ control sample) and exactly 4 leptons (ZZ control sample).
152     The relevant WZ and ZZ MC samples are:
153    
154     \begin{itemize}
155     \footnotesize{
156     \item \verb=/WZJetsTo3LNu_TuneZ2_8TeV-madgraph-tauola/Summer12-PU_S7_START52_V9-v2/AODSIM= ($\sigma=1.058$ pb),
157     \item \verb=/ZZJetsTo4L_TuneZ2star_8TeV-madgraph-tauola/Summer12-PU_S7_START52_V9-v3/AODSIM= ($\sigma=0.093$ pb),
158     }
159     \end{itemize}
160     The WZJetsTo2L2Q, ZZJetsTo2L2Q, and ZZJetsTo2L2Nu samples are also used in this analysis but their contribution to the 3-lepton and 4-lepton
161     control samples is negligible.
162    
163     \subsubsection{WZ Validation Studies}
164     \label{sec:bkg_wz}
165    
166     A pure WZ sample can be selected in data with the requirements:
167    
168     \begin{itemize}
169     \item Exactly 3 $p_T>20$~GeV leptons passing analysis identication and isolation requirements,
170     \item 2 of the 3 leptons must fall in the Z window 81-101 GeV,
171     \item \MET $>$ 50 GeV (to suppress DY).
172     \end{itemize}
173    
174     The data and MC yields passing the above selection are in Table~\ref{tab:wz}.
175     The inclusive yields (without any jet requirements) agree within 17\%, which is approximately equal
176     to the uncertainty in the measured WZ cross section. A data vs. MC comparison of kinematic
177     distributions (jet multiplicity, \MET, Z \pt) is given in Fig.~\ref{fig:wz}. High \MET\
178     values in WZ and ZZ events arise from highly boosted W or Z bosons that decay leptonically,
179     and we therefore check that the MC does a reasonable job of reproducing the \pt distributions of the
180     leptonically decaying \Z. While the inclusive WZ yields are in reasonable agreement, we observe
181     an excess in data in events with at least 2 jets, corresponding to the jet multiplicity requirement
182     in our preselection. We observe 60 events in data while the MC predicts $34\pm5.2$~(stat)), representing an excess of 78\%,
183     as indicated in Table~\ref{tab:wz2j}.
184     We note some possible causes for this discrepancy:
185    
186     \begin{itemize}
187    
188     \item The \zjets\ contribution is under-estimated here, for 2 reasons: first, because the \zjets\
189     yield passing a \MET $>$ 50 GeV requirement is under-estimated in MC and second, because the fake
190     rate is typically under-estimated in the MC. To get a rough idea for how much the excess depends
191     on the \zjets\ yield, if the \zjets\ yield is doubled then the excess is reduced from 78\% to 55\%.
192     {\bf currently using 10\% of \zjets\ MC, and there is 1 event with a weight of about 5, plots and tables to be remade with full \zjets\ stats}.
193    
194     \item The \ttbar\ contribution is under-estimated here because the fake
195     rate is typically under-estimated in the MC. To get a rough idea for how much the excess depends
196     on the \ttbar\ yield, if the \ttbar\ yield is doubled then the excess is reduced from 78\% to 57\%.
197    
198     \item Currently no attempt is made to reject jets from pile-up interactions, which may be responsible
199     for some of this excess. To check this, we increase the jet \pt requirement to 40 GeV which
200     helps to suppress PU jets and observe 39 events in data vs. an MC prediction of $25\pm5.2$~(stat),
201     decreasing the excess from 78\% to 58\%. In the future this may be improved by explicitly
202     requiring the jets to be consistent with originating from the signal primary vertex.
203    
204     \end{itemize}
205    
206     Based on these studies we currently assess an uncertainty of 80\% on the WZ yield.
207    
208     \begin{table}[htb]
209     \begin{center}
210     \caption{\label{tab:wz} Data and Monte Carlo yields passing the WZ preselection. }
211     \begin{tabular}{lccccc}
212     \hline
213     Sample & ee & $\mu\mu$ & e$\mu$ & total \\
214     \hline
215     WZ & 58.9 $\pm$ 0.7 & 82.2 $\pm$ 0.8 & 4.0 $\pm$ 0.2 &145.1 $\pm$ 1.0 \\
216     \ttbar & 0.6 $\pm$ 0.5 & 4.3 $\pm$ 1.5 & 3.0 $\pm$ 1.2 & 8.0 $\pm$ 2.0 \\
217     \zjets & 0.4 $\pm$ 0.4 & 4.9 $\pm$ 4.9 & 0.0 $\pm$ 0.0 & 5.3 $\pm$ 4.9 \\
218     ZZ & 1.4 $\pm$ 0.0 & 2.0 $\pm$ 0.0 & 0.1 $\pm$ 0.0 & 3.5 $\pm$ 0.0 \\
219     WW & 0.0 $\pm$ 0.0 & 0.2 $\pm$ 0.1 & 0.2 $\pm$ 0.1 & 0.3 $\pm$ 0.1 \\
220     single top & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.1 $\pm$ 0.1 \\
221     \hline
222     total SM MC & 61.3 $\pm$ 0.9 & 93.7 $\pm$ 5.2 & 7.3 $\pm$ 1.3 &162.3 $\pm$ 5.4 \\
223     data & 68 & 108 & 14 & 190 \\
224     \hline
225     \hline
226    
227     \end{tabular}
228     \end{center}
229     \end{table}
230    
231     \begin{table}[htb]
232     \begin{center}
233     \caption{\label{tab:wz2j} Data and Monte Carlo yields passing the WZ preselection and \njets\ $>$ 2. }
234     \begin{tabular}{lccccc}
235     \hline
236     Sample & ee & $\mu\mu$ & e$\mu$ & total \\
237     \hline
238     \hline
239     WZ & 9.8 $\pm$ 0.3 & 13.3 $\pm$ 0.3 & 0.6 $\pm$ 0.1 & 23.6 $\pm$ 0.4 \\
240     \ttbar & 0.2 $\pm$ 0.2 & 2.0 $\pm$ 0.9 & 2.2 $\pm$ 1.2 & 4.4 $\pm$ 1.5 \\
241     \zjets & 0.0 $\pm$ 0.0 & 4.9 $\pm$ 4.9 & 0.0 $\pm$ 0.0 & 4.9 $\pm$ 4.9 \\
242     ZZ & 0.3 $\pm$ 0.0 & 0.4 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.7 $\pm$ 0.0 \\
243     WW & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.1 $\pm$ 0.0 \\
244     single top & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
245     \hline
246     tot SM MC & 10.3 $\pm$ 0.3 & 20.8 $\pm$ 5.0 & 2.8 $\pm$ 1.2 & 33.8 $\pm$ 5.2 \\
247     \hline
248     data & 23 & 32 & 5 & 60 \\
249     \hline
250     \hline
251    
252     \end{tabular}
253     \end{center}
254     \end{table}
255    
256     \begin{figure}[tbh]
257     \begin{center}
258     \includegraphics[width=1\linewidth]{plots/WZ.pdf}
259     \caption{\label{fig:wz}\protect
260     Data vs. MC comparisons for the WZ selection discussed in the text for \lumi.
261     The number of jets, missing transverse energy, and Z boson transverse momentum are displayed.
262     }
263     \end{center}
264     \end{figure}
265    
266     \clearpage
267    
268     \subsubsection{ZZ Validation Studies}
269     \label{sec:bkg_zz}
270    
271     A pure ZZ sample can be selected in data with the requirements:
272    
273     \begin{itemize}
274     \item Exactly 4 $p_T>20$~GeV leptons passing analysis identication and isolation requirements,
275     \item 2 of the 4 leptons must fall in the $Z$ window 81-101 GeV.
276     \end{itemize}
277    
278     The data and MC yields passing the above selection are in Table~\ref{tab:zz}. Again we observe an
279     excess in data with respect to the MC prediction (29 observed vs. $17.3\pm0.1$~(stat) MC predicted).
280     After requiring at least 2 jets, we observe 2 events and the MC predicts $1.5\pm0.1$~(stat).
281     Based on this we apply an uncertainty of 80\% to the ZZ background.
282    
283     \begin{table}[htb]
284     \begin{center}
285     \caption{\label{tab:zz} Data and Monte Carlo yields for the ZZ preselection. }
286     \begin{tabular}{lccccc}
287     \hline
288     Sample & ee & $\mu\mu$ & e$\mu$ & total \\
289     \hline
290    
291     \hline
292     ZZ & 6.6 $\pm$ 0.0 & 9.9 $\pm$ 0.0 & 0.4 $\pm$ 0.0 & 17.0 $\pm$ 0.1 \\
293     WZ & 0.1 $\pm$ 0.0 & 0.2 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.3 $\pm$ 0.0 \\
294     \zjets & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
295     \ttbar & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
296     WW & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
297     single top & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
298     \hline
299     total SM MC & 6.7 $\pm$ 0.0 & 10.1 $\pm$ 0.1 & 0.5 $\pm$ 0.0 & 17.3 $\pm$ 0.1 \\
300     \hline
301     data & 13 & 16 & 0 & 29 \\
302     \hline
303    
304     \hline
305     \end{tabular}
306     \end{center}
307     \end{table}
308    
309     \begin{figure}[tbh]
310     \begin{center}
311     \includegraphics[width=1\linewidth]{plots/ZZ.pdf}
312     \caption{\label{fig:zz}\protect
313     Data vs. MC comparisons for the $ZZ$ selection discussed in the text for \lumi.
314     The number of jets, missing transverse energy, and $Z$ boson transverse momentum are displayed.
315     }
316     \end{center}
317     \end{figure}
318    
319    
320    
321    
322     \subsection{Estimating the Rare SM Backgrounds with MC}
323     \label{sec:bkg_raresm}
324    
325     {\bf TODO: list samples, yields in preselection region, and \MET distribution}