ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/cvsroot/UserCode/benhoob/cmsnotes/ZMet2012/bkg.tex
Revision: 1.4
Committed: Fri Jun 29 19:41:46 2012 UTC (12 years, 10 months ago) by benhoob
Content type: application/x-tex
Branch: MAIN
Changes since 1.3: +16 -13 lines
Log Message:
remove fixme's, add dijet mass plot

File Contents

# Content
1 %\clearpage
2 \section{Background Estimation Techniques}
3 \label{sec:bkg}
4
5 In this section we describe the techniques used to estimate the SM backgrounds in our signal regions defined by requirements of large \MET.
6 The SM backgrounds fall into three categories:
7
8 \begin{itemize}
9 \item \zjets: this is the dominant background after the preselection. The \MET\ in \zjets\ events is estimated with the
10 ``\MET\ templates'' technique described in Sec.~\ref{sec:bkg_zjets};
11 \item Flavor-symmetric (FS) backgrounds: this category includes processes which produces 2 leptons of uncorrelated flavor. It is dominated
12 by \ttbar\ but also contains Z$\to\tau\tau$, WW, and single top processes. This is the dominant contribution in the signal regions, and it
13 is estimated using a data control sample of e$\mu$ events as described in Sec.~\ref{sec:bkg_fs};
14 \item WZ and ZZ backgrounds: this background is estimated from MC, after validating the MC modeling of these processes using data control
15 samples with jets and exactly 3 leptons (WZ control sample) and exactly 4 leptons (ZZ control sample) as described in Sec.~\ref{sec:bkg_vz};
16 %\item Rare SM backgrounds: this background contains rare processes such as $t\bar{t}$V and triple vector boson processes VVV (V=W,Z).
17 %This background is estimated from MC as described in Sec.~\ref{sec:bkg_raresm}. {\bf FIXME: add rare MC}
18 \end{itemize}
19
20 \subsection{Estimating the \zjets\ Background with \MET\ Templates}
21 \label{sec:bkg_zjets}
22
23 The premise of this data driven technique is that \MET\ in \zjets\ events
24 is produced by the hadronic recoil system and {\it not} by the leptons making up the Z.
25 Therefore, the basic idea of the \MET\ template method is to measure the \MET\ distribution in
26 a control sample which has no true MET and the same general attributes regarding
27 fake MET as in \zjets\ events. We thus use a sample of \gjets\ events, since both \zjets\
28 and \gjets\ events consist of a well-measured object recoiling against hadronic jets.
29
30 For selecting photon-like objects, the very loose photon selection described in Sec.~\ref{sec:phosel} is used.
31 It is not essential for the photon sample to have high purity. For our purposes, selecting jets with predominantly
32 electromagnetic energy deposition in a good fiducial volume suffices to ensure that
33 they are well measured and do not contribute to fake \MET. The \gjets\ events are selected with a suite of
34 single photon triggers with \pt thresholds varying from 22--90 GeV. The events are weighted by the trigger prescale
35 such that \gjets\ events evenly sample the conditions over the full period of data taking.
36 There remains a small difference in the PU conditions in the \gjets\ vs. \zjets\ samples due to the different
37 dependencies of the $\gamma$ vs. Z isolation efficiencies on PU. To account for this, we reweight the \gjets\ samples
38 to match the distribution of reconstructed primary vertices in the \zjets\ sample.
39
40 To account for kinematic differences between the hadronic systems in the control vs. signal
41 samples, we measure the \MET\ distributions in the \gjets\ sample in bins of the number of jets
42 and the scalar sum of jet transverse energies (\Ht). These \MET\ templates are extracted separately from the 5 single photon
43 triggers with thresholds 22, 36, 50, 75, and 90 GeV, so that the templates are effectively binned in photon \pt.
44 All \MET distributions are normalized to unit area to form ``MET templates''.
45 The prediction of the MET in each \Z event is the template which corresponds to the \njets,
46 \Ht, and Z \pt in the \zjets\ event. The prediction for the \Z sample is simply the sum of all such templates.
47 All templates are displayed in App.~\ref{app:templates}.
48
49 While there is in principle a small contribution from backgrounds other than \zjets\ in the preselection regions,
50 this contribution is only $\approx$3\% ($\approx$2\%) of the total sample in the inclusive search (targeted search),
51 as shown in Table~\ref{table:zyields_2j} (Table~\ref{table:zyields_2j_targeted}), and is therefore negligible compared to the total
52 background uncertainty.
53
54 \subsection{Estimating the Flavor-Symmetric Background with e$\mu$ Events}
55 \label{sec:bkg_fs}
56
57 In this subsection we describe the background estimate for the FS background. Since this background produces equal rates of same-flavor (SF)
58 ee and $\mu\mu$ lepton pairs as opposite-flavor (OF) e$\mu$ lepton pairs, the OF yield can be used to estimate the SF yield, after
59 correcting for the different electron vs. muon offline selection efficiencies and the different efficiencies for the ee, $\mu\mu$, and e$\mu$ triggers.
60
61 An important quantity needed to translate from the OF yield to a prediction for the background in the SF final state is the ratio
62 $R_{\mu e} = \epsilon_\mu / \epsilon_e$, where $\epsilon_\mu$ ($\epsilon_e$) indicates the offline muon (electron) selection efficiency.
63 This quantity can be extracted from data using the observed Z$\to\mu\mu$ and Z$\to$ee yields in the preselection region, after correcting
64 for the different trigger efficiencies.
65
66 Hence we define:
67
68 \begin{itemize}
69 \item $N_{ee}^{\rm{trig}} = \epsilon_{ee}^{\rm{trig}}N_{ee}^{\rm{offline}}$,
70 \item $N_{\mu\mu}^{\rm{trig}} = \epsilon_{\mu\mu}^{\rm{trig}}N_{\mu\mu}^{\rm{offline}}$,
71 \item $N_{e\mu}^{\rm{trig}} = \epsilon_{e\mu}^{\rm{trig}}N_{e\mu}^{\rm{offline}}$.
72 \end{itemize}
73
74 Here $N_{\ell\ell}^{\rm{trig}}$ denotes the number of selected Z events in the $\ell\ell$ channel passing the offline and trigger selection
75 (in other words, the number of recorded and selected events), $\epsilon_{\ell\ell}^{\rm{trig}}$ is the trigger efficiency, and
76 $N_{\ell\ell}^{\rm{offline}}$ is the number of events that would have passed the offline selection if the trigger had an efficiency of 100\%.
77 Thus we calculate the quantity:
78
79 \begin{equation}
80 R_{\mu e} = \sqrt{\frac{N_{\mu\mu}^{\rm{offline}}}{N_{ee}^{\rm{offline}}}} = \sqrt{\frac{N_{\mu\mu}^{\rm{trig}}/\epsilon_{\mu\mu}^{\rm{trig}}}{N_{ee}^{\rm{trig}}/\epsilon_{ee}^{\rm{trig}}}}
81 = \sqrt{\frac{80367/0.88}{54426/0.95}} = 1.26\pm0.07.
82 \end{equation}
83
84 Here we have used the Z$\to\mu\mu$ and Z$\to$ee yields from Table~\ref{table:zyields_2j} and the trigger efficiencies quoted in Sec.~\ref{sec:datasets}.
85 The indicated uncertainty is due to the 3\% uncertainties in the trigger efficiencies. %{\bf FIXME: check for variation w.r.t. lepton \pt}.
86 The predicted yields in the ee and $\mu\mu$ final states are calculated from the observed e$\mu$ yield as
87
88 \begin{itemize}
89 \item $N_{ee}^{\rm{predicted}} = \frac {N_{e\mu}^{\rm{trig}}} {\epsilon_{e\mu}^{\rm{trig}}} \frac {\epsilon_{ee}^{\rm{trig}}} {2 R_{\mu e}}
90 = \frac{N_{e\mu}^{\rm{trig}}}{0.92}\frac{0.95}{2\times1.26} = (0.41\pm0.05) \times N_{e\mu}^{\rm{trig}}$ ,
91 \item $N_{\mu\mu}^{\rm{predicted}} = \frac {N_{e\mu}^{\rm{trig}}} {\epsilon_{e\mu}^{\rm{trig}}} \frac {\epsilon_{\mu\mu}^{\rm{trig}} R_{\mu e}} {2}
92 = \frac {N_{e\mu}^{\rm{trig}}} {0.95} \frac {0.88 \times 1.26}{2} = (0.58\pm0.07) \times N_{e\mu}^{\rm{trig}}$,
93 \end{itemize}
94
95 and the predicted yield in the combined ee and $\mu\mu$ channel is simply the sum of these two predictions:
96
97 \begin{itemize}
98 \item $N_{ee+\mu\mu}^{\rm{predicted}} = (0.99\pm0.06)\times N_{e\mu}^{\rm{trig}}$.
99 \end{itemize}
100
101 Note that the relative uncertainty in the combined ee and $\mu\mu$ prediction is smaller than those for the individual ee and $\mu\mu$ predictions
102 because the uncertainty in $R_{\mu e}$ cancels when summing the ee and $\mu\mu$ predictions. %{\bf N.B. these uncertainties are preliminary}.
103
104 To improve the statistical precision of the FS background estimate, we remove the requirement that the e$\mu$ lepton pair falls in the Z mass window.
105 Instead we scale the e$\mu$ yield by $K$, the efficiency for e$\mu$ events to satisfy the Z mass requirement, extracted from simulation. In Fig.~\ref{fig:K_incl}
106 we display the value of $K$ in data and simulation, for a variety of \MET\ requirements, for the inclusive analysis. Based on this we chose $K=0.14\pm0.02$
107 for all \MET\ regions except for \MET\ $>$ 300 GeV. For this region the statistical precision is reduced, so that we inflate the uncertainty and chose $K=0.14\pm0.08$.
108 The corresponding plot for the targeted analysis, including the b-veto, is displayed in Fig.~\ref{fig:K_targeted}.
109 Based on this we chose $K=0.13\pm0.02$
110 for all \MET\ regions up to \MET\ $>$ 100 GeV. For higher \MET\ regions (\MET\ $>$ 150 GeV and above) the statistical precision is reduced,
111 so that we inflate the uncertainty and chose $K=0.13\pm0.07$.
112
113 \begin{figure}[!ht]
114 \begin{center}
115 \begin{tabular}{cc}
116 \includegraphics[width=0.4\textwidth]{plots/K_incl.pdf} &
117 \includegraphics[width=0.4\textwidth]{plots/K_excl.pdf} \\
118 \end{tabular}
119 \caption{
120 The efficiency for e$\mu$ events to satisfy the dilepton mass requirement, $K$, in data and simulation for inclusive \MET\ intervals (left) and
121 exclusive \MET\ intervals (right) for the inclusive analysis. Based on this we chose $K=0.14\pm0.02$ for all \MET\ regions except \MET\ $>$ 300 GeV,
122 where we chose $K=0.14\pm0.08$.
123 %{\bf FIXME plots made with 10\% of \zjets\ MC statistics, to be remade with full statistics}
124 \label{fig:K_incl}
125 }
126 \end{center}
127 \end{figure}
128
129 \begin{figure}[!hb]
130 \begin{center}
131 \begin{tabular}{cc}
132 \includegraphics[width=0.4\textwidth]{plots/extractK_inclusive_bveto.pdf} &
133 \includegraphics[width=0.4\textwidth]{plots/extractK_exclusive_bveto.pdf} \\
134 \end{tabular}
135 \caption{
136 The efficiency for e$\mu$ events to satisfy the dilepton mass requirement, $K$, in data and simulation for inclusive \MET\ intervals (left) and
137 exclusive \MET\ intervals (right) for the targeted analysis, including the b-veto.
138 Based on this we chose $K=0.13\pm0.02$ for the \MET\ regions up to \MET\ $>$ 100 GeV.
139 For higher \MET\ regions we chose $K=0.13\pm0.07$.
140 %{\bf FIXME plots made with 10\% of \zjets\ MC statistics, to be remade with full statistics}
141 \label{fig:K_targeted}
142 }
143 \end{center}
144 \end{figure}
145
146 \clearpage
147
148 \subsection{Estimating the WZ and ZZ Background with MC}
149 \label{sec:bkg_vz}
150
151 Backgrounds from W($\ell\nu$)Z($\ell\ell$) where the W lepton is not identified or is outside acceptance, and Z($\nu\nu$)Z($\ell\ell$),
152 are estimated from simulation. The MC modeling of these processes is validated by comparing the MC predictions with data in control samples
153 with exactly 3 leptons (WZ control sample) and exactly 4 leptons (ZZ control sample).
154 The relevant WZ and ZZ MC samples are:
155
156 \begin{itemize}
157 \footnotesize{
158 \item \verb=/WZJetsTo3LNu_TuneZ2_8TeV-madgraph-tauola/Summer12-PU_S7_START52_V9-v2/AODSIM= ($\sigma=1.058$ pb),
159 \item \verb=/ZZJetsTo4L_TuneZ2star_8TeV-madgraph-tauola/Summer12-PU_S7_START52_V9-v3/AODSIM= ($\sigma=0.093$ pb),
160 }
161 \end{itemize}
162 The WZJetsTo2L2Q, ZZJetsTo2L2Q, and ZZJetsTo2L2Nu samples are also used in this analysis but their contribution to the 3-lepton and 4-lepton
163 control samples is negligible.
164
165 \subsubsection{WZ Validation Studies}
166 \label{sec:bkg_wz}
167
168 A pure WZ sample can be selected in data with the requirements:
169
170 \begin{itemize}
171 \item Exactly 3 $p_T>20$~GeV leptons passing analysis identication and isolation requirements,
172 \item 2 of the 3 leptons must fall in the Z window 81-101 GeV,
173 \item \MET $>$ 50 GeV (to suppress DY).
174 \end{itemize}
175
176 The data and MC yields passing the above selection are in Table~\ref{tab:wz}.
177 The inclusive yields (without any jet requirements) agree within 17\%, which is approximately equal
178 to the uncertainty in the measured WZ cross section. A data vs. MC comparison of kinematic
179 distributions (jet multiplicity, \MET, Z \pt) is given in Fig.~\ref{fig:wz}. High \MET\
180 values in WZ and ZZ events arise from highly boosted W or Z bosons that decay leptonically,
181 and we therefore check that the MC does a reasonable job of reproducing the \pt distributions of the
182 leptonically decaying \Z. While the inclusive WZ yields are in reasonable agreement, we observe
183 an excess in data in events with at least 2 jets, corresponding to the jet multiplicity requirement
184 in our preselection. We observe 60 events in data while the MC predicts $34\pm5.2$~(stat), representing an excess of 78\%,
185 as indicated in Table~\ref{tab:wz2j}. We note some possible contributions to this discrepancy:
186
187 \begin{itemize}
188
189 \item The \zjets\ contribution is under-estimated here, for 2 reasons: first, because the \zjets\
190 yield passing a \MET $>$ 50 GeV requirement is under-estimated in MC and second, because the fake
191 rate is typically under-estimated in the MC. To get a rough idea for how much the excess depends
192 on the \zjets\ yield, if the \zjets\ yield is doubled then the excess is reduced from 78\% to 55\%.
193 Also note that we are currently using 10\% of the \zjets\ MC sample and there is 1 event with a weight
194 of about 5, so the plots and tables will be remade with full \zjets\ sample.
195
196 \item The \ttbar\ contribution is under-estimated here because the fake
197 rate is typically under-estimated in the MC. To get a rough idea for how much the excess depends
198 on the \ttbar\ yield, if the \ttbar\ yield is doubled then the excess is reduced from 78\% to 57\%.
199
200 \item Currently no attempt is made to reject jets from pile-up interactions, which may be responsible
201 for some of the excess at large \njets. To check this, we increase the jet \pt threhsold to 40 GeV, which
202 helps to suppress PU jets, and observe 39 events in data vs. an MC prediction of $25\pm5.2$~(stat),
203 decreasing the excess from 78\% to 58\%. In the future this may be improved by explicitly
204 requiring the jets to be consistent with originating from the signal primary vertex.
205
206 \end{itemize}
207
208 Based on these studies we currently assess an uncertainty of 80\% on the WZ yield.
209
210 \begin{table}[htb]
211 \begin{center}
212 \caption{\label{tab:wz} Data and Monte Carlo yields passing the WZ preselection. }
213 \begin{tabular}{lccccc}
214 \hline
215 \hline
216 Sample & ee & $\mu\mu$ & e$\mu$ & total \\
217 \hline
218 WZ & 58.9 $\pm$ 0.7 & 82.2 $\pm$ 0.8 & 4.0 $\pm$ 0.2 &145.1 $\pm$ 1.0 \\
219 \ttbar & 0.6 $\pm$ 0.5 & 4.3 $\pm$ 1.5 & 3.0 $\pm$ 1.2 & 8.0 $\pm$ 2.0 \\
220 \zjets & 0.4 $\pm$ 0.4 & 4.9 $\pm$ 4.9 & 0.0 $\pm$ 0.0 & 5.3 $\pm$ 4.9 \\
221 ZZ & 1.4 $\pm$ 0.0 & 2.0 $\pm$ 0.0 & 0.1 $\pm$ 0.0 & 3.5 $\pm$ 0.0 \\
222 WW & 0.0 $\pm$ 0.0 & 0.2 $\pm$ 0.1 & 0.2 $\pm$ 0.1 & 0.3 $\pm$ 0.1 \\
223 single top & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.1 $\pm$ 0.1 \\
224 \hline
225 total SM MC & 61.3 $\pm$ 0.9 & 93.7 $\pm$ 5.2 & 7.3 $\pm$ 1.3 &162.3 $\pm$ 5.4 \\
226 data & 68 & 108 & 14 & 190 \\
227 \hline
228 \hline
229
230 \end{tabular}
231 \end{center}
232 \end{table}
233
234 \begin{table}[htb]
235 \begin{center}
236 \caption{\label{tab:wz2j} Data and Monte Carlo yields passing the WZ preselection and \njets\ $>$ 2. }
237 \begin{tabular}{lccccc}
238 \hline
239 \hline
240 Sample & ee & $\mu\mu$ & e$\mu$ & total \\
241 \hline
242 WZ & 9.8 $\pm$ 0.3 & 13.3 $\pm$ 0.3 & 0.6 $\pm$ 0.1 & 23.6 $\pm$ 0.4 \\
243 \ttbar & 0.2 $\pm$ 0.2 & 2.0 $\pm$ 0.9 & 2.2 $\pm$ 1.2 & 4.4 $\pm$ 1.5 \\
244 \zjets & 0.0 $\pm$ 0.0 & 4.9 $\pm$ 4.9 & 0.0 $\pm$ 0.0 & 4.9 $\pm$ 4.9 \\
245 ZZ & 0.3 $\pm$ 0.0 & 0.4 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.7 $\pm$ 0.0 \\
246 WW & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.1 $\pm$ 0.0 \\
247 single top & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
248 \hline
249 total SM MC & 10.3 $\pm$ 0.3 & 20.8 $\pm$ 5.0 & 2.8 $\pm$ 1.2 & 33.8 $\pm$ 5.2 \\
250 data & 23 & 32 & 5 & 60 \\
251 \hline
252 \hline
253
254 \end{tabular}
255 \end{center}
256 \end{table}
257
258 \begin{figure}[tbh]
259 \begin{center}
260 \includegraphics[width=1\linewidth]{plots/WZ.pdf}
261 \caption{\label{fig:wz}\protect
262 Data vs. MC comparisons for the WZ selection discussed in the text for \lumi.
263 The number of jets, missing transverse energy, and Z boson transverse momentum are displayed.
264 }
265 \end{center}
266 \end{figure}
267
268 \clearpage
269
270 \subsubsection{ZZ Validation Studies}
271 \label{sec:bkg_zz}
272
273 A pure ZZ sample can be selected in data with the requirements:
274
275 \begin{itemize}
276 \item Exactly 4 $p_T>20$~GeV leptons passing analysis identication and isolation requirements,
277 \item 2 of the 4 leptons must fall in the $Z$ window 81-101 GeV.
278 \end{itemize}
279
280 The data and MC yields passing the above selection are in Table~\ref{tab:zz}. Again we observe an
281 excess in data with respect to the MC prediction (29 observed vs. $17.3\pm0.1$~(stat) MC predicted).
282 After requiring at least 2 jets, we observe 2 events and the MC predicts $1.5\pm0.1$~(stat).
283 However, we have recently discovered that we may be using the wrong (too small) cross section for the ZZ sample,
284 and we are in contact with the MC generator group to determine the correct cross section.
285 Based on this we currently apply an uncertainty of 80\% to the ZZ background.
286
287 \begin{table}[htb]
288 \begin{center}
289 \caption{\label{tab:zz} Data and Monte Carlo yields for the ZZ preselection. }
290 \begin{tabular}{lccccc}
291 \hline
292 \hline
293 Sample & ee & $\mu\mu$ & e$\mu$ & total \\
294 \hline
295 ZZ & 6.6 $\pm$ 0.0 & 9.9 $\pm$ 0.0 & 0.4 $\pm$ 0.0 & 17.0 $\pm$ 0.1 \\
296 WZ & 0.1 $\pm$ 0.0 & 0.2 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.3 $\pm$ 0.0 \\
297 \zjets & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
298 \ttbar & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
299 WW & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
300 single top & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 & 0.0 $\pm$ 0.0 \\
301 \hline
302 total SM MC & 6.7 $\pm$ 0.0 & 10.1 $\pm$ 0.1 & 0.5 $\pm$ 0.0 & 17.3 $\pm$ 0.1 \\
303 data & 13 & 16 & 0 & 29 \\
304 \hline
305 \hline
306 \end{tabular}
307 \end{center}
308 \end{table}
309
310 \begin{figure}[tbh]
311 \begin{center}
312 \includegraphics[width=1\linewidth]{plots/ZZ.pdf}
313 \caption{\label{fig:zz}\protect
314 Data vs. MC comparisons for the ZZ selection discussed in the text for \lumi.
315 The number of jets, missing transverse energy, and Z boson transverse momentum are displayed.
316 }
317 \end{center}
318 \end{figure}
319
320
321
322
323 %\subsection{Estimating the Rare SM Backgrounds with MC}
324 %\label{sec:bkg_raresm}
325
326 %{\bf TODO: list samples, yields in preselection region, and show \MET\ distribution}