[ViewVC] Diff of: cvsroot/UserCode/claudioc/OSNote2010/datadriven.tex

Comparing UserCode/claudioc/OSNote2010/datadriven.tex (file contents):
Revision 1.10 by benhoob, Mon Nov 8 11:06:03 2010 UTC vs.
Revision 1.23 by benhoob, Mon Nov 15 10:11:17 2010 UTC

#	Line 3 \| Line 3
3		We have developed two data-driven methods to
4		estimate the background in the signal region.
5		The first one exploits the fact that
6	<	\met and \met$/\sqrt{\rm SumJetPt}$ are nearly
6	>	SumJetPt and \met$/\sqrt{\rm SumJetPt}$ are nearly
7		uncorrelated for the $t\bar{t}$ background
8		(Section~\ref{sec:abcd}); the second one
9		is based on the fact that in $t\bar{t}$ the
#	Line 12 \| Line 12 \| nearly the same as the $P_T$ of the pair
12		from $W$-decays, which is reconstructed as \met in the
13		detector.
14
15	<	In 30 pb$^{-1}$ we expect $\approx$ 1 SM event in
16	<	the signal region. The expectations from the LMO
17	<	and LM1 SUSY benchmark points are 5.6 and
18	<	2.2 events respectively.
15	>
16		%{\color{red} I took these
17		%numbers from the twiki, rescaling from 11.06 to 30/pb.
18		%They seem too large...are they really right?}
#	Line 24 \| Line 21 \| and LM1 SUSY benchmark points are 5.6 an
21		\subsection{ABCD method}
22		\label{sec:abcd}
23
24	<	We find that in $t\bar{t}$ events \met and
25	<	\met$/\sqrt{\rm SumJetPt}$ are nearly uncorrelated.
26	<	This is demonstrated in Figure~\ref{fig:uncor}.
24	>	We find that in $t\bar{t}$ events SumJetPt and
25	>	\met$/\sqrt{\rm SumJetPt}$ are nearly uncorrelated,
26	>	as demonstrated in Figure~\ref{fig:uncor}.
27		Thus, we can use an ABCD method in the \met$/\sqrt{\rm SumJetPt}$ vs
28		sumJetPt plane to estimate the background in a data driven way.
29
30	<	\begin{figure}[tb]
30	>	\begin{figure}[bht]
31		\begin{center}
32		\includegraphics[width=0.75\linewidth]{uncorrelated.pdf}
33		\caption{\label{fig:uncor}\protect Distributions of SumJetPt
#	Line 39 \| Line 36 \| MET$/\sqrt{\rm SumJetPt}$.}
36		\end{center}
37		\end{figure}
38
39	<	\begin{figure}[bt]
39	>	\begin{figure}[tb]
40		\begin{center}
41		\includegraphics[width=0.5\linewidth, angle=90]{abcdMC.pdf}
42	<	\caption{\label{fig:abcdMC}\protect Distributions of SumJetPt
43	<	vs. MET$/\sqrt{\rm SumJetPt}$ for SM Monte Carlo. Here we also
47	<	show our choice of ABCD regions.}
42	>	\caption{\label{fig:abcdMC}\protect Distributions of MET$/\sqrt{\rm SumJetPt}$ vs.
43	>	SumJetPt for SM Monte Carlo. Here we also show our choice of ABCD regions.}
44		\end{center}
45		\end{figure}
46
#	Line 53 \| Line 49 \| Our choice of ABCD regions is shown in F
49		The signal region is region D. The expected number of events
50		in the four regions for the SM Monte Carlo, as well as the BG
51		prediction AC/B are given in Table~\ref{tab:abcdMC} for an integrated
52	<	luminosity of 30 pb$^{-1}$. The ABCD method is accurate
53	<	to about 10\%.
52	>	luminosity of 35 pb$^{-1}$. The ABCD method is accurate
53	>	to about 20\%, and we assess a corresponding systematic uncertainty on
54	>	the background prediction.
55		%{\color{red} Avi wants some statement about stability
56		%wrt changes in regions. I am not sure that we have done it and
57		%I am not sure it is necessary (Claudio).}
58
59	<	\begin{table}[htb]
59	>	\begin{table}[ht]
60		\begin{center}
61		\caption{\label{tab:abcdMC} Expected SM Monte Carlo yields for
62	<	30 pb$^{-1}$ in the ABCD regions.}
63	<	\begin{tabular}{\|l\|c\|c\|c\|c\|\|c\|}
62	>	35 pb$^{-1}$ in the ABCD regions, as well as the predicted yield in
63	>	the signal region given by A $\times$ C / B. Here `SM other' is the sum
64	>	of non-dileptonic $t\bar{t}$ decays, $W^{\pm}$+jets, $W^+W^-$,
65	>	$W^{\pm}Z^0$, $Z^0Z^0$ and single top.}
66	>	\begin{tabular}{lccccc}
67	>	\hline
68	>	sample & A & B & C & D & A $\times$ C / B \\
69	>	\hline
70	>
71	>
72	>	\hline
73	>	$t\bar{t}\rightarrow \ell^{+}\ell^{-}$ & 7.96 & 33.07 & 4.81 & 1.20 & 1.16 \\
74	>	$Z^0 \rightarrow \ell^{+}\ell^{-}$ & 0.03 & 1.47 & 0.10 & 0.10 & 0.00 \\
75	>	SM other & 0.65 & 2.31 & 0.17 & 0.14 & 0.05 \\
76	>	\hline
77	>	total SM MC & 8.63 & 36.85 & 5.07 & 1.43 & 1.19 \\
78		\hline
68	–	Sample & A & B & C & D & AC/D \\ \hline
69	–	ttdil & 6.9 & 28.6 & 4.2 & 1.0 & 1.0 \\
70	–	Zjets & 0.0 & 1.3 & 0.1 & 0.1 & 0.0 \\
71	–	Other SM & 0.5 & 2.0 & 0.1 & 0.1 & 0.0 \\ \hline
72	–	total MC & 7.4 & 31.9 & 4.4 & 1.2 & 1.0 \\ \hline
79		\end{tabular}
80		\end{center}
81		\end{table}
#	Line 90 \| Line 96 \| In practice one has to rescale the resul
96		to account for the fact that any dilepton selection must include a
97		moderate \met cut in order to reduce Drell Yan backgrounds. This
98		is discussed in Section 5.3 of Reference~\cite{ref:ourvictory}; for a \met
99	<	cut of 50 GeV, the rescaling factor is obtained from the data as
99	>	cut of 50 GeV, the rescaling factor is obtained from the MC as
100
101		\newcommand{\ptll} {\ensuremath{P_T(\ell\ell)}}
102		\begin{center}
#	Line 115 \| Line 121 \| There are several effects that spoil the
121		$P_T(\ell\ell)$:
122		\begin{itemize}
123		\item $Ws$ in top events are polarized. Neutrinos are emitted preferentially
124	<	forward in the $W$ rest frame, thus the $P_T(\nu\nu)$ distribution is harder
124	>	parallel to the $W$ velocity while charged leptons are emitted prefertially
125	>	anti-parallel. Thus the $P_T(\nu\nu)$ distribution is harder
126		than the $P_T(\ell\ell)$ distribution for top dilepton events.
127		\item The lepton selections results in $P_T$ and $\eta$ cuts on the individual
128		leptons that have no simple correspondance to the neutrino requirements.
129		\item Similarly, the \met$>$50 GeV cut introduces an asymmetry between leptons and
130		neutrinos which is only partially compensated by the $K$ factor above.
131		\item The \met resolution is much worse than the dilepton $P_T$ resolution.
132	<	When convoluted with a falling spectrum in the tails of \met, this result
132	>	When convoluted with a falling spectrum in the tails of \met, this results
133		in a harder spectrum for \met than the original $P_T(\nu\nu)$.
134		\item The \met response in CMS is not exactly 1. This causes a distortion
135		in the \met distribution that is not present in the $P_T(\ell\ell)$ distribution.
#	Line 133 \| Line 140 \| of $P_T(\ell\ell)$ and $P_T(\nu\nu)$ do
140		sources. These events can affect the background prediction. Particularly
141		dangerous are high $P_T$ Drell Yan events that barely pass the \met$>$ 50
142		GeV selection. They will tend to push the data-driven background prediction up.
143	+	Therefore we estimate the number of DY events entering the background prediction
144	+	using the $R_{out/in}$ method as described in Sec.~\ref{sec:othBG}.
145		\end{itemize}
146
147		We have studied these effects in SM Monte Carlo, using a mixture of generator and
#	Line 155 \| Line 164 \| under different assumptions. See text f
164		4&Y & N & N & GEN & Y & Y & Y & 1.55 \\
165		5&Y & N & N & RECOSIM & Y & Y & Y & 1.51 \\
166		6&Y & Y & N & RECOSIM & Y & Y & Y & 1.58 \\
167	<	7&Y & Y & Y & RECOSIM & Y & Y & Y & 1.18 \\
167	>	7&Y & Y & Y & RECOSIM & Y & Y & Y & 1.38 \\
168	>	%%%NOTE: updated value 1.18 -> 1.46 since 2/3 DY events have been removed by updated analysis selections,
169	>	%%%dpt/pt cut and general lepton veto
170		\hline
171		\end{tabular}
172		\end{center}
#	Line 173 \| Line 184 \| Going from GEN to RECOSIM, the change in
184		% by $\approx 4\%$\footnote{We find that observed/predicted changes by roughly 0.1
185		%for each 1.5\% change in \met response.}.
186		Finally, contamination from non $t\bar{t}$
187	<	events can have a significant impact on the BG prediction. The changes between
188	<	lines 6 and 7 of Table~\ref{tab:victorybad} is driven by 3
189	<	Drell Yan events that pass the \met selection in Monte Carlo (thus the effect
190	<	is statistically not well quantified).
187	>	events can have a significant impact on the BG prediction.
188	>	%The changes between
189	>	%lines 6 and 7 of Table~\ref{tab:victorybad} is driven by 3
190	>	%Drell Yan events that pass the \met selection in Monte Carlo (thus the effect
191	>	%is statistically not well quantified).
192
193		An additional source of concern is that the CMS Madgraph $t\bar{t}$ MC does
194		not include effects of spin correlations between the two top quarks.
#	Line 196 \| Line 208 \| that the bias is at the few percent leve
208
209		Based on the results of Table~\ref{tab:victorybad}, we conclude that the
210		naive data driven background estimate based on $P_T{(\ell\ell)}$ needs to
211	<	be corrected by a factor of $ K = X \pm Y$.
211	>	be corrected by a factor of $ K_C = X \pm Y$.
212		The value of this correction factor as well as the systematic uncertainty
213		will be assessed using 38X ttbar madgraph MC. In the following we use
214	<	$K = 1$ for simplicity. Based on previous MC studies we foresee a correction
215	<	factor of $\approx 1.2 - 1.4$, and we will assess an uncertainty
214	>	$K_C = 1$ for simplicity. Based on previous MC studies we foresee a correction
215	>	factor of $K_C \approx 1.2 - 1.5$, and we will assess an uncertainty
216		based on the stability of the Monte Carlo tests under
217		variations of event selections, choices of \met algorithm, etc.
218		For example, we find that observed/predicted changes by roughly 0.1
#	Line 230 \| Line 242 \| in the ABCD method but not in the $P_T(\
242
243		The LM points are benchmarks for SUSY analyses at CMS. The effects
244		of signal contaminations for a couple such points are summarized
245	<	in Table~\ref{tab:sigcontABCD} and~\ref{tab:sigcontPT}.
234	<	Signal contamination is definitely an important
245	>	in Table~\ref{tab:sigcont}. Signal contamination is definitely an important
246		effect for these two LM points, but it does not totally hide the
247		presence of the signal.
248
249
250		\begin{table}[htb]
251		\begin{center}
252	<	\caption{\label{tab:sigcontABCD} Effects of signal contamination
253	<	for the background predictions of the ABCD method including LM0 or
254	<	LM1. Results
255	<	are normalized to 30 pb$^{-1}$.}
256	<	\begin{tabular}{\|c\|c\|\|c\|c\|\|c\|c\|}
252	>	\caption{\label{tab:sigcont} Effects of signal contamination
253	>	for the two data-driven background estimates. The three columns give
254	>	the expected yield in the signal region and the background estimates
255	>	using the ABCD and $P_T(\ell \ell)$ methods. Results are normalized to 35~pb$^{-1}$.}
256	>	\begin{tabular}{lccc}
257		\hline
258	<	SM & BG Prediction & SM$+$LM0 & BG Prediction & SM$+$LM1 & BG Prediction \\
248	<	Background & SM Only & Contribution & Including LM0 & Contribution & Including LM1 \\ \hline
249	<	1.2 & 1.0 & 6.8 & 3.7 & 3.4 & 1.3 \\
258	>	& Yield & ABCD & $P_T(\ell \ell)$ \\
259		\hline
260	<	\end{tabular}
261	<	\end{center}
262	<	\end{table}
254	<
255	<	\begin{table}[htb]
256	<	\begin{center}
257	<	\caption{\label{tab:sigcontPT} Effects of signal contamination
258	<	for the background predictions of the $P_T(\ell\ell)$ method including LM0 or
259	<	LM1. Results
260	<	are normalized to 30 pb$^{-1}$.}
261	<	\begin{tabular}{\|c\|c\|\|c\|c\|\|c\|c\|}
262	<	\hline
263	<	SM & BG Prediction & SM$+$LM0 & BG Prediction & SM$+$LM1 & BG Prediction \\
264	<	Background & SM Only & Contribution & Including LM0 & Contribution & Including LM1 \\ \hline
265	<	1.2 & 1.0 & 6.8 & 2.2 & 3.4 & 1.5 \\
260	>	SM only & 1.43 & 1.19 & 1.03 \\
261	>	SM + LM0 & 7.90 & 4.23 & 2.35 \\
262	>	SM + LM1 & 4.00 & 1.53 & 1.51 \\
263		\hline
264		\end{tabular}
265		\end{center}
266		\end{table}
267
268	+
269	+
270	+	%\begin{table}[htb]
271	+	%\begin{center}
272	+	%\caption{\label{tab:sigcontABCD} Effects of signal contamination
273	+	%for the background predictions of the ABCD method including LM0 or
274	+	%LM1. Results
275	+	%are normalized to 30 pb$^{-1}$.}
276	+	%\begin{tabular}{\|c\|c\|\|c\|c\|\|c\|c\|}
277	+	%\hline
278	+	%SM & BG Prediction & SM$+$LM0 & BG Prediction & SM$+$LM1 & BG Prediction \\
279	+	%Background & SM Only & Contribution & Including LM0 & Contribution & Including LM1 \\ \hline
280	+	%1.2 & 1.0 & 6.8 & 3.7 & 3.4 & 1.3 \\
281	+	%\hline
282	+	%\end{tabular}
283	+	%\end{center}
284	+	%\end{table}
285	+
286	+	%\begin{table}[htb]
287	+	%\begin{center}
288	+	%\caption{\label{tab:sigcontPT} Effects of signal contamination
289	+	%for the background predictions of the $P_T(\ell\ell)$ method including LM0 or
290	+	%LM1. Results
291	+	%are normalized to 30 pb$^{-1}$.}
292	+	%\begin{tabular}{\|c\|c\|\|c\|c\|\|c\|c\|}
293	+	%\hline
294	+	%SM & BG Prediction & SM$+$LM0 & BG Prediction & SM$+$LM1 & BG Prediction \\
295	+	%Background & SM Only & Contribution & Including LM0 & Contribution & Including LM1 \\ \hline
296	+	%1.2 & 1.0 & 6.8 & 2.2 & 3.4 & 1.5 \\
297	+	%\hline
298	+	%\end{tabular}
299	+	%\end{center}
300	+	%\end{table}
301	+

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing UserCode/claudioc/OSNote2010/datadriven.tex (file contents): Revision 1.10 by benhoob, Mon Nov 8 11:06:03 2010 UTC vs. Revision 1.23 by benhoob, Mon Nov 15 10:11:17 2010 UTC

Diff Legend

Comparing UserCode/claudioc/OSNote2010/datadriven.tex (file contents):
Revision 1.10 by benhoob, Mon Nov 8 11:06:03 2010 UTC vs.
Revision 1.23 by benhoob, Mon Nov 15 10:11:17 2010 UTC