1 |
benhoob |
1.1 |
\section{Background Estimates from Data}
|
2 |
|
|
\label{sec:datadriven}
|
3 |
benhoob |
1.3 |
|
4 |
|
|
To look for possible BSM contributions, we define 2 signal regions that preserve about
|
5 |
|
|
0.1\% of the dilepton $t\bar{t}$ events, by adding requirements of large \MET\ and \Ht:
|
6 |
|
|
|
7 |
|
|
\begin{itemize}
|
8 |
|
|
\item high \MET\ signal region: \MET $>$ 275~GeV, \Ht $>$ 300~GeV,
|
9 |
|
|
\item high \Ht\ signal region: \MET $>$ 200~GeV, \Ht $>$ 600~GeV.
|
10 |
|
|
\end{itemize}
|
11 |
|
|
|
12 |
|
|
For the high \MET\ (high \Ht) signal region, the MC predicts 2.6 (2.5) SM events,
|
13 |
|
|
dominated by dilepton $t\bar{t}$; the expected LM1 yield is 17 (14) and the
|
14 |
|
|
expected LM3 yield is 4.3 (4.3). The signal regions are indicated in Fig.~\ref{fig:met_ht}.
|
15 |
|
|
|
16 |
benhoob |
1.1 |
We use three independent methods to estimate from data the background in the signal region.
|
17 |
benhoob |
1.3 |
The first method is a novel technique based on the ABCD method, which we used in our 2010 analysis~\cite{ref:ospaper},
|
18 |
benhoob |
1.1 |
and exploits the fact that \HT\ and $y$ are nearly uncorrelated for the $t\bar{t}$ background;
|
19 |
|
|
this method is referred to as the ABCD' technique. First, we extract the $y$ and \Ht\ distributions
|
20 |
|
|
$f(y)$ and $g(H_T)$ from data, using events from control regions which are dominated by background.
|
21 |
benhoob |
1.3 |
Because $y$ and \Ht\ are weakly-correlated, the distribution of events in the $y$ vs. \Ht\ plane is described by:
|
22 |
benhoob |
1.1 |
|
23 |
|
|
\begin{equation}
|
24 |
|
|
\frac{\partial^2 N}{\partial y \partial H_T} = f(y)g(H_T),
|
25 |
|
|
\end{equation}
|
26 |
|
|
|
27 |
|
|
allowing us to deduce the number of events falling in any region of this plane. In particular,
|
28 |
|
|
we can deduce the number of events falling in our signal regions defined by requirements on \MET\ and \Ht.
|
29 |
|
|
|
30 |
benhoob |
1.2 |
We measure the $f(y)$ and $g(H_T)$ distributions using events in the regions indicated in Fig.~\ref{fig:abcdprimedata}
|
31 |
benhoob |
1.1 |
Next, we randomly sample values of $y$ and \Ht\ from these distributions; each pair of $y$ and \Ht\ values is a pseudo-event.
|
32 |
|
|
We generate a large ensemble of pseudo-events, and find the ratio $R_{S/C}$, the ratio of the
|
33 |
|
|
number of pseudo-events falling in the signal region to the number of pseudo-events
|
34 |
|
|
falling in a control region defined by the same requirements used to select events
|
35 |
|
|
to measure $f(y)$ and $g(H_T)$. We then
|
36 |
benhoob |
1.3 |
multiply this ratio by the number events which fall in the control region in data
|
37 |
benhoob |
1.1 |
to get the predicted yield, ie. $N_{pred} = R_{S/C} \times N({\rm control})$.
|
38 |
|
|
To estimate the statistical uncertainty in the predicted background, we smear the bin contents
|
39 |
|
|
of $f(y)$ and $g(H_T)$ according to their uncertainties. We repeat the prediction 20 times
|
40 |
|
|
with these smeared distributions, and take the RMS of the deviation from the nominal prediction
|
41 |
|
|
as the statistical uncertainty. We have studied this technique using toy MC studies based on
|
42 |
benhoob |
1.3 |
event samples of similar size to the expected yield in data for 1 fb$^{-1}$.
|
43 |
benhoob |
1.1 |
Based on these studies we correct the predicted backgrounds yields by factors of 1.2 $\pm$ 0.5
|
44 |
|
|
(1.0 $\pm$ 0.5) for the high \MET\ (high \Ht) signal region.
|
45 |
|
|
|
46 |
|
|
|
47 |
|
|
The second background estimate, henceforth referred to as the dilepton transverse momentum ($\pt(\ell\ell)$) method,
|
48 |
|
|
is based on the idea~\cite{ref:victory} that in dilepton $t\bar{t}$ events the
|
49 |
|
|
\pt\ distributions of the charged leptons and neutrinos from $W$
|
50 |
|
|
decays are related, because of the common boosts from the top and $W$
|
51 |
|
|
decays. This relation is governed by the polarization of the $W$'s,
|
52 |
|
|
which is well understood in top
|
53 |
|
|
decays in the SM~\cite{Wpolarization,Wpolarization2} and can therefore be
|
54 |
|
|
reliably accounted for. We then use the observed
|
55 |
|
|
$\pt(\ell\ell)$ distribution to model the $\pt(\nu\nu)$ distribution,
|
56 |
|
|
which is identified with \MET. Thus, we use the number of observed
|
57 |
benhoob |
1.3 |
events with $\HT > 300\GeV$ and $\pt(\ell\ell) > 275\GeV$
|
58 |
benhoob |
1.1 |
($\HT > 600\GeV$ and $\pt(\ell\ell) > 200\GeV^{1/2}$ )
|
59 |
|
|
to predict the number of background events with
|
60 |
benhoob |
1.3 |
$\HT > 300\GeV$ and $\MET > 275\GeV$ ($\HT > 600\GeV$ and $\MET > 200\GeV$).
|
61 |
benhoob |
1.1 |
In practice, two corrections must be applied to this prediction, as described below.
|
62 |
|
|
|
63 |
|
|
%
|
64 |
|
|
% Now describe the corrections
|
65 |
|
|
%
|
66 |
|
|
The first correction accounts for the $\MET > 50\GeV$ requirement in the
|
67 |
|
|
preselection, which is needed to reduce the DY background. We
|
68 |
|
|
rescale the prediction by a factor equal to the inverse of the
|
69 |
|
|
fraction of events passing the preselection which also satisfy the
|
70 |
|
|
requirement $\pt(\ell\ell) > 50\GeVc$.
|
71 |
benhoob |
1.3 |
For the \Ht\ $>$ 300 GeV requirement corresponding to the high \MET\ signal region,
|
72 |
benhoob |
1.1 |
we determine this correction from data and find $K_{50}=1.5 \pm 0.3$.
|
73 |
benhoob |
1.3 |
For the \Ht\ $>$ 600 GeV requirement corresponding to the high \Ht\ signal region,
|
74 |
benhoob |
1.1 |
we do not have enough events in data to determine this correction with statistical
|
75 |
benhoob |
1.3 |
precision, so we instead extract it from MC and find $K_{50}=1.3 \pm 0.2$.
|
76 |
benhoob |
1.1 |
The second correction ($K_C$) is associated with the known polarization of the $W$, which
|
77 |
|
|
introduces a difference between the $\pt(\ell\ell)$ and $\pt(\nu\nu)$
|
78 |
|
|
distributions. The correction $K_C$ also takes into account detector effects such as the hadronic energy
|
79 |
|
|
scale and resolution which affect the \MET\ but not $\pt(\ell\ell)$.
|
80 |
|
|
The total correction factor is $K_{50} \times K_C = 2.2 \pm 0.9$ ($1.7 \pm 0.6$) for the
|
81 |
|
|
high \MET (high \Ht) signal regions, where the uncertainty includes the MC statistical uncertainty
|
82 |
|
|
in the extraction of $K_C$ and the 5\% uncertainty in the hadronic energy scale~\cite{ref:jes}.
|
83 |
|
|
|
84 |
|
|
Our third background estimation method is based on the fact that many models of new physics
|
85 |
benhoob |
1.3 |
produce an excess of SF with respect to OF lepton pairs. In SUSY, such an excess may be produced
|
86 |
benhoob |
1.1 |
in the decay $\chi_2^0 \to \chi_1^0 \ell^+\ell^-$ or in the decay of $Z$ bosons produced in
|
87 |
|
|
the cascade decays of heavy, colored objects. In contrast, for the \ttbar\ background the
|
88 |
|
|
rates of SF and OF lepton pairs are the same, as is also the case for other SM backgrounds
|
89 |
|
|
such as $W^+W^-$ or DY$\to\tau^+\tau^-$. We quantify the excess of SF vs. OF pairs using the
|
90 |
|
|
quantity
|
91 |
|
|
|
92 |
|
|
\begin{equation}
|
93 |
|
|
\label{eq:ofhighpt}
|
94 |
|
|
\Delta = R_{\mu e}N(ee) + \frac{1}{R_{\mu e}}N(\mu\mu) - N(e\mu),
|
95 |
|
|
\end{equation}
|
96 |
|
|
|
97 |
benhoob |
1.3 |
where $R_{\mu e} = 1.13 \pm 0.05$ is the ratio of muon to electron selection efficiencies,
|
98 |
|
|
evaluated by taking the square root of the ratio of the number of
|
99 |
|
|
$Z \to \mu^+\mu^-$ to $Z \to e^+e^-$ events in data, in the mass range 76-106 GeV with no jets or
|
100 |
benhoob |
1.1 |
\met\ requirements. The quantity $\Delta$ is predicted to be 0 for processes with
|
101 |
|
|
uncorrelated lepton flavors. In order for this technique to work, the kinematic selection
|
102 |
|
|
applied to events in all dilepton flavor channels must be the same, which is not the case
|
103 |
benhoob |
1.3 |
for our default selection because the $Z$ mass veto is applied only to same-flavor channels.
|
104 |
|
|
Therefore when applying the OF subtraction technique we also apply the $Z$ mass veto also
|
105 |
benhoob |
1.1 |
to the $e\mu$ channel.
|
106 |
|
|
|
107 |
|
|
All background estimation methods based on data are in principle subject to signal contamination
|
108 |
|
|
in the control regions, which tends to decrease the significance of a signal
|
109 |
|
|
which may be present in the data by increasing the background prediction.
|
110 |
|
|
In general, it is difficult to quantify these effects because we
|
111 |
|
|
do not know what signal may be present in the data. Having two
|
112 |
|
|
independent methods (in addition to expectations from MC)
|
113 |
|
|
adds redundancy because signal contamination can have different effects
|
114 |
|
|
in the different control regions for the two methods.
|
115 |
|
|
For example, in the extreme case of a
|
116 |
|
|
BSM signal with identical distributions of $\pt(\ell \ell)$ and \MET, an excess of events might be seen
|
117 |
benhoob |
1.3 |
in the ABCD' method but not in the $\pt(\ell \ell)$ method.
|
118 |
benhoob |
1.1 |
|