1 |
\section{Background Estimates from Data}
|
2 |
\label{sec:datadriven}
|
3 |
We use three independent methods to estimate from data the background in the signal region.
|
4 |
The first method is a novel technique based on the ABCD method used in our 2010 analysis~\cite{ref:ospaper},
|
5 |
and exploits the fact that \HT\ and $y$ are nearly uncorrelated for the $t\bar{t}$ background;
|
6 |
this method is referred to as the ABCD' technique. First, we extract the $y$ and \Ht\ distributions
|
7 |
$f(y)$ and $g(H_T)$ from data, using events from control regions which are dominated by background.
|
8 |
Because $y$ and \Ht\ are weakly-correlated, we can predict the distribution of events in the $y$ vs. \Ht\ plane as:
|
9 |
|
10 |
\begin{equation}
|
11 |
\frac{\partial^2 N}{\partial y \partial H_T} = f(y)g(H_T),
|
12 |
\end{equation}
|
13 |
|
14 |
allowing us to deduce the number of events falling in any region of this plane. In particular,
|
15 |
we can deduce the number of events falling in our signal regions defined by requirements on \MET\ and \Ht.
|
16 |
|
17 |
We measure the $f(y)$ and $g(H_T)$ distributions using events in the regions indicated in Fig.~\ref{fig:abcdprime}
|
18 |
Next, we randomly sample values of $y$ and \Ht\ from these distributions; each pair of $y$ and \Ht\ values is a pseudo-event.
|
19 |
We generate a large ensemble of pseudo-events, and find the ratio $R_{S/C}$, the ratio of the
|
20 |
number of pseudo-events falling in the signal region to the number of pseudo-events
|
21 |
falling in a control region defined by the same requirements used to select events
|
22 |
to measure $f(y)$ and $g(H_T)$. We then
|
23 |
multiply this ratio by the number of \ttbar\ MC events which fall in the control region
|
24 |
to get the predicted yield, ie. $N_{pred} = R_{S/C} \times N({\rm control})$.
|
25 |
To estimate the statistical uncertainty in the predicted background, we smear the bin contents
|
26 |
of $f(y)$ and $g(H_T)$ according to their uncertainties. We repeat the prediction 20 times
|
27 |
with these smeared distributions, and take the RMS of the deviation from the nominal prediction
|
28 |
as the statistical uncertainty. We have studied this technique using toy MC studies based on
|
29 |
similar event samples of similar size to the expected yield in data for 1 fb$^{-1}$.
|
30 |
Based on these studies we correct the predicted backgrounds yields by factors of 1.2 $\pm$ 0.5
|
31 |
(1.0 $\pm$ 0.5) for the high \MET\ (high \Ht) signal region.
|
32 |
|
33 |
|
34 |
The second background estimate, henceforth referred to as the dilepton transverse momentum ($\pt(\ell\ell)$) method,
|
35 |
is based on the idea~\cite{ref:victory} that in dilepton $t\bar{t}$ events the
|
36 |
\pt\ distributions of the charged leptons and neutrinos from $W$
|
37 |
decays are related, because of the common boosts from the top and $W$
|
38 |
decays. This relation is governed by the polarization of the $W$'s,
|
39 |
which is well understood in top
|
40 |
decays in the SM~\cite{Wpolarization,Wpolarization2} and can therefore be
|
41 |
reliably accounted for. We then use the observed
|
42 |
$\pt(\ell\ell)$ distribution to model the $\pt(\nu\nu)$ distribution,
|
43 |
which is identified with \MET. Thus, we use the number of observed
|
44 |
events with $\HT > 300\GeV$ and $\pt(\ell\ell) > 275\GeV^{1/2}$
|
45 |
($\HT > 600\GeV$ and $\pt(\ell\ell) > 200\GeV^{1/2}$ )
|
46 |
to predict the number of background events with
|
47 |
$\HT > 300\GeV$ and $\MET = > 275\GeV^{1/2}$ ($\HT > 600\GeV$ and $\MET = > 200\GeV^{1/2}$).
|
48 |
In practice, two corrections must be applied to this prediction, as described below.
|
49 |
|
50 |
%
|
51 |
% Now describe the corrections
|
52 |
%
|
53 |
The first correction accounts for the $\MET > 50\GeV$ requirement in the
|
54 |
preselection, which is needed to reduce the DY background. We
|
55 |
rescale the prediction by a factor equal to the inverse of the
|
56 |
fraction of events passing the preselection which also satisfy the
|
57 |
requirement $\pt(\ell\ell) > 50\GeVc$.
|
58 |
For the \Ht $>$ 300 GeV requirement corresponding to the high \MET\ signal region,
|
59 |
we determine this correction from data and find $K_{50}=1.5 \pm 0.3$.
|
60 |
For the \Ht $>$ 600 GeV requirement corresponding to the high \Ht\ signal region,
|
61 |
we do not have enough events in data to determine this correction with statistical
|
62 |
precisions, so we instead extract it from MC and find $K_{50}=1.3 \pm 0.2$.
|
63 |
The second correction ($K_C$) is associated with the known polarization of the $W$, which
|
64 |
introduces a difference between the $\pt(\ell\ell)$ and $\pt(\nu\nu)$
|
65 |
distributions. The correction $K_C$ also takes into account detector effects such as the hadronic energy
|
66 |
scale and resolution which affect the \MET\ but not $\pt(\ell\ell)$.
|
67 |
The total correction factor is $K_{50} \times K_C = 2.2 \pm 0.9$ ($1.7 \pm 0.6$) for the
|
68 |
high \MET (high \Ht) signal regions, where the uncertainty includes the MC statistical uncertainty
|
69 |
in the extraction of $K_C$ and the 5\% uncertainty in the hadronic energy scale~\cite{ref:jes}.
|
70 |
|
71 |
Our third background estimation method is based on the fact that many models of new physics
|
72 |
produce an excess of SF with respect to OF lepton pairs. In SUSY, such an excess may produced
|
73 |
in the decay $\chi_2^0 \to \chi_1^0 \ell^+\ell^-$ or in the decay of $Z$ bosons produced in
|
74 |
the cascade decays of heavy, colored objects. In contrast, for the \ttbar\ background the
|
75 |
rates of SF and OF lepton pairs are the same, as is also the case for other SM backgrounds
|
76 |
such as $W^+W^-$ or DY$\to\tau^+\tau^-$. We quantify the excess of SF vs. OF pairs using the
|
77 |
quantity
|
78 |
|
79 |
\begin{equation}
|
80 |
\label{eq:ofhighpt}
|
81 |
\Delta = R_{\mu e}N(ee) + \frac{1}{R_{\mu e}}N(\mu\mu) - N(e\mu),
|
82 |
\end{equation}
|
83 |
|
84 |
where $R_{\mu e} = 1.13 \pm 0.05$ is the ratio of muon to electron selection efficiencies.
|
85 |
This quantity is evaluated by taking the square root of the ratio of the number of observed
|
86 |
$Z \to \mu^+\mu^-$ to $Z \to e^+e^-$ events, in the mass range 76-106 GeV with no jets or
|
87 |
\met\ requirements. The quantity $\Delta$ is predicted to be 0 for processes with
|
88 |
uncorrelated lepton flavors. In order for this technique to work, the kinematic selection
|
89 |
applied to events in all dilepton flavor channels must be the same, which is not the case
|
90 |
for our default selection because the $Z$ mass veto is applied only to same-flavor channels.Therefore when applying the OF subtraction technique we also apply the $Z$ mass veto also
|
91 |
to the $e\mu$ channel.
|
92 |
|
93 |
All background estimation methods based on data are in principle subject to signal contamination
|
94 |
in the control regions, which tends to decrease the significance of a signal
|
95 |
which may be present in the data by increasing the background prediction.
|
96 |
In general, it is difficult to quantify these effects because we
|
97 |
do not know what signal may be present in the data. Having two
|
98 |
independent methods (in addition to expectations from MC)
|
99 |
adds redundancy because signal contamination can have different effects
|
100 |
in the different control regions for the two methods.
|
101 |
For example, in the extreme case of a
|
102 |
BSM signal with identical distributions of $\pt(\ell \ell)$ and \MET, an excess of events might be seen
|
103 |
in the ABCD method but not in the $\pt(\ell \ell)$ method.
|
104 |
|