1 |
\section{Counting Experiments}
|
2 |
\label{sec:datadriven}
|
3 |
|
4 |
To look for possible BSM contributions, we define 2 signal regions that preserve about
|
5 |
0.1\% of the dilepton $t\bar{t}$ events, by adding requirements of large \MET\ and \Ht:
|
6 |
|
7 |
\begin{itemize}
|
8 |
\item high \MET\ signal region: \MET\ $>$ 275~GeV, \Ht\ $>$ 300~GeV,
|
9 |
\item high \Ht\ signal region: \MET\ $>$ 200~GeV, \Ht\ $>$ 600~GeV.
|
10 |
\end{itemize}
|
11 |
|
12 |
For the high \MET\ (high \Ht) signal region, the MC predicts 2.6 (2.5) SM events,
|
13 |
dominated by dilepton $t\bar{t}$; the expected LM1 yield is 17 (14) and the
|
14 |
expected LM3 yield is 6.4 (6.7). The signal regions are indicated in Fig.~\ref{fig:met_ht}.
|
15 |
These signal regions are tighter than the one used in our published 2010 analysis since
|
16 |
with the larger data sample they give improved sensitivity to contributions from new physics.
|
17 |
|
18 |
We perform counting experiments in these signal regions, and use three independent methods to estimate from data the background in the signal region.
|
19 |
The first method is a novel technique based on the ABCD method, which we used in our 2010 analysis~\cite{ref:ospaper},
|
20 |
and exploits the fact that \HT\ and $y \equiv \MET/\sqrt{H_T}$ are nearly uncorrelated for the $t\bar{t}$ background;
|
21 |
this method is referred to as the ABCD' technique. First, we extract the $y$ and \Ht\ distributions
|
22 |
$f(y)$ and $g(H_T)$ from data, using events from control regions which are dominated by background.
|
23 |
Because $y$ and \Ht\ are weakly-correlated, the distribution of events in the $y$ vs. \Ht\ plane is described by:
|
24 |
|
25 |
\begin{equation}
|
26 |
\label{eq:abcdprime}
|
27 |
\frac{\partial^2 N}{\partial y \partial H_T} = f(y)g(H_T),
|
28 |
\end{equation}
|
29 |
|
30 |
allowing us to deduce the number of events falling in any region of this plane. In particular,
|
31 |
we can deduce the number of events falling in our signal regions defined by requirements on \MET\ and \Ht.
|
32 |
|
33 |
We measure the $f(y)$ and $g(H_T)$ distributions using events in the regions indicated in Fig.~\ref{fig:abcdprimedata},
|
34 |
and predict the background yields in the signal regions using Eq.~\ref{eq:abcdprime}.
|
35 |
%Next, we randomly sample values of $y$ and \Ht\ from these distributions; each pair of $y$ and \Ht\ values is a pseudo-event.
|
36 |
%We generate a large ensemble of pseudo-events, and find the ratio $R_{S/C}$, the ratio of the
|
37 |
%number of pseudo-events falling in the signal region to the number of pseudo-events
|
38 |
%falling in a control region defined by the same requirements used to select events
|
39 |
%to measure $f(y)$ and $g(H_T)$. We then
|
40 |
%multiply this ratio by the number events which fall in the control region in data
|
41 |
%to get the predicted yield, ie. $N_{pred} = R_{S/C} \times N({\rm control})$.
|
42 |
To estimate the statistical uncertainty in the predicted background, the bin contents
|
43 |
of $f(y)$ and $g(H_T)$ are smeared according to their Poisson uncertainties, the prediction is repeated 20 times
|
44 |
with these smeared distributions, and the RMS of the deviation from the nominal prediction is taken
|
45 |
as the statistical uncertainty. We have studied this technique using toy MC studies based on
|
46 |
event samples of similar size to the expected yield in data for 1 fb$^{-1}$.
|
47 |
Based on these studies we correct the predicted background yields by factors of 1.2 $\pm$ 0.5
|
48 |
(1.0 $\pm$ 0.5) for the high \MET\ (high \Ht) signal region.
|
49 |
|
50 |
|
51 |
The second background estimate, henceforth referred to as the dilepton transverse momentum ($\pt(\ell\ell)$) method,
|
52 |
is based on the idea~\cite{ref:victory} that in dilepton $t\bar{t}$ events the
|
53 |
\pt\ distributions of the charged leptons and neutrinos from $W$
|
54 |
decays are related, because of the common boosts from the top and $W$
|
55 |
decays. This relation is governed by the polarization of the $W$'s,
|
56 |
which is well understood in top
|
57 |
decays in the SM~\cite{Wpolarization,Wpolarization2} and can therefore be
|
58 |
reliably accounted for. We then use the observed
|
59 |
$\pt(\ell\ell)$ distribution to model the $\pt(\nu\nu)$ distribution,
|
60 |
which is identified with \MET. Thus, we use the number of observed
|
61 |
events with $\HT > 300\GeV$ and $\pt(\ell\ell) > 275\GeV$
|
62 |
($\HT > 600\GeV$ and $\pt(\ell\ell) > 200\GeV$ )
|
63 |
to predict the number of background events with
|
64 |
$\HT > 300\GeV$ and $\MET > 275\GeV$ ($\HT > 600\GeV$ and $\MET > 200\GeV$).
|
65 |
In practice, we apply two corrections to this prediction, following the same procedure as in Ref.~\cite{ref:ospaper}.
|
66 |
The first correction is $K_{50}=1.5 \pm 0.3$ ($1.3 \pm 0.2$) for the high \MET\ (high \Ht) signal region.
|
67 |
The second correction factor is $K_C = 1.5 \pm 0.5$ ($1.3 \pm 0.4$) for the
|
68 |
high \MET (high \Ht) signal region.
|
69 |
|
70 |
Our third background estimation method is based on the fact that many models of new physics
|
71 |
produce an excess of SF with respect to OF lepton pairs, while for the \ttbar\ background the
|
72 |
rates of SF and OF lepton pairs are the same. Hence we make use of the OF subtraction technique
|
73 |
discussed in Sec.~\ref{sec:fit} in which we performed a shape analysis of the dilepton mass distribution.
|
74 |
Here we perform a counting experiment, by quantifying the the excess of SF vs. OF pairs using the
|
75 |
quantity
|
76 |
|
77 |
\begin{equation}
|
78 |
\label{eq:ofhighpt}
|
79 |
\Delta = R_{\mu e}N(ee) + \frac{1}{R_{\mu e}}N(\mu\mu) - N(e\mu).
|
80 |
\end{equation}
|
81 |
|
82 |
This quantity is predicted to be 0 for processes with
|
83 |
uncorrelated lepton flavors. In order for this technique to work, the kinematic selection
|
84 |
applied to events in all dilepton flavor channels must be the same, which is not the case
|
85 |
for our default selection because the $Z$ mass veto is applied only to same-flavor channels.
|
86 |
Therefore when applying the OF subtraction technique we also apply the $Z$ mass veto
|
87 |
to the $e\mu$ channel.
|
88 |
|
89 |
All background estimation methods based on data are in principle subject to signal contamination
|
90 |
in the control regions, which tends to decrease the significance of a signal
|
91 |
which may be present in the data by increasing the background prediction.
|
92 |
In general, it is difficult to quantify these effects because we
|
93 |
do not know what signal may be present in the data. Having three
|
94 |
independent methods (in addition to expectations from MC)
|
95 |
adds redundancy because signal contamination can have different effects
|
96 |
in the different control regions for the three methods.
|
97 |
For example, in the extreme case of a
|
98 |
BSM signal with identical distributions of $\pt(\ell \ell)$ and \MET, an excess of events might be seen
|
99 |
in the ABCD' method but not in the $\pt(\ell \ell)$ method.
|
100 |
|