1 |
< |
\section{Background Estimates from Data} |
1 |
> |
\section{Counting Experiments} |
2 |
|
\label{sec:datadriven} |
3 |
< |
We use three independent methods to estimate from data the background in the signal region. |
4 |
< |
The first method is a novel technique based on the ABCD method used in our 2010 analysis~\cite{ref:ospaper}, |
5 |
< |
and exploits the fact that \HT\ and $y$ are nearly uncorrelated for the $t\bar{t}$ background; |
3 |
> |
|
4 |
> |
To look for possible BSM contributions, we define 2 signal regions that reject all but |
5 |
> |
0.1\% of the dilepton $t\bar{t}$ events, by adding requirements of large \MET\ and \Ht: |
6 |
> |
|
7 |
> |
\begin{itemize} |
8 |
> |
\item high \MET\ signal region: \MET\ $>$ 275~GeV, \Ht\ $>$ 300~GeV, |
9 |
> |
\item high \Ht\ signal region: \MET\ $>$ 200~GeV, \Ht\ $>$ 600~GeV. |
10 |
> |
\end{itemize} |
11 |
> |
|
12 |
> |
For the high \MET\ (high \Ht) signal region, the MC predicts 2.6 (2.5) SM events, |
13 |
> |
dominated by dilepton $t\bar{t}$; the expected LM1 yield is 17 (14) and the |
14 |
> |
expected LM3 yield is 6.4 (6.7). The signal regions are indicated in Fig.~\ref{fig:met_ht}. |
15 |
> |
These signal regions are tighter than the one used in our published 2010 analysis since |
16 |
> |
with the larger data sample they allow us to explore phase space farther from the core |
17 |
> |
of the SM distributions. |
18 |
> |
|
19 |
> |
|
20 |
> |
We perform counting experiments in these signal regions, and use three independent methods to estimate from data the background in the signal region. |
21 |
> |
The first method is a novel technique which is a variation of the ABCD method, which we used in our 2010 analysis~\cite{ref:ospaper}, |
22 |
> |
and exploits the fact that \HT\ and $y \equiv \MET/\sqrt{H_T}$ are nearly uncorrelated for the $t\bar{t}$ background; |
23 |
|
this method is referred to as the ABCD' technique. First, we extract the $y$ and \Ht\ distributions |
24 |
|
$f(y)$ and $g(H_T)$ from data, using events from control regions which are dominated by background. |
25 |
< |
Because $y$ and \Ht\ are weakly-correlated, we can predict the distribution of events in the $y$ vs. \Ht\ plane as: |
25 |
> |
Because $y$ and \Ht\ are weakly-correlated, the distribution of events in the $y$ vs. \Ht\ plane is described by: |
26 |
|
|
27 |
|
\begin{equation} |
28 |
+ |
\label{eq:abcdprime} |
29 |
|
\frac{\partial^2 N}{\partial y \partial H_T} = f(y)g(H_T), |
30 |
|
\end{equation} |
31 |
|
|
32 |
|
allowing us to deduce the number of events falling in any region of this plane. In particular, |
33 |
|
we can deduce the number of events falling in our signal regions defined by requirements on \MET\ and \Ht. |
34 |
|
|
35 |
< |
We measure the $f(y)$ and $g(H_T)$ distributions using events in the regions indicated in Fig.~\ref{fig:abcdprimedata} |
36 |
< |
Next, we randomly sample values of $y$ and \Ht\ from these distributions; each pair of $y$ and \Ht\ values is a pseudo-event. |
37 |
< |
We generate a large ensemble of pseudo-events, and find the ratio $R_{S/C}$, the ratio of the |
38 |
< |
number of pseudo-events falling in the signal region to the number of pseudo-events |
39 |
< |
falling in a control region defined by the same requirements used to select events |
40 |
< |
to measure $f(y)$ and $g(H_T)$. We then |
41 |
< |
multiply this ratio by the number of \ttbar\ MC events which fall in the control region |
42 |
< |
to get the predicted yield, ie. $N_{pred} = R_{S/C} \times N({\rm control})$. |
43 |
< |
To estimate the statistical uncertainty in the predicted background, we smear the bin contents |
44 |
< |
of $f(y)$ and $g(H_T)$ according to their uncertainties. We repeat the prediction 20 times |
45 |
< |
with these smeared distributions, and take the RMS of the deviation from the nominal prediction |
46 |
< |
as the statistical uncertainty. We have studied this technique using toy MC studies based on |
47 |
< |
similar event samples of similar size to the expected yield in data for 1 fb$^{-1}$. |
48 |
< |
Based on these studies we correct the predicted backgrounds yields by factors of 1.2 $\pm$ 0.5 |
35 |
> |
We measure the $f(y)$ and $g(H_T)$ distributions using events in the regions indicated in Fig.~\ref{fig:abcdprimedata}, |
36 |
> |
and predict the background yields in the signal regions using Eq.~\ref{eq:abcdprime}. |
37 |
> |
%Next, we randomly sample values of $y$ and \Ht\ from these distributions; each pair of $y$ and \Ht\ values is a pseudo-event. |
38 |
> |
%We generate a large ensemble of pseudo-events, and find the ratio $R_{S/C}$, the ratio of the |
39 |
> |
%number of pseudo-events falling in the signal region to the number of pseudo-events |
40 |
> |
%falling in a control region defined by the same requirements used to select events |
41 |
> |
%to measure $f(y)$ and $g(H_T)$. We then |
42 |
> |
%multiply this ratio by the number events which fall in the control region in data |
43 |
> |
%to get the predicted yield, ie. $N_{pred} = R_{S/C} \times N({\rm control})$. |
44 |
> |
To estimate the statistical uncertainty in the predicted background, the bin contents |
45 |
> |
of $f(y)$ and $g(H_T)$ are smeared according to their Poisson uncertainties. |
46 |
> |
We have studied this technique using toy MC studies based on |
47 |
> |
event samples of similar size to the expected yield in data for 1 fb$^{-1}$. |
48 |
> |
Based on these studies we correct the predicted background yields by factors of 1.2 $\pm$ 0.5 |
49 |
|
(1.0 $\pm$ 0.5) for the high \MET\ (high \Ht) signal region. |
50 |
|
|
51 |
|
|
59 |
|
reliably accounted for. We then use the observed |
60 |
|
$\pt(\ell\ell)$ distribution to model the $\pt(\nu\nu)$ distribution, |
61 |
|
which is identified with \MET. Thus, we use the number of observed |
62 |
< |
events with $\HT > 300\GeV$ and $\pt(\ell\ell) > 275\GeV^{1/2}$ |
63 |
< |
($\HT > 600\GeV$ and $\pt(\ell\ell) > 200\GeV^{1/2}$ ) |
62 |
> |
events with $\HT > 300\GeV$ and $\pt(\ell\ell) > 275\GeV$ |
63 |
> |
($\HT > 600\GeV$ and $\pt(\ell\ell) > 200\GeV$ ) |
64 |
|
to predict the number of background events with |
65 |
< |
$\HT > 300\GeV$ and $\MET = > 275\GeV^{1/2}$ ($\HT > 600\GeV$ and $\MET = > 200\GeV^{1/2}$). |
66 |
< |
In practice, two corrections must be applied to this prediction, as described below. |
67 |
< |
|
68 |
< |
% |
69 |
< |
% Now describe the corrections |
52 |
< |
% |
53 |
< |
The first correction accounts for the $\MET > 50\GeV$ requirement in the |
54 |
< |
preselection, which is needed to reduce the DY background. We |
55 |
< |
rescale the prediction by a factor equal to the inverse of the |
56 |
< |
fraction of events passing the preselection which also satisfy the |
57 |
< |
requirement $\pt(\ell\ell) > 50\GeVc$. |
58 |
< |
For the \Ht $>$ 300 GeV requirement corresponding to the high \MET\ signal region, |
59 |
< |
we determine this correction from data and find $K_{50}=1.5 \pm 0.3$. |
60 |
< |
For the \Ht $>$ 600 GeV requirement corresponding to the high \Ht\ signal region, |
61 |
< |
we do not have enough events in data to determine this correction with statistical |
62 |
< |
precisions, so we instead extract it from MC and find $K_{50}=1.3 \pm 0.2$. |
63 |
< |
The second correction ($K_C$) is associated with the known polarization of the $W$, which |
64 |
< |
introduces a difference between the $\pt(\ell\ell)$ and $\pt(\nu\nu)$ |
65 |
< |
distributions. The correction $K_C$ also takes into account detector effects such as the hadronic energy |
66 |
< |
scale and resolution which affect the \MET\ but not $\pt(\ell\ell)$. |
67 |
< |
The total correction factor is $K_{50} \times K_C = 2.2 \pm 0.9$ ($1.7 \pm 0.6$) for the |
68 |
< |
high \MET (high \Ht) signal regions, where the uncertainty includes the MC statistical uncertainty |
69 |
< |
in the extraction of $K_C$ and the 5\% uncertainty in the hadronic energy scale~\cite{ref:jes}. |
65 |
> |
$\HT > 300\GeV$ and $\MET > 275\GeV$ ($\HT > 600\GeV$ and $\MET > 200\GeV$). |
66 |
> |
In practice, we apply two corrections to this prediction, following the same procedure as in Ref.~\cite{ref:ospaper}. |
67 |
> |
The first correction is $K_{50}=1.5 \pm 0.3$ ($1.3 \pm 0.2$) for the high \MET\ (high \Ht) signal region. |
68 |
> |
The second correction factor is $K_C = 1.5 \pm 0.5$ ($1.3 \pm 0.4$) for the |
69 |
> |
high \MET (high \Ht) signal region. |
70 |
|
|
71 |
|
Our third background estimation method is based on the fact that many models of new physics |
72 |
< |
produce an excess of SF with respect to OF lepton pairs. In SUSY, such an excess may produced |
73 |
< |
in the decay $\chi_2^0 \to \chi_1^0 \ell^+\ell^-$ or in the decay of $Z$ bosons produced in |
74 |
< |
the cascade decays of heavy, colored objects. In contrast, for the \ttbar\ background the |
75 |
< |
rates of SF and OF lepton pairs are the same, as is also the case for other SM backgrounds |
76 |
< |
such as $W^+W^-$ or DY$\to\tau^+\tau^-$. We quantify the excess of SF vs. OF pairs using the |
72 |
> |
produce an excess of SF with respect to OF lepton pairs, while for the \ttbar\ background the |
73 |
> |
rates of SF and OF lepton pairs are the same. Hence we make use of the OF subtraction technique |
74 |
> |
discussed in Sec.~\ref{sec:fit} in which we performed a shape analysis of the dilepton mass distribution. |
75 |
> |
Here we perform a counting experiment, by quantifying the excess of SF vs. OF pairs using the |
76 |
|
quantity |
77 |
|
|
78 |
|
\begin{equation} |
79 |
|
\label{eq:ofhighpt} |
80 |
< |
\Delta = R_{\mu e}N(ee) + \frac{1}{R_{\mu e}}N(\mu\mu) - N(e\mu), |
80 |
> |
\Delta = R_{\mu e}N(ee) + \frac{1}{R_{\mu e}}N(\mu\mu) - N(e\mu). |
81 |
|
\end{equation} |
82 |
|
|
83 |
< |
where $R_{\mu e} = 1.13 \pm 0.05$ is the ratio of muon to electron selection efficiencies. |
85 |
< |
This quantity is evaluated by taking the square root of the ratio of the number of observed |
86 |
< |
$Z \to \mu^+\mu^-$ to $Z \to e^+e^-$ events, in the mass range 76-106 GeV with no jets or |
87 |
< |
\met\ requirements. The quantity $\Delta$ is predicted to be 0 for processes with |
83 |
> |
This quantity is predicted to be 0 for processes with |
84 |
|
uncorrelated lepton flavors. In order for this technique to work, the kinematic selection |
85 |
|
applied to events in all dilepton flavor channels must be the same, which is not the case |
86 |
< |
for our default selection because the $Z$ mass veto is applied only to same-flavor channels.Therefore when applying the OF subtraction technique we also apply the $Z$ mass veto also |
86 |
> |
for our default selection because the $Z$ mass veto is applied only to same-flavor channels. |
87 |
> |
Therefore when applying the OF subtraction technique we also apply the $Z$ mass veto |
88 |
|
to the $e\mu$ channel. |
89 |
|
|
90 |
|
All background estimation methods based on data are in principle subject to signal contamination |
91 |
|
in the control regions, which tends to decrease the significance of a signal |
92 |
|
which may be present in the data by increasing the background prediction. |
93 |
|
In general, it is difficult to quantify these effects because we |
94 |
< |
do not know what signal may be present in the data. Having two |
94 |
> |
do not know what signal may be present in the data. Having three |
95 |
|
independent methods (in addition to expectations from MC) |
96 |
|
adds redundancy because signal contamination can have different effects |
97 |
< |
in the different control regions for the two methods. |
97 |
> |
in the different control regions for the three methods. |
98 |
|
For example, in the extreme case of a |
99 |
|
BSM signal with identical distributions of $\pt(\ell \ell)$ and \MET, an excess of events might be seen |
100 |
< |
in the ABCD method but not in the $\pt(\ell \ell)$ method. |
100 |
> |
in the ABCD' method but not in the $\pt(\ell \ell)$ method. |
101 |
|
|