ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/cvsroot/UserCode/benhoob/cmsnotes/OSPAS2011/datadriven.tex
(Generate patch)

Comparing UserCode/benhoob/cmsnotes/OSPAS2011/datadriven.tex (file contents):
Revision 1.2 by benhoob, Mon Jun 13 16:39:03 2011 UTC vs.
Revision 1.3 by benhoob, Mon Jun 13 18:08:56 2011 UTC

# Line 1 | Line 1
1   \section{Background Estimates from Data}
2   \label{sec:datadriven}
3 +
4 + To look for possible BSM contributions, we define 2 signal regions that preserve about
5 + 0.1\% of the dilepton $t\bar{t}$ events, by adding requirements of large \MET\ and \Ht:
6 +
7 + \begin{itemize}
8 + \item high \MET\ signal region: \MET $>$ 275~GeV, \Ht $>$ 300~GeV,
9 + \item high \Ht\ signal region:  \MET $>$ 200~GeV, \Ht $>$ 600~GeV.
10 + \end{itemize}
11 +
12 + For the high \MET\ (high \Ht) signal region, the MC predicts 2.6 (2.5) SM events,
13 + dominated by dilepton $t\bar{t}$; the expected LM1 yield is 17 (14) and the
14 + expected LM3 yield is 4.3 (4.3). The signal regions are indicated in Fig.~\ref{fig:met_ht}.
15 +
16   We use three independent methods to estimate from data the background in the signal region.
17 < The first method is a novel technique based on the ABCD method used in our 2010 analysis~\cite{ref:ospaper},
17 > The first method is a novel technique based on the ABCD method, which we used in our 2010 analysis~\cite{ref:ospaper},
18   and exploits the fact that \HT\ and $y$ are nearly uncorrelated for the $t\bar{t}$ background;
19   this method is referred to as the ABCD' technique. First, we extract the $y$ and \Ht\ distributions
20   $f(y)$ and $g(H_T)$ from data, using events from control regions which are dominated by background.
21 < Because $y$ and \Ht\ are weakly-correlated, we can predict the distribution of events in the $y$ vs. \Ht\ plane as:
21 > Because $y$ and \Ht\ are weakly-correlated, the distribution of events in the $y$ vs. \Ht\ plane is described by:
22  
23   \begin{equation}
24   \frac{\partial^2 N}{\partial y \partial H_T} = f(y)g(H_T),
# Line 20 | Line 33 | We generate a large ensemble of pseudo-e
33   number of pseudo-events falling in the signal region to the number of pseudo-events
34   falling in a control region defined by the same requirements used to select events
35   to measure $f(y)$ and $g(H_T)$. We then
36 < multiply this ratio by the number of \ttbar\ MC events which fall in the control region
36 > multiply this ratio by the number events which fall in the control region in data
37   to get the predicted yield, ie. $N_{pred} = R_{S/C} \times N({\rm control})$.
38   To estimate the statistical uncertainty in the predicted background, we smear the bin contents
39   of $f(y)$ and $g(H_T)$ according to their uncertainties. We repeat the prediction 20 times
40   with these smeared distributions, and take the RMS of the deviation from the nominal prediction
41   as the statistical uncertainty. We have studied this technique using toy MC studies based on
42 < similar event samples of similar size to the expected yield in data for 1 fb$^{-1}$.
42 > event samples of similar size to the expected yield in data for 1 fb$^{-1}$.
43   Based on these studies we correct the predicted backgrounds yields by factors of 1.2 $\pm$ 0.5
44   (1.0 $\pm$ 0.5) for the high \MET\ (high \Ht) signal region.
45  
# Line 41 | Line 54 | decays in the SM~\cite{Wpolarization,Wpo
54   reliably  accounted   for.   We then  use   the  observed
55   $\pt(\ell\ell)$ distribution to  model the $\pt(\nu\nu)$ distribution,
56   which is  identified with \MET.  Thus,  we use the  number of observed
57 < events  with $\HT > 300\GeV$ and $\pt(\ell\ell)  > 275\GeV^{1/2}$
57 > events  with $\HT > 300\GeV$ and $\pt(\ell\ell)  > 275\GeV$
58   ($\HT > 600\GeV$ and $\pt(\ell\ell)  > 200\GeV^{1/2}$ )
59   to predict the  number of  background events  with
60 < $\HT >  300\GeV$ and  $\MET = > 275\GeV^{1/2}$ ($\HT >  600\GeV$ and  $\MET = > 200\GeV^{1/2}$).  
60 > $\HT >  300\GeV$ and  $\MET > 275\GeV$ ($\HT >  600\GeV$ and  $\MET > 200\GeV$).  
61   In  practice, two corrections must be applied to this prediction, as described below.
62  
63   %
# Line 55 | Line 68 | preselection, which is needed to  reduce
68   rescale  the  prediction by  a  factor equal  to  the  inverse of  the
69   fraction  of  events  passing  the preselection which  also  satisfy  the
70   requirement  $\pt(\ell\ell) >  50\GeVc$.  
71 < For the \Ht $>$ 300 GeV requirement corresponding to the high \MET\ signal region,
71 > For the \Ht\ $>$ 300 GeV requirement corresponding to the high \MET\ signal region,
72   we determine this correction from data and find  $K_{50}=1.5 \pm 0.3$.  
73 < For the \Ht $>$ 600 GeV requirement corresponding to the high \Ht\ signal region,
73 > For the \Ht\ $>$ 600 GeV requirement corresponding to the high \Ht\ signal region,
74   we do not have enough events in data to determine this correction with statistical
75 < precisions, so we instead extract it from MC and find $K_{50}=1.3 \pm 0.2$.
75 > precision, so we instead extract it from MC and find $K_{50}=1.3 \pm 0.2$.
76   The  second  correction ($K_C$) is  associated with the  known polarization  of the  $W$, which
77   introduces a difference  between the $\pt(\ell\ell)$ and $\pt(\nu\nu)$
78   distributions. The correction $K_C$ also takes into account detector effects such as the hadronic energy
# Line 69 | Line 82 | high \MET (high \Ht) signal regions, whe
82   in the extraction of $K_C$ and the 5\%  uncertainty in  the hadronic energy scale~\cite{ref:jes}.
83  
84   Our third background estimation method is based on the fact that many models of new physics
85 < produce an excess of SF with respect to OF lepton pairs. In SUSY, such an excess may produced
85 > produce an excess of SF with respect to OF lepton pairs. In SUSY, such an excess may be produced
86   in the decay $\chi_2^0 \to \chi_1^0 \ell^+\ell^-$ or in the decay of $Z$ bosons produced in
87   the cascade decays of heavy, colored objects. In contrast, for the \ttbar\ background the
88   rates of SF and OF lepton pairs are the same, as is also the case for other SM backgrounds
# Line 81 | Line 94 | quantity
94   \Delta = R_{\mu e}N(ee) + \frac{1}{R_{\mu e}}N(\mu\mu) - N(e\mu),
95   \end{equation}
96  
97 < where $R_{\mu e} = 1.13 \pm 0.05$ is the ratio of muon to electron selection efficiencies.
98 < This quantity is evaluated by taking the square root of the ratio of the number of observed
99 < $Z \to \mu^+\mu^-$ to $Z \to e^+e^-$ events, in the mass range 76-106 GeV with no jets or
97 > where $R_{\mu e} = 1.13 \pm 0.05$ is the ratio of muon to electron selection efficiencies,
98 > evaluated by taking the square root of the ratio of the number of
99 > $Z \to \mu^+\mu^-$ to $Z \to e^+e^-$ events in data, in the mass range 76-106 GeV with no jets or
100   \met\ requirements. The quantity $\Delta$ is predicted to be 0 for processes with
101   uncorrelated lepton flavors. In order for this technique to work, the kinematic selection
102   applied to events in all dilepton flavor channels must be the same, which is not the case
103 < for our default selection because the $Z$ mass veto is applied only to same-flavor channels.Therefore when applying the OF subtraction technique we also apply the $Z$ mass veto also
103 > for our default selection because the $Z$ mass veto is applied only to same-flavor channels.
104 > Therefore when applying the OF subtraction technique we also apply the $Z$ mass veto also
105   to the $e\mu$ channel.
106  
107   All background estimation methods based on data are in principle subject to signal contamination
# Line 100 | Line 114 | adds redundancy because signal contamina
114   in the different control regions for the two methods.
115   For example, in the extreme case of a
116   BSM signal with identical distributions of $\pt(\ell \ell)$ and \MET, an excess of events might be seen
117 < in the ABCD method but not in the $\pt(\ell \ell)$ method.
117 > in the ABCD' method but not in the $\pt(\ell \ell)$ method.
118  

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines