MitHzz4l/Documentation/Backgrounds.tex

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Backgrounds}\label{section:BG}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
This section reviews our evaluation of background in the $4\ell$ analysis.  We discuss expected yields and the predicted $m(4\ell)$ shapes, both of which are used in the limit and sensitivity calculations described in Section~\ref{sec:Extraction}.  We estimate Electroweak (EWK) backgrounds with Monte Carlo.  Our estimates of instrumental and jet backgrounds are data-driven.

%_________________________________________________________________
\subsection{Electroweak Backgrounds}\label{sec:EWK}
%_________________________________________________________________
We use the $ZZ \rightarrow 4\ell$, $WZ \rightarrow 3\ell$ and $Z\gamma$ MC samples listed in Table~\ref{tab:MC} to estimate yields and $m(4\ell)$ shapes for these backgrounds.  We correct the acceptances determined from simulation using the procedures described in Section~\ref{section:signalEff}.  Background yields follow from the corrected $4e$, $4\mu$ and $2e2\mu$ acceptances ($\alpha_{c}$) for each process: 

\begin{eqnarray}
N^{exp}_{i}  & = & \alpha^{c}_{i}\int\mathcal{L}(k_{i}\sigma_{i})
\end{eqnarray}

The cross sections and K-factors used in formula above are taken from Table~\ref{tab:xsec}.  Table~\ref{tab;MCBG} lists the $\alpha_{c}$ and the expected $2.1\rm~fb^{-1}$ yields for each of the backgrounds.  Figure~\ref{fig:MCshapes} shows a yield-normalized stack of the corresponding $m(4\ell)$ distributions.

%-------------------------------------------------
\begin{table}[htb]
\begin{center}
\begin{tabular}{c|cc|cc|cc}
\hline
process   & $\alpha^{c}_{ee}$  & $N^{exp}_{ee}$ & $\alpha^{c}_{\mu\mu}$  & $N^{exp}_{\mu\mu}$ & $\alpha^{c}_{2e2\mu}$  & $N^{exp}_{2e2\mu}$ \\
\hline
$ZZ*$     &  ~                 & ~              &  ~                 & ~              &  ~                 & ~              \\ 
$WZ$      &  ~                 & ~              &  ~                 & ~              &  ~                 & ~              \\ 
$Z\gamma$ &  ~                 & ~              &  ~                 & ~              &  ~                 & ~              \\ 
\hline
\end{tabular}
\caption{{\bf MC Background Yields.}\small{blah.}\label{tab:MCBG}}
\end{center}
\end{table}
%-------------------------------------------------

We consider two sources of systematic uncertainties on EWK background predictions.  The first is due to the uncertainty on the efficiency scale-factors, which we propagate from Tables~\ref{tab:}-\ref{tab:} to the corrected acceptance for each channel.  The second uncertainty concerns the shape of the $m(4\ell)$ distribution predicted by the MC, which we estimate by reweighting our POWHEG samples at generator-level to the $m(4\ell)$ distributions predicted by MCFM.  We take the relative difference in shape as the uncertainty input to the limit calculation.  Table~\ref{tab:EWKsys} lists the yield uncertainties.  Figure~\ref{fig:EWKshapeSys} shows the relative shape differences we obtain after reweighting.

%-------------------------------------------------
\begin{figure}[htb]
\begin{center}
\includegraphics[width=0.5\linewidth]{figs/HF1.png}
\caption{MC Background Shapes.\label{fig:MCshapes} }
\end{center}
\end{figure}
%-------------------------------------------------

%-------------------------------------------------
\begin{figure}[bht]
\begin{center}
\includegraphics[width=0.5\linewidth]{figs/HF1.png}
\caption{EWK Shape Differences From MCFM Reweight.\label{fig:EWKshapesSys} }
\end{center}
\end{figure}
%-------------------------------------------------

%_________________________________________________________________
\subsection{Instrumental/Fake Backgrounds}\label{sec:fakes}
%_________________________________________________________________
$Z+jets$ , $Zb\bar{b}/c\bar{c}$ and $t\bar{t}$ backgrounds (collectively, $\ell\ell jj$) contribute to the $4\ell$ signal region when jets in these events are either mismeasured as leptons or produce real leptons through secondary interactions.  These processes are difficult to accurately simulate so we estimate their contribution from data.  We assess $\ell\ell jj$  backgrounds using the ``fakeable object'' technique~\cite{fakeable}, employing ``fakerates'' defined with respect to loosely identified lepton candidates, referred to as {\it denominator objects}.  Electron and muon denominator selections are defined in Table~\ref{tab:fo}.

%-------------------------------------------------
\begin{table}[htb]
\begin{center}
\begin{tabular}{|c|c|c|c|}
\hline
\multicolumn{2}{|c|}{Electron}                    & \multicolumn{2}{|c|}{Muon}  \\
\hline
variable               & requirement              & variable   & requirement          \\
\hline
$E_{T}$   &  $> 7\rm~GeV$            &                                  $p_{T}$                & $> 5\rm~GeV$            \\
$|dz|$    &   $< 0.1\rm~cm$          &                                  type                   & $\rm Global~||~Tracker$  \\
$|\eta|$  & $< 2.5\rm~GeV$           &                                  $|d_{0}|$              & $< 2\rm~mm$             \\
$H/E$     &    $< 0.12(0.1) EB(EE)$  &                                  $Iso^{pf}_{0.3}$       & $< 3\times p_{T}$        \\
$iso_{trk}$ & $<0.3$                 &                                  ~                      & ~                        \\
$iso_{em}$  & $<0.3$                 &                                  ~                      & ~                        \\
$iso_{had}$ & $<0.3$                 &                                  ~                      & ~                        \\
\hline
\end{tabular}
\caption{Denominator Object Definitions}\label{tab:fo}
\end{center}
\end{table}
%-------------------------------------------------

We calculate the fakerates ($\epsilon_{FR}(p_{T},\eta)$) from samples of events that pass single lepton triggers: \verb|HLT_Ele8| for electrons, \verb|HLT_Mu8| or \verb|HLT_Mu13| for muons.  In both channels we reduce contamination from $W\rightarrow \ell\nu$ and $Z/\gamma^{*}\rightarrow\ell\ell$ by vetoing events with $MET > 20\rm~GeV$, or with $m_{T} > 35\rm~GeV$ or with two or more denominator objects of $p_{T} > 10\rm~GeV$.  We enrich the samples in background by selecting only those denominator objects opposite to  ($\Delta R(\eta,\phi) > 1.0$) a reconstructed $p_{T} > 35\rm~GeV$ jet.  Figure~\ref{fig:FR} shows the electron and muon fakerates obtained from this procedure as a function of $p_{T}$. 

%-------------------------------------------------
\begin{figure}[tbp]
\begin{center}
\includegraphics[width=0.45\linewidth]{figs/bdt-medium-frpt.png}
\includegraphics[width=0.45\linewidth]{figs/frMu.png}
\caption{ {\bf Muon and Electron Fake Rates.}\label{fig:FR} }
\end{center}
\end{figure}
%-------------------------------------------------

We estimate $\ell\ell jj$ backgrounds in the signal region by applying the fakerates in events that contain a good Z1.  First, we select denominator objects that fail identification/isolation to prevent bias from real leptons.  Next, we loop over pairs of the denominator objects, weight each leg with $\epsilon_{FR}(p_{T},\eta)/(1-\epsilon_{FR}(p_{T},\eta))$  and apply the Z2 kinematic requirements ($12\rm~GeV < m(Z2) < 120$).  The denominator in the weight term accounts for the fact that the we only consider candidates that fail full lepton selection.  Weighted pairs that pass the Z2 kinematic selection are summed to obtain an estimate of the $\ell\ell jj$ background.  

Table~\ref{tab:fakes} presents $\ell\ell jj$ background estimates for the $2.1\rm~fb^{-1}$ dataset.  We maximize the statistical power of the small $Z1 + \ge 2\rm~denominator$ sample by integrating over the flavor of the $Z1$ leptons and then dividing the $Z1$-inclusive prediction between the $4\ell_{e,\mu}$ and $2\ell_{e,\mu}2\ell_{\mu,e}$ channels.  The division is performed by assuming equal $ee$ and $\mu\mu$ $Z1$ branching ratios and using an acceptance factor ($=XXX$, measured from inclusive $Z\rightarrow ee,\mu\mu$ yields) to account for efficiency differences in the detection of electrons and muons.

%-------------------------------------------------
\begin{table}[htb]
\begin{center}
\begin{tabular}{c|c}
\hline
\multicolumn{2}{c}{Z1-Inclusive $\ell\ell jj$ Yields} \\
\hline
$Z1 + \mu\mu$      & $0.057 \pm X$ \\
$Z1 + ee$          & $X \pm Y$     \\
\hline
\multicolumn{2}{c}{Per-Channel $\ell\ell jj$ Yields} \\
\hline 
$4\mu$      & $0.044 \pm X$ \\
$4e$        & $X \pm Y$     \\
$2e2\mu$    & $(0.013 + Z) \pm Y$     \\
\hline
\end{tabular}
\caption{{\bf Expected $ell\ell jj$ Events.}\label{tab:fakes}}
\end{center}
\end{table}
%-------------------------------------------------

It is difficult to predict shapes for the $m(4\ell)$ distributions of $\ell\ell jj$ background with the limited number of events containing a good $Z1$ and two failing denominator objects.  We increase sample size by loosening the denominator and $Z2$ selections.  For the muon-channel, we relax the isolation requirement in the denominator definition and remove the opposite-sign requirement in the $Z2$ selection.  For electrons we ... {\bf XXX}.  With these modifications we obtain the $m(4\ell)$ distributions shown Figure~\ref{fig:mufakeshapes}.  We fit the observed shapes with Landau distributions and obtain an acceptable goodness-of-fit.  Consequently, we use Landau distributions to model the $m(4l)$ distribution of our $\ell\ell jj$ predictions, also shown in Figure~\ref{fig:mufakeshapes}.  

%-------------------------------------------------
\begin{figure}[tbp]
\begin{center}
\includegraphics[width=0.45\linewidth]{figs/muFakeShape-4m.png}
\includegraphics[width=0.45\linewidth]{figs/muFakeShape-2m.png}
\includegraphics[width=0.45\linewidth]{figs/muFakeShape-4m.png}
\includegraphics[width=0.45\linewidth]{figs/eleFakeShape-inclusive.png}
\caption{ Predicted $m(4\ell)$ Distributions for $\ell\ell jj$ Events.\label{fig:mufakeshapes} }
\end{center}
\end{figure}
%-------------------------------------------------

%_________________________________________________________________
\subsubsection{Cross Check and Systematics: Light Flavor }\label{sec:lflavor}
%_________________________________________________________________
We cross-check our procedures by predicting the number of fake leptons in independent control regions enriched in light flavor.  We require one $p_{T} > 25\rm~GeV$ lepton candidate that passes our nominal lepton selection and $1+$ same-sign, same-flavor denominator objects.  We veto events with $m(\ell\ell)$ between $76-106\rm~GeV$ to reduce real lepton contamination from Z decays.  

In the muon-channel this selection produces a sample of pure background, of which the primary component is $W+jet$ with a jet faking a muon.  The smaller multi-jet backgrounds, consisting of both light and heavy flavor, contain at least two jets that both fake muons.  We reduce the heavy flavor contribution in this sample by requiring $|\sigma(IP_{3D})/IP_{3D} < 3|$ for all muon candidates and $MET > 25\rm~GeV$.  Relative abundances for events in which the denominator muon passes selection are determined by fitting the resulting MET distribution with a same-sign MC template for $W+jets$ and a Rayleigh distribution for multi-jets.  The fit result (Figure~\ref{fig:ssMuon}, left) indicates that $W+jets$ constitutes $\sim80\%$ of the sample.  Residual contributions from heavy flavor in the same-sign muon sample are therefore small.

%-------------------------------------------------
\begin{figure}[htb]
\begin{center}
\includegraphics[width=0.45\linewidth]{figs/ssMuMET.png}
\includegraphics[width=0.45\linewidth]{figs/ssMuMZ1.png}
\caption{Fakerate Predictions for Same-sign Muon Events.}\label{fig:ssMuon} 
\end{center}
\end{figure}
%-------------------------------------------------

Next, we attempt to predict the number of events containing two identified and isolated same-sign muons by applying our fakerates to denominator objects that fail selection.  We loop over all such objects, weight each with the appropriate factor of $\epsilon_{FR}(p_{T},\eta)/(1-\epsilon_{FR}(p_{T},\eta))$ and sum.  The expected and observed $m(\ell\ell)$ distributions are shown in the rightmost plot of Figure~\ref{fig:ssMuon}.  The shape of the predicted distribution agrees with the observation, however the yield is under-predicted by $47.2\%$.  

%This difference can be understood as a result of differences in the composition of the prediction sample (mainly light flavor) and that used to measure the fakerate (a mix of light and heavy flavor).  

For electrons ...
%For electrons, charge misidentification is significant enough to result in a noticible Z-peak.  The jet background is however easily estimated from a fit with a same-sign MC Z template and an exponential background PDF.  Events selected in data are shown in Figures~\ref{fig:ssMuon} and (\ref{fig:ssEle}) as points.  Table~\ref{tab:ssfakes} lists the total number of observed events in the muon-channel and the electron-channel background determined from the fit.

%-------------------------------------------------
\begin{figure}[htb]
\begin{center}
\includegraphics[width=0.45\linewidth]{figs/ssEleMET.png}
\includegraphics[width=0.45\linewidth]{figs/ssEleMZ1.png}
\caption{ Fakerate Predictions for Same-sign Electron Events.}\label{fig:ssEle} 
\end{center}
\end{figure}
%-------------------------------------------------

Table summarizes the results of this section.  We take $47.2\%$ ($X\%$) as the systematic uncertainty on the muon (electron) fakerate to account for potential biases in our prediction due to differences in light flavor composition.

%-------------------------------------------------
\begin{table}[tbh]
\begin{center}
\begin{tabular}{|c||c|c||c|}
\hline
channel                 & observed & predicted  & systematic     \\
\hline
\hline
$same sign~\mu\mu$      & $159$    & $108.04$  & $47.2\%$\\
$same sign~ee$          & $X$      & $Y$       &  $Z\%$ \\
\hline
\end{tabular}
\caption{{\bf Same-sign Control Yields.}\label{tab:ssfakes}}
\end{center}
\end{table}
%-------------------------------------------------

%_________________________________________________________________
\subsubsection{Cross Check and Systematics : Heavy Flavor }\label{sec:hflavor}
%_________________________________________________________________
Backgrounds from $t\bar{t}$ and $Zb\bar{b}/c\bar{c}$ involve real leptons from heavy flavor decays.  As with light flavor, a difference in the fraction of heavy flavor in the fakerate and prediction samples can lead to errors in estimation.  We assess the potential impact of heavy flavor composition differences by applying our fakerate in a sample of relatively pure $Zb\bar{b}/c\bar{c}$ and $t\bar{t}$.

The control region consists of events that contain a pair of leptons passing the $Z1$ selection and at least two additional denominator objects with $\sigma_{IP_{3D}}/IP_{3D} > 4$.  Denominators are defined according to the requirements of Table~\ref{tab:fo}.  We make no requirement on denominator charge or flavor.  The leftmost plot of Figure~\ref{fig:ZHF} compares the observed $m(Z1)$ distributions for events passing this selection in data with cross section normalized predictions from MC.  We observe $71$ events and predict $66.3 \pm 2.0$ with $Zb\bar{b}$ and $t\bar{t}$ MC.  Thus we confirm that the data sample is indeed dominated by heavy flavor.

Next, we require the high-IP denominator objects to additionally pass the more stringent lepton ID and isolation criteria used in our nominal Z2 selection.  We estimate $0.81 \pm 0.21$ events from MC and observe 2.  Electron and muon fakerates are then applied to the denominator objects in the original $71$ events and, following the procedures described in Section~\ref{sec:lflavor}, we predict $0.84 \pm 0.10$ events.  Given the consistent results, we assign no additional systematic uncertainty on our predicted  $\ell\ell jj$ background yields.

%We then reinstate the $\sigma_{IP_{3D}}/IP_{3D} < 4$ cut and estimate $2.5 \pm 0.4$ events in the signal region from the $Zb\bar{b}$ and $t\bar{t}$ MC.  We take this prediction as an estimate of the heavy flavor contribution to our overall $\ell\ell jj$ background esimtate of $XXX$.  We assign a s sysmatic uncertainty on the estimated fraction Considering the We assignconsider th

%-------------------------------------------------
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=0.45\linewidth]{figs/HFmZ1.png}
\includegraphics[width=0.45\linewidth]{figs/HFm4l.png}
\caption{$m(Z1)$ and $m(4\ell)$ in the Heavy Flavor control region.}\label{fig:ZHF} 
\end{center}
\end{figure}
%-------------------------------------------------

We determine a shape for heavy flavor background in the signal region from the distribution of $m(4\ell)$ from the $Z1 + 2\times$ denominator events.  The rightmost plot of Figure~\ref{fig:ZHF} compares the $m(4\ell)$ distributions for this selection in data and (cross-section normalized) simulation.  We fit both distributions with Landaus and compare the normalized PDFs in Figure~\ref{fig:HFshape}.  

%-------------------------------------------------
\begin{figure}[htbp]
\begin{center}
\includegraphics[width=0.5\linewidth]{figs/HFshape.png}
\caption{$m(4\ell)$ shapes in the Heavy Flavor control region.}\label{fig:HFshape} 
\end{center}
\end{figure}
%-------------------------------------------------
%_________________________________________________________________
\subsection{Cross Check and Systematics: $WZ$  }
%_________________________________________________________________
The estimate of $WZ$ background in Table~\ref{tab:EWK} is entirely MC-based.  In addition to the leptons from $W$ and $Z$ decay, an additional ``fake'' lepton is needed for this process to contribute in the $4\ell$ signal region.  We cross-check MC predictions with an estimate obtained from the fakeable object method.  

We begin by requiring three fully selected leptons (two from the Z1 plus one additional) and $1+$ denominator objects.  We then perform a single loop to associate the denominator objects with the third lepton.  As before, we weight the denominators with $\epsilon_{FR}(p_{T},\eta)/(1-\epsilon_{FR}(p_{T},\eta))$, apply opposite-sign, same-flavor and kinematic selections and sum.  The additional, ID'ed lepton with which the denominators are paired is either a fake (from $Z+jets$) or a real lepton (from $WZ$ or $ZZ$ where one of the leptons is not reconstructed).  In order to extract the $WZ$ component of the measurement, we need to subtract off the $3\ell$ contribution predicted by MC for $ZZ$ as well as the double-fake estimate described in Section~\ref{sec:fakes}.  The latter is double-counted when performing a single denominator loop. 

\begin{eqnarray}
 N(WZ) &=& \ell\ell\ell~\Sigma_{i=0}^{Nd}~\frac{\epsilon(\eta^{i},p_{T}^{i})}{1-\epsilon(\eta^{i},p_{T}^{i})} -  2\times \ell\ell~\Sigma_{i=0}^{Nd}\Sigma_{j=i+1}^{Nd}~\frac{\epsilon(\eta^{i},p_{T}^{i})}{1-\epsilon(\eta^{i},p_{T}^{i})}~\frac{\epsilon(\eta^{j},p_{T}^{j})}{1-\epsilon(\eta^{j},p_{T}^{j})} - N(WZ) 
\end{eqnarray}

Table~\ref{tab:} provides values for the terms in the equation above.  The result XXX

%-------------------------------------------------
\begin{table}[tbh]
\begin{center}
\begin{tabular}{|c|c|c|}
\hline
$4e$    &         $4\mu$  &       $2e2\mu$ \\
\hline
$X\pm Y$ &         $Z\pm Y$ &       $Z\pm Y$  \\
\hline
\end{tabular}
\caption{{\bf Data-driven Expected $WZ$ Yields.}\small{blah.}\label{tab:WZfake}}
\end{center}
\end{table}
%-------------------------------------------------


%_________________________________________________________________
\subsubsection{Data-Driven Systematics Summary}\label{sec:fakesys}
%_________________________________________________________________

We summarize the systematic uncertainties on $\ell\ell jj$ and $\ell\ell\ell j$ background yields in Table~\ref{tab:fakessyssummary}.  The relative yields differences discussed in Sections~\ref{sec:lflavor} and~\ref{sec:hflavor} are added in quadrature to address potential biases due to sample dependence.  The difference in MC and data-driven predictions are assigned as a modeling uncertainty for $WZ$.

%-------------------------------------------------
\begin{table}[tbh]
\begin{center}
\begin{tabular}{|c|c|c|}
\hline
$4e$    &         $4\mu$  &       $2e2\mu$ \\
\hline
$X\pm Y$ &         $Z\pm Y$ &       $Z\pm Y$  \\
\hline
\end{tabular}
\caption{{\bf Data-driven Expected $WZ$ Yields.}\small{blah.}\label{tab:fakesyssummary}}
\end{center}
\end{table}
%-------------------------------------------------

The central shapes used for the $\ell\ell jj$ backgrounds are the Landaus fit to the $1.6\rm~fb^{-1}$ predictions.  Our alternate shapes are the high-statistics Landaus of Section~\ref{sec:} and the distributions from the high-IP control region.  We take the larger of the differences between the central shapes and the alternatives as a shape systematic.  Figure~\ref{fig:fakeshapesummary} shows the central shape and corresponding uncertainty envelope.
We take the high-statistics distribution as our are included in our evaluation of systematic uncertainties (Section~\ref{sec:fakesys}).
%-------------------------------------------------
\begin{figure}[tbp]
\begin{center}
\includegraphics[width=0.5\linewidth]{figs/m4l-HF.png}
\caption{Heavy Flavor $m(4\ell)$ Shape.\label{fig:fakeshapesummary} }
\end{center}
\end{figure}
%-------------------------------------------------
Revision:	1.2
Committed:	Mon Nov 7 13:34:06 2011 UTC (13 years, 6 months ago) by dkralph
Content type:	application/x-tex
Branch:	MAIN
Changes since 1.1:	+14 -10 lines
Log Message:	* empty log message *