ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/cvsroot/COMP/CSA06DOC/preprod.tex
Revision: 1.13
Committed: Mon Feb 5 02:09:35 2007 UTC (18 years, 2 months ago) by acosta
Content type: application/x-tex
Branch: MAIN
CVS Tags: HEAD
Changes since 1.12: +1 -0 lines
Log Message:
more edits

File Contents

# Content
1 \section{Pre-challenge Monte Carlo Production}
2
3 The Monte Carlo Production for CSA06 started in mid July. The original aim was
4 to produce 50 million events in total to be used as input for prompt
5 reconstruction.\\
6
7 Four teams from CIEMAT, DESY/RWTH, INFN/Bari, and University of
8 Wisconsin, Madison, volunteered to run production using the Production
9 Agent for the first time at large scale. All production related
10 job submissions were Grid-based, we refrained from using local
11 submissions entirely. After a short ramp-up when sites prepared for
12 production (e.g. most of the sites received the CMSSW software via
13 a centrally managed installation mechanism, while a few managed the
14 installation manually) a total of 28 sites offered resources for the
15 pre-production step.
16
17 Table~\ref{tab:prechallenge} shows the datasets by event category and the associated
18 number of events that were requested and actually produced. All four teams started production with the simulation of minimum bias
19 events.
20
21
22 \begin{table}[htb]
23 \centering
24 \caption{CSA06 Pre-challenge Production by event category.}
25 \label{tab:prechallenge}
26 \vspace{3mm}
27 \begin{tabular}{|l|l|r|r|}
28 \hline
29 CMSSW & & Nb Events produced & Nb Events requested \\
30 \hline
31 0\_8\_1 & minbias & 39.8 & 25.0 \\
32 0\_8\_2 & TTbar & 5.8 & 5.0 \\
33 0\_8\_2 & Zmumu & 2.2 & 2.0 \\
34 0\_8\_3 & Wenu & 4.6 & 4.0 \\
35 0\_8\_3 & SoftMuon & 2.0 & 2.0 \\
36 0\_8\_3 & EWK Soup & 5.6 & 5.0 \\
37 0\_8\_3 & Jets & 1.2 & 1.2 \\
38 0\_8\_3 & Exo Soup & 1.0 & 1.0 \\
39 0\_8\_4 & HLT Soup & 5.0 & 5.0 \\
40 \hline
41 & Total & 67.2 & 50.2 \\
42 \hline
43 \end{tabular}
44 \end{table}
45
46
47 The average event processing time observed on a 3.6GHz Xeon processor and the
48 event size is shown in Figure~\ref{fig:minbias-processing-performance}.
49
50 \begin{figure}[htp]
51 \begin{center}
52 $\begin{array}{c@{\hspace{1in}}c}
53 \includegraphics[width=0.4\linewidth]{figs/Pre-prod-minbias.pdf} &
54 \includegraphics[width=0.4\linewidth]{figs/Pre-prod-minbias-size.pdf} \\ [-0.53cm]
55 \end{array}$
56 \end{center}
57 \caption{Minimum Bias Event processing time and Event size, where
58 some jobs have 1000 events/job and some less than 500 events/job.}
59 \label{fig:minbias-processing-performance}
60 \end{figure}
61
62 More complex signal events like those associated with the TTbar data sample take
63 significant more time to simulate. According to the experience, it required
64 about 4 minutes per event to complete.
65
66 Regarding the job submission strategy, the teams found that the information about
67 resource usage at sites as it is published by the Grid Information System is not
68 useful to build the ranking since it is lacking the job information associated
69 with the particular VO role. Therefore a static ranking was used that was built
70 according to the available resources as they were discovered by the ProdAgent's
71 Job Tracking component.
72
73 With an average of up to 100 jobs/hour, per agent the performance of job submission
74 by the ProdAgent was rather low. With the level of resources available to the
75 teams it took a day or more until all CPUs could be utilized.
76 Given the anticipated scale of production for CSA06 and the fact that there were
77 four teams running two instances of ProdAgent each this was not a problem for the
78 CSA pre-production, however needs to be taken into account for future production
79 activities and is an area that needs to be improved. Moving to the new gLite
80 resource Broker with its bulk submission feature may help to some extent, but
81 this is certainly not the only area that needs to be looked at.
82
83 \begin{figure}[htp]
84 \begin{center}
85 \includegraphics[width=0.6\linewidth]{figs/Pre-prod-submission-rate.pdf}
86 \end{center}
87 \caption{ProdAgent job submission rate}
88 \label{fig:submission-rate}
89 \end{figure}
90
91 Rather than using the output sandbox for the produced data, files are staged out
92 to the local Storage Element (SE). The performance of the process copying files
93 from the the Worker Node disk to the SE, illustrated by
94 Figure~\ref{fig:stage-out-performance}, is very good for all the prominent SE's
95 CMS is using at sites (Castor, dCache and DPM).
96
97 \begin{figure}[htp]
98 \begin{center}
99 \includegraphics[width=0.6\linewidth]{figs/Pre-prod-stage-out.pdf}
100 \end{center}
101 \caption{Local stage-out performance}
102 \label{fig:stage-out-performance}
103 \end{figure}
104
105 As was reported by the production teams, early production was affected by
106 instabilities in the JobTracking and MergeSensor components of ProdAgent and
107 required continuous attention by the operators. Fortunately the problems
108 were solved in late August/early September.
109
110 Regarding operational problems one that turned out to be common across almost
111 all participating sites was with data access between the farm of Worker Nodes
112 and the local SE, i.e. for stage-out and in particular the merge process.
113 Given the many processes running in parallel the latter has shown to stress
114 some of the deployed SE's up to their limit. It is therefore important to
115 maintain a suitable CPU to storage access bandwidth ratio.
116
117 To help operate the ProdAgent more efficiently people from INFN Bari
118 developed a monitoring tool that allows a comprehensive overview
119 of the current state and access to log files
120 from a single web page. A screenshot is shown in Figure~\ref{fig:pa-monitoring} and~\ref{fig:pa-monitoring-1}.
121
122 \begin{figure}[htp]
123 \begin{center}
124 \includegraphics[width=0.95\linewidth]{figs/Pre-prod-PA-mon.pdf}
125 \end{center}
126 \caption{ProdAgent monitoring tool developed by INFN/Bari}
127 \label{fig:pa-monitoring}
128 \end{figure}
129
130 \begin{figure}[htp]
131 \begin{center}
132 \includegraphics[width=0.95\linewidth]{figs/Pre-prod-PA-mon-1.pdf}
133 \end{center}
134 \caption{Bari ProdAgent monitoring summarizing important production job information}
135 \label{fig:pa-monitoring-1}
136 \end{figure}