ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/cvsroot/COMP/CSA06DOC/tier12ops.tex
Revision: 1.27
Committed: Mon Mar 12 13:58:03 2007 UTC (18 years, 1 month ago) by acosta
Content type: application/x-tex
Branch: MAIN
CVS Tags: HEAD
Changes since 1.26: +10 -4 lines
Log Message:
suggestions from management

File Contents

# Content
1 \section{Tier-1 and Tier-2 Operations}
2
3 \subsection{Data Transfers}
4
5 The Tier-1 centers were expected to receive data from CERN at a rate
6 proportional to the 25\% of the 2008 pledge rate and serve the data to
7 Tier-2 centers. The expected rate into the Tier-1 centers is shown in
8 Table~\ref{tab:tier01pledge}. Note that while the listed rates are
9 significantly less than the bandwidth to the WAN (see
10 Table~\ref{tab:tier1resources}), they fit within the storage
11 capability available for a 30 day challenge.
12
13 \begin{table}[htb]
14 \centering
15 \caption{Expect transfer rates from CERN to Tier-1 centers based on the MOU pledges.}
16 \label{tab:tier01pledge}
17 \begin{tabular}{|l|l|l|}
18 \hline
19 Site & Goal Rates (MB/s) & Threshold Rates (MB/s) \\
20 \hline
21 ASGC & 15 & 7.5 \\
22 CNAF & 25 & 12.5 \\
23 FNAL & 50 & 25 \\
24 GridKa & 25 & 12.5 \\
25 IN2P3 & 25 & 12.5 \\
26 PIC & 10 & 5 \\
27 RAL & 10 & 5 \\
28 \hline
29 \end{tabular}
30 \end{table}
31
32 The Tier-2 centers are expected in the computing model \cite{model, ctdr} to transfer
33 from the Tier-1 centers in bursts. The goal rate in CSA06 was 20MB/s,
34 with a threshold for success of 5MB/s. Achieving these metrics in the
35 computing model was defined as hitting the transfer rate for a 24 hour
36 period. At the beginning of CSA06 CMS concentrated primarily on
37 moving data from the ``associated'' Tier-1 centers to the Tier-2s. By
38 the end of the challenge most of the Tier-1 to Tier-2 permutations had
39 been attempted.
40
41 The total data transferred between sites in CSA06 is shown in
42 Figure~\ref{fig:totaltran}. This plot only includes wide area data
43 transfers, additionally data was moved onto tape at the majority of
44 Tier-1 centers. Over the 45 days of the challenge CMS was able to
45 move more than 1 petabyte of data over the wide area.
46
47 \begin{figure}[htp]
48 \begin{center}
49 \includegraphics[width=0.7\linewidth]{figs/CSA06_CumTran}
50 \end{center}
51 \caption{The cumulative data volume transferred during CSA06 in TB.}
52 \label{fig:totaltran}
53 \end{figure}
54
55 Timeline:
56 \begin{itemize}
57
58 \item October 2, 2006: The Tier-0 to Tier-1 transfers began on the
59 first day of the challenge. In the first few hours 6 of 7 Tier-1
60 centers successfully received data. During the first week only
61 minimum bias was reconstructed and at 40Hz the total rate out of the
62 CERN site does not meet the 150MB/s target rate.
63
64 \item October 3, 2006: All 7 Tier-1 sites were able to successfully
65 receive data and 8 Tier-2 centers were subscribed to data samples:
66 Belgium IIHE, UC San Diego, Wisconsin, Nebraska, DESY, Aachen, and
67 Estonia. There were successful transfers to 6 Tier-2 sites.
68
69 \item October 4, 2006: An additional 11 Tier-2 sites were subscribed
70 to data samples: Pisa, Purdue, CIEMAT, Caltech, Florida, Rome, Bari,
71 CSCS, IHEP, Belgium UCL, and Imperial College. Of the 19 registered
72 Tier-2 sites, 12 were able to receive data. Of those, 5 exceeded the
73 goal transfer rates for over an hour, and an additional 3 were over
74 the threshold rate.
75
76 \item October 5, 2006: Three additional Tier-2s were added increasing
77 the number of participating sites above the goal rate of 20 Tier-2
78 centers. New hardware installed at IN2P3 for CSA06 began to exhibit
79 stability problems leading to poor transfer efficiency.
80
81 \item October 9, 2006: RAL transitioned from a dCache SE to a Castor2
82 SE. The signal samples began being reconstructed at the Tier-0.
83
84 \item October 10-12, 2006: The Tier-1 sites had stable operations
85 through the week at an aggregate rate of approximately 100MB/s from
86 CERN. IFCA has joined the Tier-1 to Tier-2 transfers and their average
87 transfer rate over the day was observed at 14MB/s at a low error
88 rate.
89
90 \item October 13, 2006: Multiple subscriptions of the minimum bias
91 samples were made to some of the Tier-1 centers to increase the total
92 rate of data transfer from CERN. The number of participating Tier-2
93 sites increased to 23.
94
95 \item October 18, 2006: The PhEDEx transfer system held a lock in the
96 Oracle database which blocked other agents from continuing with
97 transfers. This problem appeared more frequently in the latter half
98 of the challenge when the load was higher.
99
100 \item October 20, 2006: The reconstruction rate was increased at the
101 Tier-0 to improve the output from CERN and to better exercise the
102 prompt reconstruction farm. The data rate from CERN approximately
103 doubles. An average rate over an hour of 600MB/s from CERN was
104 achieved.
105
106 \item October 25, 2006: The transfer rate from CERN was large with
107 daily average rates of 250MB/s-300MB/s. The first observation of
108 transfer backlogs begin to appear.
109
110 \item October 30, 2006: Data reconstruction at the Tier-0 stopped.
111
112 \item October 31, 2006: PIC and ASGC finished transferring the assigned prompt reconstruction data from CERN.
113
114 \item November 2, 2006: FNAL and IN2P3 also completed the transfers.
115
116 \item November 3, 2006: RAL completed the transfers. The first of the
117 Tier-1 to any Tier-2 transfer validation began. The test involved
118 sending a small sample from a Tier-1 site to a validated Tier-2, in
119 the test case DESY, and then sending a small sample to all Tier-2
120 sites.
121
122 \item November 5, 2006: CNAF completed the Tier-0 transfers
123
124 \item November 6, 2006: The Tier-1 to Tier-2 transfer testing continued.
125
126 \item November 9, 2006: GridKa completed the Tier-0 transfers
127
128 \end{itemize}
129
130
131 \subsubsection{Transfers to Tier-1 Centers}
132
133 During CSA06 the Tier-1 centers met the transfer rate goals. In the
134 first week of the challenge using minimum bias events the total volume
135 of data out of CERN did not amount to 150MB/s unless the datasets were
136 subscribed to multiple sites. After the reconstruction rate was
137 increased at the Tier-0 the transfer rate easily exceeded the 150MB/s
138 target. The 30 day and 15 day averages are shown in
139 Table~\ref{tab:tier01csa06}. For the thirty day average all sites
140 except two exceeded the goal rate and for the final 15 days all sites
141 easily exceed the goal. Several sites doubled and tripled the goal
142 rate during the final two weeks of high volume transfers.
143
144 The WLCG metric for availability this year is 90\% for the Tier-1
145 sites. If we apply this to the Tier-1 participating in CSA06
146 transfers we have 6 of 7 Tier-1s reaching the availability goal.
147
148 \begin{table}[htb]
149 \caption{Transfer rates during CSA06 between CERN and Tier-1 centers and the number of outage days during the active challenge activities. In the MSS column the parentheses indicates the site either had scaling issues keeping up with the total rate to tape, or transferred only a portion of the data to tape.}
150 \label{tab:tier01csa06}
151 \begin{tabular}{|l|r|r|r|r|c|}
152 \hline
153 Site & Anticipated Rate (MB/s) & last 30 day average & last 15 day average & Outage (Days) & MSS used \\
154 \hline
155 ASGC & 15MB/s & 17MB/s & 23MB/s & 0 & (Yes) \\
156 CNAF & 25MB/s & 26MB/s & 37MB/s & 0 & (Yes) \\
157 FNAL & 50MB/s & 68MB/s & 98MB/s & 0 & Yes \\
158 GridKa & 25MB/s & 23MB/s & 28MB/s & 3 & No \\
159 IN2P3 & 25MB/s & 23MB/s & 34MB/s & 1 & Yes \\
160 PIC & 10MB/s & 22MB/s & 33MB/s & 0 & No \\
161 RAL & 10MB/s & 23MB/s & 33MB/s & 2 & Yes \\
162 \hline
163 \end{tabular}
164 \end{table}
165
166
167 The rate of data transferred averaged over 24 hours and the volume of
168 data transferred in 24 hours are shown in Figures~\ref{fig:tier01rate}
169 and~\ref{fig:tier01vol}. The start of the transfers during the first
170 week is visible on the left side of the plot as well as the transfers
171 not reaching the target rate shown as a horizontal red bar. The twin
172 peaks in excess of 300MB/s and 25TB of data moved correspond to the
173 over-subscription of data. The bottom of the graph has indicators of
174 the approximate Tier-0 reconstruction rate. Both the rate and the
175 volume figures show clearly the point when the Tier-0 trigger rate was
176 doubled to 100Hz. The daily average exceeded 350MB/s with more than
177 30TB moved. The hourly averages from CERN peaked at more than
178 650MB/s.
179
180 \begin{figure}[htp]
181 \begin{center}
182 \includegraphics[width=0.7\linewidth]{figs/Tier01rate}
183 \end{center}
184 \caption{The rate of data transferred between the Tier-0 to the Tier-1 centers in MB per second.}
185 \label{fig:tier01rate}
186 \end{figure}
187
188
189 \begin{figure}[htp]
190 \begin{center}
191 \includegraphics[width=0.7\linewidth]{figs/Tier01vol}
192 \end{center}
193 \caption{The total volume of data transferred between the Tier-0 to the Tier-1 centers in TB per day.}
194 \label{fig:tier01vol}
195 \end{figure}
196
197 The transferrable volume plot shown in Figure~\ref{fig:tier01queue} is an
198 indicator of how well the sites are keeping up with the volume of data
199 from the Tier-0 reconstruction farm. During the first three weeks of
200 the challenge almost no backlog of files is accumulated by the Tier-1
201 centers. A hardware failure at IN2P3 resulted is a small
202 accumulation. The additional data subscriptions leads to a spike in
203 data to transfer, but is quickly cleared by the Tier-1 sites. The
204 most significant volumes of data waiting for transfer come at the end
205 of the challenge. During this time GridKa has performed a dCache
206 storage upgrade that resulted in a large accumulation of data to
207 transfer. CNAF suffered a file server problem that reduced the amount
208 of available hardware. Additionally RAL turned off the import system
209 for two days over a weekend to demonstrate the ability to recover from
210 a service interruption. The Tier-1 issues combined with PhEDEx
211 database connection interruptions under the heavy load of the final
212 week of transfers to accumulate a backlog of approximately 50TB over
213 the final days of the heavy challenge transfers. During this time
214 CERN continued to serve data at 350MB/s on average.
215
216
217 \begin{figure}[htp]
218 \begin{center}
219 \includegraphics[width=0.7\linewidth]{figs/Tier01queue}
220 \end{center}
221 \caption{The total volume of data waiting for transfer between the Tier-0 to the Tier-1 centers in TB per day.}
222 \label{fig:tier01queue}
223 \end{figure}
224
225 The CERN to Tier-1 transfer quality is shown in
226 Figure~\ref{fig:tier01qual}. In CMS the transfer quality is defined
227 as the number of times a transfer has to be attempted before it
228 successfully completes. The link between two sites with 100\%
229 transfer quality would have had to attempt each transfer once, while a
230 10\% transfer quality would indicate each transfer had to be attempted
231 ten times to successfully complete. Most transfers eventually
232 complete, having low transfer quality uses the transfer resources
233 inefficiency and usually results in a low utilization of the network.
234
235 \begin{figure}[htp]
236 \begin{center}
237 \includegraphics[width=0.7\linewidth]{figs/qualt0t1}
238 \end{center}
239 \caption{Transfer quality between CERN and Tier-1 centers over 30 days}
240 \label{fig:tier01qual}
241 \end{figure}
242
243
244 The transfer quality plot compares very favorably to equivalent plots
245 made during the spring. The CERN Castor2 storage element performed
246 very stably throughout the challenge. There were two small
247 configuration issues that were very promptly addressed by the experts.
248 The Tier-1s also performed well throughout the challenge with several
249 24 hour periods to specific Tier-1s with no transfer errors. The
250 stability of the RAL SE before the transition to CASTOR2 can be seen
251 at the left side of the plot, as well as the intentional downtime to
252 demonstrate recovery on the right side of the plot. The IN2P3
253 hardware problems are visible during the first week and the GridKa
254 dCache upgrade is clearly visible during the last week. Most of the
255 other periods are solidly green. Both FNAL and PIC are above 70\%
256 efficient for every day of the challenge activities.
257
258
259 Tier-1 to Tier-1 transfers were considered to be beyond the scope of
260 CSA06, though the dataflow exists in the CMS computing model. During
261 CSA06 we had an opportunity to test Tier-1 to Tier-1 transfers while
262 recovering from backlogs of data when the samples were subscribed to
263 mulitple sites. PhEDEx is designed to take the data from source site
264 where it can be efficiently transferred from. Figure~\ref{fig:t1t1}
265 shows the total Tier-1 to Tier-1 transfers during CSA06. With 7
266 Tier-1s there are 84 permutations of Tier-1 to Tier-1 transfers,
267 counting each direction separately. During CSA06 we successfully
268 exercised about half of them.
269
270 \begin{figure}[htp]
271 \begin{center}
272 \includegraphics[width=0.7\linewidth]{figs/T1T1Rate}
273 \end{center}
274 \caption{Transfer rate between Tier-1 centers during CSA06}
275 \label{fig:t1t1}
276 \end{figure}
277
278 \subsubsection{Transfers to Tier-2 Centers}
279 In the CMS computing model the Tier-2s are expected to be able to
280 receive data from any Tier-1 site. In order to simplify CSA06
281 operations we began by concentrating on transfers from the
282 ``Associated'' Tier-1 sites, and in the final two weeks of the
283 challenge began a concerted effort on transfers from any Tier-1. The
284 associated Tier-1 center is the center operating the File Transfer
285 Service (FTS) server and hosting the channels for Tier-2 transfers.
286
287 The Tier-2 transfer metrics involved both participation and
288 performance. For CSA06 CMS had 27 sites that signed up to participate
289 in the challenge. Participation was defined as having successful
290 transfers 80\% of the days during the challenge. By this metric there
291 were 21 sites that succeeded in participating in the challenge, which
292 is above the goal of 20.
293
294 The Tier-2 transfer performance goals were 20MB/s and the threshold
295 was 5MB/s. In the CMS computing model the Tier-2 transfers are
296 expected to occur in bursts. Data will be transferred to refresh a
297 Tier-2 cache, and then will be analyzed locally. The Tier-2 sites
298 were not expected to hit the goal transfer rates continuously
299 throughout the challenge. There were 12 sites that successfully
300 averaged above the goal rate for at least one 24 hour period, and an
301 additional 8 sites that rated averaged the threshold rate for at least
302 one 24 hour period.
303
304 The transfer rate over the 30 most active transfer days is shown in
305 Figure~\ref{fig:tier12rate}. The aggregate rate from Tier-1 to
306 Tier-2 centers was not as high as the total rate from CERN, which is
307 not an accurate reflection of the transfers expected from the CMS
308 computing model. In the CMS computing model there is more data
309 exported from the Tier-1s to the Tier-2s than total raw data coming
310 from CERN because data is sent to multiple Tier-2s and the Tier-2s may
311 flush data from the cache and reload at a later time. In CSA06 the
312 Tier-2 centers were subscribed to specific samples at the beginning
313 and then specific skims when available.
314
315 \begin{figure}[htp]
316 \begin{center}
317 \includegraphics[width=0.7\linewidth]{figs/tier12rate}
318 \end{center}
319 \caption{Transfer rate between Tier-1 and Tier-2 centers during the first 30 days of CSA06}
320 \label{fig:tier12rate}
321 \end{figure}
322
323 The ability of the Tier-1 centers to export data was successfully
324 demonstrated during the challenge, but several sites indicated
325 interference between receiving and exporting data. The quality of the
326 Tier-1 to Tier-2 data transfers is shown in Figure~\ref{fig:tier12qual}.
327 The quality is not nearly as consistently green as the CERN to Tier-1
328 plots, but the variation has a number of causes. Not all of the
329 Tier-1 centers are currently exporting data as efficiently as CERN,
330 especially in the presence of a high load of data ingests, in addition
331 most of the Tier-2 sites do not have as much operational experience
332 receiving data as the Tier-1 sites do.
333
334 The Tier-1 to Tier-2 transfer quality looks very similar to the CERN
335 to Tier-1 transfer quality of 9-12 months ago. With a concerted
336 effort the Tier-1 to Tier-2 transfers should be able to reach the
337 quality of the current CERN to Tier-1 transfers before they are needed
338 to move large qualities of experiment data to users.
339
340 \begin{figure}[htp]
341 \begin{center}
342 \includegraphics[width=0.7\linewidth]{figs/tier12qual}
343 \end{center}
344 \caption{Transfer quality between Tier-1 and Tier-2 centers during the first 30 days of CSA06}
345 \label{fig:tier12qual}
346 \end{figure}
347
348 There are a number of very positive examples of Tier-1 to Tier-2
349 transfers. Figure~\ref{fig:picqual} shows the results of the Tier-1
350 to all Tier-2 tests when PIC was the source of the dataset. A small
351 skim sample was chosen and within 24 hours 20 sites had successfully
352 received the dataset. The transfer quality over the 24 hour period
353 remained high with success transfers to all four continents
354 participating in CMS.
355
356 \begin{figure}[htp]
357 \begin{center}
358 \includegraphics[width=0.7\linewidth]{figs/PICQual}
359 \end{center}
360 \caption{Transfer quality between PIC and Tier-2 sites participating in the dedicated Tier-1 to Tier-2 transfer tests}
361 \label{fig:picqual}
362 \end{figure}
363
364 Figure~\ref{fig:fnalrate} is an example of the very high export rates
365 the tier-1 centers were able to achieve transferring data to Tier-2
366 centers. The peak rate on the plot is over 5Gb/s, which was
367 independently verified by the site network monitoring. This rate is
368 over 50\% of the anticipated Tier-1 data export rate expected in the
369 full sized system.
370
371 \begin{figure}[htp]
372 \begin{center}
373 \includegraphics[width=0.7\linewidth]{figs/FNAL_Rate}
374 \end{center}
375 \caption{Transfer Performance between FNAL and Tier-2 sites
376 participating in the
377 %dedicated
378 Tier-1 to Tier-2 transfer tests}
379 \label{fig:fnalrate}
380 \end{figure}
381
382 Figure~\ref{fig:FZK_DESY} is an example of the very high rates achieved at both Tier-1 export and Tier-2 import observed in CSA06. The plot shows both the hourly average and the instantaneous rate. DESY achieved an import rate to disk of higher than 400MB/s.
383
384 \begin{figure}[ht]
385 \begin{center}
386 $\begin{array}{c@{\hspace{1in}}c}
387 \includegraphics[width=0.4\linewidth]{figs/FZK_DESY_1} &
388 \includegraphics[width=0.4\linewidth]{figs/FZK_DESY_2} \\
389 %[-0.53cm]
390 \end{array}$
391 \end{center}
392 \caption{The plot on the left is the hourly average transfer rate between GridKa and DESY. The plot on the right is the instantaneous rate between the two sites measured with Ganglia.}
393 \label{fig:FZK_DESY}
394 \end{figure}
395
396
397 \subsection{Tier-1 Skim Job Production}
398 \label{sec:skims}
399
400 CSA tested the workflow to reduce primary datasets to manageable sizes
401 for analyses. Four production teams provided centralized skim job workflow at
402 the Tier-1 centers. The produced secondary datasets are registered
403 into Dataset Bookkeeping Service and accessed like any other data.
404 Common skim job tools were prepared based on Monte Carlo generator
405 information and reconstruction output, and both types were tested
406 (see Section~\ref{sec:filtering}). There was
407 overwhelming response from the analysis demonstrations, and about
408 25 filters producing nearly 60 datasets were run as compiled in
409 Table~\ref{tab:tier1skim}. A
410 variety of output formats for the secondary datasets were used (FEVT,
411 RECO, AOD, AlCaReco), and the selected number of
412 events range from $<1\%$ to $100\%$. Secondary dataset sizes ranged
413 from $<1$~GB to 2.5 TB. No requirement was imposed beforehand on the
414 restrictiveness of the filters for CSA06, hence those with very low
415 efficiencies are probably tighter than what one would apply in
416 practice.
417
418 \begin{table}[phtb]
419 \centering
420 \caption{List of requested skim filters to run during CSA06 by group,
421 filter name, primary input dataset, efficiency, and input/output
422 data formats.}
423 \label{tab:tier1skim}
424 \begin{tabular}{|l|l|l|l|l|l|}
425 \hline
426 Group & Filter & Samples & Efficiencies & Input
427 format & Output format \\
428 \hline
429 hg & CSA06\_Tau\_Zand1lFilter.cfg & EWK & 14\% & FEVT &
430 RECOSim \\
431 hg & CSA06\_HiggsTau\_1lFilter.cfg & EWK & 36\% & FEVT &
432 RECOSim \\
433 hg & CSA06\_HiggsTau\_1lFilter.cfg & T-Tbar & 47\% & FEVT &
434 RECOSim \\
435 hg & CSA06\_HiggsWW\_WWFilter.cfg (bkgnd)& EWK & 1\% & FEVT &
436 FEVT \\
437 hg & CSA06\_HiggsWW\_WWFilter.cfg (signal)& EWK & 1\% & FEVT &
438 FEVT \\
439 hg & CSA06\_HiggsWW\_TTb\_Filter.cfg & T-Tbar & 4\% & FEVT &
440 FEVT \\
441 hg & CSA06\_Higgs\_mc2l\_Filter.cfg & EWK & 10\% & FEVT &
442 RECOSim \\
443 hg & CSA06\_Higgs\_mc2l\_Filter.cfg & Jets & 2\% & FEVT &
444 RECOSim \\
445 hg & CSA06\_Higgs\_mc2l\_Filter.cfg & HLT(e,mu) & 1\% &
446 FEVT & RECOSim \\
447 hg & CSA06\_Higgs\_mc2gamma\_Filter.cfg & EWK &0 & FEVT &
448 RECOSim \\
449 hg & CSA06\_Higgs\_mc2gamma\_Filter.cfg & Jets & 34\% &
450 FEVT & RECOSim \\
451 hg & CSA06\_Higgs\_mc2gamma\_Filter.cfg & HLT(gam) &
452 0.4\% & FEVT & RECOSim \\
453 hg & CSA06\_Higgs\_mc2l\_Filter.cfg & TTbar & 14\% & FEVT & RECOSim\\
454 hg & CSA06\_Higgs\_mc2gamma\_Filter.cfg & TTbar & 8\% & FEVT &
455 RECOSim\\ \hline
456 sm & CSA06\_TTbar\_1lFilters.cfg (skim1efilter) & T-Tbar & 20\%
457 & FEVT & RECOSim \\
458 sm & CSA06\_TTbar\_1lFilters.cfg (skim1mufilter) & T-Tbar & 20\%
459 & FEVT & RECOSim \\
460 sm & CSA06\_TTbar\_1lFilters.cfg (skim1taufilter) & T-Tbar & 20\%
461 & FEVT & RECOSim \\
462 sm & CSA06\_TTbar\_dilepton.cfg & T-Tbar & $\sim$10\% & FEVT &
463 RECOSim \\
464 sm & CSA06\_MinimumBiasSkim.cfg & minbias & 100\% & FEVT & RECOSim \\
465 sm & CSA06\_UnderlyingEventJetsSkim.cfg (reco) & Jets &
466 $\sim$100\% & FEVT & RECOSim \\
467 sm & CSA06\_UnderlyingEventDYSkim.cfg & EWK & $\sim$10\% &
468 FEVT & RECOSim \\ \hline
469 eg & CSA06\_ZeeFilter.cfg (zeeFilter) & EWK & 3\% & FEVT &
470 RECOSim \\
471 eg & CSA06\_ZeeFilter.cfg (AlCaReco) & EWK & 3\% & FEVT &
472 AlcaReco \\
473 eg & CSA06\_AntiZmmFilter.cfg & Jets & 85\% & FEVT &
474 FEVT\\ \hline
475 mu & CSA06\_JPsi\_mumuFilter.cfg & SoftMuon & 50\% & FEVT &
476 FEVT \\
477 mu & CSA06\_JPsi\_mumuFilter.cfg & Zmumu & 50\% & FEVT & FEVT \\
478 mu & CSA06\_JPsi\_mumuFilter.cfg & EWK & 10\% & FEVT & FEVT \\
479 mu & CSA06\_WmunuFilter.cfg (reco) & EWK & 20\% & FEVT & AODSim \\
480 mu & CSA06\_WmunuFilter.cfg (reco) & SoftMuon & 60\% & FEVT &
481 AODSim \\
482 mu & CSA06\_ZmmFilter.cfg & Zmumu & 50\% & FEVT & RECOSim \\
483 mu & CSA06\_ZmmFilter.cfg & Jets & -- & FEVT & FEVT \\
484 mu & recoDiMuonExample.cfg (reco) & EWK & 20\% & FEVT & RECOSim \\
485 mu & recoDiMuonExample.cfg (reco) & Zmumu & 67\% & FEVT &
486 RECOSim \\ \hline
487 su & CSA06\_Exotics\_LM1Filter.cfg & Exotics & 39\% & FEVT &
488 FEVT \\
489 su & CSA06\_BSM\_mc2e\_Filter.cfg & Exotics & 2\% & FEVT &
490 FEVT \\
491 su & CSA06\_BSM\_mc2e\_Filter.cfg & EWK & $\sim 40\%$ & FEVT &
492 FEVT \\
493 su & CSA06\_BSM\_mc2e\_Filter.cfg & HLT(e) & -- & FEVT & FEVT \\
494 su & CSA06\_Exotics\_ZprimeDijetFilter.cfg & Exotics &
495 $\sim$30\% & FEVT & FEVT \\
496 su & CSA06\_Exotics\_QstarDijetFilter.cfg & Exotics &
497 $\sim$20\% & FEVT & FEVT \\
498 su & CSA06\_Exotics\_XQFilter.cfg & Exotics & 22\% & FEVT &
499 FEVT \\
500 su & CSA06\_Exotics\_ZprimeFilter.cfg & Exotics & 39\% &
501 FEVT & FEVT \\
502 su & CSA06\_Exotics\_LM1\_3IC5Jet30Filter.cfg (reco) & Exotics &
503 $25\%$ & FEVT & FEVT \\
504 su & CSA06\_TTbar\_2IC5Jet100ExoFilter.cfg (reco) & T-Tbar &
505 5\% & FEVT & FEVT \\ \hline
506 jm & CSA06\_QCD\_Skim.cfg (21 samples) & Jets & 100\% &
507 FEVT & FEVT \\ \hline
508
509 \end{tabular}
510 \end{table}
511
512
513
514 \subsection{Tier-1 Re-Reconstruction}
515 \label{sec:rereco}
516
517 The goal was to demonstrate re-reconstruction at a Tier-1 centre on
518 files first reconstructed and distributed by the Tier-0 centre,
519 including access and application of new constants from the
520 offline DB. Four teams were set up to demonstrate re-reconstruction on
521 at least 100K events at each of the Tier-1 centres.
522
523 \subsubsection{Baseline Approach}
524
525 Since re-reconstruction had not been tested before the start of CSA06,
526 a technical problem was encountered with a couple of reconstruction
527 modules when re-reconstructed was first attempted November 4. The
528 issue has to do with multiple reconstruction products stored in the
529 Event, and the proper mechanism of accessing them. Once diagnosed the
530 Tier-1 re-reconstruction workflow dropped
531 pixel tracking and vertexing out of about 100 reconstruction modules,
532 and the processing worked correctly.
533 Re-reconstruction was demonstrated on $>$100K events at 6 Tier-1 centres.
534 For the Tracker and ECAL calibration exercises (see
535 Section~\ref{sec:calib}), new constants inserted
536 into the offline DB were used for the re-reconstruction, and the
537 resulting datasets were
538 published and accessible to CRAB jobs. Thus, CSA06 also demonstrated
539 the full reprocessing workflow.
540
541 \subsubsection{Two-Step Approach}
542
543 While the reconstruction issue described above was being diagnosed, a
544 brute-force two-step procedure was conducted in parallel to ensure
545 re-reconstruction at a Tier-1 centre. The approach consisted of first
546 skimming off the original Tier-0 reconstruction products in analogy
547 with the physics skim job workflow described in
548 Section~\ref{sec:skims}, and then run reconstruction on the skimmed
549 events (i.e. two ProdAgent workflows). This approach was also
550 successfully demonstrated at the FNAL Tier-1 centre.
551
552
553 \subsection{Job Execution at Tier-1 and Tier-2}
554 \subsubsection{Job Robot}
555 The processing metrics in CSA06 as they were defined foresaw that sites
556 offering computing capacity to CMS and participating in CSA06 were expected
557 to complete an aggregate of 50k jobs per day. The goal was to exercise the
558 job submission infrastructure and to monitor the input/output rate.
559
560 \begin{itemize}
561 \item About 10k per day were intended as skimming and reconstruction jobs
562 at the Tier-1 centers
563 \item About 40k per day were expected to be a combination of user submitted
564 analysis jobs and robot submitted analysis-like jobs
565 \end{itemize}
566
567 The job robots are automated expert systems to simulate user analysis tasks
568 using the CMS Remote Analysis Builder (CRAB). Therefore they provide a reasonable
569 method to generate load on the system by running analysis on all datasamples
570 at all sites individually. They consist of a component/agent based
571 structure which enables parallel execution. Job distribution to CMS compute
572 resources is accomplished by using Condor-G direct submission on the OSG sites
573 and gLite bulk submission on the EGEE sites.\\
574
575 The job preparation phase comprises four distinct steps
576 \begin{itemize}
577 \item Job creation
578 \begin{itemize}
579 \item Data discovery using DBS/DLS
580 \item Job splitting according to user requirements
581 \item Preparation of job dependent files (incl. the jdl)
582 \end{itemize}
583 \item Job submission
584 \begin{itemize}
585 \item Check if there any compatible resources in the Grid Information System known to the submission system
586 \item Submit job to the Grid submission component (Resource Broker or Condor-G) through the CMS bookkeeping component (BOSS)
587 \end{itemize}
588 \item Job status check
589 \item Job output retrieval
590 \begin{itemize}
591 \item Retrieve job output from the sandbox located on the Resource Broker (EGEE sites) or the common filesystem (OSG sites)
592 \end{itemize}
593 \end{itemize}
594
595 The job robot executes all four steps of the above described workflow on a large scale.\\
596
597 Apart from job submission the monitoring of the job execution over the
598 entire chain of all steps involved plays an important role. CMS has
599 chosen to use a product called Dashboard, a development that is part
600 of the CMS Integration Program. It is a joint effort of LCG's
601 ARDA project and the MonAlisa team in close collaboration with the CMS
602 developers working on job submission tools for production and analysis.
603 The objective of the Dashboard is to provide a complete view of the CMS
604 activity independently of the Grid flavour (i.e. OSG vs. EGEE). The
605 Dashboard maintains and displays the quantitative characteristics of the
606 usage pattern by including CMS-specific information and it reports problems
607 of various nature.\\
608
609 The monitoring information used in CSA06 is available via a web interface
610 and includes the following categories
611 \begin{itemize}
612 \item Quantities - how many jobs are running, pending, successfully
613 completed, failed, per user, per site, per input data collection, and
614 the distribution of these quantities over time
615 \item Usage of the resources (CPU, memory consumption, I/O rates), and
616 distribution over time with aggregation on different levels
617 \item Distribution of resources between different application areas
618 (i.e. analysis vs. production), different analysis groups and individual
619 users
620 \item Grid behaviour - success rate, failure reasons as a function of time,
621 site and data collection
622 \item CMS application behaviour
623 \item Distribution of data samples over sites and analysis groups
624 \end{itemize}
625
626 Timeline:
627 \begin{itemize}
628 \item October 15, 2006: The job robots have started analysis submission. 10k
629 jobs were submitted by two robot instances, with 90\% of them going to OSG sites
630 using Codor-G direct submission and 10\% going through the traditional LCG
631 Resource Broker (RB) to EGEE sites. In preparation of moving to the gLite RB,
632 thereby improving the submission rate to EGEE sites, bulk submission was
633 integrated into CRAB and is currently being tested.
634
635 \item October 17, 2006: Job robot submissions continue at a larger scale. There
636 was an issue found with the bulk submission feature used at EGEE sites leaving
637 jobs hanging indefinitely. The explanation was parsing
638 of file names in the RB input sandbox failed for file name lengths of exactly 110
639 characters. The problem, located in the gLite User Interface (UI), was solved by
640 rebuilding the UI code to include a new version of the libtar library. A new
641 version of the UI was made available to the job robot operations team within a
642 day.\\
643 A total of 20k jobs were submitted in the past 24 hours. A large number of jobs
644 seemed not to report all the site information to the
645 Dashboard, which results into a major fraction marked as "unknown" in the report.
646 The effect needs to be understood.\\
647 Apart from the jobs being affected by the problem mentioned above the efficiency
648 regarding successfully completed jobs is very high.
649
650 \item October 19, 2006: Robotic job submission via both the Condor-G direct
651 submission and the gLite RB bulk submission is activated. The job completion efficiency
652 remains very high for some sites. Over the course of the past day nearly 2000
653 jobs were completed at Caltech with only 5 failures.
654
655 \item October 20, 2006: The number of "unknown" jobs is decreasing following
656 further investigations by the robot operations team. The job completion efficiency
657 remains high though the total number of submissions is lower than in the previous
658 days. A large number of sites running the PBS batch system have taken their
659 resources off the Grid because of a critical security vulnerability. Sites
660 applied a respective patch at short notice and were back to normal operation
661 within a day or two.
662
663 \item October 23, 2006: Over the weekend significant scaling issues were
664 encountered in the robot. Those were mainly associated with the mySQL
665 server holding the BOSS DB. On the gLite submission side a problem was
666 found with projects comprising more than 2000 jobs. A limit was
667 introduced with the consequence that the same data files are more often
668 accessed.
669
670 \item October 24, 2006: There were again scaling problems observed in the
671 job robots. Switching to a central mySQL data base for both the robots
672 has lead to the databases developing a lock state. Though the locks
673 automatically clear within 10 to 30 minutes the effect has an impact on
674 the overall job submissions rate. To resolve the issue two data bases
675 were created, one for each robot. While the Condor-G side performs well
676 the gLite robot continues to develop locking. A memory leak leading to
677 robot crashes was observed in CRAB/BOSS submission through gLite. The
678 robot operations team is working with the BOSS developers on a solution.
679
680 \item October 25, 2006: The BOSS developers have analyzed the problem
681 yesterday reported as a "scaling issue" and found that an SQL statement
682 issued by CRAB was incomplete, leading to long table rows being accessed
683 resulting in a heavy load on the data base server. The CRAB developers
684 have made a new release available the same day and the robot operations
685 team found that the robots are running fine since.
686
687 \item October 26, 2006: Following the decision to move from analysis
688 of data that has been produced with CMSSW\_1\_0\_3 to more recent data
689 that was produced with CMSSW\_1\_0\_5 a lot of sites were not selected
690 and therefore not participating since they are still lacking respective
691 datasets.
692
693 \item November 1, 2006: The submission rate reached by the job robots
694 is currently at about 25k jobs per day. To improve scaling up to the
695 desired rate 11 robots were set up and are currently submitting to OSG
696 and EGEE sites.
697
698 \item November 2, 2006: The total number of jobs was in the order of
699 21k. Due to more sites having datasets published in DBS/DLS that were
700 created with CMSSW\_1\_0\_5 the number of participating sites has increased.
701 The total application and Grid efficiency is both over 99%.
702
703 \item November 6, 2006: The number of submitted and completed jobs is still increasing.
704 30k jobs have successfully passed all steps in the past 24 hours. 24
705 Tier-2 sites are now publishing data and are accepting jobs from the robot.
706 The efficiency remains high.
707
708 \item November 7, 2006: The combined job robot, production and analysis submissions
709 exceeded the goal of 55k per day. The activity breakdown is shown in
710 Figure~\ref{fig:breakdown}.
711
712 \begin{figure}[htp]
713 \begin{center}
714 \includegraphics[width=0.9\linewidth]{figs/jobs-breakdown-1102}
715 \end{center}
716 \caption{Dashboard view of job breakdown by activity (7 Nov - 8 Nov)}
717 \label{fig:breakdown}
718 \end{figure}
719
720 The job robot submissions by site are shown in Figure~\ref{fig:jobs-per-site}.
721 Six out of seven Tier-1
722 centers are included in the job robot. As expected the Tier-2 centers are
723 still dominating the submissions. The addition of the Tier-1 centers has
724 driven the job robot submission rates past the load that can be sustained
725 by a single mySQL job monitor.
726
727 \begin{figure}[htp]
728 \begin{center}
729 \includegraphics[width=0.9\linewidth]{figs/jobs-per-site-1102}
730 \end{center}
731 \caption{Dashboard view of job breakdown by site (7 Nov - 8 Nov)}
732 \label{fig:jobs-per-site}
733 \end{figure}
734 \end{itemize}