ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/cvsroot/COMP/CSA06DOC/datamanagement.tex
Revision: 1.8
Committed: Sat Mar 3 11:09:30 2007 UTC (18 years, 2 months ago) by acosta
Content type: application/x-tex
Branch: MAIN
CVS Tags: HEAD
Changes since 1.7: +1 -1 lines
Log Message:
minor edits

File Contents

# Content
1 \section{Data Management}
2 \subsection{Dataset Bookkeeping System}
3 The Dataset Bookkeeping System (DBS) for CSA06 included the
4 functionality needed for cataloging Monte Carlo data and tracking some
5 of the processing history. Included were data-related concepts of
6 Dataset, File, File Block and Data Tier. The processing related
7 concepts of Application and Application Configuration were provided to
8 track the actual operations that were performed to produce the
9 data. In addition, data parentage relationships were provided. A
10 client level API enabled the creation of each of the entities
11 described above. File information including size, number of events,
12 status and Logical File Name (LFN) are included as attributes of each
13 file. A discovery service was developed that enabled users to find
14 data of interest for further processing and analysis.
15
16 \subsubsection{Deployment and Operation}
17 The architecture of the DBS service included a middle tier server
18 running a CG script under Apache. All client access to the server was
19 through an HTTP API. The CG script was written in PERL and access the
20 database via the PERL DBI module. All activity for CSA06 was
21 established on the CMSR production Oracle database server, with one
22 Global DBS account and a half dozen so-called Local DBS accounts. The
23 procedure was to produce Monte Carlo data under the control of four
24 ``Prod Agents'', each with access to its own Local DBS instance. When the
25 data was appropriately merged and validated its catalog entries were
26 migrated to the global catalog for use by CMS at large. This migration
27 task used allowed block-by-block transfer of Datasets to be done
28 through a simple API.
29
30 There were two servers provided for CSA06, a ``test'' and
31 ``production'' machine. The production servers were both dual Pentium
32 2.8 Hz processor with 2GB memory. The test server was heavily used
33 by remote sites, including the initial data production, CMOS Robot
34 submissions, and final skimming operations. The production server was
35 used by the Tier-0 reconstruction farm,and some skimming operations
36 near the end of CSA06. There were also ongoing CMOS activities
37 included in the loads for the test server that are not related to
38 CSA06. Access statistics for the service were obtained by mining the
39 Apache access log files for each of the servers. The activity for the
40 production server is shown during the month of November in
41 Fig.~\ref{fig:dbs-prod-stats-chart} and
42 Fig.~\ref{fig:dbs-prod-stats-table}. The important features to
43 observe in this data are the number of pages served (Pages), and the
44 total amount of data transferred (Bandwidth). As an example of a
45 particularly busy day, November 27 showed 220k pages (query requests)
46 and over 10 GB of data. This is a request rate of over 2.5 Hz and the
47 server CPU was around 50\% loaded. Demand on the Test server was
48 heavy in October and the first part of November with peak rates of
49 around 3 Hz in Mid October and again in early November.
50 \begin{figure}[hbtp]
51 \begin{center}
52 \resizebox{15cm}{!}{\includegraphics{figs/dbs-prod-server-stats-nov-chart}}
53 \caption{DBS production server November daily statistics. The bars on the chart for each day represent the number/amount of ``Visits'', ``Pages'', ``Hits'', and ``Bandwidth'' served. The scale and legend for the bar chart can be determined from the data in the table in Fig.~\ref{fig:dbs-prod-stats-table}.}
54 \label{fig:dbs-prod-stats-chart}
55 \end{center}
56 \end{figure}
57
58 \begin{figure}[hbtp]
59 \begin{center}
60 \resizebox{12cm}{!}{\includegraphics{figs/dbs-prod-server-stats-nov-table}}
61 \caption{DBS production server November daily statistics. }
62 \label{fig:dbs-prod-stats-table}
63 \end{center}
64 \end{figure}
65 \subsubsection{Experience from CSA06}
66 The overall operation and performance of the system was very good
67 throughout the course of the CSA06 exercise. The limited functionality
68 provided by the schema and API was sufficient for the test although
69 many additional features are needed for the ultimate system. The
70 dataset propagation from Local-scopes to Global-scope worked
71 seamlessly. Maintenance of the server code was simple and
72 straightforward. The clients were easily integrated with DLS, CRAB and
73 Prod agent. The support needed for the DBS production server was at
74 its minimal.
75
76 Clients occasionally reported slow response during peak periods, but
77 the servers held up well. The service can be easily scaled by adding
78 additional machines and a load balancing mechanism, such as round
79 robin DNS, and this will be examined for the final system. The loads
80 of the CSA06 operation were artificially inflated because many Local
81 DBS instances were being managed centrally at CERN, in addition to the
82 Global instance. In the final system the local instances will not be
83 operated at CERN. There were two incidents which resulted in service
84 interruption, both caused by problems with the central CMSR database
85 system. Ultimately, the DBS service rates needed will be reduced by
86 the fact that only Global-instance traffic will go through the central
87 CERN service.
88
89
90 There were several specific problems and concerns that can be noted
91 based on the CSA06 experience:
92 \begin{enumerate}
93 \item Lack of proper communication between FNAL and Tier 0 for DBS needs resulted in some concepts missing in the schema and API functionality.
94 \item Parameter Set information was not properly stored and there were not enough APIs to relate datasets with these parameters sets.
95 \item Provenance information for a dataset and a file were not properly stored in DBS.
96 \item Block Management was not automated and was done externally on an irregular basis. This led to transfer of both opened and closed blocks. Also, initially blocks that were transferred could not be uniquely and universally identified but this was fixed.
97 \item The merge remapping API were not used and thus not tested because of unavailability of needed functionality with the Framework Job Report and Prod Agent.
98 \item Dataset migration from Local-scope to Global-scope was found to be slower than desired. It performed at rate of 1000 files per minute which caused problems for the ``test'' server behind the cmsdoc proxy server that timed out after 15 minutes.
99
100 \end{enumerate}
101 These will be addressed in the next generation DBS being implemented post CSA06.
102
103
104 \subsection{Data Location Services}
105
106 The data location services (DLS) operated in CMS was based on the
107 Local File Catalog (LFC) infrastructure provided by the EGEE grid
108 project. Data is CMS is divided into blocks, which are logical
109 groupings of data files. The block to file mapping is maintained in
110 the DBS. The advantage of file blocks is that they reduce the number
111 of entries that need to be tracked in the DLS catalog. Instead of an
112 entry for each file there is an entry for every block, which in CSA06
113 typically contained a few hundred files.
114
115 \subsubsection{Deployment and Operation}
116
117 The LFC was deployed as a service provided by the WLCG and the DLS
118 tools developed by CMS were deployed at locations that needed to query
119 or update DLS entries. The tools were deployed and compiled from CVS
120 and were deployed on a standard WLCG user interface machine (UI).
121
122 \subsubsection{CSA06 Experience}
123 The DLS performed stably over the challenge. New data blocks were
124 created once per day for each dataset. This created a maximum of
125 about twenty new entries per day, so the load for creating production
126 datasets was small. The user analysis jobs and load generating job
127 robot jobs queried the DLS to determine the location of data blocks,
128 but only when creating new work flows. The query rate was larger than the new entry rate, but the DLS performed well.
129
130 The largest load in the system came from PhEDEx agents updating the
131 DLS entries with data block locations at sites. The PhEDEx agents
132 update the DLS with the status of all the complete blocks for a site
133 on a ten minute time interval. This kept the latency for publishing
134 complete blocks low, and was manageable with the small number of
135 blocks used in CSA06. As the number of blocks grows, CMS may need to
136 investigate local site caching of the DLS information and only update
137 the DLS with changes from the previous block publication.
138
139 \subsection{Data File Catalogs}
140
141 CMS utilized a technique called the trivial file catalog (TFC) to
142 provide the data catalogs for the site. The TFC utilizes a consistent
143 namespace on each site to provide the catalog functionality that maps
144 a logical file name to a physical file name in the storage system.
145 There is a local site configuration file that points the applications
146 to the common namespace.
147
148 \subsubsection{Deployment and Operation}
149 The TFC was deployed during the site validation phase of service
150 challenge 4. The local site configurations were entered into the
151 common CMS CVS repository, which provided tracking and aided with
152 debugging from remote experts. The TFC was successfully deployed at
153 all sites participating in the challenge activities and the feedback
154 on deployment and operations was generally positive. All underlying
155 storage systems could be accommodated and the site instructions were
156 detailed. The local configuration file also provides the location of
157 the local database cache and the local storage element to the
158 application and could be used for other site specific elements.
159
160 \subsubsection{CSA06 Experience}
161 The TFC scaled well during the challenge. Even on sites with a high
162 load of applications the logical to physical file name mappings were
163 reliably resolved. The TFC did not represent too high a load on the
164 name spaces of the underlying storage systems. An additional factor
165 of four in the TFC rate should be possible in all the currently used
166 storage systems.
167
168 \subsection{Data Transfer Mechanism}
169
170 Data in CMS was transferred between sites using the Physics Data
171 Exporter (PhEDEx) system. The PhEDEx system relies on underlying grid
172 file transfer protocols to physically move the files. While PhEDEx is
173 capable of using bare gridFTP to replicate files, only File Transfer
174 Service (FTS) driven transfers and Storage Resource Manager (SRM) transfers were operated during CSA06.
175
176 \subsubsection{Deployment and Operation}
177
178 CMS deployed an architecture where the FTS servers were located at
179 each Tier-1 and supported channels for groups of ``associated'' Tier-2
180 centers. The association between Tier-1 and Tier-2 centers was
181 intended for channel hosting, as the data can be sourced from any
182 Tier-1 center in the CMS computing model \cite{model, ctdr}. The FTS channels relied on
183 SRM transfers and the FNAL Tier-1 center also supported SRM transfers
184 driven directly from srmcp. The stability of the SRM service at the
185 sites varied, but the percentage of time the transfer succeeded on the
186 first attempt was improved over similar tests during service challenge
187 4, indicating the services are maturing.
188
189 \subsubsection{CSA06 Experience}
190
191 The architecture deployed in CMS for FTS transfers with channels
192 hosted at the associated Tier-1 centers for the supported Tier-2
193 centers leads to a large number of FTS channels. The number of FTS
194 channels supported at Tier-1 centers was larger than the number of FTS
195 channels supported at CERN. The deployed FTS architecture will be
196 re-examined for scalability and supportability.
197
198 \subsection{Data Access}
199
200 The CMS application was able to successfully read from the local
201 storage element using RFIO, RFIO2, and dCache during CSA06. During
202 the challenge the local file access was largely sequential and all
203 protocols were able to meet the application input and output needs.
204 The initial goal of the challenge was to reach 1MB/s per batch slot
205 for Tier-1 and Tier-2 centers. On average CMS was able to reach
206 approximately half the anticipated rate, which was improved after the
207 end of the challenge with protocol specific tuning for the
208 application.