ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/cvsroot/COMP/CRAB/python/crab_help.py
Revision: 1.88
Committed: Thu Jan 22 10:45:27 2009 UTC (16 years, 3 months ago) by spiga
Content type: text/x-python
Branch: MAIN
Changes since 1.87: +4 -4 lines
Log Message:
update

File Contents

# Content
1
2 ###########################################################################
3 #
4 # H E L P F U N C T I O N S
5 #
6 ###########################################################################
7
8 import common
9
10 import sys, os, string
11
12 import tempfile
13
14 ###########################################################################
15 def usage():
16 print 'in usage()'
17 usa_string = common.prog_name + """ [options]
18
19 The most useful general options (use '-h' to get complete help):
20
21 -create -- Create all the jobs.
22 -submit n -- Submit the first n available jobs. Default is all.
23 -status [range] -- check status of all jobs.
24 -getoutput|-get [range] -- get back the output of all jobs: if range is defined, only of selected jobs.
25 -extend -- Extend an existing task to run on new fileblocks if there.
26 -publish [dbs_url] -- after the getouput, publish the data user in a local DBS instance.
27 -kill [range] -- kill submitted jobs.
28 -resubmit [range] -- resubmit killed/aborted/retrieved jobs.
29 -copyData [range] -- copy locally the output stored on remote SE.
30 -renewCredential -- renew credential on the server.
31 -clean -- gracefully cleanup the directory of a task.
32 -match|-testJdl [range] -- check if resources exist which are compatible with jdl.
33 -list [range] -- show technical job details.
34 -postMortem [range] -- provide a file with information useful for post-mortem analysis of the jobs.
35 -printId [range] -- print the job SID or Task Unique ID while using the server.
36 -createJdl [range] -- provide files with a complete Job Description (JDL).
37 -validateCfg [fname] -- parse the ParameterSet using the framework's Python API.
38 -continue|-c [dir] -- Apply command to task stored in [dir].
39 -h [format] -- Detailed help. Formats: man (default), tex, html, txt.
40 -cfg fname -- Configuration file name. Default is 'crab.cfg'.
41 -debug N -- set the verbosity level to N.
42 -v -- Print version and exit.
43
44 "range" has syntax "n,m,l-p" which correspond to [n,m,l,l+1,...,p-1,p] and all possible combination
45
46 Example:
47 crab -create -submit 1
48 """
49 print usa_string
50 sys.exit(2)
51
52 ###########################################################################
53 def help(option='man'):
54 help_string = """
55 =pod
56
57 =head1 NAME
58
59 B<CRAB>: B<C>ms B<R>emote B<A>nalysis B<B>uilder
60
61 """+common.prog_name+""" version: """+common.prog_version_str+"""
62
63 This tool B<must> be used from an User Interface and the user is supposed to
64 have a valid Grid certificate.
65
66 =head1 SYNOPSIS
67
68 B<"""+common.prog_name+"""> [I<options>] [I<command>]
69
70 =head1 DESCRIPTION
71
72 CRAB is a Python program intended to simplify the process of creation and submission of CMS analysis jobs to the Grid environment .
73
74 Parameters for CRAB usage and configuration are provided by the user changing the configuration file B<crab.cfg>.
75
76 CRAB generates scripts and additional data files for each job. The produced scripts are submitted directly to the Grid. CRAB makes use of BossLite to interface to the Grid scheduler, as well as for logging and bookkeeping.
77
78 CRAB supports any CMSSW based executable, with any modules/libraries, including user provided ones, and deals with the output produced by the executable. CRAB provides an interface to CMS data discovery services (DBS and DLS), which are completely hidden to the final user. It also splits a task (such as analyzing a whole dataset) into smaller jobs, according to user requirements.
79
80 CRAB can be used in two ways: StandAlone and with a Server.
81 The StandAlone mode is suited for small task, of the order of O(100) jobs: it submits the jobs directly to the scheduler, and these jobs are under user responsibility.
82 In the Server mode, suited for larger tasks, the jobs are prepared locally and then passed to a dedicated CRAB server, which then interacts with the scheduler on behalf of the user, including additional services, such as automatic resubmission, status caching, output retrieval, and more.
83 The CRAB commands are exactly the same in both cases.
84
85 CRAB web page is available at
86
87 I<http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/>
88
89 =head1 HOW TO RUN CRAB FOR THE IMPATIENT USER
90
91 Please, read all the way through in any case!
92
93 Source B<crab.(c)sh> from the CRAB installation area, which have been setup either by you or by someone else for you.
94
95 Modify the CRAB configuration file B<crab.cfg> according to your need: see below for a complete list. A template and commented B<crab.cfg> can be found on B<$CRABDIR/python/crab.cfg>
96
97 ~>crab -create
98 create all jobs (no submission!)
99
100 ~>crab -submit 2 -continue [ui_working_dir]
101 submit 2 jobs, the ones already created (-continue)
102
103 ~>crab -create -submit 2
104 create _and_ submit 2 jobs
105
106 ~>crab -status
107 check the status of all jobs
108
109 ~>crab -getoutput
110 get back the output of all jobs
111
112 ~>crab -publish
113 publish all user outputs in the DBS specified in the crab.cfg (dbs_url_for_publication) or written as argument of this option
114
115 =head1 RUNNING CMSSW WITH CRAB
116
117 =over 4
118
119 =item B<A)>
120
121 Develop your code in your CMSSW working area. Do anything which is needed to run interactively your executable, including the setup of run time environment (I<eval `scramv1 runtime -sh|csh`>), a suitable I<ParameterSet>, etc. It seems silly, but B<be extra sure that you actually did compile your code> I<scramv1 b>.
122
123 =item B<B)>
124
125 Source B<crab.(c)sh> from the CRAB installation area, which have been setup either by you or by someone else for you. Modify the CRAB configuration file B<crab.cfg> according to your need: see below for a complete list.
126
127 The most important parameters are the following (see below for complete description of each parameter):
128
129 =item B<Mandatory!>
130
131 =over 6
132
133 =item B<[CMSSW]> section: datasetpath, pset, splitting parameters, output_file
134
135 =item B<[USER]> section: output handling parameters, such as return_data, copy_data etc...
136
137 =back
138
139 =item B<Run it!>
140
141 You must have a valid voms-enabled Grid proxy. See CRAB web page for details.
142
143 =back
144
145 =head1 HOW TO RUN ON CONDOR-G
146
147 The B<Condor-G> mode for B<CRAB> is a special submission mode next to the standard Resource Broker submission. It is designed to submit jobs directly to a site and not using the Resource Broker.
148
149 Due to the nature of B<Condor-G> submission, the B<Condor-G> mode is restricted to OSG sites within the CMS Grid, currently the 7 US T2: Florida(ufl.edu), Nebraska(unl.edu), San Diego(ucsd.edu), Purdue(purdue.edu), Wisconsin(wisc.edu), Caltech(ultralight.org), MIT(mit.edu).
150
151 =head2 B<Requirements:>
152
153 =over 2
154
155 =item installed and running local Condor scheduler
156
157 (either installed by the local Sysadmin or self-installed using the VDT user interface: http://www.uscms.org/SoftwareComputing/UserComputing/Tutorials/vdt.html)
158
159 =item locally available LCG or OSG UI installation
160
161 for authentication via Grid certificate proxies ("voms-proxy-init -voms cms" should result in valid proxy)
162
163 =item set the environment variable EDG_WL_LOCATION to the edg directory of the local LCG or OSG UI installation
164
165 =back
166
167 =head2 B<What the Condor-G mode can do:>
168
169 =over 2
170
171 =item submission directly to multiple OSG sites,
172
173 the requested dataset must be published correctly by the site in the local and global services.
174 Previous restrictions on submitting only to a single site have been removed. SE and CE whitelisting
175 and blacklisting work as in the other modes.
176
177 =back
178
179 =head2 B<What the Condor-G mode cannot do:>
180
181 =over 2
182
183 =item submit jobs if no condor scheduler is running on the submission machine
184
185 =item submit jobs if the local condor installation does not provide Condor-G capabilities
186
187 =item submit jobs to an LCG site
188
189 =item support Grid certificate proxy renewal via the myproxy service
190
191 =back
192
193 =head2 B<CRAB configuration for Condor-G mode:>
194
195 The CRAB configuration for the Condor-G mode only requires one change in crab.cfg:
196
197 =over 2
198
199 =item select condor_g Scheduler:
200
201 scheduler = condor_g
202
203 =back
204
205 =head1 COMMANDS
206
207 =over 4
208
209 =item B<-create>
210
211 Create the jobs: from version 1_3_0 it is only possible to create all jobs.
212 The maximum number of jobs depends on dataset and splitting directives. This set of identical jobs accessing the same dataset are defined as a task.
213 This command create a directory with default name is I<crab_0_date_time> (can be changed via ui_working_dir parameter, see below). Inside this directory it is placed whatever is needed to submit your jobs. Also the output of your jobs (once finished) will be place there (see after). Do not cancel by hand this directory: rather use -clean (see).
214 See also I<-continue>.
215
216 =item B<-submit [range]>
217
218 Submit n jobs: 'n' is either a positive integer or 'all' or a [range]. Default is all.
219 If 'n' is passed as argument, the first 'n' suitable jobs will be submitted. Please note that this is behaviour is different from other commands, where -command N means act the command to the job N, and not to the first N jobs. If a [range] is passed, the selected jobs will be submitted.
220 This option must be used in conjunction with -create (to create and submit immediately) or with -continue (which is assumed by default), to submit previously created jobs. Failure to do so will stop CRAB and generate an error message. See also I<-continue>.
221
222 =item B<-continue [dir] | -c [dir]>
223
224 Apply the action on the task stored on directory [dir]. If the task directory is the standard one (crab_0_date_time), the more recent in time is taken. Any other directory must be specified.
225 Basically all commands (but -create) need -continue, so it is automatically assumed. Of course, the standard task directory is used in this case.
226
227 =item B<-status>
228
229 Check the status of the jobs, in all states. All the info (e.g. application and wrapper exit codes) will be available only after the output retrieval.
230
231 =item B<-getoutput|-get [range]>
232
233 Retrieve the output declared by the user via the output sandbox. By default the output will be put in task working dir under I<res> subdirectory. This can be changed via config parameters. B<Be extra sure that you have enough free space>. See I<range> below for syntax.
234
235 =item B<-publish [dbs_url]>
236
237 Publish user output in a local DBS instance after retrieving of output. By default the publish uses the dbs_url_for_publication specified in the crab.cfg file, otherwise you can write it as argument of this option.
238
239 =item B<-resubmit [range]>
240
241 Resubmit jobs which have been previously submitted and have been either I<killed> or are I<aborted>. See I<range> below for syntax.
242
243 =item B<-extend>
244
245 Create new jobs for an existing task, checking if new blocks are available for the given dataset.
246
247 =item B<-kill [range]>
248
249 Kill (cancel) jobs which have been submitted to the scheduler. A range B<must> be used in all cases, no default value is set.
250
251 =item B<-copyData [range]>
252
253 Copy locally (on current working directory) the output previously stored on remote SE by the jobs. Of course, only if copy_data option has been set.
254
255 =item B<-renewCredential >
256
257 If using the server modality, this command allows to delegate a valid credential (proxy/token) to the server associated with the task.
258
259 =item B<-match|-testJdl [range]>
260
261 Check if the job can find compatible resources. It is equivalent of doing I<edg-job-list-match> on edg.
262
263 =item B<-printId [range]>
264
265 Just print the job identifier, which can be the SID (Grid job identifier) of the job(s) or the taskId if you are using CRAB with the server or local scheduler Id. If [range] is "full", the the SID of all the jobs are printed, also in the case of submission with server.
266
267 =item B<-printJdl [range]>
268
269 Collect the full Job Description in a file located under share directory. The file base name is File- .
270
271 =item B<-postMortem [range]>
272
273 Try to collect more information of the job from the scheduler point of view.
274
275 =item B<-list [range]>
276
277 Dump technical information about jobs: for developers only.
278
279 =item B<-clean [dir]>
280
281 Clean up (i.e. erase) the task working directory after a check whether there are still running jobs. In case, you are notified and asked to kill them or retrieve their output. B<Warning> this will possibly delete also the output produced by the task (if any)!
282
283 =item B<-help [format] | -h [format]>
284
285 This help. It can be produced in three different I<format>: I<man> (default), I<tex> and I<html>.
286
287 =item B<-v>
288
289 Print the version and exit.
290
291 =item B<range>
292
293 The range to be used in many of the above commands has the following syntax. It is a comma separated list of jobs ranges, each of which may be a job number, or a job range of the form first-last.
294 Example: 1,3-5,8 = {1,3,4,5,8}
295
296 =back
297
298 =head1 OPTION
299
300 =over 4
301
302 =item B<-cfg [file]>
303
304 Configuration file name. Default is B<crab.cfg>.
305
306 =item B<-debug [level]>
307
308 Set the debug level: high number for high verbosity.
309
310 =back
311
312 =head1 CONFIGURATION PARAMETERS
313
314 All the parameter describe in this section can be defined in the CRAB configuration file. The configuration file has different sections: [CRAB], [USER], etc. Each parameter must be defined in its proper section. An alternative way to pass a config parameter to CRAB is via command line interface; the syntax is: crab -SECTION.key value . For example I<crab -USER.outputdir MyDirWithFullPath> .
315 The parameters passed to CRAB at the creation step are stored, so they cannot be changed by changing the original crab.cfg . On the other hand the task is protected from any accidental change. If you want to change any parameters, this require the creation of a new task.
316 Mandatory parameters are flagged with a *.
317
318 B<[CRAB]>
319
320 =over 4
321
322 =item B<jobtype *>
323
324 The type of the job to be executed: I<cmssw> jobtypes are supported
325
326 =item B<scheduler *>
327
328 The scheduler to be used: I<glitecoll> is the more efficient grid scheduler and should be used. Other choice are I<glite>, same as I<glitecoll> but without bulk submission (and so slower) or I<condor_g> (see specific paragraph) or I<edg> which is the former Grid scheduler, which will be dismissed in some future
329 From version 210, also local scheduler are supported, for the time being only at CERN. I<LSF> is the standard CERN local scheduler or I<CAF> which is LSF dedicated to CERN Analysis Facilities.
330
331 =item B<use_server>
332
333 To use the server for job handling (recommended) 0=no (default), 1=true. The server to be used will be found automatically from a list of available ones: it can also be specified explicitly by using I<server_name> (see below)
334
335 =item B<server_name>
336
337 To use the CRAB-server support it is needed to fill this key with server name as <Server_DOMAIN> (e.g. cnaf,fnal). If this is set, I<use_server> is set to true automatically.
338 If I<server_name=None> crab works in standalone way, same as using I<use_server=0> and no I<server_name>.
339 The server available to users can be found from CRAB web page.
340
341 =back
342
343 B<[CMSSW]>
344
345 =over 4
346
347 =item B<datasetpath *>
348
349 the path of processed dataset as defined on the DBS. It comes with the format I</PrimaryDataset/DataTier/Process> . In case no input is needed I<None> must be specified.
350
351 =item B<runselection *>
352
353 within a dataset you can restrict to run on a specific run number or run number range. For example runselection=XYZ or runselection=XYZ1-XYZ2 .
354
355 =item B<use_parent *>
356
357 within a dataset you can ask to run over the related parent files too. E.g., this will give you access to the RAW data while running over a RECO sample. Setting use_parent=True CRAB determines the parent files from DBS and will add secondaryFileNames = cms.untracked.vstring( <LIST of parent FIles> ) to the pool source section of your parameter set.
358
359 =item B<pset *>
360
361 the ParameterSet to be used. Both .cfg and .py parameter sets are supported for the relevant versions of CMSSW.
362
363 =item I<Of the following three parameter exactly two must be used, otherwise CRAB will complain.>
364
365 =item B<total_number_of_events *>
366
367 the number of events to be processed. To access all available events, use I<-1>. Of course, the latter option is not viable in case of no input. In this case, the total number of events will be used to split the task in jobs, together with I<event_per_job>.
368
369 =item B<events_per_job*>
370
371 number of events to be accessed by each job. Since a job cannot cross the boundary of a fileblock it might be that the actual number of events per job is not exactly what you asked for. It can be used also with No input.
372
373 =item B<number_of_jobs *>
374
375 Define the number of job to be run for the task. The number of event for each job is computed taking into account the total number of events required as well as the granularity of EventCollections. Can be used also with No input.
376
377 =item B<output_file *>
378
379 the output files produced by your application (comma separated list). From CRAB 2_2_2 onward, if TFileService is defined in user Pset, the corresponding output file is automatically added to the list of output files. User can avoid this by setting B<skip_TFileService_output> = 1 (default is 0 == file included). The Edm output produced via PoolOutputModule can be automatically added by setting B<get_edm_output> = 1 (default is 0 == no)
380
381 =item B<skip_TFileService_output>
382
383 Force CRAB to skip the inclusion of file produced by TFileService to list of output files. Default is I<0>, namely the file is included.
384
385 =item B<get_edm_output>
386
387 Force CRAB to add the EDM output file, as defined in PSET in PoolOutputModule (if any) to be added to the list of output files. Default is 0 (== no inclusion)
388
389 =item B<increment_seeds>
390
391 Specifies a comma separated list of seeds to increment from job to job. The initial value is taken
392 from the CMSSW config file. I<increment_seeds=sourceSeed,g4SimHits> will set sourceSeed=11,12,13 and g4SimHits=21,22,23 on
393 subsequent jobs if the values of the two seeds are 10 and 20 in the CMSSW config file.
394
395 See also I<preserve_seeds>. Seeds not listed in I<increment_seeds> or I<preserve_seeds> are randomly set for each job.
396
397 =item B<preserve_seeds>
398
399 Specifies a comma separated list of seeds to which CRAB will not change from their values in the user
400 CMSSW config file. I<preserve_seeds=sourceSeed,g4SimHits> will leave the Pythia and GEANT seeds the same for every job.
401
402 See also I<increment_seeds>. Seeds not listed in I<increment_seeds> or I<preserve_seeds> are randomly set for each job.
403
404 =item B<first_run>
405
406 First run to be generated in a generation jobs. Relevant only for no-input workflow.
407
408 =item B<generator>
409
410 Name of the generator your MC job is using. Some generators require CRAB to skip events, others do not.
411 Possible values are pythia, comphep, and madgraph. This will skip events in your generator input file.
412
413 =item B<executable>
414
415 The name of the executable to be run on remote WN. The default is cmsrun. The executable is either to be found on the release area of the WN, or has been built on user working area on the UI and is (automatically) shipped to WN. If you want to run a script (which might internally call I<cmsrun>, use B<USER.script_exe> instead.
416
417 =item I<DBS and DLS parameters:>
418
419 =item B<dbs_url>
420
421 The URL of the DBS query page. For expert only.
422
423 =item B<show_prod>
424
425 To enable CRAB to show data hosted on Tier1s sites specify I<show_prod> = 1. By default those data are masked.
426
427 =item B<no_block_boundary>
428
429 To remove fileblock boundaries in job splitting specify I<no_block_boundary> = 1.
430
431 =back
432
433 B<[USER]>
434
435 =over 4
436
437 =item B<additional_input_files>
438
439 Any additional input file you want to ship to WN: comma separated list. IMPORTANT NOTE: they will be placed in the WN working dir, and not in ${CMS_SEARCH_PATH}. Specific files required by CMSSW application must be placed in the local data directory, which will be automatically shipped by CRAB itself. You do not need to specify the I<ParameterSet> you are using, which will be included automatically. Wildcards are allowed.
440
441 =item B<script_exe>
442
443 A user script that will be run on WN (instead of default cmsrun). It is up to the user to setup properly the script itself to run on WN enviroment. CRAB guarantees that the CMSSW environment is setup (e.g. scram is in the path) and that the modified pset.cfg will be placed in the working directory, with name CMSSW.cfg . The user must ensure that a job report named crab_fjr.xml will be written. This can be guaranteed by passing the arguments "-j crab_fjr.xml" to cmsRun in the script. The script itself will be added automatically to the input sandbox so user MUST NOT add it within the B<USER.additional_input_files>.
444
445 =item B<ui_working_dir>
446
447 Name of the working directory for the current task. By default, a name I<crab_0_(date)_(time)> will be used. If this card is set, any CRAB command which require I<-continue> need to specify also the name of the working directory. A special syntax is also possible, to reuse the name of the dataset provided before: I<ui_working_dir : %(dataset)s> . In this case, if e.g. the dataset is SingleMuon, the ui_working_dir will be set to SingleMuon as well.
448
449 =item B<thresholdLevel>
450
451 This has to be a value between 0 and 100, that indicates the percentage of task completeness (jobs in a ended state are complete, even if failed). The server will notify the user by e-mail (look at the field: B<eMail>) when the task will reach the specified threshold. Works just with the server_mode = 1.
452
453 =item B<eMail>
454
455 The server will notify the specified e-mail when the task will reaches the specified B<thresholdLevel>. A notification is also sent when the task will reach the 100\% of completeness. This field can also be a list of e-mail: "B<eMail = user1@cern.ch, user2@cern.ch>". Works just with the server_mode = 1.
456
457 =item B<return_data *>
458
459 The output produced by the executable on WN is returned (via output sandbox) to the UI, by issuing the I<-getoutput> command. B<Warning>: this option should be used only for I<small> output, say less than 10MB, since the sandbox cannot accommodate big files. Depending on Resource Broker used, a size limit on output sandbox can be applied: bigger files will be truncated. To be used in alternative to I<copy_data>.
460
461 =item B<outputdir>
462
463 To be used together with I<return_data>. Directory on user interface where to store the output. Full path is mandatory, "~/" is not allowed: the default location of returned output is ui_working_dir/res .
464
465 =item B<logdir>
466
467 To be used together with I<return_data>. Directory on user interface where to store the standard output and error. Full path is mandatory, "~/" is not allowed: the default location of returned output is ui_working_dir/res .
468
469 =item B<copy_data *>
470
471 The output (only that produced by the executable, not the std-out and err) is copied to a Storage Element of your choice (see below). To be used as an alternative to I<return_data> and recommended in case of large output.
472
473 =item B<storage_element>
474
475 To be used with <copy_data>=1
476 If you want to copy the output of your analysis in a official CMS Tier2 or Tier3, you have to write the CMS Site Name of the site, as written in the SiteDB https://cmsweb.cern.ch/sitedb/reports/showReport?reportid=se_cmsname_map.ini (i.e T2_IT_legnaro). You have also to specify the <remote_dir>(see below)
477
478 If you want to copy the output in a not_official_CMS remote site you have to specify the complete storage element name (i.e se.xxx.infn.it).You have also to specify the <storage_path> and the <storage_port> if you do not use the default one(see below).
479
480 =item B<user_remote_dir>
481
482 To be used with <copy_data>=1 and <storage_element> official CMS sites.
483 This is the directory or tree of directories where your output will be stored. This directory will be created under the mountpoint ( which will be discover by CRAB if an official CMS storage Element has been used, or taken from the crab.cfg as specified by the user). B<NOTE> This part of the path will be used as logical file name of your files in the case of publication without using an official CMS storage Element.
484
485 =item B<storage_path>
486
487 To be used with <copy_data>=1 and <storage_element> not official CMS sites.
488 This is the full path of the Storage Element writeable by all, the mountpoint of SE (i.e /srm/managerv2?SFN=/pnfs/se.xxx.infn.it/yyy/zzz/)
489
490
491 =item B<storage_pool>
492
493 If you are using CAF scheduler, you can specify the storage pool where to write your output.
494 The default is cmscafuser. If you do not want to use the default, you can overwrite it specifing None
495
496 =item B<storage_port>
497
498 To choose the storage port specify I<storage_port> = N (default is 8443) .
499
500 =item B<publish_data*>
501
502 To be used with <copy_data>=1
503 To publish your produced output in a local istance of DBS set publish_data = 1
504 All the details about how to use this functionality are written in https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrabForPublication
505 N.B 1) if you are using an official CMS site to stored data, the remote dir will be not considered. The directory where data will be stored is decided by CRAB, following the CMS policy in order to be able to re-read published data.
506 2) if you are using a not official CMS site to store data, you have to check the <lfn>, that will be part of the logical file name of you published files, in order to be able to re-read the data.
507
508 =item B<publish_data_name>
509
510 You produced output will be published in your local DBS with dataset name <primarydataset>/<publish_data_name>/USER
511
512 =item B<dbs_url_for_publication>
513
514 Specify the URL of your local DBS istance where CRAB has to publish the output files
515
516 =item B<srm_version>
517
518 To choose the srm version specify I<srm_version> = (srmv1 or srmv2).
519
520 =item B<xml_report>
521
522 To be used to switch off the screen report during the status query, enabling the db serialization in a file. Specifying I<xml_report> = FileName CRAB will serialize the DB into CRAB_WORKING_DIR/share/FileName.
523
524 =item B<usenamespace>
525
526 To use the automate namespace definition (perfomed by CRAB) it is possible to set I<usenamespace>=1. The same policy used for the stage out in case of data publication will be applied.
527
528 =item B<debug_wrapper>
529
530 To enable the higer verbose level on wrapper specify I<debug_wrapper> = 1. The Pset contents before and after the CRAB maipulation will be written together with other useful infos.
531
532 =item B<deep_debug>
533
534 To be used in case of unexpected job crash when the sdtout and stderr files are lost. Submitting again the same jobs specifying I<deep_debug> = 1 these files will be reported back. NOTE: it works only on standalone mode for debugging purpose.
535
536 =item B<dontCheckSpaceLeft>
537
538 Set it to 1 to skip the check of free space left on your working directory before attempting to get the output back. Default is 0 (=False)
539
540 =back
541
542 B<[EDG]>
543
544 =over 4
545
546 =item B<RB>
547
548 Which RB you want to use instead of the default one, as defined in the configuration of your UI. The ones available for CMS are I<CERN> and I<CNAF>. They are actually identical, being a collection of all RB/WMS available for CMS: the configuration files needed to change the broker will be automatically downloaded from CRAB web page and used.
549 You can use any other RB which is available, if you provide the proper configuration files. E.g., for RB XYZ, you should provide I<edg_wl_ui.conf.CMS_XYZ> and I<edg_wl_ui_cmd_var.conf.CMS_XYZ> for EDG RB, or I<glite.conf.CMS_XYZ> for glite WMS. These files are searched for in the current working directory, and, if not found, on crab web page. So, if you put your private configuration files in the working directory, they will be used, even if they are not available on crab web page.
550 Please get in contact with crab team if you wish to provide your RB or WMS as a service to the CMS community.
551
552 =item B<proxy_server>
553
554 The proxy server to which you delegate the responsibility to renew your proxy once expired. The default is I<myproxy.cern.ch> : change only if you B<really> know what you are doing.
555
556 =item B<role>
557
558 The role to be set in the VOMS. See VOMS documentation for more info.
559
560 =item B<group>
561
562 The group to be set in the VOMS, See VOMS documentation for more info.
563
564 =item B<dont_check_proxy>
565
566 If you do not want CRAB to check your proxy. The creation of the proxy (with proper length), its delegation to a myproxyserver is your responsibility.
567
568 =item B<requirements>
569
570 Any other requirements to be add to JDL. Must be written in compliance with JDL syntax (see LCG user manual for further info). No requirement on Computing element must be set.
571
572 =item B<additional_jdl_parameters:>
573
574 Any other parameters you want to add to jdl file:semicolon separated list, each
575 item B<must> be complete, including the closing ";".
576
577 =item B<wms_service>
578
579 With this field it is also possible to specify which WMS you want to use (https://hostname:port/pathcode) where "hostname" is WMS name, the "port" generally is 7443 and the "pathcode" should be something like "glite_wms_wmproxy_server".
580
581 =item B<max_cpu_time>
582
583 Maximum CPU time needed to finish one job. It will be used to select a suitable queue on the CE. Time in minutes.
584
585 =item B<max_wall_clock_time>
586
587 Same as previous, but with real time, and not CPU one.
588
589 =item B<ce_black_list>
590
591 All the CE (Computing Element) whose name contains the following strings (comma separated list) will not be considered for submission. Use the dns domain (e.g. fnal, cern, ifae, fzk, cnaf, lnl,....). You may use hostnames or CMS Site names (T2_DE_DESY) or substrings.
592
593 =item B<ce_white_list>
594
595 Only the CE (Computing Element) whose name contains the following strings (comma separated list) will be considered for submission. Use the dns domain (e.g. fnal, cern, ifae, fzk, cnaf, lnl,....). You may use hostnames or CMS Site names (T2_DE_DESY) or substrings. Please note that if the selected CE(s) does not contain the data you want to access, no submission can take place.
596
597 =item B<se_black_list>
598
599 All the SE (Storage Element) whose name contains the following strings (comma separated list) will not be considered for submission.It works only if a datasetpath is specified. You may use hostnames or CMS Site names (T2_DE_DESY) or substrings.
600
601 =item B<se_white_list>
602
603 Only the SE (Storage Element) whose name contains the following strings (comma separated list) will be considered for submission.It works only if a datasetpath is specified. Please note that if the selected CE(s) does not contain the data you want to access, no submission can take place. You may use hostnames or CMS Site names (T2_DE_DESY) or substrings.
604
605 =item B<remove_default_blacklist>
606
607 CRAB enforce the T1s Computing Eelements Black List. By default it is appended to the user defined I<CE_black_list>. To remove the enforced T1 black lists set I<remove_default_blacklist>=1.
608
609 =item B<virtual_organization>
610
611 You don\'t want to change this: it\'s cms!
612
613 =item B<retry_count>
614
615 Number of time the Grid will try to resubmit your job in case of Grid related problem.
616
617 =item B<shallow_retry_count>
618
619 Number of time shallow resubmission the Grid will try: resubmissions are tried B<only> if the job aborted B<before> start. So you are guaranteed that your jobs run strictly once.
620
621 =item B<maxtarballsize>
622
623 Maximum size of tar-ball in Mb. If bigger, an error will be generated. The actual limit is that on the RB input sandbox. Default is 9.5 Mb (sandbox limit is 10 Mb)
624
625 =item B<skipwmsauth>
626
627 Temporary useful parameter to allow the WMSAuthorisation handling. Specifying I<skipwmsauth> = 1 the pyopenssl problmes will disappear. It is needed working on gLite UI outside of CERN.
628
629 =back
630
631 B<[LSF]> or B<[CAF]>
632
633 =over 4
634
635 =item B<queue>
636
637 The LSF queue you want to use: if none, the default one will be used. For CAF, the proper queue will be automatically selected.
638
639 =item B<resource>
640
641 The resources to be used within a LSF queue. Again, for CAF, the right one is selected.
642
643 =item B<copyCommand>
644
645 To define the command to be used to copy both Input and Output sandboxes to final location. Default is cp
646
647 =back
648
649 =head1 FILES
650
651 I<crab> uses a configuration file I<crab.cfg> which contains configuration parameters. This file is written in the INI-style. The default filename can be changed by the I<-cfg> option.
652
653 I<crab> creates by default a working directory 'crab_0_E<lt>dateE<gt>_E<lt>timeE<gt>'
654
655 I<crab> saves all command lines in the file I<crab.history>.
656
657 =head1 HISTORY
658
659 B<CRAB> is a tool for the CMS analysis on the Grid environment. It is based on the ideas from CMSprod, a production tool originally implemented by Nikolai Smirnov.
660
661 =head1 AUTHORS
662
663 """
664 author_string = '\n'
665 for auth in common.prog_authors:
666 #author = auth[0] + ' (' + auth[2] + ')' + ' E<lt>'+auth[1]+'E<gt>,\n'
667 author = auth[0] + ' E<lt>' + auth[1] +'E<gt>,\n'
668 author_string = author_string + author
669 pass
670 help_string = help_string + author_string[:-2] + '.'\
671 """
672
673 =cut
674 """
675
676 pod = tempfile.mktemp()+'.pod'
677 pod_file = open(pod, 'w')
678 pod_file.write(help_string)
679 pod_file.close()
680
681 if option == 'man':
682 man = tempfile.mktemp()
683 pod2man = 'pod2man --center=" " --release=" " '+pod+' >'+man
684 os.system(pod2man)
685 os.system('man '+man)
686 pass
687 elif option == 'tex':
688 fname = common.prog_name+'-v'+common.prog_version_str
689 tex0 = tempfile.mktemp()+'.tex'
690 pod2tex = 'pod2latex -full -out '+tex0+' '+pod
691 os.system(pod2tex)
692 tex = fname+'.tex'
693 tex_old = open(tex0, 'r')
694 tex_new = open(tex, 'w')
695 for s in tex_old.readlines():
696 if string.find(s, '\\begin{document}') >= 0:
697 tex_new.write('\\title{'+common.prog_name+'\\\\'+
698 '(Version '+common.prog_version_str+')}\n')
699 tex_new.write('\\author{\n')
700 for auth in common.prog_authors:
701 tex_new.write(' '+auth[0]+
702 '\\thanks{'+auth[1]+'} \\\\\n')
703 tex_new.write('}\n')
704 tex_new.write('\\date{}\n')
705 elif string.find(s, '\\tableofcontents') >= 0:
706 tex_new.write('\\maketitle\n')
707 continue
708 elif string.find(s, '\\clearpage') >= 0:
709 continue
710 tex_new.write(s)
711 tex_old.close()
712 tex_new.close()
713 print 'See '+tex
714 pass
715 elif option == 'html':
716 fname = common.prog_name+'-v'+common.prog_version_str+'.html'
717 pod2html = 'pod2html --title='+common.prog_name+\
718 ' --infile='+pod+' --outfile='+fname
719 os.system(pod2html)
720 print 'See '+fname
721 pass
722 elif option == 'txt':
723 fname = common.prog_name+'-v'+common.prog_version_str+'.txt'
724 pod2text = 'pod2text '+pod+' '+fname
725 os.system(pod2text)
726 print 'See '+fname
727 pass
728
729 sys.exit(0)