ViewVC Help
View File | Revision Log | Show Annotations | Root Listing
root/cvsroot/COMP/CRAB/python/crab_help.py
Revision: 1.182
Committed: Wed Sep 11 12:52:35 2013 UTC (11 years, 7 months ago) by belforte
Content type: text/x-python
Branch: MAIN
CVS Tags: CRAB_2_9_1, CRAB_2_9_1_pre2, HEAD
Changes since 1.181: +1 -1 lines
Log Message:
fix twiki reference, see: https://savannah.cern.ch/bugs/?102552

File Contents

# User Rev Content
1 nsmirnov 1.1
2     ###########################################################################
3     #
4     # H E L P F U N C T I O N S
5     #
6     ###########################################################################
7    
8     import common
9    
10     import sys, os, string
11 spiga 1.34
12 nsmirnov 1.1 import tempfile
13    
14     ###########################################################################
15     def usage():
16 slacapra 1.43 print 'in usage()'
17 nsmirnov 1.1 usa_string = common.prog_name + """ [options]
18 slacapra 1.3
19     The most useful general options (use '-h' to get complete help):
20    
21 spiga 1.100 -create -- Create all the jobs.
22     -submit n -- Submit the first n available jobs. Default is all.
23 slacapra 1.102 -status -- check status of all jobs.
24 spiga 1.100 -getoutput|-get [range] -- get back the output of all jobs: if range is defined, only of selected jobs.
25     -publish -- after the getouput, publish the data user in a local DBS instance.
26 fanzago 1.163 -publishNoInp -- after the getoutput, publish the data user in the local DBS instance removing input data file
27 ewv 1.104 -checkPublication [dbs_url datasetpath] -- checks if a dataset is published in a DBS.
28 spiga 1.100 -kill [range] -- kill submitted jobs.
29 fanzago 1.161 -resubmit range or all -- resubmit killed/aborted/retrieved jobs.
30     -forceResubmit range or all -- resubmit jobs regardless to their status.
31 ewv 1.133 -copyData [range [dest_se or dest_endpoint]] -- copy locally (in crab_working_dir/res dir) or on a remote SE your produced output,
32     already stored on remote SE.
33 spiga 1.100 -renewCredential -- renew credential on the server.
34     -clean -- gracefully cleanup the directory of a task.
35     -match|-testJdl [range] -- check if resources exist which are compatible with jdl.
36     -report -- print a short report about the task
37     -list [range] -- show technical job details.
38     -postMortem [range] -- provide a file with information useful for post-mortem analysis of the jobs.
39 belforte 1.169 -printId -- print the SID for all jobs in task
40 spiga 1.100 -createJdl [range] -- provide files with a complete Job Description (JDL).
41     -validateCfg [fname] -- parse the ParameterSet using the framework's Python API.
42 spiga 1.135 -cleanCache -- clean SiteDB and CRAB caches.
43 mcinquil 1.154 -uploadLog [jobid] -- upload main log files to a central repository
44 spiga 1.100 -continue|-c [dir] -- Apply command to task stored in [dir].
45     -h [format] -- Detailed help. Formats: man (default), tex, html, txt.
46     -cfg fname -- Configuration file name. Default is 'crab.cfg'.
47     -debug N -- set the verbosity level to N.
48     -v -- Print version and exit.
49 nsmirnov 1.1
50 slacapra 1.4 "range" has syntax "n,m,l-p" which correspond to [n,m,l,l+1,...,p-1,p] and all possible combination
51    
52 nsmirnov 1.1 Example:
53 slacapra 1.26 crab -create -submit 1
54 nsmirnov 1.1 """
55 slacapra 1.43 print usa_string
56 nsmirnov 1.1 sys.exit(2)
57    
58     ###########################################################################
59     def help(option='man'):
60     help_string = """
61     =pod
62    
63     =head1 NAME
64    
65     B<CRAB>: B<C>ms B<R>emote B<A>nalysis B<B>uilder
66    
67 slacapra 1.3 """+common.prog_name+""" version: """+common.prog_version_str+"""
68 nsmirnov 1.1
69 slacapra 1.19 This tool B<must> be used from an User Interface and the user is supposed to
70 fanzago 1.37 have a valid Grid certificate.
71 nsmirnov 1.1
72     =head1 SYNOPSIS
73    
74 slacapra 1.13 B<"""+common.prog_name+"""> [I<options>] [I<command>]
75 nsmirnov 1.1
76     =head1 DESCRIPTION
77    
78 ewv 1.52 CRAB is a Python program intended to simplify the process of creation and submission of CMS analysis jobs to the Grid environment .
79 nsmirnov 1.1
80 slacapra 1.3 Parameters for CRAB usage and configuration are provided by the user changing the configuration file B<crab.cfg>.
81 nsmirnov 1.1
82 spiga 1.48 CRAB generates scripts and additional data files for each job. The produced scripts are submitted directly to the Grid. CRAB makes use of BossLite to interface to the Grid scheduler, as well as for logging and bookkeeping.
83 nsmirnov 1.1
84 ewv 1.52 CRAB supports any CMSSW based executable, with any modules/libraries, including user provided ones, and deals with the output produced by the executable. CRAB provides an interface to CMS data discovery services (DBS and DLS), which are completely hidden to the final user. It also splits a task (such as analyzing a whole dataset) into smaller jobs, according to user requirements.
85 nsmirnov 1.1
86 slacapra 1.46 CRAB can be used in two ways: StandAlone and with a Server.
87     The StandAlone mode is suited for small task, of the order of O(100) jobs: it submits the jobs directly to the scheduler, and these jobs are under user responsibility.
88 ewv 1.52 In the Server mode, suited for larger tasks, the jobs are prepared locally and then passed to a dedicated CRAB server, which then interacts with the scheduler on behalf of the user, including additional services, such as automatic resubmission, status caching, output retrieval, and more.
89 slacapra 1.46 The CRAB commands are exactly the same in both cases.
90    
91 slacapra 1.13 CRAB web page is available at
92    
93 spiga 1.94 I<https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrab>
94 slacapra 1.6
95 slacapra 1.19 =head1 HOW TO RUN CRAB FOR THE IMPATIENT USER
96    
97 ewv 1.52 Please, read all the way through in any case!
98 slacapra 1.19
99     Source B<crab.(c)sh> from the CRAB installation area, which have been setup either by you or by someone else for you.
100    
101 fanzago 1.163 Modify the CRAB configuration file B<crab.cfg> according to your need: see below for a complete list. A template and commented B<crab.cfg> can be found on B<$CRABDIR/python/full_crab.cfg> (detailed cfg) and B<$CRABDIR/python/minimal_crab.cfg> (only basic parameters)
102 slacapra 1.19
103 ewv 1.44 ~>crab -create
104 slacapra 1.19 create all jobs (no submission!)
105    
106 spiga 1.25 ~>crab -submit 2 -continue [ui_working_dir]
107 slacapra 1.19 submit 2 jobs, the ones already created (-continue)
108    
109 slacapra 1.26 ~>crab -create -submit 2
110 slacapra 1.19 create _and_ submit 2 jobs
111    
112 spiga 1.25 ~>crab -status
113 slacapra 1.19 check the status of all jobs
114    
115 spiga 1.25 ~>crab -getoutput
116 slacapra 1.19 get back the output of all jobs
117    
118 ewv 1.44 ~>crab -publish
119     publish all user outputs in the DBS specified in the crab.cfg (dbs_url_for_publication) or written as argument of this option
120 fanzago 1.42
121 slacapra 1.20 =head1 RUNNING CMSSW WITH CRAB
122 nsmirnov 1.1
123 slacapra 1.3 =over 4
124    
125     =item B<A)>
126    
127 ewv 1.160 Develop your code in your CMSSW working area. Do anything which is needed to run interactively your executable, including the setup of run time environment (I<cmsenv>), a suitable I<ParameterSet>, etc. It seems silly, but B<be extra sure that you actually did compile your code> I<scram b>.
128 slacapra 1.3
129 ewv 1.44 =item B<B)>
130 slacapra 1.3
131 slacapra 1.20 Source B<crab.(c)sh> from the CRAB installation area, which have been setup either by you or by someone else for you. Modify the CRAB configuration file B<crab.cfg> according to your need: see below for a complete list.
132    
133     The most important parameters are the following (see below for complete description of each parameter):
134    
135     =item B<Mandatory!>
136    
137     =over 6
138    
139     =item B<[CMSSW]> section: datasetpath, pset, splitting parameters, output_file
140    
141     =item B<[USER]> section: output handling parameters, such as return_data, copy_data etc...
142    
143     =back
144    
145     =item B<Run it!>
146    
147 fanzago 1.37 You must have a valid voms-enabled Grid proxy. See CRAB web page for details.
148 slacapra 1.20
149     =back
150    
151 spiga 1.94 =head1 RUNNING MULTICRAB
152    
153 ewv 1.98 MultiCRAB is a CRAB extension to submit the same job to multiple datasets in one go.
154 spiga 1.94
155 ewv 1.98 The use case for multicrab is when you have your analysis code that you want to run on several datasets, typically some signals plus some backgrounds (for MC studies)
156 spiga 1.94 or on different streams/configuration/runs for real data taking. You want to run exactly the same code, and also the crab.cfg are different only for few keys:
157 ewv 1.98 for sure datasetpath but also other keys, such as eg total_number_of_events, in case you want to run on all signals but only a fraction of background, or anything else.
158 spiga 1.94 So far, you would have to create a set of crab.cfg, one for each dataset you want to access, and submit several instances of CRAB, saving the output to different locations.
159     Multicrab is meant to automatize this procedure.
160     In addition to the usual crab.cfg, there is a new configuration file called multicrab.cfg. The syntax is very similar to that of crab.cfg, namely
161     [SECTION] <crab.cfg Section>.Key=Value
162    
163     Please note that it is mandatory to add explicitly the crab.cfg [SECTION] in front of [KEY].
164     The role of multicrab.cfg is to apply modification to the template crab.cfg, some which are common to all tasks, and some which are task specific.
165    
166     =head2 So there are two sections:
167    
168     =over 2
169    
170 ewv 1.98 =item B<[COMMON]>
171 spiga 1.94
172     section: which applies to all task, and which is fully equivalent to modify directly the template crab.cfg
173    
174 ewv 1.98 =item B<[DATASET]>
175 spiga 1.94
176 ewv 1.98 section: there could be an arbitrary number of sections, one for each dataset you want to run. The names are free (but COMMON and MULTICRAB), and they will be used as ui_working_dir for the task as well as an appendix to the user_remote_dir in case of output copy to remote SE. So, the task corresponding to section, say [SIGNAL] will be placed in directory SIGNAL, and the output will be put on /SIGNAL/, so SIGNAL will be added as last subdir in the user_remote_dir.
177 spiga 1.94
178     =back
179    
180     For further details please visit
181    
182     I<https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideMultiCrab>
183    
184 slacapra 1.19 =head1 HOW TO RUN ON CONDOR-G
185    
186     The B<Condor-G> mode for B<CRAB> is a special submission mode next to the standard Resource Broker submission. It is designed to submit jobs directly to a site and not using the Resource Broker.
187    
188 ewv 1.52 Due to the nature of B<Condor-G> submission, the B<Condor-G> mode is restricted to OSG sites within the CMS Grid, currently the 7 US T2: Florida(ufl.edu), Nebraska(unl.edu), San Diego(ucsd.edu), Purdue(purdue.edu), Wisconsin(wisc.edu), Caltech(ultralight.org), MIT(mit.edu).
189 slacapra 1.19
190     =head2 B<Requirements:>
191    
192     =over 2
193    
194     =item installed and running local Condor scheduler
195    
196     (either installed by the local Sysadmin or self-installed using the VDT user interface: http://www.uscms.org/SoftwareComputing/UserComputing/Tutorials/vdt.html)
197    
198     =item locally available LCG or OSG UI installation
199    
200 ewv 1.44 for authentication via Grid certificate proxies ("voms-proxy-init -voms cms" should result in valid proxy)
201 slacapra 1.19
202 spiga 1.96 =item set the environment variable GRID_WL_LOCATION to the edg directory of the local LCG or OSG UI installation
203 slacapra 1.19
204     =back
205    
206     =head2 B<What the Condor-G mode can do:>
207    
208     =over 2
209    
210 ewv 1.52 =item submission directly to multiple OSG sites,
211 slacapra 1.19
212 ewv 1.52 the requested dataset must be published correctly by the site in the local and global services.
213     Previous restrictions on submitting only to a single site have been removed. SE and CE whitelisting
214     and blacklisting work as in the other modes.
215 slacapra 1.19
216     =back
217    
218     =head2 B<What the Condor-G mode cannot do:>
219    
220     =over 2
221    
222     =item submit jobs if no condor scheduler is running on the submission machine
223    
224     =item submit jobs if the local condor installation does not provide Condor-G capabilities
225    
226 ewv 1.52 =item submit jobs to an LCG site
227 slacapra 1.19
228 fanzago 1.37 =item support Grid certificate proxy renewal via the myproxy service
229 slacapra 1.19
230     =back
231    
232     =head2 B<CRAB configuration for Condor-G mode:>
233    
234 ewv 1.52 The CRAB configuration for the Condor-G mode only requires one change in crab.cfg:
235 nsmirnov 1.1
236 slacapra 1.19 =over 2
237 slacapra 1.3
238 slacapra 1.19 =item select condor_g Scheduler:
239 slacapra 1.4
240 slacapra 1.19 scheduler = condor_g
241 slacapra 1.4
242 slacapra 1.19 =back
243 slacapra 1.4
244 edelmann 1.121
245     =head1 HOW TO RUN ON NORDUGRID ARC
246    
247     The ARC scheduler can be used to submit jobs to sites running the NorduGrid
248 fanzago 1.163 ARC grid middleware. To use it you need to have the ARC client
249 edelmann 1.121 installed.
250    
251     =head2 B<CRAB configuration for ARC mode:>
252    
253     The ARC scheduler requires some changes to crab.cfg:
254    
255     =over 2
256    
257     =item B<scheduler:>
258    
259     Select the ARC scheduler:
260     scheduler = arc
261 ewv 1.133
262 edelmann 1.121 =item B<requirements>, B<additional_jdl_parameters:>
263    
264     Use xrsl code instead of jdl for these parameters.
265    
266 ewv 1.133 =item B<max_cpu_time>, B<max_wall_clock_time:>
267 edelmann 1.121
268 belforte 1.174 When using ARC scheduler, for parameters max_cpu_time and max_wall_clock_time,
269     you can use units, e.g. "72 hours" or "3 days", just like with the xrsl attributes
270 edelmann 1.122 cpuTime and wallTime. If no unit is given, minutes is assumed by default.
271 edelmann 1.121
272     =back
273    
274     =head2 B<CRAB Commands:>
275    
276     Most CRAB commands behave approximately the same with the ARC scheduler, with only some minor differences:
277    
278     =over 2
279    
280     =item B<*> B<-printJdl|-createJdl> will print xrsl code instead of jdl.
281    
282     =back
283    
284    
285    
286    
287 ewv 1.52 =head1 COMMANDS
288 slacapra 1.4
289 spiga 1.142 =head2 B<-create>
290 slacapra 1.4
291 slacapra 1.26 Create the jobs: from version 1_3_0 it is only possible to create all jobs.
292 ewv 1.52 The maximum number of jobs depends on dataset and splitting directives. This set of identical jobs accessing the same dataset are defined as a task.
293 slacapra 1.4 This command create a directory with default name is I<crab_0_date_time> (can be changed via ui_working_dir parameter, see below). Inside this directory it is placed whatever is needed to submit your jobs. Also the output of your jobs (once finished) will be place there (see after). Do not cancel by hand this directory: rather use -clean (see).
294     See also I<-continue>.
295    
296 spiga 1.142 =head2 B<-submit [range]>
297 slacapra 1.4
298 ewv 1.98 Submit n jobs: 'n' is either a positive integer or 'all' or a [range]. The default is all.
299 ewv 1.160 If 'n' is passed as an argument, the first 'n' suitable jobs will be submitted. Please note that this is behaviour is different from other commands, where -command N means act the command to the job N, and not to the first N jobs. If a [range] is passed, the selected jobs will be submitted. In order to only submit job number M use this syntax (note the trailing comma): I<crab -submit M,>
300 belforte 1.155
301 ewv 1.98 This option may be used in conjunction with -create (to create and submit immediately) or with -continue (which is assumed by default) to submit previously created jobs. Failure to do so will stop CRAB and generate an error message. See also I<-continue>.
302 slacapra 1.4
303 spiga 1.142 =head2 B<-continue [dir] | -c [dir]>
304 slacapra 1.4
305 ewv 1.98 Apply the action on the task stored in directory [dir]. If the task directory is the standard one (crab_0_date_time), the most recent in time is assumed. Any other directory must be specified.
306     Basically all commands (except -create) need -continue, so it is automatically assumed. Of course, the standard task directory is used in this case.
307 slacapra 1.4
308 slacapra 1.176 =head2 B<-status [options]>
309 nsmirnov 1.1
310 slacapra 1.176 Check the status of all jobs. With the server, the full status, including application and wrapper exit codes, is available as soon as a job end. In StandAlone mode it is necessary to retrieve (crab -get) the job output first to obtain the exit codes. The status is printed on the console as a table with 7 columns: ID (identifier in the task), END (job completed or not. Crab server resubmit failed jobs, therefore: N=server is still working on this job, Y=server has done and status will not change anymore), STATUS (the job status), ACTION (some additional status info useful for experts), ExeExitCode (exit code from cmsRun, if not zero it means cmsRun failed), JobExitCode (the exit code assigned by Crab and reported by dashboard), E_HOST (the CE where the job executed). A list of comma separated options can be passed to -status (which do not accept a range). The option implmented are: I<-status short> which skip the detailed job-per-job status, printing only the summary; I<-status color> which add some coloring to the summary status. The color code is the following: Green for successfully finished jobs, Red for jobs which ended unsuccessfully, Blue for jobs done but not retireved, yellow for jobs still to be submitted, default color for all other jobs, namely those running or pending on the grid. The color will be used only if the output stream is capable of accepting it. The two options can coexist I<-status short,color>.
311 nsmirnov 1.1
312 spiga 1.142 =head2 B<-getoutput|-get [range]>
313 nsmirnov 1.1
314 slacapra 1.102 Retrieve the output declared by the user via the output sandbox. By default the output will be put in task working dir under I<res> subdirectory. This can be changed via config parameters. B<Be extra sure that you have enough free space>. From version 2_3_x, the available free space is checked in advance. See I<range> below for syntax.
315 nsmirnov 1.1
316 spiga 1.142 =head2 B<-publish>
317 fanzago 1.42
318 ewv 1.98 Publish user output in a local DBS instance after the retrieval of output. By default publish uses the dbs_url_for_publication specified in the crab.cfg file, otherwise you can supply it as an argument of this option.
319 fanzago 1.117 Warnings about publication:
320    
321     CRAB publishes only EDM files (in the FJR they are written in the tag <File>)
322    
323     CRAB publishes in the same USER dataset more EDM files if they are produced by a job and written in the tag <File> of FJR.
324    
325     It is not possible for the user to select only one file to publish, nor to publish two files in two different USER datasets.
326    
327 fanzago 1.42
328 spiga 1.142 =head2 B<-checkPublication [-USER.dbs_url_for_publication=dbs_url -USER.dataset_to_check=datasetpath -debug]>
329 fanzago 1.97
330 ewv 1.98 Check if a dataset is published in a DBS. This option is automaticaly called at the end of the publication step, but it can be also used as a standalone command. By default it reads the parameters (USER.dbs_url_for_publication and USER.dataset_to_check) in your crab.cfg. You can overwrite the defaults in crab.cfg by passing these parameters as option. Using the -debug option, you will get detailed info about the files of published blocks.
331 fanzago 1.97
332 belforte 1.159 =head2 B<-publishNoInp>
333    
334     To be used only if you know why and you are of sure what you are doing, or if crab support persons told you to use it.It is meant for situations where crab -publish fails because framework job report xml file contains input files not present in DBS. It will publish the dataset anyhow, while marking it as Unknown Provenace to indicate that parentage information is partial. Those dataset will not be accepted for promotion to Global Scope DBS. In all other respects this works as crab -publish
335    
336 fanzago 1.161 =head2 B<-resubmit range or all>
337 nsmirnov 1.1
338 slacapra 1.175 Resubmit jobs which have been previously submitted and have been either I<killed> or are I<aborted>. See I<range> below for syntax. Also possible with key I<bad>, which will resubmit all jobs in I<killed> or I<aborted> or I<failed submission> or I<retrieved> but with exit status not 0 (with the exception for wrapper exit status equal 60307).
339 nsmirnov 1.1
340 fanzago 1.161 =head2 B<-forceResubmit range or all>
341 slacapra 1.125
342     iSame as -resubmit but without any check about the actual status of the job: please use with caution, you can have problem if both the original job and the resubmitted ones actually run and tries to write the output ona a SE. This command is meant to be used if the killing is not possible or not working but you know that the job failed or will. See I<range> below for syntax.
343    
344 spiga 1.142 =head2 B<-kill [range]>
345 nsmirnov 1.1
346 slacapra 1.4 Kill (cancel) jobs which have been submitted to the scheduler. A range B<must> be used in all cases, no default value is set.
347 nsmirnov 1.1
348 spiga 1.142 =head2 B<-copyData [range -dest_se=the official SE name or -dest_endpoint=the complete endpoint of the remote SE]>
349 slacapra 1.58
350 ewv 1.118 Option that can be used only if your output have been previously copied by CRAB on a remote SE.
351 fanzago 1.115 By default the copyData copies your output from the remote SE locally on the current CRAB working directory (under res). Otherwise you can copy the output from the remote SE to another one, specifying either -dest_se=<the remote SE official name> or -dest_endpoint=<the complete endpoint of remote SE>. If dest_se is used, CRAB finds the correct path where the output can be stored.
352    
353     Example: crab -copyData --> output copied to crab_working_dir/res directory
354     crab -copyData -dest_se=T2_IT_Legnaro --> output copied to the legnaro SE, directory discovered by CRAB
355 ewv 1.118 crab -copyData -dest_endpoint=srm://<se_name>:8443/xxx/yyyy/zzzz --> output copied to the se <se_name> under
356     /xxx/yyyy/zzzz directory.
357 slacapra 1.58
358 spiga 1.142 =head2 B<-renewCredential >
359 mcinquil 1.59
360 spiga 1.80 If using the server modality, this command allows to delegate a valid credential (proxy/token) to the server associated with the task.
361 mcinquil 1.59
362 spiga 1.142 =head2 B<-match|-testJdl [range]>
363 nsmirnov 1.1
364 fanzago 1.163 Check if the job can find compatible resources. It is equivalent of doing I<glite-wms-job-list-match> on edg.
365 nsmirnov 1.1
366 belforte 1.171 =head2 B<-printId>
367 slacapra 1.20
368 belforte 1.169 Just print the Scheduler Job Identifierb (Grid job identifier e.g.) of the jobs in the task.
369 slacapra 1.20
370 spiga 1.142 =head2 B<-createJdl [range]>
371 spiga 1.53
372 ewv 1.64 Collect the full Job Description in a file located under share directory. The file base name is File- .
373 spiga 1.53
374 spiga 1.142 =head2 B<-postMortem [range]>
375 nsmirnov 1.1
376 slacapra 1.46 Try to collect more information of the job from the scheduler point of view.
377 fanzago 1.163 And this is the only way to obtain info about failure reason of aborted jobs.
378 nsmirnov 1.1
379 spiga 1.142 =head2 B<-list [range]>
380 slacapra 1.13
381 ewv 1.52 Dump technical information about jobs: for developers only.
382 slacapra 1.13
383 spiga 1.142 =head2 B<-report>
384 slacapra 1.89
385 fanzago 1.170 Print a short report about the task, namely the total number of events and files processed/requested/available, the name of the dataset path, a summary of the status of the jobs, and so on. A summary file of the runs and luminosity sections processed is written to res subdirecttory as lumiSummary.json and can be used as input to tools that compute the luminosity like lumiCalc.py. In the same subdirectory also a file containing all the input runs and lumis, called InputLumiSummaryOfTask.json and the file containing the missing runs and lumis due to failed jobs, called missingLumiSummary.json are produced. The missingLumiSummary.json can be use as lumimask file to create a new task in order to analyse the missing data (instead of failure jobs resubmission).
386 slacapra 1.89
387 spiga 1.142 =head2 B<-clean [dir]>
388 nsmirnov 1.1
389 slacapra 1.26 Clean up (i.e. erase) the task working directory after a check whether there are still running jobs. In case, you are notified and asked to kill them or retrieve their output. B<Warning> this will possibly delete also the output produced by the task (if any)!
390 nsmirnov 1.1
391 spiga 1.142 =head2 B<-cleanCache>
392 calloni 1.110
393 spiga 1.135 Clean up (i.e. erase) the SiteDb and CRAB cache content.
394 calloni 1.110
395 mcinquil 1.154 =head2 B<-uploadLog [jobid]>
396    
397     Upload main log files to a central repository. It prints a link to be forwared to supporting people (eg: crab feedback hypernews).
398    
399 ewv 1.160 It can optionally take a job id as input. It does not allow job ranges/lists.
400 mcinquil 1.154
401 fanzago 1.161 Uploaded files are: crab.log, crab.cfg, job logging info, summary file and a metadata file.
402     If you specify the jobid, also the job standard output and fjr will be uploaded. Warning: in this case you need to run the getoutput before!!
403 fanzago 1.163 In the case of aborted jobs you have to upload the postMortem file too, creating it with crab -postMortem jobid and then uploading files specifying the jobid number.
404 spiga 1.142
405     =head2 B<-validateCfg [fname]>
406 spiga 1.141
407     Parse the ParameterSet using the framework\'s Python API in order to perform a sanity check of the CMSSW configuration file.
408 fanzago 1.161 You have to create your task with crab -create and then to validate the config file with crab -validateCfg.
409 spiga 1.141
410 spiga 1.142 =head2 B<-help [format] | -h [format]>
411 nsmirnov 1.1
412 slacapra 1.4 This help. It can be produced in three different I<format>: I<man> (default), I<tex> and I<html>.
413 nsmirnov 1.1
414 spiga 1.142 =head2 B<-v>
415 nsmirnov 1.1
416 slacapra 1.4 Print the version and exit.
417 nsmirnov 1.1
418 spiga 1.142 =head2 B<range>
419 nsmirnov 1.1
420 slacapra 1.13 The range to be used in many of the above commands has the following syntax. It is a comma separated list of jobs ranges, each of which may be a job number, or a job range of the form first-last.
421 slacapra 1.4 Example: 1,3-5,8 = {1,3,4,5,8}
422 nsmirnov 1.1
423 spiga 1.142 =head1 OPTIONS
424 nsmirnov 1.1
425 spiga 1.142 =head2 B<-cfg [file]>
426 nsmirnov 1.1
427 slacapra 1.4 Configuration file name. Default is B<crab.cfg>.
428 nsmirnov 1.1
429 spiga 1.142 =head2 B<-debug [level]>
430 nsmirnov 1.1
431 slacapra 1.13 Set the debug level: high number for high verbosity.
432 nsmirnov 1.1
433 slacapra 1.5 =head1 CONFIGURATION PARAMETERS
434    
435 spiga 1.25 All the parameter describe in this section can be defined in the CRAB configuration file. The configuration file has different sections: [CRAB], [USER], etc. Each parameter must be defined in its proper section. An alternative way to pass a config parameter to CRAB is via command line interface; the syntax is: crab -SECTION.key value . For example I<crab -USER.outputdir MyDirWithFullPath> .
436 slacapra 1.5 The parameters passed to CRAB at the creation step are stored, so they cannot be changed by changing the original crab.cfg . On the other hand the task is protected from any accidental change. If you want to change any parameters, this require the creation of a new task.
437 slacapra 1.6 Mandatory parameters are flagged with a *.
438 slacapra 1.5
439 spiga 1.142 =head2 B<[CRAB]>
440 slacapra 1.5
441 spiga 1.142 =head3 B<jobtype *>
442 slacapra 1.5
443 fanzago 1.164 The type of the job to be executed: I<cmssw> jobtypes are supported. No default value.
444 slacapra 1.6
445 fanzago 1.163 =head3 B<scheduler *>
446     The scheduler to be used: <glite> or I<condor_g> (see specific paragraph) Grid schedulers to be used with glite or osg middleware. In addition, there's an I<arc> scheduler to be used with the NorduGrid ARC middleware.
447 fanzago 1.164 From version 210, also local scheduler are supported, for the time being only at CERN. I<LSF> is the standard CERN local scheduler or I<CAF> which is LSF dedicated to CERN Analysis Facilities. I<condor> is the scheduler to submit jobs to US LPC CAF. No default value.
448 slacapra 1.5
449 spiga 1.142 =head3 B<use_server>
450 slacapra 1.81
451 fanzago 1.164 To use the server for job handling (recommended) 0=no (default), 1=true. The server to be used will be found automatically from a list of available ones: it can also be specified explicitly by using I<server_name> (see below). The server usage is compulsory for task with a number of created jobs > 500. Default value = 0.
452 slacapra 1.81
453 spiga 1.142 =head3 B<server_name>
454 mcinquil 1.35
455 slacapra 1.81 To use the CRAB-server support it is needed to fill this key with server name as <Server_DOMAIN> (e.g. cnaf,fnal). If this is set, I<use_server> is set to true automatically.
456     If I<server_name=None> crab works in standalone way, same as using I<use_server=0> and no I<server_name>.
457 fanzago 1.164 The server available to users can be found from CRAB web page. No default value.
458 mcinquil 1.35
459 spiga 1.142 =head2 B<[CMSSW]>
460 slacapra 1.20
461 spiga 1.142 =head3 B<datasetpath *>
462 slacapra 1.20
463 fanzago 1.164 The path of the processed or analysis dataset as defined in DBS. It comes with the format I</PrimaryDataset/DataTier/Process[/OptionalADS]>. If no input is needed I<None> must be specified. When running on an analysis dataset, the job splitting must be specified by luminosity block rather than event. Analysis datasets are only treated accurately on a lumi-by-lumi level with CMSSW 3_1_x and later. No default value.
464 spiga 1.90
465 spiga 1.142 =head3 B<runselection *>
466 ewv 1.52
467 fanzago 1.164 Within a dataset you can restrict to run on a specific run number or run number range. For example runselection=XYZ or runselection=XYZ1-XYZ2 . Run number range will include both run XYZ1 and XYZ2. Combining runselection with a lumi_mask runs on the intersection of the two lists. No default value
468 afanfani 1.50
469 spiga 1.142 =head3 B<use_parent>
470 spiga 1.57
471 ewv 1.108 Within a dataset you can ask to run over the related parent files too. E.g., this will give you access to the RAW data while running over a RECO sample. Setting use_parent=1 CRAB determines the parent files from DBS and will add secondaryFileNames = cms.untracked.vstring( <LIST of parent FIles> ) to the pool source section of your parameter set.
472 fanzago 1.164 This setting is supposed to works both with Splitting by Lumis and Splitting by Events. Default value = 0.
473 spiga 1.57
474 spiga 1.142 =head3 B<pset *>
475 slacapra 1.20
476 fanzago 1.164 The python ParameterSet to be used. No default value.
477 slacapra 1.20
478 spiga 1.142 =head3 B<pycfg_params *>
479 ewv 1.111
480 belforte 1.182 These parameters are passed to the python config file, as explained in https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideAboutPythonConfigFile#Passing_Command_Line_Arguments_T
481 ewv 1.111
482 spiga 1.142 =head3 B<lumi_mask>
483 spiga 1.136
484 fanzago 1.164 The filename of a JSON file that describes which runs and lumis to process. CRAB will skip luminosity blocks not listed in the file. When using this setting, you must also use the split by lumi settings rather than split by event as described below. Combining runselection with a lumi_mask runs on the intersection of the two lists. No default value.
485 spiga 1.136
486 ewv 1.148 =head3 B<Splitting jobs by Lumi>
487 slacapra 1.20
488 spiga 1.142 =over 4
489 slacapra 1.22
490 ewv 1.148 =item B<NOTE: Exactly two of these three parameters must be used: total_number_of_lumis, lumis_per_job, number_of_jobs.> Split by lumi (or by run, explained below) is required for real data. Because jobs in split by lumi mode process entire rather than partial files, you will often end up with fewer jobs processing more lumis than you are expecting. Additionally, a single job cannot analyze files from multiple blocks in DBS. All job splitting parameters in split by lumi mode are "advice" to CRAB rather than determinative.
491 slacapra 1.22
492 spiga 1.142 =back
493 slacapra 1.22
494 ewv 1.148 =head4 B<total_number_of_lumis *>
495    
496 fanzago 1.164 The number of luminosity blocks to be processed. -1 for processing a whole dataset. Your task will process this many lumis regardless of how the jobs are actually split up. If you do not specify this, the total number of lumis processed will be number_of_jobs x lumis_per_job. No default value
497 ewv 1.148
498     =head4 B<lumis_per_job *>
499 ewv 1.108
500 fanzago 1.164 The number of luminosity blocks to be accessed by each job. Since a job cannot access less than a whole file, it may be that the actual number of lumis per job is more than you asked for. NO default value
501 ewv 1.108
502 ewv 1.148 =head4 B<number_of_jobs *>
503 ewv 1.108
504 fanzago 1.164 Define the number of jobs to be run for the task. This parameter is common between split by lumi and split by event modes. In split by lumi mode, the number of jobs will only approximately match this value. No default value
505 ewv 1.108
506 ewv 1.148 =head3 B<Splitting jobs by Event>
507 slacapra 1.22
508 spiga 1.142 =over 4
509 slacapra 1.22
510 fanzago 1.164 =item B<NOTE: Exactly two of these three parameters must be used: total_number_of_events, events_per_job, number_of_jobs.> Otherwise CRAB will complain. Only MC data can be split by event. No default value
511 spiga 1.142
512     =back
513 slacapra 1.22
514 ewv 1.148 =head4 B<total_number_of_events *>
515 spiga 1.142
516 fanzago 1.164 The number of events to be processed. To access all available events, use I<-1>. Of course, the latter option is not viable in case of no input. In this case, the total number of events will be used to split the task in jobs, together with I<events_per_job>. No default value.
517 spiga 1.142
518 ewv 1.148 =head4 B<events_per_job*>
519 spiga 1.142
520 fanzago 1.164 The number of events to be accessed by each job. Since a job cannot cross the boundary of a fileblock it might be that the actual number of events per job is not exactly what you asked for. It can be used also with no input. No default value.
521 spiga 1.142
522 ewv 1.148 =head4 B<number_of_jobs *>
523 spiga 1.142
524 fanzago 1.164 Define the number of jobs to be run for the task. The number of events for each job is computed taking into account the total number of events required as well as the granularity of EventCollections. Can be used also with No input. No default value.
525 spiga 1.142
526 ewv 1.149 =head4 B<split_by_event *>
527    
528 fanzago 1.164 This setting is for experts only. If you don't know why you want to use it, you don't want to use it. Set the value to 1 to enabe split by event on data. CRAB then behaves like old versions of CRAB which did not enforce split by lumi for data. Default value = 0.
529 ewv 1.149
530 spiga 1.142 =head3 B<split_by_run>
531 spiga 1.90
532 fanzago 1.164 To activate the split run based (each job will access a different run) use I<split_by_run>=1. You can also define I<number_of_jobs> and/or I<runselection>. NOTE: the Run Based combined with Event Based split is not available. Default value = 0.
533 spiga 1.90
534 spiga 1.142 =head3 B<output_file *>
535 slacapra 1.22
536 fanzago 1.164 The output files produced by your application (comma separated list). From CRAB 2_2_2 onward, if TFileService is defined in user Pset, the corresponding output file is automatically added to the list of output files. User can avoid this by setting B<skip_TFileService_output> = 1 (default is 0 == file included). The Edm output produced via PoolOutputModule can be automatically added by setting B<get_edm_output> = 1 (default is 0 == no). B<warning> it is not allowed to have a PoolOutputSource and not save it somewhere, since it is a waste of resource on the WN. In case you really want to do that, and if you really know what you are doing (hint: you dont!) you can user I<ignore_edm_output=1>. No default value.
537 slacapra 1.61
538 spiga 1.142 =head3 B<skip_TFileService_output>
539 slacapra 1.61
540 fanzago 1.164 Force CRAB to skip the inclusion of file produced by TFileService to list of output files. Default value = 0, namely the file is included.
541 slacapra 1.20
542 spiga 1.142 =head3 B<get_edm_output>
543 slacapra 1.63
544 fanzago 1.164 Force CRAB to add the EDM output file, as defined in PSET in PoolOutputModule (if any) to be added to the list of output files. Default value = 0 (== no inclusion)
545 slacapra 1.63
546 spiga 1.142 =head3 B<increment_seeds>
547 ewv 1.47
548 spiga 1.142 Specifies a comma separated list of seeds to increment from job to job. The initial value is taken
549 ewv 1.47 from the CMSSW config file. I<increment_seeds=sourceSeed,g4SimHits> will set sourceSeed=11,12,13 and g4SimHits=21,22,23 on
550     subsequent jobs if the values of the two seeds are 10 and 20 in the CMSSW config file.
551    
552     See also I<preserve_seeds>. Seeds not listed in I<increment_seeds> or I<preserve_seeds> are randomly set for each job.
553    
554 spiga 1.142 =head3 B<preserve_seeds>
555 ewv 1.47
556 spiga 1.142 Specifies a comma separated list of seeds to which CRAB will not change from their values in the user
557 ewv 1.47 CMSSW config file. I<preserve_seeds=sourceSeed,g4SimHits> will leave the Pythia and GEANT seeds the same for every job.
558    
559     See also I<increment_seeds>. Seeds not listed in I<increment_seeds> or I<preserve_seeds> are randomly set for each job.
560    
561 spiga 1.142 =head3 B<first_lumi>
562 slacapra 1.30
563 spiga 1.142 Relevant only for Monte Carlo production for which it defaults to 1. The first job will generate events with this lumi block number, subsequent jobs will
564 ewv 1.133 increment the lumi block number. Setting this number to 0 (not recommend) means CMSSW will not be able to read multiple such files as they
565     will all have the same run, lumi and event numbers. This check in CMSSW can be bypassed by setting
566 ewv 1.118 I<process.source.duplicateCheckMode = cms.untracked.string('noDuplicateCheck')> in the input source, should you need to
567 fanzago 1.164 read files produced without setting first_run (in old versions of CRAB) or first_lumi. Default value = 1.
568 slacapra 1.30
569 spiga 1.142 =head3 B<generator>
570 ewv 1.79
571 spiga 1.142 Name of the generator your MC job is using. Some generators require CRAB to skip events, others do not.
572 ewv 1.104 Possible values are pythia (default), comphep, lhe, and madgraph. This will skip events in your generator input file.
573 ewv 1.78
574 spiga 1.142 =head3 B<executable>
575 slacapra 1.30
576 fanzago 1.164 The name of the executable to be run on remote WN. The default is cmsrun. The executable is either to be found on the release area of the WN, or has been built on user working area on the UI and is (automatically) shipped to WN. If you want to run a script (which might internally call I<cmsrun>, use B<USER.script_exe> instead. Default value = cmsRun.
577 slacapra 1.30
578 spiga 1.142 =head3 I<DBS and DLS parameters:>
579 slacapra 1.30
580 spiga 1.142 =head3 B<dbs_url>
581 slacapra 1.6
582 fanzago 1.164 The URL of the DBS query page. For expert only. Default value the global DBS http://cmsdbsprod.cern.ch/cms_dbs_prod_global/servlet/DBSServlet
583 slacapra 1.13
584 spiga 1.142 =head3 B<show_prod>
585 spiga 1.84
586 ewv 1.98 To enable CRAB to show data hosted on Tier1s sites specify I<show_prod> = 1. By default those data are masked.
587 spiga 1.86
588 spiga 1.142 =head3 B<subscribed>
589 spiga 1.116
590     By setting the flag I<subscribed> = 1 only the replicas that are subscribed to its site are considered.The default is to return all replicas. The intended use of this flag is to avoid sending jobs to sites based on data that is being moved or deleted (and thus not subscribed).
591    
592 spiga 1.142 =head3 B<no_block_boundary>
593 spiga 1.86
594 fanzago 1.164 To remove fileblock boundaries in job splitting specify I<no_block_boundary> = 1. Default value = 0.
595 spiga 1.84
596 spiga 1.142 =head2 B<[USER]>
597 slacapra 1.13
598 spiga 1.142 =head3 B<additional_input_files>
599 slacapra 1.6
600 fanzago 1.164 Any additional input file you want to ship to WN: comma separated list. IMPORTANT NOTE: they will be placed in the WN working dir, and not in ${CMS_SEARCH_PATH}. Specific files required by CMSSW application must be placed in the local data directory ($CMSSW_BASE/src/data), which will be automatically shipped by CRAB itself, without specifying them as additional_input_files. You do not need to specify the I<ParameterSet> you are using, which will be included automatically. Wildcards are allowed. No default value.
601 slacapra 1.6
602 spiga 1.142 =head3 B<script_exe>
603 slacapra 1.31
604 belforte 1.147 A user script that will be run on WN (instead of default cmsRun). It is up to the user to setup properly the script itself to run on WN enviroment. CRAB guarantees that the CMSSW environment is setup (e.g. scram is in the path) and that the modified pset.py will be placed in the working directory, with name pset.py . The user must ensure that a properly name job report will be written, this can be done e.g. by calling cmsRun within the script as "cmsRun -j $RUNTIME_AREA/crab_fjr_$NJob.xml -p pset.py". The script itself will be added automatically to the input sandbox so user MUST NOT add it within the B<USER.additional_input_files>.
605 fanzago 1.163 Arguments: CRAB does automatically pass the job index as the first argument of script_exe.
606 fanzago 1.164 The MaxEvents number is set by CRAB in the environment variable "$MaxEvents". So the script can reads this value directly from there. No default value.
607 slacapra 1.31
608 spiga 1.142 =head3 B<script_arguments>
609 spiga 1.105
610     Any arguments you want to pass to the B<USER.script_exe>: comma separated list.
611 fanzago 1.163 CRAB does automatically pass the job index as the first argument of script_exe.
612 fanzago 1.164 The MaxEvents number is set by CRAB in the environment variable "$MaxEvents". So the script can read this value directly from there. No default value.
613 spiga 1.105
614 spiga 1.142 =head3 B<ui_working_dir>
615 slacapra 1.6
616 fanzago 1.164 Name of the working directory for the current task. By default, a name I<crab_0_(date)_(time)> will be used. If this card is set, any CRAB command which require I<-continue> need to specify also the name of the working directory. A special syntax is also possible, to reuse the name of the dataset provided before: I<ui_working_dir : %(dataset)s> . In this case, if e.g. the dataset is SingleMuon, the ui_working_dir will be set to SingleMuon as well. Default value = crab_0_(date)_(time).
617 slacapra 1.6
618 spiga 1.142 =head3 B<thresholdLevel>
619 mcinquil 1.35
620 fanzago 1.164 This has to be a value between 0 and 100, that indicates the percentage of task completeness (jobs in a ended state are complete, even if failed). The server will notify the user by e-mail (look at the field: B<eMail>) when the task will reach the specified threshold. Works just when using the server. Default value = 100.
621 mcinquil 1.35
622 spiga 1.142 =head3 B<eMail>
623 mcinquil 1.35
624 fanzago 1.164 The server will notify the specified e-mail when the task will reaches the specified B<thresholdLevel>. A notification is also sent when the task will reach the 100\% of completeness. This field can also be a list of e-mail: "B<eMail = user1@cern.ch, user2@cern.ch>". Works just when using the server. No default value.
625 mcinquil 1.126
626 spiga 1.142 =head3 B<client>
627 mcinquil 1.126
628 fanzago 1.163 Specify the client storage protocol that can be used to interact with the server in B<CRAB.server_name>. The default is the value in the server configuration.
629 mcinquil 1.35
630 spiga 1.142 =head3 B<return_data *>
631 slacapra 1.6
632 fanzago 1.164 The output produced by the executable on WN is returned (via output sandbox) to the UI, by issuing the I<-getoutput> command. B<Warning>: this option should be used only for I<small> output, say less than 10MB, since the sandbox cannot accommodate big files. Depending on Resource Broker used, a size limit on output sandbox can be applied: bigger files will be truncated. To be used in alternative to I<copy_data>. Default value = 0.
633 slacapra 1.6
634 spiga 1.142 =head3 B<outputdir>
635 slacapra 1.6
636 belforte 1.167 To be used together with I<return_data>. Directory on user interface where to store the output. Full path is mandatory, "~/" is not allowed: the default location of returned output is ui_working_dir/res . BEWARE: does not work with scheduler=CAF
637 slacapra 1.6
638 spiga 1.142 =head3 B<logdir>
639 slacapra 1.6
640 ewv 1.52 To be used together with I<return_data>. Directory on user interface where to store the standard output and error. Full path is mandatory, "~/" is not allowed: the default location of returned output is ui_working_dir/res .
641 slacapra 1.6
642 spiga 1.142 =head3 B<copy_data *>
643 slacapra 1.6
644 fanzago 1.164 The output (only the file produced by the analysis executable, not the std-out and err) is copied to a Storage Element of your choice (see below). To be used as an alternative to I<return_data> and recommended in case of large output. Default value = 0.
645 slacapra 1.6
646 spiga 1.142 =head3 B<storage_element>
647 slacapra 1.6
648 fanzago 1.71 To be used with <copy_data>=1
649 belforte 1.178 If you want to copy the output of your analysis in a official CMS Tier2 or Tier3, you have to write the CMS Site Name of the site, e.g. as written in SiteDB at https://cmsweb.cern.ch/sitedb/prod/sites (i.e T2_IT_legnaro). You have also to specify the <remote_dir>(see below)
650 fanzago 1.71
651 fanzago 1.164 If you want to copy the output in a not_official_CMS remote site you have to specify the complete storage element name (i.e se.xxx.infn.it).You have also to specify the <storage_path> and the <storage_port> if you do not use the default one(see below). No default value.
652 fanzago 1.71
653 spiga 1.142 =head3 B<user_remote_dir>
654 fanzago 1.71
655     To be used with <copy_data>=1 and <storage_element> official CMS sites.
656 ewv 1.104 This is the directory or tree of directories where your output will be stored. This directory will be created under the mountpoint ( which will be discover by CRAB if an official CMS storage Element has been used, or taken from the crab.cfg as specified by the user). B<NOTE> This part of the path will be used as logical file name of your files in the case of publication without using an official CMS storage Element. Generally it should start with "/store".
657 slacapra 1.6
658 spiga 1.142 =head3 B<storage_path>
659 slacapra 1.6
660 fanzago 1.71 To be used with <copy_data>=1 and <storage_element> not official CMS sites.
661     This is the full path of the Storage Element writeable by all, the mountpoint of SE (i.e /srm/managerv2?SFN=/pnfs/se.xxx.infn.it/yyy/zzz/)
662 fanzago 1.164 No default value.
663 fanzago 1.71
664 spiga 1.142 =head3 B<storage_port>
665 spiga 1.70
666 fanzago 1.164 To choose the storage port specify I<storage_port> = N. Default value = 8443.
667 spiga 1.70
668 fanzago 1.165 =head3 B<caf_lfn>
669     Running at CAF, you can decide in which mountpoint to copy your output, by selecting the first part of LFN.
670     The default value is /store/caf/user.
671     To test eos area you can use caf_lfn = /store/eos/user
672    
673 spiga 1.142 =head3 B<local_stage_out *>
674 fanzago 1.101
675 fanzago 1.164 This option enables the local stage out of produced output to the "close storage element" where the job is running, in case of failure of the remote copy to the Storage element decided by the user in che crab.cfg. It has to be used with the copy_data option. In the case of backup copy, the publication of data is forbidden. Set I<local_stage_out> = 1. Default value = 0.
676 fanzago 1.101
677 spiga 1.142 =head3 B<publish_data*>
678 fanzago 1.71
679     To be used with <copy_data>=1
680     To publish your produced output in a local istance of DBS set publish_data = 1
681 fanzago 1.77 All the details about how to use this functionality are written in https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrabForPublication
682 ewv 1.78 N.B 1) if you are using an official CMS site to stored data, the remote dir will be not considered. The directory where data will be stored is decided by CRAB, following the CMS policy in order to be able to re-read published data.
683     2) if you are using a not official CMS site to store data, you have to check the <lfn>, that will be part of the logical file name of you published files, in order to be able to re-read the data.
684 fanzago 1.164 Default value = 0.
685 fanzago 1.71
686 spiga 1.142 =head3 B<publish_data_name>
687 fanzago 1.71
688 fanzago 1.164 You produced output will be published in your local DBS with dataset name <primarydataset>/<publish_data_name>/USER. No default value.
689 fanzago 1.71
690 spiga 1.142 =head3 B<dbs_url_for_publication>
691 fanzago 1.71
692 fanzago 1.164 Specify the URL of your local DBS istance where CRAB has to publish the output files. No default value.
693 fanzago 1.71
694 spiga 1.93
695 spiga 1.142 =head3 B<xml_report>
696 spiga 1.51
697 fanzago 1.164 To be used to switch off the screen report during the status query, enabling the db serialization in a file. Specifying I<xml_report> = FileName CRAB will serialize the DB into CRAB_WORKING_DIR/share/FileName. No default value.
698 slacapra 1.6
699 spiga 1.142 =head3 B<usenamespace>
700 spiga 1.55
701 fanzago 1.164 To use the automate namespace definition (perfomed by CRAB) it is possible to set I<usenamespace>=1. The same policy used for the stage out in case of data publication will be applied. Default value = 0.
702 spiga 1.54
703 spiga 1.142 =head3 B<debug_wrapper>
704 spiga 1.55
705 fanzago 1.164 To enable the higer verbose level on wrapper specify I<debug_wrapper> = 1. The Pset contents before and after the CRAB maipulation will be written together with other useful infos. Default value = 0.
706 spiga 1.54
707 spiga 1.142 =head3 B<deep_debug>
708 spiga 1.75
709 ewv 1.78 To be used in case of unexpected job crash when the sdtout and stderr files are lost. Submitting again the same jobs specifying I<deep_debug> = 1 these files will be reported back. NOTE: it works only on standalone mode for debugging purpose.
710 spiga 1.75
711 spiga 1.142 =head3 B<dontCheckSpaceLeft>
712 slacapra 1.68
713     Set it to 1 to skip the check of free space left on your working directory before attempting to get the output back. Default is 0 (=False)
714    
715 spiga 1.142 =head3 B<check_user_remote_dir>
716 spiga 1.124
717 fanzago 1.164 To avoid stage out failures CRAB checks the remote location content at the creation time. By setting I<check_user_remote_dir>=0 crab will skip the check. Default value = 0.
718 spiga 1.124
719 belforte 1.159 =head3 B<tasktype>
720    
721 fanzago 1.164 Expert only parameter. Not to be used. Default value = analysis.
722 belforte 1.159
723 belforte 1.177 =head3 B<ssh_control_persist>
724    
725     Expert only parameter. Not to be used. Default value = 3600. Behaves like ControlPersist in ssh_config but time is only supported in seconds.
726    
727 spiga 1.142 =head2 B<[GRID]>
728 nsmirnov 1.1
729 belforte 1.172 in square brackets the name of the schedulers this parameter applies to in case it does not apply to all
730    
731     =head3 B<RB [glite]>
732 slacapra 1.6
733 fanzago 1.164 Which WMS you want to use instead of the default one, as defined in the configuration file automatically downloaded by CRAB from CMSDOC web page. You can use any other WMS which is available, if you provide the proper configuration files. E.g., for gLite WMS XYZ, you should provide I< 0_GET_glite_wms_XXX.conf> where XXX is the RB value. These files are searched for in the cache dir (~/.cms_crab), and, if not found, on cmsdoc web page. So, if you put your private configuration files in the cache dir, they will be used, even if they are not available on crab web page.
734     Please get in contact with crab team if you wish to provide your WMS as a service to the CMS community.
735 slacapra 1.6
736 belforte 1.172 =head3 B<role [glite]>
737 slacapra 1.26
738 belforte 1.168 The role to be set in the VOMS. Beware that simultaneus use of I<role> and I<group> is not supported. See VOMS documentation for more info. No default value.
739 slacapra 1.26
740 belforte 1.172 =head3 B<group [glite]>
741 slacapra 1.27
742 belforte 1.168 The group to be set in the VOMS. Beware that simultaneus use of I<role> and I<group> is not supported. See VOMS documentation for more info. No default value.
743 slacapra 1.27
744 spiga 1.142 =head3 B<dont_check_proxy>
745 slacapra 1.28
746 ewv 1.52 If you do not want CRAB to check your proxy. The creation of the proxy (with proper length), its delegation to a myproxyserver is your responsibility.
747 slacapra 1.28
748 spiga 1.142 =head3 B<dont_check_myproxy>
749 spiga 1.95
750 fanzago 1.164 If you want to to switch off only the proxy renewal set I<dont_check_myproxy>=1. The proxy delegation to a myproxyserver is your responsibility. Default value = 0.
751 spiga 1.95
752 belforte 1.172 =head3 B<requirements [glite]>
753 slacapra 1.6
754 fanzago 1.164 Any other requirements to be add to JDL. Must be written in compliance with JDL syntax (see LCG user manual for further info). No requirement on Computing element must be set. No default value.
755 slacapra 1.6
756 belforte 1.172 =head3 B<additional_jdl_parameters [glite, remoteGlidein]>
757 slacapra 1.27
758 spiga 1.48 Any other parameters you want to add to jdl file:semicolon separated list, each
759 belforte 1.172 item in the list must, including the closing ";". No default value.
760     Works both for gLite and remoteGlidein
761 spiga 1.48
762 belforte 1.172 =head3 B<wms_service [glite]>
763 spiga 1.48
764 fanzago 1.164 With this field it is also possible to specify which WMS you want to use (https://hostname:port/pathcode) where "hostname" is WMS name, the "port" generally is 7443 and the "pathcode" should be something like "glite_wms_wmproxy_server". No default value.
765 slacapra 1.27
766 spiga 1.142 =head3 B<max_cpu_time>
767 slacapra 1.6
768 fanzago 1.164 Maximum CPU time needed to finish one job. It will be used to select a suitable queue on the CE. Time in minutes. Default value = 130.
769 slacapra 1.6
770 spiga 1.142 =head3 B<max_wall_clock_time>
771 slacapra 1.6
772 fanzago 1.164 Same as previous, but with real time, and not CPU one. No default value.
773 slacapra 1.6
774 belforte 1.172 =head3 B<max_rss [remoteGlidein]>
775    
776     Maximum Resident Set Size (memory) needed for one job. It will be used to select a suitable queue on the CE and to adjust the crab watchdog. Memory need in Mbytes. Default value = 2300
777    
778     =head3 B<ce_black_list [glite]>
779 slacapra 1.6
780 ewv 1.66 All the CE (Computing Element) whose name contains the following strings (comma separated list) will not be considered for submission. Use the dns domain (e.g. fnal, cern, ifae, fzk, cnaf, lnl,....). You may use hostnames or CMS Site names (T2_DE_DESY) or substrings.
781 fanzago 1.164 By default T0 and T1s site are in blacklist.
782 slacapra 1.6
783 belforte 1.172 =head3 B<ce_white_list[glite]>
784 slacapra 1.6
785 ewv 1.66 Only the CE (Computing Element) whose name contains the following strings (comma separated list) will be considered for submission. Use the dns domain (e.g. fnal, cern, ifae, fzk, cnaf, lnl,....). You may use hostnames or CMS Site names (T2_DE_DESY) or substrings. Please note that if the selected CE(s) does not contain the data you want to access, no submission can take place.
786 slacapra 1.27
787 belforte 1.172 =head3 B<se_black_list [glite,glidein,remoteGlidein]>
788 slacapra 1.27
789 ewv 1.66 All the SE (Storage Element) whose name contains the following strings (comma separated list) will not be considered for submission.It works only if a datasetpath is specified. You may use hostnames or CMS Site names (T2_DE_DESY) or substrings.
790 fanzago 1.164 By default T0 and T1s site are in blacklist.
791 slacapra 1.27
792 belforte 1.172 =head3 B<se_white_list [glite,glidein,remoteGlidein]>
793 slacapra 1.27
794 ewv 1.66 Only the SE (Storage Element) whose name contains the following strings (comma separated list) will be considered for submission.It works only if a datasetpath is specified. Please note that if the selected CE(s) does not contain the data you want to access, no submission can take place. You may use hostnames or CMS Site names (T2_DE_DESY) or substrings.
795 slacapra 1.6
796 belforte 1.172 =head3 B<remove_default_blacklist [glite]>
797 spiga 1.73
798 fanzago 1.164 CRAB enforce the T1s Computing Eelements Black List. By default it is appended to the user defined I<CE_black_list>. To remove the enforced T1 black lists set I<remove_default_blacklist>=1. Default value = 0.
799 spiga 1.73
800 belforte 1.172 =head3 B<data_location_override [remoteGlidein]>
801    
802 belforte 1.181 Overrides the data location list obtained from DLS/PhEDEx with the list of sites indicated. Same syntax as se_white_list. Up to the user to make sure that needed data can be read nevertheless. Note: ONLY WORKS INSIDE crab.cfg at crab -create time, not when issued in the command line as crab -submit -GRID.data_location_override=...
803 belforte 1.172
804     =head3 B<allow_overflow [remoteGlidein]>
805    
806     Tells glidein wether it can overlow this job, i.e. run at another site and access data via xrootds if the sites were data are located are full. Set to 0 to disallow overflow. Default value = 1.
807    
808 ewv 1.146 =head2 B<[LSF]> or B<[CAF]> or B<[PBS]> or B<[SGE]>
809 slacapra 1.46
810 spiga 1.142 =head3 B<queue>
811 slacapra 1.46
812 mcinquil 1.123 The LSF/PBS queue you want to use: if none, the default one will be used. For CAF, the proper queue will be automatically selected.
813 slacapra 1.46
814 spiga 1.142 =head3 B<resource>
815 slacapra 1.46
816 mcinquil 1.123 The resources to be used within a LSF/PBS queue. Again, for CAF, the right one is selected.
817 slacapra 1.46
818 spiga 1.142 =head3 B<group>
819 spiga 1.141
820     The physics GROUP which the user belong to ( it is for example PHYS_SUSY etc...). By specifying that the LSF accounting and fair share per sub-group is done properly.
821    
822 nsmirnov 1.1 =head1 FILES
823    
824 slacapra 1.6 I<crab> uses a configuration file I<crab.cfg> which contains configuration parameters. This file is written in the INI-style. The default filename can be changed by the I<-cfg> option.
825 nsmirnov 1.1
826 slacapra 1.6 I<crab> creates by default a working directory 'crab_0_E<lt>dateE<gt>_E<lt>timeE<gt>'
827 nsmirnov 1.1
828     I<crab> saves all command lines in the file I<crab.history>.
829    
830 belforte 1.153 I<crab> downloads some configuration files from internet and keeps cached copies in ~/.cms_crab and ~/.cms_sitedbcache directories. The location of those caches can be redirected using the enviromental variables CMS_SITEDB_CACHE_DIR and CMS_CRAB_CACHE_DIR
831    
832 nsmirnov 1.1 =head1 HISTORY
833    
834 ewv 1.52 B<CRAB> is a tool for the CMS analysis on the Grid environment. It is based on the ideas from CMSprod, a production tool originally implemented by Nikolai Smirnov.
835 nsmirnov 1.1
836     =head1 AUTHORS
837    
838     """
839     author_string = '\n'
840     for auth in common.prog_authors:
841     #author = auth[0] + ' (' + auth[2] + ')' + ' E<lt>'+auth[1]+'E<gt>,\n'
842     author = auth[0] + ' E<lt>' + auth[1] +'E<gt>,\n'
843     author_string = author_string + author
844     pass
845     help_string = help_string + author_string[:-2] + '.'\
846     """
847    
848     =cut
849 slacapra 1.19 """
850 nsmirnov 1.1
851     pod = tempfile.mktemp()+'.pod'
852     pod_file = open(pod, 'w')
853     pod_file.write(help_string)
854     pod_file.close()
855    
856     if option == 'man':
857     man = tempfile.mktemp()
858     pod2man = 'pod2man --center=" " --release=" " '+pod+' >'+man
859     os.system(pod2man)
860     os.system('man '+man)
861     pass
862     elif option == 'tex':
863     fname = common.prog_name+'-v'+common.prog_version_str
864     tex0 = tempfile.mktemp()+'.tex'
865     pod2tex = 'pod2latex -full -out '+tex0+' '+pod
866     os.system(pod2tex)
867     tex = fname+'.tex'
868     tex_old = open(tex0, 'r')
869     tex_new = open(tex, 'w')
870     for s in tex_old.readlines():
871     if string.find(s, '\\begin{document}') >= 0:
872     tex_new.write('\\title{'+common.prog_name+'\\\\'+
873     '(Version '+common.prog_version_str+')}\n')
874     tex_new.write('\\author{\n')
875     for auth in common.prog_authors:
876     tex_new.write(' '+auth[0]+
877     '\\thanks{'+auth[1]+'} \\\\\n')
878     tex_new.write('}\n')
879     tex_new.write('\\date{}\n')
880     elif string.find(s, '\\tableofcontents') >= 0:
881     tex_new.write('\\maketitle\n')
882     continue
883     elif string.find(s, '\\clearpage') >= 0:
884     continue
885     tex_new.write(s)
886     tex_old.close()
887     tex_new.close()
888     print 'See '+tex
889     pass
890     elif option == 'html':
891     fname = common.prog_name+'-v'+common.prog_version_str+'.html'
892     pod2html = 'pod2html --title='+common.prog_name+\
893     ' --infile='+pod+' --outfile='+fname
894     os.system(pod2html)
895     print 'See '+fname
896     pass
897 slacapra 1.33 elif option == 'txt':
898     fname = common.prog_name+'-v'+common.prog_version_str+'.txt'
899     pod2text = 'pod2text '+pod+' '+fname
900     os.system(pod2text)
901     print 'See '+fname
902     pass
903 nsmirnov 1.1
904     sys.exit(0)