24 |
|
-getoutput|-get [range] -- get back the output of all jobs: if range is defined, only of selected jobs. |
25 |
|
-extend -- Extend an existing task to run on new fileblocks if there. |
26 |
|
-publish -- after the getouput, publish the data user in a local DBS instance. |
27 |
+ |
-publishNoInp -- after the getoutput, publish the data user in the local DBS instance removing input data file |
28 |
|
-checkPublication [dbs_url datasetpath] -- checks if a dataset is published in a DBS. |
29 |
|
-kill [range] -- kill submitted jobs. |
30 |
|
-resubmit range or all -- resubmit killed/aborted/retrieved jobs. |
37 |
|
-report -- print a short report about the task |
38 |
|
-list [range] -- show technical job details. |
39 |
|
-postMortem [range] -- provide a file with information useful for post-mortem analysis of the jobs. |
40 |
< |
-printId [range] -- print the job SID or Task Unique ID while using the server. |
40 |
> |
-printId [range or full] -- print the job SID or Task Unique ID while using the server (full to get the SIDs) |
41 |
|
-createJdl [range] -- provide files with a complete Job Description (JDL). |
42 |
|
-validateCfg [fname] -- parse the ParameterSet using the framework's Python API. |
43 |
|
-cleanCache -- clean SiteDB and CRAB caches. |
99 |
|
|
100 |
|
Source B<crab.(c)sh> from the CRAB installation area, which have been setup either by you or by someone else for you. |
101 |
|
|
102 |
< |
Modify the CRAB configuration file B<crab.cfg> according to your need: see below for a complete list. A template and commented B<crab.cfg> can be found on B<$CRABDIR/python/crab.cfg> |
102 |
> |
Modify the CRAB configuration file B<crab.cfg> according to your need: see below for a complete list. A template and commented B<crab.cfg> can be found on B<$CRABDIR/python/full_crab.cfg> (detailed cfg) and B<$CRABDIR/python/minimal_crab.cfg> (only basic parameters) |
103 |
|
|
104 |
|
~>crab -create |
105 |
|
create all jobs (no submission!) |
246 |
|
=head1 HOW TO RUN ON NORDUGRID ARC |
247 |
|
|
248 |
|
The ARC scheduler can be used to submit jobs to sites running the NorduGrid |
249 |
< |
ARC grid middleware. To use it you'll need to have the ARC client |
249 |
> |
ARC grid middleware. To use it you need to have the ARC client |
250 |
|
installed. |
251 |
|
|
252 |
|
=head2 B<CRAB configuration for ARC mode:> |
366 |
|
|
367 |
|
=head2 B<-match|-testJdl [range]> |
368 |
|
|
369 |
< |
Check if the job can find compatible resources. It is equivalent of doing I<edg-job-list-match> on edg. |
369 |
> |
Check if the job can find compatible resources. It is equivalent of doing I<glite-wms-job-list-match> on edg. |
370 |
|
|
371 |
|
=head2 B<-printId [range]> |
372 |
|
|
379 |
|
=head2 B<-postMortem [range]> |
380 |
|
|
381 |
|
Try to collect more information of the job from the scheduler point of view. |
382 |
+ |
And this is the only way to obtain info about failure reason of aborted jobs. |
383 |
|
|
384 |
|
=head2 B<-list [range]> |
385 |
|
|
405 |
|
|
406 |
|
Uploaded files are: crab.log, crab.cfg, job logging info, summary file and a metadata file. |
407 |
|
If you specify the jobid, also the job standard output and fjr will be uploaded. Warning: in this case you need to run the getoutput before!! |
408 |
< |
In the case of aborted jobs you can upload the postMortem file, creating it with crab -postMortem jobid and then uploading files specifying the jobid number. |
408 |
> |
In the case of aborted jobs you have to upload the postMortem file too, creating it with crab -postMortem jobid and then uploading files specifying the jobid number. |
409 |
|
|
410 |
|
=head2 B<-validateCfg [fname]> |
411 |
|
|
447 |
|
|
448 |
|
The type of the job to be executed: I<cmssw> jobtypes are supported |
449 |
|
|
450 |
< |
The scheduler to be used: I<glitecoll> is the more efficient grid scheduler and should be used. Other choice are I<glite>, same as I<glitecoll> but without bulk submission (and so slower) or I<condor_g> (see specific paragraph) or I<edg> which is the former Grid scheduler, which will be dismissed in some future. In addition, there's an I<arc> scheduler to be used with the NorduGrid ARC middleware. |
451 |
< |
From version 210, also local scheduler are supported, for the time being only at CERN. I<LSF> is the standard CERN local scheduler or I<CAF> which is LSF dedicated to CERN Analysis Facilities. |
450 |
> |
=head3 B<scheduler *> |
451 |
> |
The scheduler to be used: <glite> or I<condor_g> (see specific paragraph) Grid schedulers to be used with glite or osg middleware. In addition, there's an I<arc> scheduler to be used with the NorduGrid ARC middleware. |
452 |
> |
From version 210, also local scheduler are supported, for the time being only at CERN. I<LSF> is the standard CERN local scheduler or I<CAF> which is LSF dedicated to CERN Analysis Facilities. I<condor> is the scheduler to submit jobs to US LPC CAF. |
453 |
|
|
454 |
|
=head3 B<use_server> |
455 |
|
|
456 |
< |
To use the server for job handling (recommended) 0=no (default), 1=true. The server to be used will be found automatically from a list of available ones: it can also be specified explicitly by using I<server_name> (see below) |
456 |
> |
To use the server for job handling (recommended) 0=no (default), 1=true. The server to be used will be found automatically from a list of available ones: it can also be specified explicitly by using I<server_name> (see below). The server usage is compulsory for task with a number of created jobs > 500. |
457 |
|
|
458 |
|
=head3 B<server_name> |
459 |
|
|
607 |
|
=head3 B<script_exe> |
608 |
|
|
609 |
|
A user script that will be run on WN (instead of default cmsRun). It is up to the user to setup properly the script itself to run on WN enviroment. CRAB guarantees that the CMSSW environment is setup (e.g. scram is in the path) and that the modified pset.py will be placed in the working directory, with name pset.py . The user must ensure that a properly name job report will be written, this can be done e.g. by calling cmsRun within the script as "cmsRun -j $RUNTIME_AREA/crab_fjr_$NJob.xml -p pset.py". The script itself will be added automatically to the input sandbox so user MUST NOT add it within the B<USER.additional_input_files>. |
610 |
+ |
Arguments: CRAB does automatically pass the job index as the first argument of script_exe. |
611 |
+ |
The MaxEvents number is set by CRAB in the environment variable "$MaxEvents". So the script can reads this value directly from there. |
612 |
|
|
613 |
|
=head3 B<script_arguments> |
614 |
|
|
615 |
|
Any arguments you want to pass to the B<USER.script_exe>: comma separated list. |
616 |
+ |
CRAB does automatically pass the job index as the first argument of script_exe. |
617 |
+ |
The MaxEvents number is set by CRAB in the environment variable "$MaxEvents". So the script can read this value directly from there. |
618 |
|
|
619 |
|
=head3 B<ui_working_dir> |
620 |
|
|
630 |
|
|
631 |
|
=head3 B<client> |
632 |
|
|
633 |
< |
Specify the client that can be used to interact with the server in B<CRAB.server_name>. The default is the value in the server configuration. |
633 |
> |
Specify the client storage protocol that can be used to interact with the server in B<CRAB.server_name>. The default is the value in the server configuration. |
634 |
|
|
635 |
|
=head3 B<return_data *> |
636 |
|
|
646 |
|
|
647 |
|
=head3 B<copy_data *> |
648 |
|
|
649 |
< |
The output (only that produced by the executable, not the std-out and err) is copied to a Storage Element of your choice (see below). To be used as an alternative to I<return_data> and recommended in case of large output. |
649 |
> |
The output (only the file produced by the analysis executable, not the std-out and err) is copied to a Storage Element of your choice (see below). To be used as an alternative to I<return_data> and recommended in case of large output. |
650 |
|
|
651 |
|
=head3 B<storage_element> |
652 |
|
|
665 |
|
To be used with <copy_data>=1 and <storage_element> not official CMS sites. |
666 |
|
This is the full path of the Storage Element writeable by all, the mountpoint of SE (i.e /srm/managerv2?SFN=/pnfs/se.xxx.infn.it/yyy/zzz/) |
667 |
|
|
661 |
– |
|
662 |
– |
=head3 B<storage_pool> |
663 |
– |
|
664 |
– |
If you are using CAF scheduler, you can specify the storage pool where to write your output. |
665 |
– |
The default is cmscafuser. If you do not want to use the default, you can overwrite it specifing None |
666 |
– |
|
668 |
|
=head3 B<storage_port> |
669 |
|
|
670 |
|
To choose the storage port specify I<storage_port> = N (default is 8443) . |
690 |
|
Specify the URL of your local DBS istance where CRAB has to publish the output files |
691 |
|
|
692 |
|
|
692 |
– |
=head3 B<srm_version> |
693 |
– |
|
694 |
– |
To choose the srm version specify I<srm_version> = (srmv1 or srmv2). |
695 |
– |
|
693 |
|
=head3 B<xml_report> |
694 |
|
|
695 |
|
To be used to switch off the screen report during the status query, enabling the db serialization in a file. Specifying I<xml_report> = FileName CRAB will serialize the DB into CRAB_WORKING_DIR/share/FileName. |