Campuses:
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
computing:department:unix:jobs:condor [2015/12/14 16:15] – allan | computing:department:unix:jobs:condor [2015/12/14 16:43] (current) – [Where do jobs run] allan | ||
---|---|---|---|
Line 4: | Line 4: | ||
Condor is a software framework for distributed parallel computation. At Minnesota, we use it both to manage workload on dedicated server clusters, and (in some cases) to farm out work to idle desktop computers | Condor is a software framework for distributed parallel computation. At Minnesota, we use it both to manage workload on dedicated server clusters, and (in some cases) to farm out work to idle desktop computers | ||
- | Each Physics linux system is tagged with a "cluster" name corresponding to the research group or entity to which it belongs. It is possible to run jobs on any cluster, but if you are unsure whether you should be using a particular group of machines please [[mailto: | + | Each Physics linux system is tagged with a "CondorGroup" name corresponding to the research group or entity to which it belongs. It is possible to run jobs on any cluster, but if you are unsure whether you should be using a particular group of machines please [[mailto: |
< | < | ||
You can see which servers belong to each cluster, and what their capabilities are, on the [[https:// | You can see which servers belong to each cluster, and what their capabilities are, on the [[https:// | ||
Line 32: | Line 32: | ||
* [[http:// | * [[http:// | ||
- | ===== slot resources | + | ===== Submitting batch jobs ===== |
- | Almost all our condor slots are set up as " | + | ==== Use vanilla environment ==== |
- | * '' | + | Unless you've specifically used '' |
- | * '' | + | |
- | * '' | + | |
- | ===== Site Specific Information ===== | + | universe |
+ | |||
+ | ==== Limit email output | ||
- | Our physics " | + | Notification = error |
- | +CondorGroup | + | ==== Request slot resources ==== |
- | You should read the manual page for '' | + | Almost all our condor slots are set up as " |
+ | request_cpus = 1 | ||
+ | request_memory = 2048 | ||
- | Some example | + | ==== think about your data access ==== |
+ | |||
+ | Never use your home directory for job i/o - you should probably be using a ''/ | ||
+ | |||
+ | ===== Where do jobs run ===== | ||
+ | |||
+ | By default, the cluster which executes a job is determined by the machine where you issue the '' | ||
+ | |||
+ | You can override this behavior in your submit file by manipulating the CondorGroup job ClassAd. For example, you could place the following line in your job file to make the jobs run on the CMS server farm: | ||
+ | |||
+ | +CondorGroup = " | ||
- | | + | In addition, you can let your jobs run on **any** cluster, with the condition that it will be pre-empted |
- | | + | |
- | | + | |
- | | + | |
- | * novafarm - NOvA servers | + | |
- | | + | |
+ | ===== Why won't my job run ===== | ||
- | Anyone may submit jobs to " | + | Some commands |
- | We currently support the vanilla, standard, and MPI universes. | + | condor_q -analyze < |
+ | condor_q -better < | ||
+ | |||
+ | Why is job not running on a particular machine | ||
+ | condor_q -better < | ||
+ | condor_q -better: | ||
+ | |||
+ | |