Go to the U of M home page
School of Physics & Astronomy
School of Physics and Astronomy Wiki

User Tools


computing:department:unix:jobs:condor

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
computing:department:unix:jobs:condor [2015/12/14 16:28] allancomputing:department:unix:jobs:condor [2015/12/14 16:43] (current) – [Where do jobs run] allan
Line 35: Line 35:
  
 ==== Use vanilla environment ==== ==== Use vanilla environment ====
- 
  
 Unless you've specifically used ''condor_compile'' to build your programs, you'll need to submit your jobs in the "vanilla" universe. Unless you've specifically used ''condor_compile'' to build your programs, you'll need to submit your jobs in the "vanilla" universe.
Line 42: Line 41:
      
 ==== Limit email output ==== ==== Limit email output ====
- 
  
   Notification = error   Notification = error
Line 48: Line 46:
 ==== Request slot resources ==== ==== Request slot resources ====
  
 +Almost all our condor slots are set up as "partitionable", which means you can request what resources are needed. Try to specify appropriate values in your submit file so that condor can reserve resources for your job (without this, your job will be given default values which may cause it to be held). For example:
 +  request_cpus = 1
 +  request_memory = 2048
  
-Almost all our condor slots are set up as "partitionable", which means you can request what resources are needed. Try to specify appropriate values in your submit file so that condor can reserve resources for your job (without this, your job will be given default values which may cause it to be held).+==== think about your data access ====
  
-  * ''request_cpus'' (defaults to 1) +Never use your home directory for job i/o - you should probably be using a ''/data'' volume.
-  * ''request_memory'' (defined in megabytes; defaults to the ImageSize or JobVMemory parameters).  +
-  * ''request_disk'' - (defined in kilobytes; defaults to the DiskUsage parameter. You probably don't need to worry about this one).+
  
 ===== Where do jobs run ===== ===== Where do jobs run =====
Line 63: Line 62:
   +CondorGroup = "cmsfarm"   +CondorGroup = "cmsfarm"
  
-In additionl, you can let your jobs run on **any** cluster by adding the following line+In addition, you can let your jobs run on **any** cluster, with the condition that it will be pre-empted (ie killed) if a job with a higher rank (based on the group owning the cluster) needs to run. You can enable this behavior with:
   +CanEvict = True   +CanEvict = True
  
-In this case your job can run on any cluster, but will be pre-empted (ie killed) if a job with a higher rank (based on the group owning the cluster) needs to run.+===== Why won't my job run =====
  
 +Some commands to help analyze why your jobs isn't matching
 +
 +  condor_q -analyze <jobid>
 +  condor_q -better <jobid>
 +  
 +Why is job not running on a particular machine
 +  condor_q -better <jobid> -machine <name>
 +  condor_q -better:reverse -machine <name>
 +  
 +  
computing/department/unix/jobs/condor.1450132102.txt.gz · Last modified: 2015/12/14 16:28 by allan