On Fri, 28 Oct 2011, Krisztian Krajczar wrote: > The mps_stat.pl command reports that the Pede job for the alignment of ideal > geometry failed. However, there are outputs produced in the directory you > indicated in your earlier email. > > I have checked the Pede dump in search for any errors, but found no errors. Hi Krisztian, (adding Claus as pede expert asking for advice in the end) indeed this is the first file to look into. And it does not look healthy, but simply stops at some point - the last line should be something like < Millepede II-P ending ... Wed Oct 26 22:52:11 2011 as in mp0896/jobData/jobm/pede.dump. MPS looks for that line and reports failure since it is not there. > The memory usage was normal, although it was slightly higher than for the > previous alignments: > > Memory space: total 32.000000 GB > used 31.226771 GB = 97.58 % > > In STDOUT I found a possible source of the "fail" report of the mps_stat.pl. > One of the automatic root macros failed to run: > > --------- > Processing readPedeHists.C+("print nodraw")... > Info in : creating shared library > /pool/lsf/krajczar/182146920/./readPedeHists_C.so > Error in : failed reading x-y-dx-dy content > --------- Before that I see sh: line 1: 27036 CPU time limit exceeded /afs/cern.ch/user/c/ckleinw/bin/rev81/pede_32GB pedeSteerMaster.txt > pede.dump and that tells us the reason whay pede did not run through - it is a serious problem! It is also stated in alignment.log.gz from CMSSW: %MSG-e Alignment: AfterModEndJob PedeReader() 28-Oct-2011 07:12:10 CEST PostEndRun Problem opening pede output file millepede.res %MSG %MSG-i Alignment: AfterModEndJob PedeReader::read() 28-Oct-2011 07:12:10 CEST PostEndRun will read parameters for run range 1 - 4294967295 %MSG %MSG-i Alignment: AfterModEndJob PedeReader::read() 28-Oct-2011 07:12:10 CEST PostEndRun 0 parameters for 0 alignables What you point to is a consequence of that: pede did not run through, so millepede.his with histogram-like infos of the pede job is not well behaving and cannot be correctly converted into ROOT/.ps - and there the error you see comes from. > For the previous rounds of alignments this problem did not appear. > > Reference: > /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/mp0897/jobData/jobm/pede.dump > /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/mp0897/jobData/jobm/STDOUT > > Is this a serious issue? Can I submit the Pede jobs for the weighted samples > regardless this error? The question is: Why does it need more CPU starting from ideal (but bows). Internally it is using an iterative procedure (MINRES) for solving the big matrix - and this is done three (4?) times with your settings. Then after each solving there is a line search in 1D. Procedures like that tend to have difficulties if we start too close to the final result (needing more MINRES iterations - see e.g. last page of mp0896/jobData/jobm/millepede.his.ps.gz how kuch this can vary in a succesfull job.)... So - what to do? 1) We can introduce a bit of noise in the procedure by adding some random misalignment. 1a) If that does not help, we could remove the bow-misalignment and the bow determination from teh alignment job - in the very end we could probably use the bows that are the result of the jobs starting from current MC scenario 2) I'll ask for a larger CPU limit on the special millepede queue. Claus - do you have another suggestion? about 1) add to configs process.AlignmentProducer.doMisalignmentScenario = True process.AlignmentProducer.MisalignmentScenario = cms.PSet( setRotations = cms.bool(True), setTranslations = cms.bool(True), seed = cms.int32(1234567), distribution = cms.string('gaussian'), #fixed'), setError = cms.bool(True), TIBBarrels = cms.PSet(DetUnits = cms.PSet( dXlocal = cms.double(0.001)) ) # same for TIDEndcaps, TECEndcap, TPBBarrels and TPEEndcaps # but leave out TOB for now ) about 1a) - setup new directory - remove process.trackerBowedSensors stuff from startgeometry.txt - deselect the bow parameters in alignables.txt: * last three '1' to set to '0' for single sensors (SelectorBowed) * remove 'SelectorTwoBowed' and add double sensor modules (TOB, outer TEC) to SelectorBowed with '101111 000' parameterisation. Cheers Gero -- ----------------------------------------------------------------------- Gero Flucke - Analysis Centre, Helmholtz Alliance "Physics at the Terascale" * Statistics Tools - CMS: Tracker Alignment Convenor DESY/CMS, Notkestr. 85, D-22607 Hamburg, Germany Bldg. 1e, Rm. 02.501 phone: +49 (0)40 8998 3525 fax: +49 (0)40 8998 3092