On Wed, 2 Nov 2011, Krisztian Krajczar wrote: > Dear all, > > please find the draft of my talk at: > https://krajczar.web.cern.ch/krajczar/03.11.11_TrackerDPG.pdf > > Comments are welcome, Hi Krisztian, p. 2 typo: ALCARECO remark: having similar samples in MC as in data is the first approach to reach the main goal of similar precision p. 3 it would be nice to have the numbers used in data as comparison. Even better to see how close we are to the data samples in pt, eta etc., i.e. to show some plots comapring data MC (for data we have the plots from Joerg shown two weeks ago at least...) p. 4 maybe motivate the different setups: mp0896: similar to data approach mp0897: starting from a geometry that has no Z-mass phi-mode (Joerg's studies showed that we do not get fully rid of it with alignment) (but failed due to CPU problems) mp0899: as 897 from ideal with reduced #of params to see the effect on teh Z-mass p. 5/6 Geometry comparison is the wrong title: Bows and Kinks It would be enough to show this for one scenario since they are sufficiently similar (assuming you chekled also the third). But then I'd prefer zoomed plots: p.SetMaxDev(75) p.DrawSurfaceDeformations("result start", "", 6) for the bows (mean for 2 sensor modules) and delta-shifts p.DrawSurfaceDeformations() p.GetHistManager()->SetNumHistsXY(3,2) p.GetHistManager()->Draw() to get the kinks and delta-bows (in both cases it is good to show under/overflow in stats box) It would be interesting if you could check this also per subdetetcor (p.SetSubDetId()) just to see whether somewhere there are problems... (Note: ideal MC is has only zeros) p. 7: please add that the CPU issues needs follow-up... Thanks and sorry for quite some comments... Cheers Gero >> Date: Wed, 2 Nov 2011 17:33:18 +0100 >> From: Krisztian Krajczar >> To: roberto castello >> Cc: Gero Flucke , Adam Agocs , >> Vesztergombi Gyorgy , >> Alessio Bonato , >> Natalie Heracleous >> Subject: Re: MC alignment task >> >> Hi Roberto, >> >> please find my comments inline. >> >>> in view of the presentation of tomorrow in order to avoid confusion, could >>> we agree on the names for the different scenario produced? >>> Just to recap because i got a bit lost in the mails exchange: >>> >>> In /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/ >>> >>> 1) mp0896/jobData/jobm/alignments_MP.db --> name: MC_fromSTARTUP_def >>> (question: is it supposed to be the same object as in >>> mp0895/jobData/jobm/alignments_MP.db ?) >> >> yes; this geometry was produced starting ftom the current MC scenario + >> sensor bow parameters from data. >> >>> 2) mp0896/jobData/jobm1/alignments_MP.db --> name: MC_fromSTARTUP_cosmUP >> >> Cosmics are weighted up by a factor of 2. >> >>> 3) mp0899/jobData/jobm/alignments_MP.db --> name: MC_fromIDEAL_def >> >> For these geometries the starting MC scenario was indeed ideal and no bow >> misalignment was applied. >> >>> 4) mp0899/jobData/jobm1/alignments_MP.db --> name: MC_fromIDEAL_cosmUP >> >> Cosmics are weighted up by a factor of 2. >> >>> 5) mp0899/jobData/jobm2/alignments_MP.db --> name: MC_fromIDEAL_cosmDN >> >> Cosmics are weighted down by a factor of 2. >> >>> Please comment or suggest other more appropriate names. >> >> I would glady adopt your naming convention. I will make them appear on my >> slides (which I will send to the recipients of this email for reference and >> for asking for comments later today). >> >> Thanks, >> Krisztian >> >>> roberto >>> >>> >>> >>> On Oct 31, 2011, at 9:58 AM, Gero Flucke wrote: >>> >>>> On Sun, 30 Oct 2011, Krisztian Krajczar wrote: >>>> >>>>> all the pede jobs have finished; let me summarize here the locations of >>>>> the various geometries. >>>>> >>>>> starting from current MC scenario: >>>>> mp0896/jobData/jobm/alignments_MP.db (default) >>>>> mp0896/jobData/jobm1/alignments_MP.db (cosmics weighted up) >>>>> mp0896/jobData/jobm2/alignments_MP.db [*](cosmics weighted down) >>>>> >>>>> Starting from ideal alignment (the bow-misalignment and the bow >>>>> determination was removed from the alignment jobs, according to the >>>>> "1a)" recipe): >>>>> mp0899/jobData/jobm/alignments_MP.db (default) >>>>> mp0899/jobData/jobm1/alignments_MP.db (cosmics weighted up) >>>>> mp0899/jobData/jobm2/alignments_MP.db (cosmics weighted down) >>>> >>>> Hi Roberto, Natalie, >>>> any first feedback from Z-validation on >>>> >>>> mp0896/jobData/jobm/ >>>> mp0896/jobData/jobm1/ >>>> (both to be used with bows in th e.db file) >>>> >>>> and >>>> >>>> mp0899/jobData/jobm/ >>>> mp0899/jobData/jobm1/ >>>> mp0899/jobData/jobm2/ >>>> >>>> (all without using bows)? >>>> >>>> Cheers >>>> >>>> Gero >>>>> >>>>> I have checked again the Pede dumps of these jobs, and found that [*] >>>>> failed with the following message (all the other jobs ended correctly!): >>>>> ------------------ >>>>> Record 26600000 ... still reading >>>>> >>>>> Read cache usage (#blocks, #records, min,max records/block >>>>> 11062 26693609 989 2725 >>>>> Write cache usage (#flush,#overrun,,peak(levels)) >>>>> 88496 3492, 75.6% 74.4% 106.7% 197.5% >>>>> >>>>> Data rejected in initial loop: >>>>> 85 (rank deficit/NaN) 0 (Ndf=0) 930 >>>>> (huge) 382 (large) >>>>> ------------------ >>>>> >>>>> I try to simply resubmit this job. The other geometries are final. >>>>> >>>>> Cheers, >>>>> Krisztian >>>>> >>>>>> method "1a)" does work, the Pede job finished correctly (output db file >>>>>> is at >>>>>> /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/mp0899/jobData/jobm). >>>>>> I go on with the submission of the weighted cosmics samples. >>>>>> Cheers, >>>>>> Krisztian >>>>>>> method "1)" did not work, the Pede job failed again with the same >>>>>>> symptoms. The output in pede.dumb simply stops again without reaching >>>>>>> the end: >>>>>>> Record 12900000 ... still reading >>>>>>> Record >>>>>>> The dump is at >>>>>>> /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/mp0898/jobData/jobm >>>>>>> I will proceed with your other method, "1a)". >>>>>>> Cheers, >>>>>>> Krisztian >>>>>>>> thanks for the comments! >>>>>>>> I will modify the alignment_x.py config files according to your >>>>>>>> suggestion "1)". >>>>>>>> I have moved the diagnostic files to a backup directory for future >>>>>>>> reference: >>>>>>>> /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/mp0897/backup_failingJobVer1 >>>>>>>> Cheers, >>>>>>>> Krisztian >>>>>>>>>> The mps_stat.pl command reports that the Pede job for the alignment >>>>>>>>>> of ideal geometry failed. However, there are outputs produced in >>>>>>>>>> the directory you indicated in your earlier email. >>>>>>>>>> I have checked the Pede dump in search for any errors, but found no >>>>>>>>>> errors. >>>>>>>>> Hi Krisztian, >>>>>>>>> (adding Claus as pede expert asking for advice in the end) >>>>>>>>> indeed this is the first file to look into. And it does not look >>>>>>>>> healthy, but simply stops at some point - the last line should be >>>>>>>>> something like >>>>>>>>> >>>>>>>>> < Millepede II-P ending ... Wed Oct 26 22:52:11 2011 >>>>>>>>> as in mp0896/jobData/jobm/pede.dump. MPS looks for that line and >>>>>>>>> reports failure since it is not there. >>>>>>>>>> The memory usage was normal, although it was slightly higher than >>>>>>>>>> for the previous alignments: >>>>>>>>>> Memory space: total 32.000000 GB >>>>>>>>>> used 31.226771 GB = 97.58 % >>>>>>>>>> In STDOUT I found a possible source of the "fail" report of the >>>>>>>>>> mps_stat.pl. One of the automatic root macros failed to run: >>>>>>>>>> --------- >>>>>>>>>> Processing readPedeHists.C+("print nodraw")... >>>>>>>>>> Info in : creating shared library >>>>>>>>>> /pool/lsf/krajczar/182146920/./readPedeHists_C.so >>>>>>>>>> Error in : failed reading x-y-dx-dy >>>>>>>>>> content >>>>>>>>>> --------- >>>>>>>>> Before that I see >>>>>>>>> sh: line 1: 27036 CPU time limit exceeded >>>>>>>>> /afs/cern.ch/user/c/ckleinw/bin/rev81/pede_32GB pedeSteerMaster.txt >>>>>>>>> > pede.dump >>>>>>>>> and that tells us the reason whay pede did not run through - it is a >>>>>>>>> serious problem! It is also stated in alignment.log.gz from CMSSW: >>>>>>>>> %MSG-e Alignment: AfterModEndJob PedeReader() 28-Oct-2011 07:12:10 >>>>>>>>> CEST PostEndRun >>>>>>>>> Problem opening pede output file millepede.res >>>>>>>>> %MSG >>>>>>>>> %MSG-i Alignment: AfterModEndJob PedeReader::read() 28-Oct-2011 >>>>>>>>> 07:12:10 CEST PostEndRun >>>>>>>>> will read parameters for run range 1 - 4294967295 >>>>>>>>> %MSG >>>>>>>>> %MSG-i Alignment: AfterModEndJob PedeReader::read() 28-Oct-2011 >>>>>>>>> 07:12:10 CEST PostEndRun >>>>>>>>> 0 parameters for 0 alignables >>>>>>>>> What you point to is a consequence of that: pede did not run >>>>>>>>> through, so millepede.his with histogram-like infos of the pede job >>>>>>>>> is not well behaving and cannot be correctly converted into ROOT/.ps >>>>>>>>> - and there the error you see comes from. >>>>>>>>>> For the previous rounds of alignments this problem did not appear. >>>>>>>>>> Reference: >>>>>>>>>> /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/mp0897/jobData/jobm/pede.dump >>>>>>>>>> /afs/cern.ch/cms/CAF/CMSALCA/ALCA_TRACKERALIGN/MP/MPproduction/mp0897/jobData/jobm/STDOUT >>>>>>>>>> Is this a serious issue? Can I submit the Pede jobs for the >>>>>>>>>> weighted samples regardless this error? >>>>>>>>> The question is: >>>>>>>>> Why does it need more CPU starting from ideal (but bows). Internally >>>>>>>>> it is using an iterative procedure (MINRES) for solving the big >>>>>>>>> matrix - and this is done three (4?) times with your settings. Then >>>>>>>>> after each solving there is a line search in 1D. Procedures like >>>>>>>>> that tend to have difficulties if we start too close to the final >>>>>>>>> result (needing more MINRES iterations - see e.g. last page of >>>>>>>>> mp0896/jobData/jobm/millepede.his.ps.gz how kuch this can vary in a >>>>>>>>> succesfull job.)... >>>>>>>>> So - what to do? >>>>>>>>> 1) We can introduce a bit of noise in the procedure by adding some >>>>>>>>> random >>>>>>>>> misalignment. >>>>>>>>> 1a) If that does not help, we could remove the bow-misalignment and >>>>>>>>> the >>>>>>>>> bow determination from teh alignment job - in the very end we could >>>>>>>>> probably use the bows that are the result of the jobs starting from >>>>>>>>> current MC scenario >>>>>>>>> 2) I'll ask for a larger CPU limit on the special millepede queue. >>>>>>>>> Claus - do you have another suggestion? >>>>>>>>> about 1) >>>>>>>>> add to configs >>>>>>>>> process.AlignmentProducer.doMisalignmentScenario = True >>>>>>>>> process.AlignmentProducer.MisalignmentScenario = cms.PSet( >>>>>>>>> setRotations = cms.bool(True), >>>>>>>>> setTranslations = cms.bool(True), >>>>>>>>> seed = cms.int32(1234567), >>>>>>>>> distribution = cms.string('gaussian'), #fixed'), >>>>>>>>> setError = cms.bool(True), >>>>>>>>> TIBBarrels = cms.PSet(DetUnits = cms.PSet( >>>>>>>>> dXlocal = cms.double(0.001)) >>>>>>>>> ) >>>>>>>>> # same for TIDEndcaps, TECEndcap, TPBBarrels and TPEEndcaps >>>>>>>>> # but leave out TOB for now >>>>>>>>> ) >>>>>>>>> about 1a) >>>>>>>>> - setup new directory >>>>>>>>> - remove process.trackerBowedSensors stuff from startgeometry.txt >>>>>>>>> - deselect the bow parameters in alignables.txt: >>>>>>>>> * last three '1' to set to '0' for single sensors (SelectorBowed) >>>>>>>>> * remove 'SelectorTwoBowed' and add double sensor modules (TOB, >>>>>>>>> outer >>>>>>>>> TEC) to SelectorBowed with '101111 000' parameterisation. >>>>>>>>> Cheers >>>>>>>>> >>>>>>>>> Gero >>>>>>>>> -- >>>>>>>>> >>>>>>>>> >>>>>>>>> ----------------------------------------------------------------------- >>>>>>>>> Gero Flucke >>>>>>>>> - Analysis Centre, Helmholtz Alliance "Physics at the Terascale" >>>>>>>>> * Statistics Tools >>>>>>>>> - CMS: Tracker Alignment Convenor >>>>>>>>> DESY/CMS, Notkestr. 85, D-22607 Hamburg, Germany >>>>>>>>> Bldg. 1e, Rm. 02.501 >>>>>>>>> phone: +49 (0)40 8998 3525 >>>>>>>>> fax: +49 (0)40 8998 3092 >>>>> >>>> >>>> -- >>>> ----------------------------------------------------------------------- >>>> Gero Flucke >>>> - Analysis Centre, Helmholtz Alliance "Physics at the Terascale" >>>> * Statistics Tools >>>> - CMS: Tracker Alignment Convenor >>>> DESY/CMS, Notkestr. 85, D-22607 Hamburg, Germany >>>> Bldg. 1e, Rm. 02.501 >>>> phone: +49 (0)40 8998 3525 >>>> fax: +49 (0)40 8998 3092 >>> >>> >>> ______________________________________ >>> >>> Roberto Castello >>> Centre for Cosmology, Particle Physics and >>> Phenomenology - CP3 - UC Louvain, Belgium >>> phone (CERN): +41. 22.767.4739 >>> ______________________________________ >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> > -- ----------------------------------------------------------------------- Gero Flucke - Analysis Centre, Helmholtz Alliance "Physics at the Terascale" * Statistics Tools - CMS: Tracker Alignment Convenor DESY/CMS, Notkestr. 85, D-22607 Hamburg, Germany Bldg. 1e, Rm. 02.501 phone: +49 (0)40 8998 3525 fax: +49 (0)40 8998 3092