Comments from Conor =================== These comments are based on the draft dated 25 September. General comments: Overall, I think the paper is in pretty good shape. Most of my comments are fairly minor. However, I do have a suggestion to change how the plots and tables containing data are presented. At present, plots containing data are shown as part of the bkg estimate section, and as part of the NN discriminant description. I think it would be better if all plots and tables containing data were put into a new Results section. I explain more in the detailed comments. % thanks Abstract: Good abstract. Perhaps you would like to add how the cross-section limits obtained compare to the SM Higgs expected values? % added "... pb or 7.5 to 101.9 times the SM expectation for ..." Page 1, line 36: 'one more' -> another % fixed Page 3, line 41: "In this paper, we present ..." Perhaps you should say this overview line before explaining the analysis improvements in lines 35-40? The order could be: - discuss previous CDF paper - state that this paper is update to to 1.9 fb-1 - brief description of analysis improvements - contents of the current paper % rephased Page 3, line 62: " about 30 um of which is due to ..." -> I think "of which about 30 um is due to ..." reads a little better, but this is just a personal preference, so you are free to ignore it if you disagree % fixed Page 3, line 66: "E_T = E sin(theta)" You already defined E_T in line 53, no need to do it again % fixed P4, L 71: "Plug" is a little bit too much CDF jargon. Better to use 'forward' I think. Or at least define "plug" = "forward" right away, eg "in the forward calorimeter, known as the 'plug' " % fixed P4, L 81-88: do you require muons to have minimum ionizing deposits in the calorimeter? Might be worth adding that to the nice description here. % No, no minimum energy is required in the calorimeter. P6, L 142: add 'non-overlapping' or 'exclusive' to make it clear, as in "Three non-overlapping b-tagged event categories are considered" % fixed with ' exclusive' P6, L 143: 'event' -> 'events' % fixed P6, L 145: 'NN' -> I dont think that NN has been defined yet. Prefer to use Neural Net, or make sure the abbreviation is defined already. % fixed P6, L147: "With a SecVtx mistag rate of ..." I agree with this statement, but I dont see it's relevance to this particular section? It belongs in a later section on the algorithms, not in the event selection section. % removed P6, L154: "vertex" -> "Vertex" All your other subsections have capital letters for every word... % fixed P7, L 158: "NN" -> again, I prefer 'Neural Net' % fixed P7, L 160-163: "The b-quark has a relatively long lifetime..." This type of conceptual discussion belongs in the introduction to this section, before you start discussing any of the individual algorithms % fixed P9, L 225: "heavy-flavor jets from events with an electron candidate" You should explain briefly why this sample is used for validation. It's not clear to me. Is it a b-enriched sample? % fixed P9, L 228: you should define the NN output variable first. Something like: "The output of the neural net is a value ranging from 0 to 1, denoting the probability that ... etc" % fixed P9, L 232: "At these cut values .." I would put this sentence directly after the values chosen, before talking about the scale factors. % fixed P9, L 228-234: Clearly the selection of the cut value is a trade-off between efficiency and purity. But its not clear from the current description why you chose the 90% efficiency value. For example, if you were willing to cut tighter and reduce the efficiency to 80%, would it massively or only marginally improve the purity? % The mistag is small part of background, so, cut hard does not improve the search sensitivity. P10, L 236 onwards: It's a nice description, but I think it would help the reader if you could more explicitly address how jet-prob tagging differs from SECVTX? % rephased P10, L 246: 'While' -> 'Since' % fixed P10, L 251: '72 categories' -> the reader may be interested in how you make 72 categories - 4 bins in eta, 9 in pt, 2 in quality? % removed 72. P11, L 292: "due to" -> "of" % fixed P11, L 292: "mimic" -> "mimicked" % fixed P11, L 293: remove 'production', just leave "by other processes" % fixed P12, L 299: 'with' -> 'by producing' % fixed P12, L 302: 'be observed via' -> 'result from' % fixed P12, L 303-304: 'due to some unknown reasons' - this may be true, but I wouldnt say it like this! % rephased as ' due to some limitations ' P12, L305: '(pretag)' -> '(known as the 'pretag' sample)' P12, L 311: '4 sidebnad sectors' - well, one is actually the signal region, not a sideband, so better just to say '4 sectors' % rephased P14, L 358: 'due to MET trigger bias' - suggest to remind the reader that the MET+PEM trigger is used for forward region, which causes the MET trigger bias % fixed P14, L 369: 'jets with at least 2 tracks well measured in the silicon detector' - taggable jets were already defined on P13, L 327, so no need to define them again % ok as is since the previous definition is no longer exist. P14, L 371-378: suggest a slightly different ordering to the information presented in this paragraph, to make it a bit clearer. Perhaps something like: "Negative mistags are defined as tags with unphysical... The positive mistag rate can be obtained from the negative mistag rate with an additional correction factor, reflecting an enhancement of positive mistags due to light-flavor ... The correction factor is measured in a control sample ... The systematic uncertainty ..." % rephased as suggested P14, L 378: 'control sample' - how is this control sample defined? % inclusive jet samples P15, L 389: 'used' -> 'derived and applied' % fixed P15, L 394: 'programs' -> 'event simulators'? % fixed P15, L 399: 'programs' -> 'simulations'? % fixed P15, L 403-406: the description of 1B,1C,2B,2C should go in the caption for Table II, as is done already for Table III % fixed P18, Section E: it seems to me that this section, which is described as 'Summary of Background Estimate' is really the 'Results' section. The figures and tables here are not just the background - they also show the data! I think a better idea would be to create a new section called 'Results' and put these figures and tables in it. The ordering of the sections would now be: Sec V Background (subsections A-D; no need to include the summary E) Sec VI Signal Acceptance Sec VII NN Discriminant Sec VIII Results (encompassing all plots and tables that contain data, including the NN discriminant plots) Sec IX Higgs Limit (or perhaps make this a subsection of Results?) The idea behind this suggested reordering is that all plots and tables that contain data are collected together in the same Results section, and all analysis methods (bkg estimates, signal acceptance, NN discriminant) are described before any data is presented. % created a new result section as you suggested and made Higgs limit as subsection of Results P18, Fig 1: 'In this plots, the number of central and plug region are merged' -> 'This plot combines the information from the central and plug regions of the detector.' Although you say this in the text anyway, so it may not be necessary to say it in the caption. % fixed Also, the order of tables and figures should match the way it is described in the text. One way that seems reasonable to me is to put the figures first, since they show all jet bins, then show the tables, since they have the details of the signal bin. Alternatively, you could separate them according to b-tag category, and give the appropriate fig and table for each category in order. It doesn't matter too much, as long as there is some logic to it, and the text description should match that. % rephased as suggested Table V, VI, VII: 'as a function of jet multiplicity' should be removed from the caption in all these tables, since these results are just for the signal bin Njet=2. Also, 'central' and 'plug' should probably start with uppper-case letters. % fixed P19, L 442: 'programs' -> 'generators'? [I just dont like calling these programs] % fixed P19, L 442: 'The PYTHIA program was used ...' -> just 'PYTHIA is used ...' Note also the suggested change to present tense. % fixed P19, L 447: 'where epsilon ...respectively ' -> I have a preference for explaining the symbols as 'where A is ..., B is ... ' but this is not a big deal, so the authors should do what they prefer here. I just point out my preference. % ok as is P20, L 453: 'kinematics' is a very jargon-y word. Do you mean the 'event selection criteria'? % fixed P20, L 455-456: It may be worth a little more explanation of where the e_z0, e_trigger and e_lepton are obtained from. % rephased. P22, Fig 2: why are the plots from these two b-tag categories combined into the same figure, while the SECVTX+NN gets a figure all of its own? Unless there is a particular reason for this, I suggest you give each b-tag category its own figure, so that the current Fig 2 gets split into 2 separate figures. % will combine two double tags together since S/B is similar. So, there will be one for single tag and one for double tags. P22, Fig 3: 'various b-tag strategies' - this phrase occurs a lot, but I don't really like it. I prefer 'categories' to 'strategies'. And 'various' sounds a bit too random ... how about 'the selected' or something like that to reinforce the idea that you intentionally chose these 3 categories? % fixed P23, Table VIII, IX: In the tables themselves, replace 'One tag w/NN tag' with 'One SECVTX with NN tag'. That should be clear enough, so you can remove the explanation of the tag categories from the captions. The reader will be familiar with the 3 categories by this stage of the paper. % fixed P23, L481-482: this statement of the increase in acceptance gained from the inclusion of plug electrons probably belongs somewhere earlier in this section, rather than just right at the end... % fixed P24, L 491: '...from a list of 76 possible choices considered from two jets, MET, and lepton kinematics and correlation between them' - this part of the sentence could be phrased better. Perhaps something like: "... from a list of 76 possible variables, based on the kinematic information of the two jets, lepton and MET in the events (including correlations between these objects)." % fixed P24, L 500: 'constant against the Higgs mass' -> 'constant as a function of the Higgs mass' % fixed P24, L 504: personally, I would replace the comma here with a colon before the list of the 6 variables, and then in each item, I would replace the current colon, with a comma. However, I'm sure the PRD copyeditors will have their own suggestions on how to handle this. % fixed P25, Fig4 & P26 Fig 5: why not combine these two into 1 figure which shows all six variables used in the NN? At the same time, since they contain data, I think they should be moved into the new Results section that I proposed earlier. % plots are no longer needed. P27, Fig 7: similar comment to Fig 2 - why are these 2 categories combined while the other category gets its own figure? Unless there is some reason, I think each category should get its own figure. % will combine two double tags together P27, Eq 13: this equation and its accompanying text seems unnecessary to me. Eq 15, since it includes syst uncertainties, is the only one you need to give and explain here. % removed eq 15. P28, L 553: 'credibility' -> 'confidence'? % ok as is P28, L 561: you could probably shorten this sentence to just say 'the three 3 b-tag categories' if you wanted... % fixed P28, L 573: 'of forward' -> 'of the forward' % fixed P29, Fig 8: the caption should state that this is not the cross-section limit, but rather the limit divided by the SM expectation. % fixed P29, Conclusion: it might be nice to add here how the observed limit corresponds to the SM expectation, ie the information in Fig 8. % fixed