Top Mass Measurement in the Lepton + Jets Channel
Using a Multivariate Template Method

Authors
Igor Volobouev, Pedro Movilla Fernandez, John Freeman, Angela B. Galtieri , Jeremy Lys,
(
E-mail to all authors)

Lawrence Berkeley National Laboratory

Result

Our numerical value for the mass of the top quark is
(status of May 2004)

Mtop =179.6+6.4-6.3(stat.) ±6.8 (syst.) GeV/c2

which we extracted from 33 top candidate events found in 162 pb-1 Run II data. The plot at the right shows the reconstructed invariant top mass distribution with the Monte Carlo expectation for the signal for Mtop=180GeV/c2 and the various backgrounds.

Method
In the following we describe our analysis method. Click onto the pictures to get the PostScript versions of the plots shown.
  • For more details, CDF collaborators can check our CDF Note 6970 (password required).
  • Also a public conference note is available.
  • A summary talk is given here.
  • Click here to go to the final results section of this page.

    1. Introduction

    We measure the top quark mass using ttbar events. Each top quark decays into a W boson and a b quark. One of the Ws decays into a charged lepton plus neutrino and the other into two quarks. A fit is made for each event by using energy-momentum conservation, then a likelihood procedure is used to determine the top mass value. We attempt to reduce the jet systematics by calibrating the jet energy scale using the W mass constraint event-by-event. We reduce the statistical uncertainty by estimating the probability to pick the correct permutation on an event-by-event basis and reweight the events according to this probability. We improve the signal/background separation by utilizing other kinematic variables in addition to the reconstructed top mass. We introduce fewer assumptions into the analysis by using non parametric statistical techniques. Kernel density estimation is employed to make multivariate templates and local polynomial regression is used to ensure likelihood continuity.

    2. Event Fitting

    We apply the standard selection criteria for ttbar events in the semileptonic decay mode and perform generic corrections of measured event quantites (such as jet energies) known as ''level 7'' corrections. In addition to theses corrections, we apply top specific corrections developed by our group. For the event reconstruction, we use a special purpose kinematic fitter. A jet energy scale factor (JES) is included in the kinematic mass fit of the hadronic W using a Gaussian constraint. The constraint is a tunable parameter to be optimized in the analysis. All jets in the event for a particular permutation are multiplied by the jet energy scale obtained from the W mass fit. It turns out that adjusting the jet energy scale to get the best W mass in each event leads to a reduction in the jet systematic error (among other things because this procedure partially compensates for fluctuations in the jet fragmentation). However, the price for this reduction is the increase in the statistical uncertainty which occurs because the fitted energy scale is allowed to fluctuate from event to event.

    The fitted jet energy scale is different from one jet-to-parton assignment (''permutation'') to another. The plot at the right demonstrates that for the correct permutation, scale shifts due to W mass constraint compensate on average systematic shifts.

    We consider only those solutions for the (ambiguous) neutrino momenta which give the most closest values for the masses of the top quark and the anti-top quark. Furthermore we require that the solutions found by the fitter are consistent with the b tagging information provided by the SVX detector.

    3. Signal Subsamples

    Our method uses three different signal templates: 1. correct permutation samples, 2. incorrect permutation samples (i.e. jets are associated with the quarks but swapped) and 3. samples with incorrect jet assignment (e.g. if a gluon jet is associated with a quark). The knowledge of the subsamples is important since the correct template has much better mass resolution. The Figures at the bottom use ttbar Monte Carlo events to show that the events for which the correct combination was chosen by the fitter give a narrower distribution for reconstructed mass. Also the dependence of the reconstructed mass versus the generated mass has a slope 1. This is not true for the other two cases, i.e., if the correct jets are chosen by the fit but the jet assignment to each top is not correct, or if not all the chosen jets are the correct ones.

    For each event we compute the probability that the solution with the smallest χ2 is the correct one, and weight the event accordingly. The weighting is performed by using different template fractions for the different types of permutations.

    The plot at the right shows that the χ2 distribution itself is less useful to discriminate the correct permutation from the incorrect ones. The first two cases (correct combination and jets are correct, but combination is wrong) are very similar. The third case has a slightly wider distribution.

    Instead, we employ a model in which we consider differences between χ2 values of the best permutation and all other permutations. This model is inspired by viewing permutation selection as a ''diffusion'' process in the space of χ2 values in which events can ''drift'' from the region where the best χ2 permutation selects the correct jet to parton assignment into the regions where this is no longer true. The illustration at the right shows the χ2 values for two particular permutation types (with numbers 0 and 1) obtained from ttbar MC events with one b tag. Events in which permutation 0 is correct are marked with blue dots, and events in which permutation 1 is correct are marked with red dots. If the permutation with the smallest χ2 was always correct, there would be no red points above the χ0212 boundary, and no blue points below. Instead, we observe diffusion-like mixing of the red and blue points near the boundary.

    The fraction of incorrectly assigned permutations appears to decrease exponentially as a function of the difference between permutation χ2 values. This gives reason to make correct permutation probability assignments according to the formula given below. (In the current implementation of our analysis we do not attempt to improve the separation between the samples with incorrect jet assignment and the other subsamples.). The various ''diffusion'' model parameters have to be determined from fits.

    We also include additional kinematical information with weak dependence on the top mass into the definition of the correct permutation probability. The approximate Bayesian update method allows us to include kinematic variables in order to improve the separation between the correct permutation samples and the other subsamples.

    The first kinematic variable is the cos(φ) with φ the angle between the lepton and the b on the leptonic side of the decay, evaluated in the rest frame of the leptonic W. The distribution of this angle is shown in the right two Figures for both correct (left) and incorrect permutation (right). Thus we try to improve the subsample separation near cos(φ)=-1 where the correct permutation is suppressed by the top decay matrix element.

    The second kinematic variable is the product cos(θ1)· cos(θ2), where θ1 is the angle between the lepton momentum and the beam axis in the rest frame of the semileptonically decaying top quark, and θ2 is the angle between the direction of the light quark and the beam axis in the rest frame of the top quark which decays hadronically. We use that light quark which, in the rest frame of the parent W, has smaller angle with the b quark originating from the hadronically decaying top. This variable is sensitive to the ttbar spin correlations. The figure at the right shows the angular distributions for correct and incorrect permutation. A more quantitive representation of the differences between these two subsamples is given by the normalized density ratios of the the cos(θ1)· cos(θ2) product.

    The distributions of the correct permutation probabilities thus obtained from ttbar MC events is given by the top two histograms shown in the Figure to the right. The top left plot contains events with exactly one b tag, the top right plot contains events with two b tags. The latter plot reflects that tagging information is obviously very useful to increase the correct permutation probability.

    In order to check the reliability of the modeling of the correct permutation coefficients, we compare the predicted probabilities of the fit with the fractions of correct permutations found in ttbar MC events. The two bottom plots in the Figure to the right show these comparisons for events with one and two b tags, respectively. Obviously the agreement between predicted and observed fractions is satisfactory.

    4. Templates

    For the construction of the templates we have chosen a nonparametric multivariate density reconstruction method called Kernel Density Estimation (KDE). The density at any given point in the multivariate space receives contributions from Monte Carlo events in such a way that events are weighted more heavily when they appear closer to the desired point. The distances between points are determined by the metric which is chosen as the inverse of the robust sample covariance matrix. We use a radially symmetric kernel function that behaves like a Gaussian probability density near the origin but decays slower at infinity. We choose a global bandwidth which minimizes the asymptotic mean squared integrated error (AMISE) of the density estimator for a sample of points drawn from the multivariate Gaussian distribution.

    The plot at the right is an illustration of 2-dimensional templates based on the invariant top mass and the scalar sum of the transverse energies of the first four leading jets. The first and the second column shows the templates for Mtop=150 GeV/c2 and 200 GeV/c2, respectively. The top row is for events with the correct jet combination, the second row is for combination that use the correct jets but not with the correct assignment, the last row is for events for which one or more of the 4 jets are not decay products of the two top candidates. The templates for the three types of backgrounds are shown in the rightmost column.

    5. Likelihood

    We define the likelihood of the observed data sample according to the formula shown in the plot below. In addition to our primary variable, which is the top mass taken from the jet permutation with the lowest χ2, the likelihood considers further event observables (e.g. the sum of the transverse energies of the four leading jets).

    In the current version of our analysis we set the background constraint function to 1 and allow background fraction to float freely. The signal density is composed from the three signal templates for a given generated top mass, using the probabilities to pick up the correct permutation and the correct jets, respectively. We model the background density using the templates for different background types weighted by various background composition coefficients obtained from the ttbar cross section studies.

    The likelihood is not normalized, but we maintain an approximate normalization by estimating the coefficients for picking up correct permutation and the correct jets in a manner which introduces very little dependence on the top mass.

    Since the likelihood is defined only for a discrete set of top mass values for which signal templates were generated, we introduce continuity and smoothness by using a local quadratic polynomial least squares regression with top mass as a predictor and log likelihood as a response. Interpolation is performed for each event separately, and resulting log likelihood curves are then added to obtain the likelihood of the whole sample. The effect of local regression is illustrated in the Figure on the right.

    6. Data Set + Background

    Top event candidates are required to have a lepton (muon or electron) and four jets. If the event has more then four jets, the four jets with the highest transverse energy (ET) are considered for event fitting. ET > 15 GeV is required for the first three, ET > 8 GeV for jet 4. Events with the fourth jet in the 8-15 GeV range are called 3.5-jet events. At least one jet tagged as a bottom jet using a displaced vertex algorithm is required. Backgrounds from non-top events come from several sources, mostly due to lepton misidentification (non-W), from W bbar events, from W+jets events for which a light quark jet has been tagged as a b-jet (mistag), W or Z events which have the same topology as top events and single top production events. The table shows the number of observed events in 162 pb-1 data, the number of fitted events and the background events for each category.

    7. Optimization

    For the multivariate template method we seek a set of ''good'' variables which both increase the sensitivity of the likelihood to the top quark mass and improve the discrimination between signal and background. The set of variables should also be optimal w.r.t. the systematic uncertainties of the measurement.

    It is desirable to make a reasonable pre-selection among them without being dependent on the results given by the (CPU) time consuming machinery of the whole top mass analysis. Therefore, we investigated the discriminative power of various kinematic energy variables (e.g. combinations of sums of transverse jet momenta) and angular/shape variables (like acoplanarity) to distinguish signal events from background events. We evaluated various histogram distance measures (related to the well known Kullback-Leibler divergence) which quantify the differentness of probability density distributions. Among others things, we calculated a distance we refer to as ''symmetrized K distance'', which basically sums up the logarithmic differences of two differential distributions. In addition, we calculated also the Kolmogorov-Smirnov distance KS (which measures the distance of the cumulative distributions).

    The plot at the right shows the numerical values of the distances between signal ttbar MC (HERWIG) and W+Jets background MC (ALPGEN+HERWIG) for various event observables. These and also other distances measures not mentioned here lead to similar conclusions: Kinematic energy quantities discriminate signal from background better than angle or shape variables, whereas pure leptonic energy variables have almost no discriminative power. We found that the sum of the transverse energies of the four leading jets is top performer. The invariant top mass has only average discriminative power to distinguish signal from background.


    We optimize the JES constraint parameter by generating many pseudoexperiments for the number of observed data events having a mass fit. They draw events from the ttbar Monte Carlo templates and from the background templates (three different ones: mistag, non-W and the W+bb+2jets Monte Carlo for the other sources of background). The total number of background events (34% of the 32 fitted events) is obtained from a likelihood fit of the data using a JES constraint of 0.05.

    The optimization is done using 2-dimensional templates, one variable being the reconstructed top mass, the other chosen among the variables discussed above. Four choices for the second variable are shown in the plots. The top plot shows the statistical error versus the JES constraint parameter. The middle plot show the expected systematic uncertainty due to systematic uncertainties on jets momenta. The lower plot shows the two uncertainties added in quadrature. The vertical bars show the expected uncertainties for one pseudoexperiment for the four cases at each value of the JES constraint. Based on these results we have chosen as second variable the sum of the transverse momenta of the four leading jets and a value of 0.07 for the JES constraint. Although for the four cases shown it is not clear that we need to use two dimensional templates, the choice of the second variable is made mostly based on the variable studies discussed above.

    8. Results

    The likelihood procedure applied to the data (using a JES constraint of 0.07) provides a mass value and a background fraction (shown below) as a function of the top mass. The negative log-likelihood curve is given in the leftmost Figure below. The plot in the middle shows the median and width of the pull distributions as a function of the generated mass as obtained from pseudoexperiments. The pull width on the average is 1.10 and the average bias is very small. We multiply the statistical error, as obtained from the likelihood fit, by a factor 1.10. The expected statistical error (multiplied by the factor 1.10) is shown in the plot at right. The triangles show the observed error value. The results of the fit for the top mass and for the statistical error of the top mass is

    Mtop= 179.6+6.4-6.3 GeV/c2,

    From the background fraction versus top mass plot shown below we extract the most probable background fraction by interpolating between the two templates closest to the obtained mass. The result for the most probable background fraction and for its statistical error is


    fb = 0.34±0.14

    The leftmost plot below shows mass distribution for the 33 events reconstructed with a JES constraint of 0.07. The points are the data, the other distributions are for the different backgrounds and the ttbar Monte Carlo. The middle plot shows the JES values obtained for each event for the fit with a JES constraint of 0.07. The scatter plot on the right shows the sum of the transverse momenta of the four leading jets versus the reconstructed top mass, superimposed to the expectation from Monte Carlo. The dots represent the data.

    Systematic uncertainties have been determined from many pseudoexperiments of 33 events each, by modifying the default values for a given parameter and evaluating a mass shift. The contribution to the systematic uncertainties on the jet energy are the largest ones. The other listed sources contribute little when they are added in quadrature.


    You are visitor number since Jun 3 2004.

    [Top] [Public CDF Top Page]
    Last modified: Fri Aug 6 22:53:21 PDT 2004
    Pedro Movilla Fernandez