Data Analysis with Tcl and Histo-Scope

In our work with silicon particle detectors and their electronics at LBNL, we often encounter tasks which require visualization and analysis of large amounts of data. For example, a typical CDF silicon half-ladder is a 512-channel device with 8-bit resolution. The baseline pulse height distribution needed to calibrate this device is a two-dimensional histogram with 512x256 bins. About 103 of these histograms have to be analyzed every time the CDF silicon system is calibrated (usually once a week). Another common task is to display silicon ladder readout in real time on a test stand in order to understand how it is affected by various settings of the readout electronics and different grounding and bypassing schemes.

In order to solve a variety of such tasks efficiently, we have developed a Tcl interface to Histo-Scope called "hs". Histo-Scope is a free, cross-platform tool for data analysis and visualization developed by the Computing Division at FNAL. It provides a set of highly interactive graphing and plotting widgets, especially notable for its zooming capabilities, dynamic plot updating and animation, ability to "drag-and-drop" one plot into another, and for a superb implementation of the "virtual trackball" method used to rotate two-dimensional histograms and three-dimensional scatter plots. Histo-Scope also provides a portable I/O mechanism for storing histograms and n-tuples. However, in order to be really useful for data acquisition and analysis, Histo-Scope needed a scripting language and a richer set of analysis methods.

The hs tcl extension addresses this need. It wraps all functions found in the Histo-Scope C API, implements a variety of additional data manipulation procedures, interfaces data fitting and Fourier transform tools from CERNLIB to Histo-Scope data structures, and provides several GUI widgets which assist in common analysis tasks. Together with one of the GNU readline extensions (we use rdl, although tclreadline seems to be more popular), the hs extension forms a convenient environment for various data acquisition and analysis projects. Compared to other free data analysis programs traditionally employed by particle physicists, such as PAW, Mn_Fit, or ROOT, this environment gives a healthy productivity boost to its users because it is based on a well-established scripting language. Once a programmer gets familiar with the tools, such common service tasks as parsing text files, exception handling, inter-process communication, networking, retrieving data from remote computers, interfacing data acquisition libraries and hardware, database access, GUI building, CGI scripting, sending e-mails, etc., can be coded in tcl very efficiently (or even reduced to trivial) because the appropriate facilities are either built into the language itself, its standard library, or implemented in one of the numerous tcl extensions.

By providing interface to Histo-Scope and elements of CERNLIB, the hs extension brings a high quality histogramming, data plotting, and fitting facility to tcl. Although this functionality by itself is not uncommon among various software libraries and data analysis programs, in combination with a large collection of diverse tcl/tk packages the hs extension is worth serious consideration as a candidate for your data plotting and analysis tool of choice. At LBNL it has been successfully used in several projects related to the development of silicon particle detectors and their electronics.

Here is a collection of sample plots produced with the hs extension and Histo-Scope. Some GUI snapshots are also available. Note, however, that all Histo-Scope plots and hs GUI widgets are highly dynamic and interactive, so that static pictures may fail to provide an adequate impression about the capabilities of this tool. You can also take a look at the extension's reference manual.

Since fast number crunching is not among tcl strong points, the hs extension avoids speed bottlenecks in the analysis of large data samples by on-the-fly generation, compilation, and dynamic loading of C code. Dynamic loading of FORTRAN code is also supported for certain purposes (in particular, user-provided data fitting functions may be written in FORTRAN).

Histo-Scope and HBOOK files can be converted into each other (with some limitations) using the "tohbook" and "hbook2hs" programs which come with the Histo-Scope and hs distributions. If you have some projects for which you use HBOOK, PAW, or Mn_Fit, you may be able to analyze your data with Histo-Scope. Certain simple objects in ROOT files (1d histograms, 2d histograms, n-tuples, and directories) can also be converted to Histo-Scope format using the root2hs program.

The LBNL Histo-Scope distribution addresses some deficiencies found in the latest open source release of Histo-Scope from FNAL (version 4.0). In particular, all known bugs have been fixed, and a variety of improvements has been made to ensure publication quality output (introduced font management, support for color scale plots, ability to put simple figures on top of Histo-Scope graphics and to import EPS files into any plot, LaTeX can now be used for complex annotations, etc). Still, the simplicity of the underlying data model limits Histo-Scope applicability. Only datasets which can be adequately mapped into histograms and n-tuples can be efficiently analyzed. Another notable problem is that n-tuples can only be as large as the machine virtual memory allows.

The LBNL Histo-Scope distribution is available as a collection of precompiled binaries and static libraries for several UNIX platforms and as a source code. You should be able to compile the hs extension easily on any computer where tcl (version 8.3 or newer), tk, Histo-Scope, and CERNLIB are installed, so it only comes as a source code. Since CERNLIB is distributed under the terms of GPL, the hs extension code is also using GPL, although less restrictive forms of licensing may be considered in the future if CERNLIB changes its licensing scheme. The hs extension can also be compiled and used without CERNLIB and/or tk, albeit in these cases some of its functionality will not be available.

Download LBNL Histo-Scope
Download the latest version of the hs tcl extension


Questions? E-mail to igv@lbl.gov