123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990 |
- This package is a self-contained implementation of functional decomposition, an
- unbinned, parametric solution for fitting mass spectra, conducting searches and
- producing limits. FD decomposes a dataset into a complete set of orthogonal
- functions (the orthonormal exponentials), whose coefficients can be extracted
- from the data by direct computation. It uses a penalized likelihood to
- determine the appropriate number of terms to retain from the infinite series.
- -------------------------
- RUNNING THE EXAMPLE CASES
- -------------------------
- There are three examples provided, with increasing levels of complexity. These
- are:
- bkg_only Decompose a single smooth spectrum using the orthonormal
- exponentials.
- sig_bkg Decompose a smooth spectrum with two known resonances. Use
- the orthonormal exponentials as a background model and two
- Gaussians to model the two known peaks.
- sig_bkg_scan Decompose a smooth spectrum with two known resonances, and
- perform a search for a new resonances. Use the orthonormal
- exponentials for the background, two Gaussians to model the
- two known peaks, and scan a third peak through several masses
- and widths.
- The first two examples should run in only a few minutes. The third example
- tests some ~600 different signal hypotheses, and can take rather longer. On
- my Dell XPS13 9350, using a fairly large sample of 5e7 events, this example
- takes about 1.5 hours to run from scratch.
- There are four main contributions to the run time:
- 1.) Decomposing data: 415s. Linear in the number of input events
- 2.) Determining hyperparameters: 281s. Linear in the size of the initial
- search grid.
- 2.) Decomposing signal models: 3380s. Linear in the number of signal models
- and the number of events used to simulate each signal.
- 3.) Calculating limits and p-values: ~1200s. Linear in the number of signal
- models.
- Check the config files for each example (located in <example>/base.conf). The
- comments describe the various parameters and their function. You probably want
- to adjust 'Nthread' to match your number of CPU cores before running the
- examples. The remaining parameters should require no adjustment (but feel free
- to play around).
- Each of the examples can be run as follows:
- 1. Enter the FD directory and set up the code:
- cd <FD_DIR>
- . bin/setup_bash.sh
- This will also check if the required Python packages are available, and warn
- you if they are not.
- 2. Enter the example directory and import / generate the test data:
- cd Examples/<Example_Name>
- fd_generate.py --setname Test --varname Myy --wgtname weight --size 12000000
- fd_import.py ../InputSignalData/*
- 'fd_generate.py' produces a random sample of background-like data. Change the
- size parameter if you want more/fewer events. 'fd_import.py' imports several
- datasets from the CSV files in 'InputSignalData'. These contain Gaussian
- signal shapes that can be injected into the background-like sample to
- simulate resonances.
- 3. Run the scan
- fd_scan.py
- In a real application, you would use 'fd_import.py' to read in your data (either
- as .csv or .root). Note that for the 'bg_only' example, 'fd_import.py' is
- unnecessary and can be skipped.
- That's it! The output will be located in 'Output/*'. Each plot is saved as an
- individual pdf file, and additionally all plots are saved together as a
- multipage pdf in 'Output/all.pdf'.
- One last thing to note: all decompositions and likelihood calculations are
- cached on disk in '<example>/Cache/*'. Any repeated computations will hit the
- disk cache, which vastly speeds things up when making small changes (e.g. plot
- tweaks or including additional signal models).
- The cache files are named using all relevant parameters along with a checksum of
- the dataset. This ensures that if you change parameters or cuts, the cache will
- not accidentally use values computed using a different configuration. If you'd
- like to force FD to completely re-run from scratch, just delete the Cache
- directory. It will be automatically re-created and re-populated the next time
- 'fd_scan.py' is run.
|