| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990 | 
							- This package is a self-contained implementation of functional decomposition, an
 
- unbinned, parametric solution for fitting mass spectra, conducting searches and
 
- producing limits. FD decomposes a dataset into a complete set of orthogonal
 
- functions (the orthonormal exponentials), whose coefficients can be extracted
 
- from the data by direct computation.  It uses a penalized likelihood to
 
- determine the appropriate number of terms to retain from the infinite series.
 
- -------------------------
 
- RUNNING THE EXAMPLE CASES
 
- -------------------------
 
- There are three examples provided, with increasing levels of complexity. These
 
- are:
 
-    bkg_only      Decompose a single smooth spectrum using the orthonormal
 
-                  exponentials.
 
-    sig_bkg       Decompose a smooth spectrum with two known resonances.  Use
 
-                  the orthonormal exponentials as a background model and two
 
-                  Gaussians to model the two known peaks.
 
-    sig_bkg_scan  Decompose a smooth spectrum with two known resonances, and
 
-                  perform a search for a new resonances. Use the orthonormal
 
-                  exponentials for the background, two Gaussians to model the
 
-                  two known peaks, and scan a third peak through several masses
 
-                  and widths.
 
- The first two examples should run in only a few minutes.  The third example
 
- tests some ~600 different signal hypotheses, and can take rather longer. On
 
- my Dell XPS13 9350, using a fairly large sample of 5e7 events, this example
 
- takes about 1.5 hours to run from scratch.
 
- There are four main contributions to the run time:
 
-   1.) Decomposing data: 415s. Linear in the number of input events
 
-   2.) Determining hyperparameters: 281s. Linear in the size of the initial
 
-       search grid.
 
-   2.) Decomposing signal models: 3380s.  Linear in the number of signal models
 
-       and the number of events used to simulate each signal.
 
-   3.) Calculating limits and p-values: ~1200s.  Linear in the number of signal
 
-       models.
 
- Check the config files for each example (located in <example>/base.conf). The
 
- comments describe the various parameters and their function.  You probably want
 
- to adjust 'Nthread' to match your number of CPU cores before running the
 
- examples.  The remaining parameters should require no adjustment (but feel free
 
- to play around).
 
- Each of the examples can be run as follows:
 
- 1. Enter the FD directory and set up the code:
 
-        cd <FD_DIR>
 
-        . bin/setup_bash.sh
 
-    This will also check if the required Python packages are available, and warn
 
-    you if they are not.
 
- 2. Enter the example directory and import / generate the test data:
 
-        cd Examples/<Example_Name>
 
-        fd_generate.py --setname Test --varname Myy --wgtname weight --size 12000000
 
-        fd_import.py ../InputSignalData/*
 
-    'fd_generate.py' produces a random sample of background-like data. Change the
 
-    size parameter if you want more/fewer events.  'fd_import.py' imports several
 
-    datasets from the CSV files in 'InputSignalData'.  These contain Gaussian
 
-    signal shapes that can be injected into the background-like sample to
 
-    simulate resonances.
 
- 3. Run the scan
 
-        fd_scan.py
 
- In a real application, you would use 'fd_import.py' to read in your data (either
 
- as .csv or .root).  Note that for the 'bg_only' example, 'fd_import.py' is
 
- unnecessary and can be skipped.
 
- That's it! The output will be located in 'Output/*'.  Each plot is saved as an
 
- individual pdf file, and additionally all plots are saved together as a
 
- multipage pdf in 'Output/all.pdf'.
 
- One last thing to note:  all decompositions and likelihood calculations are
 
- cached on disk in '<example>/Cache/*'.  Any repeated computations will hit the
 
- disk cache, which vastly speeds things up when making small changes (e.g. plot
 
- tweaks or including additional signal models).
 
- The cache files are named using all relevant parameters along with a checksum of
 
- the dataset.  This ensures that if you change parameters or cuts, the cache will
 
- not accidentally use values computed using a different configuration.  If you'd
 
- like to force FD to completely re-run from scratch, just delete the Cache
 
- directory. It will be automatically re-created and re-populated the next time
 
- 'fd_scan.py' is run.
 
 
  |