Workspace Manipulations#
In this chapter, you will learn about various workspace manipulations including how to convert from HistFactory XML+ROOT workspaces to pyhf. We’ll cover some common pitfalls such as locations of root files, and being able to set the base path for the conversion.
Getting the XML+ROOT#
Note, getting the XML+ROOT won’t necessarily be covered as part of the tutorial as it requires ROOT (though ROOT is installed in the Binder instance).
If you want to practice extracting out the HistFactory files from the workspace, first create the workspace like so:
# Need to be in the directory containing config directory
from os import chdir
from pathlib import Path
_top_level_dir = Path.cwd()
chdir(_top_level_dir.joinpath("data", "multichannel_histfactory"))
! hist2workspace config/example.xml
[#2] INFO:HistFactory -- hist2workspace is less verbose now. Use -v and -vv for more details.
[#2] PROGRESS:HistFactory -- Getting histogram ./data/data.root:/signal_data
[#2] PROGRESS:HistFactory -- Getting histogram ./data/data.root:/signal_signal
[#2] PROGRESS:HistFactory -- Getting histogram ./data/data.root:/signal_bkg
[#2] PROGRESS:HistFactory -- Getting histogram ./data/data.root:/signal_bkgerr
[#2] PROGRESS:HistFactory -- Getting histogram ./data/data.root:/control_data
[#2] PROGRESS:HistFactory -- Getting histogram ./data/data.root:/control_bkg
[#2] PROGRESS:HistFactory -- Getting histogram ./data/data.root:/control_bkgerr
[#2] PROGRESS:HistFactory -- Starting to process channel: channel1
[#2] PROGRESS:HistFactory --
-----------------------------------------
Starting to process 'channel1' channel with 1 observables
-----------------------------------------
[#2] PROGRESS:HistFactory --
-----------------------------------------
import model into workspace
-----------------------------------------
[#2] PROGRESS:HistFactory -- Writing sample: signal
[#2] PROGRESS:HistFactory -- Writing sample: bkg
[#2] PROGRESS:HistFactory -- Saved all histograms
[#2] PROGRESS:HistFactory -- Saved Measurement
[#2] PROGRESS:HistFactory -- Successfully wrote channel to file
[#2] PROGRESS:HistFactory -- Starting to process channel: channel2
[#2] PROGRESS:HistFactory --
-----------------------------------------
Starting to process 'channel2' channel with 1 observables
-----------------------------------------
[#2] PROGRESS:HistFactory --
-----------------------------------------
import model into workspace
-----------------------------------------
WARNING: Can't find parameter of interest: SigXsecOverSM in Workspace. Not setting in ModelConfig.
[#2] PROGRESS:HistFactory -- Writing sample: bkg
[#2] PROGRESS:HistFactory -- Saved all histograms
[#2] PROGRESS:HistFactory -- Saved Measurement
[#2] PROGRESS:HistFactory -- Successfully wrote channel to file
[#2] PROGRESS:HistFactory --
-----------------------------------------
Entering combination
-----------------------------------------
[#2] PROGRESS:HistFactory -- Merging data for channel channel1
[#2] PROGRESS:HistFactory -- Merging data for channel channel2
[#2] PROGRESS:HistFactory --
-----------------------------------------
Importing combined model
-----------------------------------------
[#2] PROGRESS:HistFactory --
-----------------------------------------
create toy data for channelCat[channel1,channel2]
-----------------------------------------
[#2] PROGRESS:HistFactory -- Writing combined workspace to file: ./results/example_combined_GaussExample_model.root
[#2] PROGRESS:HistFactory -- Writing combined measurement to file: ./results/example_combined_GaussExample_model.root
[#2] PROGRESS:HistFactory -- Writing sample: signal
[#2] PROGRESS:HistFactory -- Writing sample: bkg
[#2] PROGRESS:HistFactory -- Writing sample: bkg
[#2] PROGRESS:HistFactory -- Saved all histograms
[#2] PROGRESS:HistFactory -- Saved Measurement
and you’ll notice a few new files being made!
$ ls -lhF results/
total 136K
-rw-r--r-- 1 jovyan jovyan 40K Nov 8 21:01 example_channel1_GaussExample_model.root
-rw-r--r-- 1 jovyan jovyan 38K Nov 8 21:01 example_channel2_GaussExample_model.root
-rw-r--r-- 1 jovyan jovyan 47K Nov 8 21:01 example_combined_GaussExample_model.root
-rw-r--r-- 1 jovyan jovyan 503 Nov 8 21:01 example_GaussExample.root
-rw-r--r-- 1 jovyan jovyan 26 Nov 8 21:01 example_results.table
! ls -lhF results/
total 116K
-rw-r--r-- 1 root root 503 Jul 4 10:36 example_GaussExample.root
-rw-r--r-- 1 root root 38K Jul 4 10:36 example_channel1_GaussExample_model.root
-rw-r--r-- 1 root root 22K Jul 4 10:36 example_channel2_GaussExample_model.root
-rw-r--r-- 1 root root 44K Jul 4 10:36 example_combined_GaussExample_model.root
-rw-r--r-- 1 root root 26 Jul 4 10:36 example_results.table
In particular, example_combined_GaussExample_model.root
is the file that contains the RooStats::HistFactory::Measurement
object:
$ root results/example_combined_GaussExample_model.root
------------------------------------------------------------
| Welcome to ROOT 6.18/04 https://root.cern |
| (c) 1995-2019, The ROOT Team |
| Built for macosx64 on Sep 11 2019, 15:38:23 |
| From tags/v6-18-04@v6-18-04 |
| Try '.help', '.demo', '.license', '.credits', '.quit'/'.q' |
------------------------------------------------------------
root [0]
Attaching file results/example_combined_GaussExample_model.root as _file0...
RooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby
Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
All rights reserved, please read http://roofit.sourceforge.net/license.txt
(TFile *) 0x7ffaa30d2130
root [1] .ls
TFile** results/example_combined_GaussExample_model.root
TFile* results/example_combined_GaussExample_model.root
KEY: RooWorkspace combined;1 combined
KEY: TProcessID ProcessID0;1 e1e9272e-fddb-11ea-86b3-1556a8c0beef
KEY: TDirectoryFile channel1_hists;1 channel1_hists
KEY: TDirectoryFile channel2_hists;1 channel2_hists
KEY: RooStats::HistFactory::Measurement GaussExample;1
from which you can extract out the necessary XML files as well:
root [2] GaussExample->PrintXML()
Printing XML Files for measurement: GaussExample
Printing XML Files for channel: channel1
Finished printing XML files
Printing XML Files for channel: channel2
Finished printing XML files
Finished printing XML files
To do this programatically, you can either write a ROOT
macro
// printXML.C
int printXML() {
TFile* _file0 = TFile::Open("results/example_combined_GaussExample_model.root");
_file0->Get<RooStats::HistFactory::Measurement>("GaussExample")->PrintXML();
return 0;
}
and run it
$ root -l -b -q printXML.C
but we can also do the same with PyROOT in as many lines
import ROOT
_file0 = ROOT.TFile.Open("results/example_combined_GaussExample_model.root")
_file0.GaussExample.PrintXML()
Welcome to JupyROOT 6.28/04
[#2] PROGRESS:HistFactory -- Printing XML Files for measurement: GaussExample
[#2] PROGRESS:HistFactory -- Printing XML Files for channel: channel1
[#2] PROGRESS:HistFactory -- Finished printing XML files
[#2] PROGRESS:HistFactory -- Printing XML Files for channel: channel2
[#2] PROGRESS:HistFactory -- Finished printing XML files
[#2] PROGRESS:HistFactory -- Finished printing XML files
which dumps them into the same directory you ran from:
$ ls -lhF
total 24K
drwxr-xr-x 2 jovyan jovyan 4.0K Nov 8 19:52 config/
drwxr-xr-x 2 jovyan jovyan 4.0K Nov 8 19:52 data/
-rw-r--r-- 1 jovyan jovyan 1.1K Nov 8 21:01 GaussExample_channel1.xml
-rw-r--r-- 1 jovyan jovyan 794 Nov 8 21:01 GaussExample_channel2.xml
-rw-r--r-- 1 jovyan jovyan 459 Nov 8 21:01 GaussExample.xml
drwxr-xr-x 2 jovyan jovyan 4.0K Nov 8 21:01 results/
! ls -lhF
total 24K
-rw-r--r-- 1 root root 458 Jul 4 10:36 GaussExample.xml
-rw-r--r-- 1 root root 1.1K Jul 4 10:36 GaussExample_channel1.xml
-rw-r--r-- 1 root root 793 Jul 4 10:36 GaussExample_channel2.xml
drwxr-xr-x 2 root root 4.0K Jul 4 10:21 config/
drwxr-xr-x 2 root root 4.0K Jul 4 10:21 data/
drwxr-xr-x 2 root root 4.0K Jul 4 10:36 results/
chdir(_top_level_dir)
XML to JSON#
via the command line#
So pyhf comes with a lot of nifty utilities you can access. The documentation for the command line can be found via pyhf --help
or online.
! pyhf --help
Usage: pyhf [OPTIONS] COMMAND [ARGS]...
Top-level CLI entrypoint.
Options:
--version Show the version and exit.
--cite, --citation Print the bibtex citation for this software
-h, --help Show this message and exit.
Commands:
cls Compute CLs value(s) for a given pyhf workspace.
combine Combine two workspaces into a single workspace.
completions Generate shell completion code.
contrib Contrib experimental operations.
digest Use hashing algorithm to calculate the workspace digest.
fit Perform a maximum likelihood fit for a given pyhf workspace.
inspect Inspect a pyhf JSON document.
json2xml Convert pyhf JSON back to XML + ROOT files.
patchset Operations involving patchsets.
prune Prune components from the workspace.
rename Rename components of the workspace.
sort Sort the workspace.
xml2json Entrypoint XML: The top-level XML file for the PDF...
Let’s focus for now on pyhf xml2json
which requires that you have installed pyhf[xmlio]
(pyhf with the xmlio option).
python -m pip install pyhf[xmlio]
Again, the online documentation for this option is found here.
! pyhf xml2json --help
Usage: pyhf xml2json [OPTIONS] ENTRYPOINT_XML
Entrypoint XML: The top-level XML file for the PDF definition.
Options:
--basedir PATH The base directory for the XML files to
point relative to.
-v, --mount PATH:PATH Consists of two fields, separated by a colon
character ( : ). The first field is the
local path to where files are located, the
second field is the path where the file or
directory are saved in the XML
configuration. This is similar in spirit to
Docker.
--output-file TEXT The location of the output json file. If not
specified, prints to screen.
--track-progress / --hide-progress
--validation-as-error / --validation-as-warning
-h, --help Show this message and exit.
Let’s remind ourselves of what the top-level XML file looks like, as this is the ENTRYPOINT_XML
.
! tail -n +15 data/multichannel_histfactory/config/example.xml | cat -n
1 <!DOCTYPE Combination SYSTEM 'HistFactorySchema.dtd'>
2
3 <Combination OutputFilePrefix="./results/example">
4 <Input>./config/example_signal.xml</Input>
5 <Input>./config/example_control.xml</Input>
6 <Measurement Name="GaussExample" Lumi="1." LumiRelErr="0.1" ExportOnly="True">
7 <POI>SigXsecOverSM</POI>
8 <ParamSetting Const="True">Lumi</ParamSetting>
9 </Measurement>
10 </Combination>
So to explain these options:
basedir
specifies the base directory for where all the XML files are reference with respect to. As you can see from lines 3, 4, 5 - this should be the directory containingresults/
andconfig/
output-file
specifies the output JSON file. If one is not specified, this will print to the screen, which you can redirect into a file if you want (pyhf xml2json ... > workspace.json
)hide-progress
will disable showing the progress bars when running the script… but we like progress bars 🙂
Let’s go ahead and run this command, but we won’t specify the output file so it goes to the screen. We’ll also disable the progress tracking, just so we have a nicer output for this tutorial.
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | cat -n
1 {
2 "channels": [
3 {
4 "name": "channel1",
5 "samples": [
6 {
7 "data": [
8 10.0,
9 35.0
10 ],
11 "modifiers": [
12 {
13 "data": null,
14 "name": "SigXsecOverSM",
15 "type": "normfactor"
16 }
17 ],
18 "name": "signal"
19 },
20 {
21 "data": [
22 100.0,
23 150.0
24 ],
25 "modifiers": [
26 {
27 "data": null,
28 "name": "lumi",
29 "type": "lumi"
30 },
31 {
32 "data": [
33 10.000000149011612,
34 10.000000521540642
35 ],
36 "name": "uncorrshape_signal",
37 "type": "shapesys"
38 }
39 ],
40 "name": "bkg"
41 }
42 ]
43 },
44 {
45 "name": "channel2",
46 "samples": [
47 {
48 "data": [
49 200.0,
50 350.0
51 ],
52 "modifiers": [
53 {
54 "data": null,
55 "name": "lumi",
56 "type": "lumi"
57 },
58 {
59 "data": [
60 5.000000074505806,
61 10.000000055879354
62 ],
63 "name": "uncorrshape_control",
64 "type": "shapesys"
65 }
66 ],
67 "name": "bkg"
68 }
69 ]
70 }
71 ],
72 "measurements": [
73 {
74 "config": {
75 "parameters": [
76 {
77 "auxdata": [
78 1.0
79 ],
80 "bounds": [
81 [
82 0.5,
83 1.5
84 ]
85 ],
86 "fixed": true,
87 "inits": [
88 1.0
89 ],
90 "name": "lumi",
91 "sigmas": [
92 0.1
93 ]
94 },
95 {
96 "bounds": [
97 [
98 0.0,
99 10.0
100 ]
101 ],
102 "inits": [
103 1.0
104 ],
105 "name": "SigXsecOverSM"
106 }
107 ],
108 "poi": "SigXsecOverSM"
109 },
110 "name": "GaussExample"
111 }
112 ],
113 "observations": [
114 {
115 "data": [
116 110.0,
117 155.0
118 ],
119 "name": "channel1"
120 },
121 {
122 "data": [
123 205.0,
124 345.0
125 ],
126 "name": "channel2"
127 }
128 ],
129 "version": "1.0.0"
130 }
Only 130 lines for the entire workspace! Not too shabby. If we look through a couple of pieces:
line 2: specify a list of channels
line 5: specify the samples for
channel1
lines 6-10: specify the expected event rate for the
signal
sample inchannel1
line 11: specify a list of modifiers (e.g. parameters that modify the sample)
Similarly, if we continue down to the second half of this JSON, we hit line 72 which specifies a list of measurements
for this workspace. In fact, we only have one measurement called GaussExample
with the parameter of interest defined as SigXsecOverSM
. This measurement also specifies additional parameter configuration such as details for the luminosity modifier (parameter name lumi
).
Nearly at the end, the next part of this specification is for the observations
(observed data) on line 113. Each observation corresponds with the channel, where channel1
has two bins, and channel2
also has two bins.
Finally, we have a version
which specifies the version of the schema used for the JSON HistFactory. In this case, we’re using 1.0.0
which has the https://pyhf.readthedocs.io/en/v0.7.5/schemas/1.0.0/workspace.json definition which refers to the https://pyhf.readthedocs.io/en/v0.7.5/schemas/1.0.0/defs.json.
What’s really nice about the schema definition is that it allows anyone to write their own tooling/scripting to build up the workspace and quickly check if it matches the schema. This will get you 90% of the way there in having a valid workspace to work with.
There are some additional checks that cannot be done, such as name conflicts, or ensuring that all samples in a channel have the same binning structure. The good news is that these checks can be done simply by loading up the workspace into a pyhf.Workspace
object which will do the schema validation, as well as the additional checks.
Speaking of pyhf.Workspace
objects…
via the python interface#
Let’s do the exact same thing, but from the python interpreter
import pyhf
import pyhf.readxml # not imported by default!
spec = pyhf.readxml.parse(
"data/multichannel_histfactory/config/example.xml", "data/multichannel_histfactory"
)
So we’re not going to dump this out. We already did that above. Let’s just quickly go ahead and load it into a pyhf.Workspace
object because we can.
ws = pyhf.Workspace(spec)
print(f" channels: {ws.channels}")
print(f" nbins: {ws.channel_nbins}")
print(f" samples: {ws.samples}")
print(f" modifiers: {ws.modifiers}")
print(f"observations: {ws.observations}")
channels: ['channel1', 'channel2']
nbins: {'channel1': 2, 'channel2': 2}
samples: ['bkg', 'signal']
modifiers: [('SigXsecOverSM', 'normfactor'), ('lumi', 'lumi'), ('uncorrshape_control', 'shapesys'), ('uncorrshape_signal', 'shapesys')]
observations: {'channel1': [110.0, 155.0], 'channel2': [205.0, 345.0]}
Already, we’re seeing a lot of information about this workspace as it’s rather inspectable. Remember, this is not a model. What we call a ‘model’ is to combine the channel specification with a measurement… that is, a measurement of a workspace uniquely defines that model. A model might choose a particular parameter of interest to measure or set specific parameters as constant during the fit. These configurations are all stored in the measurements
key we saw above. We’ll explore more about models in the next chapter.
Let’s move on to more things we can do with the command line.
Workspace Inspection#
Now that we have a working command for converting our XML to JSON, let’s go ahead and take advantage of the JSON output by piping it to pyhf inspect
which will print out a nice summary of our workspace.
! pyhf inspect --help
Usage: pyhf inspect [OPTIONS] [WORKSPACE]
Inspect a pyhf JSON document.
Example:
.. code-block:: shell
$ curl -sL https://raw.githubusercontent.com/scikit-
hep/pyhf/main/docs/examples/json/2-bin_1-channel.json | pyhf inspect
Summary ------------------ channels 1
samples 2 parameters 2 modifiers 2
channels nbins ---------- ----- singlechannel
2
samples ---------- background
signal
parameters constraint modifiers ----------
---------- ---------- mu
unconstrained normfactor uncorr_bkguncrt
constrained_by_poisson shapesys
measurement poi parameters ----------
---------- ---------- (*) Measurement mu
(none)
Options:
--output-file TEXT The location of the output json file. If not specified,
prints to screen.
--measurement TEXT
-h, --help Show this message and exit.
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf inspect
Summary
------------------
channels 2
samples 2
parameters 4
modifiers 4
channels nbins
---------- -----
channel1 2
channel2 2
samples
----------
bkg
signal
parameters constraint modifiers
---------- ---------- ----------
SigXsecOverSM unconstrained normfactor
lumi constrained_by_normal lumi
uncorrshape_control constrained_by_poisson shapesys
uncorrshape_signal constrained_by_poisson shapesys
measurement poi parameters
---------- ---------- ----------
(*) GaussExample SigXsecOverSM lumi,SigXsecOverSM
Immediately, we get a lot of useful information. We can see the number of channels, samples, parameters, and modifiers. Then we get a breakdown of the channels (and the number of bins for each channel), the samples, and the parameters. Finally, we see a list of measurements defined in the workspace, as well as the (*)
denoting the default measurement if one is not specified.
Could the number of parameters and modifiers differ?
“Normalizing” a Workspace#
There comes a time when you need to make comparisons to determine changes between two workspaces. This means depending on how the workspace is generated, one might need to “sort” it. pyhf sort
is a utility that will normalize the workspace for you, such that certain operations like calculating a checksum (pyhf digest
) guarantees unitarity.
For simple workspaces like the ones we’re using in this tutorial, they’re already sorted… however, this is not true in the real world. Notice how the bkg
is now the first sample and signal
is the second sample after sorting.
! pyhf sort --help
Usage: pyhf sort [OPTIONS] [WORKSPACE]
Sort the workspace.
See :func:`pyhf.workspace.Workspace.sorted` for more information.
Example:
.. code-block:: shell
$ curl -sL https://raw.githubusercontent.com/scikit-
hep/pyhf/main/docs/examples/json/2-bin_1-channel.json | pyhf sort | jq
'.' | md5 8be5186ec249d2704e14dd29ef05ffb0
.. code-block:: shell
$ curl -sL https://raw.githubusercontent.com/scikit-
hep/pyhf/main/docs/examples/json/2-bin_1-channel.json | jq -S '.channels
|=sort_by(.name)|.channels[].samples|=sort_by(.name)|.channels[].samples
[].modifiers|=sort_by(.name,.type)|.observations|=sort_by(.name)' | md5
8be5186ec249d2704e14dd29ef05ffb0
Options:
--output-file TEXT The location of the output json file. If not specified,
prints to screen.
-h, --help Show this message and exit.
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf sort
{
"channels": [
{
"name": "channel1",
"samples": [
{
"data": [
100.0,
150.0
],
"modifiers": [
{
"data": null,
"name": "lumi",
"type": "lumi"
},
{
"data": [
10.000000149011612,
10.000000521540642
],
"name": "uncorrshape_signal",
"type": "shapesys"
}
],
"name": "bkg"
},
{
"data": [
10.0,
35.0
],
"modifiers": [
{
"data": null,
"name": "SigXsecOverSM",
"type": "normfactor"
}
],
"name": "signal"
}
]
},
{
"name": "channel2",
"samples": [
{
"data": [
200.0,
350.0
],
"modifiers": [
{
"data": null,
"name": "lumi",
"type": "lumi"
},
{
"data": [
5.000000074505806,
10.000000055879354
],
"name": "uncorrshape_control",
"type": "shapesys"
}
],
"name": "bkg"
}
]
}
],
"measurements": [
{
"config": {
"parameters": [
{
"bounds": [
[
0.0,
10.0
]
],
"inits": [
1.0
],
"name": "SigXsecOverSM"
},
{
"auxdata": [
1.0
],
"bounds": [
[
0.5,
1.5
]
],
"fixed": true,
"inits": [
1.0
],
"name": "lumi",
"sigmas": [
0.1
]
}
],
"poi": "SigXsecOverSM"
},
"name": "GaussExample"
}
],
"observations": [
{
"data": [
110.0,
155.0
],
"name": "channel1"
},
{
"data": [
205.0,
345.0
],
"name": "channel2"
}
],
"version": "1.0.0"
}
Computing a digest#
Next up is a way to determine if two workspaces are equivalent, simply by comparing their computed digest. Note that this is based on the contents of the workspace and will not ensure floating-point differences are treated identically. That is, 2.19999999
and 2.2000001
will likely be treated as differently in the digest calculation as in python. We’ll show here why sorting is very important.
! pyhf digest --help
Usage: pyhf digest [OPTIONS] [WORKSPACE]
Use hashing algorithm to calculate the workspace digest.
Returns: digests (:obj:`dict`): A mapping of the hashing algorithms used
to the computed digest for the workspace.
Example:
.. code-block:: shell
$ curl -sL https://raw.githubusercontent.com/scikit-
hep/pyhf/main/docs/examples/json/2-bin_1-channel.json | pyhf digest
sha256:dad8822af55205d60152cbe4303929042dbd9d4839012e055e7c6b6459d68d73
Options:
-a, --algorithm TEXT The hashing algorithm used to compute the
workspace digest.
-j, --json / -p, --plaintext Output the hash values as a JSON dictionary or
plaintext strings
-h, --help Show this message and exit.
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf digest
sha256:50165e8ef034c514fb77e8f05a15a002c02bd659f001657952b79e0552470f79
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf sort | \
pyhf digest
sha256:27a35f6874cf91f9b38916cf948ac18ee650f1b578a93107b9b212c8752b1310
Remember that the ordering of the samples will have switched through the sorting.
The sha256
algorithm is used to compute the checksum for this workspace. This means that one can generally “normalize” all workspaces, then compute the digest and guarantee uniqueness. As with all command line functionality you’ve seen so far, there are equivalent ways to do it through python.
print(f"Unsorted: {pyhf.utils.digest(ws)}")
print(f"Sorted: {pyhf.utils.digest(pyhf.Workspace.sorted(ws))}")
Unsorted: 50165e8ef034c514fb77e8f05a15a002c02bd659f001657952b79e0552470f79
Sorted: 27a35f6874cf91f9b38916cf948ac18ee650f1b578a93107b9b212c8752b1310
“Pruning” away items#
Sometimes you want to manipulate workspaces by removing channels or samples or systematics (or measurements). This can be useful when trying to debug fits, or to build background-only workspaces, or to clean up a workspace.
! pyhf prune --help
Usage: pyhf prune [OPTIONS] [WORKSPACE]
Prune components from the workspace.
See :func:`pyhf.workspace.Workspace.prune` for more information.
Options:
--output-file TEXT The location of the output json file. If not
specified, prints to screen.
-c, --channel <CHANNEL>...
-s, --sample <SAMPLE>...
-m, --modifier <MODIFIER>...
-t, --modifier-type [histosys|lumi|normfactor|normsys|shapefactor|shapesys|staterror]
--measurement <MEASUREMENT>...
-h, --help Show this message and exit.
prune channels#
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf prune -c channel1 | \
pyhf inspect
Traceback (most recent call last):
File "/usr/local/venv/bin/pyhf", line 8, in <module>
sys.exit(cli())
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/cli/spec.py", line 82, in inspect
model = ws.model()
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/workspace.py", line 447, in model
return Model(modelspec, **config_kwargs)
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py", line 780, in __init__
self.config.set_poi(poi_name)
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py", line 464, in set_poi
raise exceptions.InvalidModel(
pyhf.exceptions.InvalidModel: The parameter of interest 'SigXsecOverSM' cannot be fit as it is not declared in the model specification.
prune samples#
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf prune -s signal | \
pyhf inspect
Traceback (most recent call last):
File "/usr/local/venv/bin/pyhf", line 8, in <module>
sys.exit(cli())
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/cli/spec.py", line 82, in inspect
model = ws.model()
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/workspace.py", line 447, in model
return Model(modelspec, **config_kwargs)
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py", line 780, in __init__
self.config.set_poi(poi_name)
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py", line 464, in set_poi
raise exceptions.InvalidModel(
pyhf.exceptions.InvalidModel: The parameter of interest 'SigXsecOverSM' cannot be fit as it is not declared in the model specification.
prune modifiers#
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf prune -m uncorrshape_signal | \
pyhf inspect
Summary
------------------
channels 2
samples 2
parameters 3
modifiers 3
channels nbins
---------- -----
channel1 2
channel2 2
samples
----------
bkg
signal
parameters constraint modifiers
---------- ---------- ----------
SigXsecOverSM unconstrained normfactor
lumi constrained_by_normal lumi
uncorrshape_control constrained_by_poisson shapesys
measurement poi parameters
---------- ---------- ----------
(*) GaussExample SigXsecOverSM lumi,SigXsecOverSM
prune modifier types#
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf prune -t shapesys | \
pyhf inspect
Summary
------------------
channels 2
samples 2
parameters 2
modifiers 2
channels nbins
---------- -----
channel1 2
channel2 2
samples
----------
bkg
signal
parameters constraint modifiers
---------- ---------- ----------
SigXsecOverSM unconstrained normfactor
lumi constrained_by_normal lumi
measurement poi parameters
---------- ---------- ----------
(*) GaussExample SigXsecOverSM lumi,SigXsecOverSM
Renaming items#
In addition to removing items, you might want to rename your channels, samples, modifiers, or measurement names. This can be useful for creating modifier correlations, or removing modifier correlations, or just cleaning up your workspace to get it ready for publication.
! pyhf rename --help
Usage: pyhf rename [OPTIONS] [WORKSPACE]
Rename components of the workspace.
See :func:`pyhf.workspace.Workspace.rename` for more information.
Options:
--output-file TEXT The location of the output json file. If not
specified, prints to screen.
-c, --channel <PATTERN> <REPLACE>...
-s, --sample <PATTERN> <REPLACE>...
-m, --modifier <PATTERN> <REPLACE>...
--measurement <PATTERN> <REPLACE>...
-h, --help Show this message and exit.
rename channels#
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf rename -c channel1 SR -c channel2 CR | \
pyhf inspect
Summary
------------------
channels 2
samples 2
parameters 4
modifiers 4
channels nbins
---------- -----
CR 2
SR 2
samples
----------
bkg
signal
parameters constraint modifiers
---------- ---------- ----------
SigXsecOverSM unconstrained normfactor
lumi constrained_by_normal lumi
uncorrshape_control constrained_by_poisson shapesys
uncorrshape_signal constrained_by_poisson shapesys
measurement poi parameters
---------- ---------- ----------
(*) GaussExample SigXsecOverSM lumi,SigXsecOverSM
rename samples#
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf rename -s bkg background | \
pyhf inspect
Summary
------------------
channels 2
samples 2
parameters 4
modifiers 4
channels nbins
---------- -----
channel1 2
channel2 2
samples
----------
background
signal
parameters constraint modifiers
---------- ---------- ----------
SigXsecOverSM unconstrained normfactor
lumi constrained_by_normal lumi
uncorrshape_control constrained_by_poisson shapesys
uncorrshape_signal constrained_by_poisson shapesys
measurement poi parameters
---------- ---------- ----------
(*) GaussExample SigXsecOverSM lumi,SigXsecOverSM
rename modifiers#
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf rename -m uncorrshape_signal corrshape -m uncorrshape_control corrshape | \
pyhf inspect
Traceback (most recent call last):
File "/usr/local/venv/bin/pyhf", line 8, in <module>
sys.exit(cli())
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/cli/spec.py", line 82, in inspect
model = ws.model()
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/workspace.py", line 447, in model
return Model(modelspec, **config_kwargs)
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py", line 774, in __init__
modifiers, _nominal_rates = _nominal_and_modifiers_from_spec(
File "/usr/local/venv/lib/python3.10/site-packages/pyhf/pdf.py", line 133, in _nominal_and_modifiers_from_spec
raise exceptions.InvalidModel(
pyhf.exceptions.InvalidModel: Trying to add paramset shapesys/corrshape on bkg sample in channel2 channel but other paramsets exist with the same name.
rename measurements#
! pyhf xml2json --basedir data/multichannel_histfactory data/multichannel_histfactory/config/example.xml --hide-progress | \
pyhf rename --measurement GaussExample FitConfig | \
pyhf inspect
Summary
------------------
channels 2
samples 2
parameters 4
modifiers 4
channels nbins
---------- -----
channel1 2
channel2 2
samples
----------
bkg
signal
parameters constraint modifiers
---------- ---------- ----------
SigXsecOverSM unconstrained normfactor
lumi constrained_by_normal lumi
uncorrshape_control constrained_by_poisson shapesys
uncorrshape_signal constrained_by_poisson shapesys
measurement poi parameters
---------- ---------- ----------
(*) FitConfig SigXsecOverSM lumi,SigXsecOverSM