Using HEPData#
import json
import pyhf
import pyhf.contrib.utils
Preserved on HEPData#
As of this tutorial, ATLAS has published 18 full statistical models to HEPData
Let’s explore the 1Lbb workspace a little bit shall we?
Getting the Data#
We’ll use the pyhf[contrib]
extra (which relies on requests
and tarfile
) to download the HEPData minted DOI and extract the files we need.
pyhf.contrib.utils.download(
"https://doi.org/10.17182/hepdata.90607.v3/r3", "1Lbb-likelihoods"
)
This will nicely download and extract everything we need.
!ls -lavh 1Lbb-likelihoods
total 59M
drwxr-xr-x 2 root root 4.0K Jul 4 10:32 .
drwxr-xr-x 6 root root 4.0K Jul 4 10:32 ..
-rw-r--r-- 1 1000 1000 4.3M May 7 2020 BkgOnly.json
-rw-r--r-- 1 1000 1000 1.4K May 30 2020 README.md
-rw-r--r-- 1 1000 1000 55M May 31 2020 patchset.json
Instantiate our objects#
We have a background-only workspace BkgOnly.json
and a signal patchset collection patchset.json
. Let’s create our python objects and play with them:
spec = json.load(open("1Lbb-likelihoods/BkgOnly.json"))
patchset = pyhf.PatchSet(json.load(open("1Lbb-likelihoods/patchset.json")))
So what did the analyzers give us for signal patches?
Patching in Signals#
Let’s look at this pyhf.PatchSet
object which provides a user-friendly way to interact with many signal patches at once.
PatchSet#
patchset
<pyhf.patchset.PatchSet object with 125 patches at 0x7fa57ffbb3d0>
Oh wow, we’ve got 125 patches. What information does it have?
print(f"description: {patchset.description}")
print(f" digests: {patchset.digests}")
print(f" labels: {patchset.labels}")
print(f" references: {patchset.references}")
print(f" version: {patchset.version}")
description: signal patchset for the SUSY EWK 1Lbb analysis
digests: {'sha256': '2563672e1a165384faf49f1431e921d88c9c07ec10f150d5045576564f98f820'}
labels: ['m1', 'm2']
references: {'hepdata': 'ins1755298'}
version: 1.0.0
So we’ve got a useful description of the signal patches… there’s a digest. Does that match the background-only workspace we have?
pyhf.utils.digest(spec)
'2563672e1a165384faf49f1431e921d88c9c07ec10f150d5045576564f98f820'
It does! In fact, this sort of verification check will be done automatically when applying patches using pyhf.PatchSet
as we will see shortly. To manually verify, simply run pyhf.PatchSet.verify
on the workspace. No error means everything is fine. It will loudly complain otherwise.
patchset.verify(spec)
No error, whew. Let’s move on.
The labels m1
and m2
tells us that we have the signal patches parametrized in 2-dimensional space, likely as \(m_1 = \tilde{\chi}_1^\pm\) and \(m_2 = \tilde{\chi}_1^0\)… but I guess we’ll see?
The references list the references for this dataset, which is pointing at the hepdata record for now.
Next, the version is the version of the schema set we’re using with pyhf
(1.0.0
).
And last, but certainly not least… its patches:
patchset.patches
[<pyhf.patchset.Patch object 'C1N2_Wh_hbb_1000_0(1000, 0)' at 0x7fa580006710>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_1000_100(1000, 100)' at 0x7fa5800309d0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_1000_150(1000, 150)' at 0x7fa580031870>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_1000_200(1000, 200)' at 0x7fa580030bb0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_1000_250(1000, 250)' at 0x7fa5800316f0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_1000_300(1000, 300)' at 0x7fa5800313c0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_1000_350(1000, 350)' at 0x7fa580031570>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_1000_400(1000, 400)' at 0x7fa5800309a0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_1000_50(1000, 50)' at 0x7fa5800318a0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_150_0(150, 0)' at 0x7fa5800317e0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_165_35(165, 35)' at 0x7fa580030d90>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_175_0(175, 0)' at 0x7fa580031a50>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_175_25(175, 25)' at 0x7fa580030f40>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_190_60(190, 60)' at 0x7fa580031060>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_200_0(200, 0)' at 0x7fa5800311b0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_200_25(200, 25)' at 0x7fa580031480>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_200_50(200, 50)' at 0x7fa580031510>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_225_0(225, 0)' at 0x7fa5800316c0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_225_25(225, 25)' at 0x7fa5800319f0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_225_50(225, 50)' at 0x7fa5800318d0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_250_0(250, 0)' at 0x7fa580033bb0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_250_100(250, 100)' at 0x7fa580031120>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_250_25(250, 25)' at 0x7fa580032e30>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_250_50(250, 50)' at 0x7fa580033a60>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_250_75(250, 75)' at 0x7fa580030940>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_275_0(275, 0)' at 0x7fa580033040>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_275_25(275, 25)' at 0x7fa580032fe0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_275_50(275, 50)' at 0x7fa580032f80>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_275_75(275, 75)' at 0x7fa5800327d0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_300_0(300, 0)' at 0x7fa580032770>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_300_150(300, 150)' at 0x7fa580032380>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_300_25(300, 25)' at 0x7fa580032320>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_300_50(300, 50)' at 0x7fa5800322c0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_300_75(300, 75)' at 0x7fa580032260>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_325_0(325, 0)' at 0x7fa580032e60>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_325_50(325, 50)' at 0x7fa5800331f0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_350_0(350, 0)' at 0x7fa5800332e0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_350_100(350, 100)' at 0x7fa5800326e0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_350_150(350, 150)' at 0x7fa580032da0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_350_200(350, 200)' at 0x7fa580032d40>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_350_25(350, 25)' at 0x7fa5800325c0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_350_50(350, 50)' at 0x7fa580032560>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_350_75(350, 75)' at 0x7fa580032410>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_375_0(375, 0)' at 0x7fa5800323b0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_375_50(375, 50)' at 0x7fa5800335e0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_400_0(400, 0)' at 0x7fa580033640>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_400_100(400, 100)' at 0x7fa5800336a0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_400_150(400, 150)' at 0x7fa5800337c0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_400_200(400, 200)' at 0x7fa5800338e0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_400_25(400, 25)' at 0x7fa580033d30>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_400_250(400, 250)' at 0x7fa580033af0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_400_50(400, 50)' at 0x7fa5800334c0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_425_0(425, 0)' at 0x7fa580033460>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_450_0(450, 0)' at 0x7fa580033400>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_450_100(450, 100)' at 0x7fa5800333a0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_450_150(450, 150)' at 0x7fa580033340>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_450_200(450, 200)' at 0x7fa580031180>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_450_250(450, 250)' at 0x7fa5800311e0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_450_300(450, 300)' at 0x7fa580031360>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_450_50(450, 50)' at 0x7fa580031210>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_500_0(500, 0)' at 0x7fa5800313f0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_500_100(500, 100)' at 0x7fa580031ea0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_500_150(500, 150)' at 0x7fa580031a80>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_500_200(500, 200)' at 0x7fa580031bd0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_500_250(500, 250)' at 0x7fa580031cf0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_500_300(500, 300)' at 0x7fa580031f90>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_500_350(500, 350)' at 0x7fa580032110>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_500_50(500, 50)' at 0x7fa580032170>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_535_400(535, 400)' at 0x7fa5800321d0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_550_0(550, 0)' at 0x7fa580032830>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_550_100(550, 100)' at 0x7fa580032890>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_550_150(550, 150)' at 0x7fa5800328f0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_550_200(550, 200)' at 0x7fa580032950>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_550_250(550, 250)' at 0x7fa5800329b0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_550_300(550, 300)' at 0x7fa580032a10>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_550_50(550, 50)' at 0x7fa580032a70>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_600_0(600, 0)' at 0x7fa580032ad0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_600_100(600, 100)' at 0x7fa580032b30>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_600_150(600, 150)' at 0x7fa580032b90>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_600_200(600, 200)' at 0x7fa580032bf0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_600_250(600, 250)' at 0x7fa580032c50>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_600_300(600, 300)' at 0x7fa580032cb0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_600_350(600, 350)' at 0x7fa580032d10>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_600_400(600, 400)' at 0x7fa580032ec0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_600_50(600, 50)' at 0x7fa580032f20>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_650_0(650, 0)' at 0x7fa5800330a0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_650_100(650, 100)' at 0x7fa580033100>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_650_150(650, 150)' at 0x7fa5800332b0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_650_200(650, 200)' at 0x7fa5800320e0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_650_250(650, 250)' at 0x7fa580032050>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_650_300(650, 300)' at 0x7fa580032020>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_650_50(650, 50)' at 0x7fa580031d80>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_700_0(700, 0)' at 0x7fa580030ca0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_700_100(700, 100)' at 0x7fa580031f00>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_700_150(700, 150)' at 0x7fa580031de0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_700_200(700, 200)' at 0x7fa580033b80>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_700_250(700, 250)' at 0x7fa580033c10>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_700_300(700, 300)' at 0x7fa580032440>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_700_350(700, 350)' at 0x7fa5800324a0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_700_400(700, 400)' at 0x7fa580032dd0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_700_50(700, 50)' at 0x7fa580033550>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_750_100(750, 100)' at 0x7fa580032680>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_750_150(750, 150)' at 0x7fa580032620>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_750_200(750, 200)' at 0x7fa580031fc0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_750_250(750, 250)' at 0x7fa580031ba0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_750_300(750, 300)' at 0x7fa580030c10>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_750_50(750, 50)' at 0x7fa580033a30>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_800_0(800, 0)' at 0x7fa580033c40>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_800_100(800, 100)' at 0x7fa580033cd0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_800_150(800, 150)' at 0x7fa580033d60>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_800_200(800, 200)' at 0x7fa5800339d0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_800_250(800, 250)' at 0x7fa580033970>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_800_300(800, 300)' at 0x7fa580033910>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_800_350(800, 350)' at 0x7fa580033850>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_800_400(800, 400)' at 0x7fa5800337f0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_800_50(800, 50)' at 0x7fa580033760>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_0(900, 0)' at 0x7fa5800336d0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_100(900, 100)' at 0x7fa580033df0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_150(900, 150)' at 0x7fa580033e50>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_200(900, 200)' at 0x7fa580033eb0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_250(900, 250)' at 0x7fa580033f10>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_300(900, 300)' at 0x7fa580033f70>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_350(900, 350)' at 0x7fa580033fd0>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_400(900, 400)' at 0x7fa57e008070>,
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_50(900, 50)' at 0x7fa57e0080d0>]
So we can see all the patches listed both by name such as C1N2_Wh_hbb_900_250
as well as a pair of points (900, 250)
. Why is this useful? The PatchSet
object acts like a special dictionary look-up where it will grab the patch you need based on the unique key you provide it.
For example, we can look up by name
patchset["C1N2_Wh_hbb_900_250"]
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_250(900, 250)' at 0x7fa580033f10>
or by the pair of points
patchset[(900, 250)]
<pyhf.patchset.Patch object 'C1N2_Wh_hbb_900_250(900, 250)' at 0x7fa580033f10>
Patches#
A pyhf.PatchSet
is a collection of pyhf.Patch
objects. What is a patch indeed? It contains enough information about how to apply the signal patch to the corresponding background-only workspace (matched by digest).
patch = patchset["C1N2_Wh_hbb_900_250"]
print(f" name: {patch.name}")
print(f"values: {patch.values}")
name: C1N2_Wh_hbb_900_250
values: (900, 250)
Most importantly, it contains the patch information itself. Specifically, this inherits from the jsonpatch.JsonPatch
object, which is a 3rd party module providing native support for json patching in python. That means we can simply apply the patch to our workspace directly!
print(f" samples pre-patch: {pyhf.Workspace(spec).samples}")
print(f"samples post-patch: {pyhf.Workspace(patch.apply(spec)).samples}")
samples pre-patch: ['diboson', 'multiboson', 'singletop', 'ttbar', 'tth', 'ttv', 'vh', 'wjets', 'zjets']
samples post-patch: ['C1N2_Wh_hbb_900_250', 'diboson', 'multiboson', 'singletop', 'ttbar', 'tth', 'ttv', 'vh', 'wjets', 'zjets']
Or, more quickly, from the PatchSet
object:
print(f" samples pre-patch: {pyhf.Workspace(spec).samples}")
print(f"samples post-patch: {pyhf.Workspace(patchset.apply(spec, (900, 250))).samples}")
samples pre-patch: ['diboson', 'multiboson', 'singletop', 'ttbar', 'tth', 'ttv', 'vh', 'wjets', 'zjets']
samples post-patch: ['C1N2_Wh_hbb_900_250', 'diboson', 'multiboson', 'singletop', 'ttbar', 'tth', 'ttv', 'vh', 'wjets', 'zjets']
Patching via Model Creation#
One last way to apply the patching is to, instead of patching workspaces, we patch the models as we build them from the background-only workspace. This maybe makes it easier to treat the background-only workspace as immutable, and patch in signal models when grabbing the model. Check it out.
workspace = pyhf.Workspace(spec)
First, load up our background-only spec into the workspace. Then let’s create a model.
model = workspace.model(patches=[patchset["C1N2_Wh_hbb_900_250"]])
print(f"samples (workspace): {workspace.samples}")
print(f"samples ( model ): {model.config.samples}")
samples (workspace): ['diboson', 'multiboson', 'singletop', 'ttbar', 'tth', 'ttv', 'vh', 'wjets', 'zjets']
samples ( model ): ['C1N2_Wh_hbb_900_250', 'diboson', 'multiboson', 'singletop', 'ttbar', 'tth', 'ttv', 'vh', 'wjets', 'zjets']
Doing Physics#
So we want to try and reproduce part of the contour. At least convince ourselves we’re doing physics and not fauxsics. … Anyway… Let’s remind ourselves of the 1Lbb contour as we don’t have the photographic memory of the ATLAS SUSY conveners
So let’s work around the 700-900 GeV \(\tilde{\chi}_1^\pm, \tilde{\chi}_2^0\) region. We’ll look at two points here:
C1N2_Wh_hbb_650_0(650, 0)
which is below the contour and excludedC1N2_Wh_hbb_1000_0(1000, 0)
which is above the contour and not excluded
Let’s perform a “standard” hypothesis test (with \(\mu = 1\) null BSM hypothesis) on both of these and use the \(\text{CL}_\text{s}\) values to convince ourselves that we just did reproducible physics!?!
Doing Physics, for real now#
model_below = workspace.model(patches=[patchset["C1N2_Wh_hbb_650_0"]])
model_above = workspace.model(patches=[patchset["C1N2_Wh_hbb_1000_0"]])
We’ve made our models. Let’s test hypotheses!
Note: this will not be as instantaneous as our simple models…but it should still be pretty fast!
test_poi = 1.0
result_below = pyhf.infer.hypotest(
test_poi,
workspace.data(model_below),
model_below,
test_stat="qtilde",
return_expected_set=True,
)
print(f"Observed CLs: {result_below[0]}")
print(f"Expected CLs band: {[exp.tolist() for exp in result_below[1]]}")
Observed CLs: 0.01713743352836708
Expected CLs band: [4.841520257669322e-06, 9.132648108169354e-05, 0.0014659244755989953, 0.01732811440517817, 0.12149672085917977]
result_above = pyhf.infer.hypotest(
test_poi,
workspace.data(model_above),
model_above,
test_stat="qtilde",
return_expected_set=True,
)
print(f"Observed CLs: {result_above[0]}")
print(f"Expected CLs band: {[exp.tolist() for exp in result_above[1]]}")
Observed CLs: 0.5856814546999648
Expected CLs band: [0.04988361901086988, 0.12645246726332207, 0.29258722130396997, 0.5694216651677977, 0.8476004404120806]
And as you can see, we’re getting results that we generally expect. Excluded models are those for which \(\text{CL}_\text{s} < 0.05\). Additionally, you can see that the expected bands \(-2\sigma\) for the \((1000, 0)\) point is just slightly below the observed result for the \((650, 0)\) point which is what we observe in the figure above.