Performing a Combination#

We’ll demonstrate how a combination works by combining everything we’ve learned so far.

Loading the Workspace#

To do so, we’ll use a simple workspace to demonstrate functionality of combinations.

import json

import pyhf
with open("data/2-bin_1-channel.json") as serialized:
    spec = json.load(serialized)

workspace = pyhf.Workspace(spec)

Combine Workspaces#

Let’s just try to combine naively right now.

pyhf.Workspace.combine(workspace, workspace)
---------------------------------------------------------------------------
InvalidWorkspaceOperation                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 pyhf.Workspace.combine(workspace, workspace)

File /usr/local/venv/lib/python3.10/site-packages/pyhf/workspace.py:757, in Workspace.combine(cls, left, right, join, merge_channels)
    752     log.warning(
    753         "You are using an unsafe join operation. This will silence exceptions that might be raised during a normal 'outer' operation."
    754     )
    756 new_version = _join_versions(join, left['version'], right['version'])
--> 757 new_channels = _join_channels(
    758     join, left['channels'], right['channels'], merge=merge_channels
    759 )
    760 new_observations = _join_observations(
    761     join, left['observations'], right['observations']
    762 )
    763 new_measurements = _join_measurements(
    764     join, left['measurements'], right['measurements']
    765 )

File /usr/local/venv/lib/python3.10/site-packages/pyhf/workspace.py:129, in _join_channels(join, left_channels, right_channels, merge)
    125     common_channels = {c['name'] for c in left_channels}.intersection(
    126         c['name'] for c in right_channels
    127     )
    128     if common_channels:
--> 129         raise exceptions.InvalidWorkspaceOperation(
    130             f"Workspaces cannot have any channels in common with the same name: {common_channels}. You can also try a different join operation: {Workspace.valid_joins}."
    131         )
    133 elif join == 'outer':
    134     counted_channels = collections.Counter(
    135         channel['name'] for channel in joined_channels
    136     )

InvalidWorkspaceOperation: Workspaces cannot have any channels in common with the same name: {'singlechannel'}. You can also try a different join operation: ['none', 'outer', 'left outer', 'right outer'].

As we can see, we can’t just combine a workspace with itself if it has some channel names in common. We try very hard in pyhf to make sure a combination “makes sense”.

Let’s go ahead and rename the channel (as well as the measurement). Then try to combine.

other_workspace = workspace.rename(
    channels={"singlechannel": "othersinglechannel"},
    modifiers={"uncorr_bkguncrt": "otheruncorr_bkguncrt"},
    measurements={"Measurement": "OtherMeasurement"},
)

combined_workspace = pyhf.Workspace.combine(workspace, other_workspace)

And did we combine?

print(f"    channels: {combined_workspace.channels}")
print(f"       nbins: {combined_workspace.channel_nbins}")
print(f"     samples: {combined_workspace.samples}")
print(f"   modifiers: {combined_workspace.modifiers}")
print(f"measurements: {combined_workspace.measurement_names}")
    channels: ['othersinglechannel', 'singlechannel']
       nbins: {'othersinglechannel': 2, 'singlechannel': 2}
     samples: ['background', 'signal']
   modifiers: [('mu', 'normfactor'), ('otheruncorr_bkguncrt', 'shapesys'), ('uncorr_bkguncrt', 'shapesys')]
measurements: ['Measurement', 'OtherMeasurement']

Indeed. And at this point, we can just use all the same functionality we expect of pyhf, such as performing a fit:

model = workspace.model()
data = workspace.data(model)
test_poi = 1.0

pyhf.infer.hypotest(test_poi, data, model, test_stat="qtilde")
array(0.49567314)
other_model = other_workspace.model()
other_data = other_workspace.data(other_model)

pyhf.infer.hypotest(test_poi, other_data, other_model, test_stat="qtilde")
array(0.49567314)
combined_model = combined_workspace.model()
combined_data = combined_workspace.data(combined_model)

pyhf.infer.hypotest(test_poi, combined_data, combined_model, test_stat="qtilde")
multiple measurements defined. Taking the first measurement.
array(0.37128219)