HEP-like Simple Workspace

Similarly to the previous chapter, we’re going to go up “one level” from models to workspaces.

The Simple Model

First, let’s remind ourselves of what the simple model looked like…

import pyhf
import json

model = pyhf.simplemodels.hepdata_like(signal_data=[5.0, 10.0],
                                       bkg_data=[50.0, 60.0], 
                                       bkg_uncerts=[5.0, 12.0])
print(json.dumps(model.spec, indent=2))
{
  "channels": [
    {
      "name": "singlechannel",
      "samples": [
        {
          "name": "signal",
          "data": [
            5.0,
            10.0
          ],
          "modifiers": [
            {
              "name": "mu",
              "type": "normfactor",
              "data": null
            }
          ]
        },
        {
          "name": "background",
          "data": [
            50.0,
            60.0
          ],
          "modifiers": [
            {
              "name": "uncorr_bkguncrt",
              "type": "shapesys",
              "data": [
                5.0,
                12.0
              ]
            }
          ]
        }
      ]
    }
  ]
}

The Simple Workspace

Let’s go ahead and load up a workspace we’ve included as part of this tutorial.

with open("data/2-bin_1-channel.json") as serialized:
  spec = json.load(serialized)

workspace = pyhf.Workspace(spec)
workspace
<pyhf.workspace.Workspace object at 0x7f036868a450>

What did we just make? This returns a pyhf.Workspace object. Let’s check out the specification.

print(json.dumps(workspace, indent=2))
{
  "channels": [
    {
      "name": "singlechannel",
      "samples": [
        {
          "name": "signal",
          "data": [
            5.0,
            10.0
          ],
          "modifiers": [
            {
              "name": "mu",
              "type": "normfactor",
              "data": null
            }
          ]
        },
        {
          "name": "background",
          "data": [
            50.0,
            60.0
          ],
          "modifiers": [
            {
              "name": "uncorr_bkguncrt",
              "type": "shapesys",
              "data": [
                5.0,
                12.0
              ]
            }
          ]
        }
      ]
    }
  ],
  "observations": [
    {
      "name": "singlechannel",
      "data": [
        52.5,
        65.0
      ]
    }
  ],
  "measurements": [
    {
      "name": "Measurement",
      "config": {
        "poi": "mu",
        "parameters": []
      }
    }
  ],
  "version": "1.0.0"
}

Well, as you can see, this looks exactly like our simple model in the channel specifications, but additionally specifies the observations, and the (one) measurement as well. A workspace can encapsulate multiple model definitions by having multiple measurements defined. This is a simple example however, so don’t expect a lot of bells and whistles.

Getting the model

So how do we get the model from the workspace? If you recall from our pyhf inspect a while back, we noticed that there’s a default measurement. For a workspace with one measurement, we expect the default measurement to be exactly that.

workspace.get_measurement()
{'name': 'Measurement', 'config': {'poi': 'mu', 'parameters': []}}

But we can also get the measurement by name

workspace.get_measurement(measurement_name='Measurement')
{'name': 'Measurement', 'config': {'poi': 'mu', 'parameters': []}}

Or by index (it is a list of measurements after all)

workspace.get_measurement(measurement_index=0)
{'name': 'Measurement', 'config': {'poi': 'mu', 'parameters': []}}

Or create a new measurement on the fly by specifying the name for the parameter of interest:

workspace.get_measurement(poi_name='mu')
{'name': 'NormalMeasurement', 'config': {'poi': 'mu', 'parameters': []}}

What does this mean for us though? Well, when we ask for a model, we specify the measurement that we want to use with it. See the documentation for more information. In this case, let’s build the model for the default measurement.

model = workspace.model()

And this gives us a pyhf.Model object that we can use as before, like in the previous chapter.

print(f'  channels: {model.config.channels}')
print(f'     nbins: {model.config.channel_nbins}')
print(f'   samples: {model.config.samples}')
print(f' modifiers: {model.config.modifiers}')
print(f'parameters: {model.config.parameters}')
print(f'  nauxdata: {model.config.nauxdata}')
print(f'   auxdata: {model.config.auxdata}')
  channels: ['singlechannel']
     nbins: {'singlechannel': 2}
   samples: ['background', 'signal']
 modifiers: [('mu', 'normfactor'), ('uncorr_bkguncrt', 'shapesys')]
parameters: ['mu', 'uncorr_bkguncrt']
  nauxdata: 2
   auxdata: [100.0, 25.0]

Getting the Observations

One last thing we’ll want to do is extract out the observations for the model.

workspace.data(model)
[52.5, 65.0, 100.0, 25.0]

Just like before, we’re getting both the actual data of the main model and the auxiliary data for the constraint model. And just like before, we can ask for data without auxiliary information:

workspace.data(model, with_aux=False)
[52.5, 65.0]

There’s not much left beyond this, as we’ve explored all the functionality at the pyhf.Model level. The workspace is a nifty user-friendly tool to make it easier to interact with the HistFactory JSON workspaces in the wild. We’ll explore more functionality related to workspaces in a future chapter.