Introduction to Workspaces#

Similarly to the previous chapter, we’re going to go up “one level” from models to workspaces.

The Simple Model#

First, let’s remind ourselves of what the simple model looked like…

import json

import pyhf
model = pyhf.simplemodels.uncorrelated_background(
    signal=[5.0, 10.0], bkg=[50.0, 60.0], bkg_uncertainty=[5.0, 12.0]
)
print(json.dumps(model.spec, indent=2))
{
  "channels": [
    {
      "name": "singlechannel",
      "samples": [
        {
          "name": "signal",
          "data": [
            5.0,
            10.0
          ],
          "modifiers": [
            {
              "name": "mu",
              "type": "normfactor",
              "data": null
            }
          ]
        },
        {
          "name": "background",
          "data": [
            50.0,
            60.0
          ],
          "modifiers": [
            {
              "name": "uncorr_bkguncrt",
              "type": "shapesys",
              "data": [
                5.0,
                12.0
              ]
            }
          ]
        }
      ]
    }
  ]
}

The Simple Workspace#

Let’s go ahead and load up a workspace we’ve included as part of this tutorial.

with open("data/2-bin_1-channel.json") as serialized:
    spec = json.load(serialized)

workspace = pyhf.Workspace(spec)
workspace
<pyhf.workspace.Workspace object at 0x7fc8c59708b0>

What did we just make? This returns a pyhf.Workspace object. Let’s check out the specification.

print(json.dumps(workspace, indent=2))
{
  "channels": [
    {
      "name": "singlechannel",
      "samples": [
        {
          "name": "signal",
          "data": [
            5.0,
            10.0
          ],
          "modifiers": [
            {
              "name": "mu",
              "type": "normfactor",
              "data": null
            }
          ]
        },
        {
          "name": "background",
          "data": [
            50.0,
            60.0
          ],
          "modifiers": [
            {
              "name": "uncorr_bkguncrt",
              "type": "shapesys",
              "data": [
                5.0,
                12.0
              ]
            }
          ]
        }
      ]
    }
  ],
  "observations": [
    {
      "name": "singlechannel",
      "data": [
        53.0,
        65.0
      ]
    }
  ],
  "measurements": [
    {
      "name": "Measurement",
      "config": {
        "poi": "mu",
        "parameters": []
      }
    }
  ],
  "version": "1.0.0"
}

Well, as you can see, this looks exactly like our simple model in the channel specifications, but additionally specifies the observations, and the (one) measurement as well. A workspace can encapsulate multiple model definitions by having multiple measurements defined. This is a simple example however, so don’t expect a lot of bells and whistles.

Getting the model#

So how do we get the model from the workspace? We need a measurement. Workspaces are required to have at least one measurement defined. For a workspace with one measurement, we expect the default measurement to be exactly that.

workspace.get_measurement()
{'name': 'Measurement', 'config': {'poi': 'mu', 'parameters': []}}

But we can also get the measurement by name

workspace.get_measurement(measurement_name="Measurement")
{'name': 'Measurement', 'config': {'poi': 'mu', 'parameters': []}}

Or by index (it is a list of measurements after all)

workspace.get_measurement(measurement_index=0)
{'name': 'Measurement', 'config': {'poi': 'mu', 'parameters': []}}

What does this mean for us though? Well, when we ask for a model, we specify the measurement that we want to use with it. Each of these measurements above have no additional parameter configurations on top of the existing model specification. Additionally, they all declare that the parameter of interest is mu.

See the documentation for more information. In this case, let’s build the model for the default measurement.

model = workspace.model()

And this gives us a pyhf.Model object that we can use as before, like in the previous chapter.

print(f"  channels: {model.config.channels}")
print(f"     nbins: {model.config.channel_nbins}")
print(f"   samples: {model.config.samples}")
print(f" modifiers: {model.config.modifiers}")
print(f"parameters: {model.config.parameters}")
print(f"  nauxdata: {model.config.nauxdata}")
print(f"   auxdata: {model.config.auxdata}")
  channels: ['singlechannel']
     nbins: {'singlechannel': 2}
   samples: ['background', 'signal']
 modifiers: [('mu', 'normfactor'), ('uncorr_bkguncrt', 'shapesys')]
parameters: ['mu', 'uncorr_bkguncrt']
  nauxdata: 2
   auxdata: [100.0, 25.0]

Getting the Observations#

One last thing we’ll want to do is extract out the observations for the model.

workspace.data(model)
[53.0, 65.0, 100.0, 25.0]

Just like before, we’re getting both the actual data of the main model and the auxiliary data for the constraint model. And just like before, we can ask for data without auxiliary information:

workspace.data(model, include_auxdata=False)
[53.0, 65.0]

There’s not much left beyond this, as we’ve explored all the functionality at the pyhf.Model level. The workspace is a nifty user-friendly tool to make it easier to interact with the HistFactory JSON workspaces in the wild. We’ll explore more functionality related to workspaces in a future chapter.