{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Using HEPData" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import json\n", "\n", "import pyhf\n", "import pyhf.contrib.utils" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preserved on HEPData" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As of this tutorial, ATLAS has [published 18 full statistical models to HEPData](https://scikit-hep.org/pyhf/citations.html#published-statistical-models)\n", "\n", "
\n", "\n", "Let's explore the 1Lbb workspace a little bit shall we?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Getting the Data\n", "\n", "We'll use the `pyhf[contrib]` extra (which relies on `requests` and `tarfile`) to download the HEPData minted DOI and extract the files we need." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pyhf.contrib.utils.download(\n", " \"https://doi.org/10.17182/hepdata.90607.v3/r3\", \"1Lbb-likelihoods\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This will nicely download and extract everything we need." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!ls -lavh 1Lbb-likelihoods" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Instantiate our objects\n", "\n", "We have a background-only workspace `BkgOnly.json` and a signal patchset collection `patchset.json`. Let's create our python objects and play with them:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "spec = json.load(open(\"1Lbb-likelihoods/BkgOnly.json\"))\n", "patchset = pyhf.PatchSet(json.load(open(\"1Lbb-likelihoods/patchset.json\")))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So what did the analyzers give us for signal patches?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Patching in Signals\n", "\n", "Let's look at this [`pyhf.PatchSet`](https://pyhf.readthedocs.io/en/v0.7.5/_generated/pyhf.patchset.PatchSet.html#pyhf.patchset.PatchSet) object which provides a user-friendly way to interact with many signal patches at once.\n", "\n", "### PatchSet" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "patchset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Oh wow, we've got 125 patches. What information does it have?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"description: {patchset.description}\")\n", "print(f\" digests: {patchset.digests}\")\n", "print(f\" labels: {patchset.labels}\")\n", "print(f\" references: {patchset.references}\")\n", "print(f\" version: {patchset.version}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So we've got a useful description of the signal patches... there's a digest. Does that match the background-only workspace we have?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pyhf.utils.digest(spec)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It does! In fact, this sort of verification check will be done automatically when applying patches using `pyhf.PatchSet` as we will see shortly. To manually verify, simply run `pyhf.PatchSet.verify` on the workspace. No error means everything is fine. It will loudly complain otherwise." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "patchset.verify(spec)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "No error, whew. Let's move on.\n", "\n", "The labels `m1` and `m2` tells us that we have the signal patches parametrized in 2-dimensional space, likely as $m_1 = \\tilde{\\chi}_1^\\pm$ and $m_2 = \\tilde{\\chi}_1^0$... but I guess we'll see?\n", "\n", "The references list the references for this dataset, which is pointing at the hepdata record for now.\n", "\n", "Next, the version is the version of the schema set we're using with `pyhf` (`1.0.0`).\n", "\n", "And last, but certainly not least... its patches:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "patchset.patches" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So we can see all the patches listed both by name such as `C1N2_Wh_hbb_900_250` as well as a pair of points `(900, 250)`. Why is this useful? The `PatchSet` object acts like a special dictionary look-up where it will grab the patch you need based on the unique key you provide it.\n", "\n", "For example, we can look up by name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "patchset[\"C1N2_Wh_hbb_900_250\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or by the pair of points" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "patchset[(900, 250)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Patches\n", "\n", "A `pyhf.PatchSet` is a collection of `pyhf.Patch` objects. What is a patch indeed? It contains enough information about how to apply the signal patch to the corresponding background-only workspace (matched by digest)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "patch = patchset[\"C1N2_Wh_hbb_900_250\"]\n", "print(f\" name: {patch.name}\")\n", "print(f\"values: {patch.values}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most importantly, it contains the patch information itself. Specifically, this inherits from the `jsonpatch.JsonPatch` object, which is a 3rd party module providing native support for json patching in python. That means we can simply apply the patch to our workspace directly!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\" samples pre-patch: {pyhf.Workspace(spec).samples}\")\n", "print(f\"samples post-patch: {pyhf.Workspace(patch.apply(spec)).samples}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or, more quickly, from the `PatchSet` object:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\" samples pre-patch: {pyhf.Workspace(spec).samples}\")\n", "print(f\"samples post-patch: {pyhf.Workspace(patchset.apply(spec, (900, 250))).samples}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Patching via Model Creation\n", "\n", "One last way to apply the patching is to, instead of patching workspaces, we patch the models as we build them from the background-only workspace. This maybe makes it easier to treat the background-only workspace as immutable, and patch in signal models when grabbing the model. Check it out." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "workspace = pyhf.Workspace(spec)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, load up our background-only spec into the workspace. Then let's create a model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = workspace.model(patches=[patchset[\"C1N2_Wh_hbb_900_250\"]])\n", "print(f\"samples (workspace): {workspace.samples}\")\n", "print(f\"samples ( model ): {model.config.samples}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Doing Physics\n", "\n", "So we want to try and reproduce part of the contour. At least convince ourselves we're doing *physics* and not *fauxsics*. ... Anyway... Let's remind ourselves of the 1Lbb contour as we don't have the photographic memory of the ATLAS SUSY conveners\n", "\n", "