The 13 TeV 2020 Data
❗ For detailed information about this release, you can read "Review of the 13 TeV ATLAS Open Data release."
A set of proton-proton (pp) collision data was released by the ATLAS Collaboration to the public for educational purposes. The data has been collected by the ATLAS detector at the LHC at 13 TeV during the year 2016 and corresponds to an integrated luminosity of 10 fb-1. The pp collision data is accompanied by a set of MC simulated samples describing several processes which are used to model the expected distributions of different signal and background events.
-
The released samples are provided in a simplified data format, reducing the information content of the original data analysis format used within the ATLAS Collaboration.
-
The resulting format is a ROOT tuple with more than 80 branches. For those not familiar with this modular scientific software toolkit, please refer to the ROOT documentation, which provides a rich set of tutorials and code examples.
-
Several final-state collections are provided within this release. The corresponding multiplicities of final-state objects, minimum transverse momentum requirements and collection names are shown below:
Final-state categories | Leading object (min) [GeV] | Collection name |
---|---|---|
25 | 1lep | |
25 | 2lep | |
25 | 3lep | |
25 | 4lep | |
& | 250 (large-R jet), 25 (lepton) | 1largeRjet1lep |
& | 20 (), 25 (lepton) | 1lep1tau |
35 | GamGam |
Reconstructed physics objects
Several reconstructed physics objects (electrons, muons, photons, hadronically decaying tau-leptons, small-R jets, large-R jets) are contained within the 13 TeV ATLAS Open Data, and their preselection requirements are detailed below:
Electron (e) | Muon () | Photon () |
---|---|---|
InDet & EMCAL rec. | InDet & MS rec. | InDet & EMCAL rec. |
loose identification | loose identification | tight identification |
loose isolation | loose isolation | loose isolation |
GeV | GeV | GeV |
Hadronically decaying -leptons () | Small-R jets | Large-R jets |
---|---|---|
InDet & EMCAL rec. | EMCAL & HCAL rec. | EMCAL & HCAL rec. |
medium identification | anti-kt, R = 0.4 | anti-kt, R = 1.0 |
GeV | GeV | GeV |
1 or 3 associated tracks | b-tagging (MV2c10) | trimming: , |
The 13 TeV ATLAS Open Data events are selected by applying several event-quality and trigger criteria, and classified according to the type and multiplicity of reconstructed objects with high transverse momentum. Several standard selection requirements, referred to as preselection, are applied to each of the reconstructed physics objects within the 13 TeV ATLAS Open Data, as detailed in the table below:
Electrons & Muons | Small-R jets | Photons | Large-R jets | |
---|---|---|---|---|
GeV | GeV | GeV | GeV | |
GeV | ||||
In addition, several data quality criteria ensure that the detector was functioning properly and events are rejected if they contain reconstructed jets associated with energy deposits that can arise from hardware problems, beam-halo events or cosmic-ray showers. Furthermore, events are required to have at least one reconstructed vertex with two or more associated tracks.
Processes
The 13 TeV ATLAS Open Data set is comprised not only of pp collision data recorded with the ATLAS detector in 2016. It is accompanied by MC simulation samples describing several SM processes, which are used to model the expected distributions of different signal and background events. All simulated samples were processed through the same reconstruction algorithms and analysis chain as the data and subjected to a loose event preselection to reduce processing time.
MC simulation samples describing several Standard Model (SM) and beyond the Standard Model (BSM) processes, which are used to model the expected distributions of different signal and background processes, are included in the release.
A set of simulated SM processes includes top-quark-pair production, single-top production, production of weak bosons in association with jets (W+jets, Z+jets), production of a pair of bosons (diboson WW, WZ, ZZ) and SM Higgs production. The basic set of SM processes is complemented by simulations of BSM processes (heavy Z' and SUSY production). The description of the MC samples released in the 13 TeV ATLAS Open Data is presented below:
Top-quark production
Process | Unique "channelNumber" | Generator, hadronisation | Additional information |
---|---|---|---|
+jets | 410000 | Powheg-Box V2 + Pythia 8 + Pythia 8 | only and decays of -system |
single (anti)top t-channel | (410012) 410011 | Powheg-Box v1 + Pythia 6 | |
single (anti)top Wt-channel | (410014) 410013 | Powheg-Box V2 + Pythia 6 | |
single (anti)top s-channel | (410026) 410025 | Powheg-Box V2 + Pythia 6 |
W/Z (+jets) production
Process | Unique "channelNumber" | Generator, hadronisation | Additional information |
---|---|---|---|
361100 – 361108 | Powheg-Box V2 + Pythia 8 | LO accuracy up to Njets = 1 | |
361500 – 361505 | Powheg-Box V2 + Pythia 8 | LO accuracy up to 3-jets final states | |
361400 – 361441 | Sherpa 2.2 | LO accuracy up to 3-jets final states |
Diboson production
Process | Unique "channelNumber" | Generator, hadronisation | Additional information |
---|---|---|---|
363359, 363360 | Sherpa 2.2 | final states | |
363492 | Sherpa 2.2 | final states | |
363356 | Sherpa 2.2 | final states | |
363490 | Sherpa 2.2 | final states | |
363358 | Sherpa 2.2 | final states | |
363489 | Sherpa 2.2 | final states | |
363491 | Sherpa 2.2 | final states | |
363493 | Sherpa 2.2 | final states |
SM Higgs production (m_H = 125 GeV)
Process | Unique "channelNumber" | Generator, hadronisation | Additional information |
---|---|---|---|
345324 | Powheg-Box V2 + Pythia 8 | final states | |
345323 | Powheg-Box V2 + Pythia 8 | final states | |
345060 | Powheg-Box V2 + Pythia 8 | final states | |
344235 | Powheg-Box V2 + Pythia 8 | final states | |
341947 | Pythia 8 | final states | |
341964 | Pythia 8 | final states | |
343981 | Powheg-Box V2 + Pythia 8 | final states | |
345041 | Powheg-Box V2 + Pythia 8 | final states | |
γγ | 345318, 345319 | Powheg-Box V2 + Pythia 8 | final states |
341081 | aMC@NLO + Pythia 8 | final states |
BSM production
Process | Unique "channelNumber" | Generator, hadronisation | Additional information |
---|---|---|---|
301325 | Pythia 8 | TeV | |
392985 | aMC@NLO + Pythia 8 | GeV, GeV |
General Capabilities of the Datasets
The publicly released datasets can be used for educational purposes with different levels of task difficulty.
At a beginner level, one could visualise the content of the datasets and produce simple distributions. An intermediate-level task would consist of making histograms with collision data after some basic selection. Advanced-level tasks would allow for a deeper look into the ATLAS data, with possibilities of measuring real event properties and physical quantities.
A non-exhaustive list of possible tasks with the proposed datasets include:
- Comparisons of several distributions of event variables for simulated signal and background events.
- Finding variables that are able to separate signal from background (jet multiplicity, transverse momenta of jets and leptons, lepton isolation, b-tagging, missing transverse energy, angular distributions).
- Development and modification of cuts on these variables in order to enrich the signal-over-background separation.
- Optimisation of the signal-over-background ratio and estimation of the purity based on simulation only.
- Comparisons of the selection efficiency between data and simulation.
Advanced-level tasks might include:
- Derivation of production cross sections and masses of objects.
- Reconstruction of the objects (quarks or bosons) by assigning the detector physics objects (jets, leptons, missing energy) to the hypothetical decay trees.
- Estimation of the impact of other sources of systematic uncertainties (luminosity uncertainty, b-tagging efficiency, background modelling) by adding approximate and conservative values.
- A test-bed for new data-analysis techniques, e.g. kinematic fitting procedures, multivariate discrimination of signal from background and other machine learning tasks.