Limitations
Although these data are all intended for use in scientific research, they are not without limitations. It is important to keep in mind the limitations of these datasets when using them in order to avoid false-positive results or dead ends. Although some of the most important limitations are documented here, if you have any questions about whether a particular analysis is possible please feel free to contact us. In case the analysis is not possible with the available public data, you are also welcome to consider a short term association with ATLAS in order to perform the analysis like a member of the collaboration would.
Data format limitations
The PHYSLITE proton-proton collision data format released for research use does not contain all charged particle tracks in the event, nor does it contain all calorimeter clusters or particle flow objects. As a result, some analyses cannot be performed:
- Analyses requiring detailed track information for all tracks (e.g. charged particle spectrum analyses, energy flow analyses)
- Analyses requiring new jet collections or jet grooming (e.g. studies of top-jet tagging or jet cross-sections as a function of jet radius)
- Analyses examining specific hadron decay information (e.g. B-hadron studies)
The PHYSLITE data format is constructed to support "standard" analyses. This also implies that analyses requiring non-standard configurations of specific tools may be difficult or impossible. For example:
- Searches for long-lived particles that require specific information to reconstruct the long-lived particle are very difficult; most information is unavailable
- Measurements and searches with non-standard physics object definitions or reconstruction (e.g. customized heavy-flavor hadron tagging, modified photon reconstruction) are not possible
- Measurements using information from the forward detectors in ATLAS are not currently supported (e.g. some diffractive physics measurements)
Data sample limitations
In all cases it is important to respect a good runs list, which removes detector data quality issues from the data. In some rare cases, careful checks might reveal additional data quality issues. The most straightforward way to look for these is to check for the distribution of the events of interest in time (or across data taking runs). Physics should generally be uniform in time; detector issues are unlikely to be.
The samples that have been released are sufficient to establish standard backgrounds and generically-applicable systematic uncertainties. In some cases, establishing key systematic uncertainties requires additional samples that have not been released (e.g. heavy quark fragmentation variations for top mass measurements). Some signal samples or variation samples have also not been released in order to reduce the total space required by the samples (e.g. W-boson mass variation samples that are required for a template fit to the W-boson mass, or a wide variety of beyond-the-Standard-Model signal samples that are used in various searches for new particles).
In principle it is possible to produce your own samples. In practice, this is extremely complicated and strongly discouraged. The collaboration has significant infrastructure in order to produce samples that, for example, match the data-taking conditions and pileup profile of the data period being modeled. Getting these sorts of things right without significant help from a collaboration member or access to internal resources generally requires major computing resources and prohibitive time investment.
In case there are specific samples that would be beneficial to your analysis, you are welcome to request their release.