The CERN Open Data Client
For remote data access, you can use the CERN Open Data Client, leveraging the xrootd
protocol. This will allow you to retrieve the necessary URLs and access the data directly from CERN’s storage systems.
It is currently not possible to retireve the URL for the 2020 release for education .root
files, we are working to support it. For more information see feat(records): add individual ATLAS 2020 ROOT files.
Install the cernopendata-client
package along with the fsspec-xrootd
package by running:
pip install cernopendata-client fsspec-xrootd
Retrieving File URLs
Once the client is installed, you can use it to obtain the URLs of the data files, for example, via their DOI (Digital Object Identifier). To get the file URLs from this record, you can use the following command:
cernopendata-client get-file-locations --doi 10.7483/OPENDATA.ATLAS.TC5G.AC24 --protocol xrootd
This command will return a file location like the following:
root://eospublic.cern.ch//eos/opendata/atlas/OutreachDatasets/2016-07-29/MC/mc_147770.Zee.root
Accessing Data
Using the xrootd
protocol, you can stream the data directly into your code without downloading the full dataset to your local machine. Here’s a Python example using ROOT
to open a file:
import ROOT
file_url = "root://eospublic.cern.ch//eos/opendata/atlas/OutreachDatasets/2016-07-29/MC/mc_147770.Zee.root"
file = ROOT.TFile.Open(file_url)
This approach allows for efficient access to large datasets hosted remotely on CERN's servers.