4-023

Distributing file-based data to remote sites within the BaBar Collaboration

Tim Adye, Alvise Dorigo, Alessandra Forti, Emanuele Leonardi on behalf of the BaBar Collaboration's Computing group

Over the first year and a half of data taking, the SLAC-based BaBar experiment accumulated more than 150 million e+ e- events at the U(4S) centre of mass energy. This data yield is expected to rapidly increase in the coming years. To ease the demanding task posed on SLAC computing facilities by this amount of data and to allow access to physics analysis data to the BaBar community distributed over the world, several external sites have built local computing farms. 

Handling and optimising the export of several Terabytes to external sites via a non-dedicated network connection required a whole set of procedures to automatically check for new data created at SLAC and to import them to local storage. 

Our work consisted of the development of a set of tools to satisfy these requirements for physics analysis data stored in ROOT files. 

Central to these tools is a data-import procedure based on information stored in a relational database: this allows for an easy and efficient tracking of new data appearing at SLAC and for a precise selection of the type and amount of data to import, also automatically balancing data storage over many disk volumes. Network-based data transfer efficiency and speed were obtained through the use of advanced ftp-like tools. A related tool handles the dynamic back-up of imported data, implementing a multi-file archiving procedure to improve efficiency of data storage to high speed tape systems, including the HPSS system in use at SLAC.   

Keywords:

Data distribution - Archiving - BaBar - ROOT

Contact:        

Dr. Alvise Dorigo (I.N.F.N. Padova)  Alvise.Dorigo@pd.infn.it