4-025
|
Simulation Analysis of the
Optimal Storage Resource Allocation for Large HENP Databases
|
Jinbaek Kim and Arie Shoshani
Large High Energy and
Nuclear Physics (HENP) databases are commonly stored on robotic tape systems
because of cost considerations. Later,
selected subsets of the data are cached into disk caches for analysis or data
mining. Because of the relatively
long time to mount, seek, and read a tape, it is important to minimize the
number of times that data is cached into disk.
Having too little disk cache will force files to be removed from disk
prematurely, thus reducing the potential of their sharing with other users.
Similarly, having too few tape drives will not make good use of a large
disk cache, as the throughput from the tape system will form the bottleneck.
Balancing the tape and disk resources is dependent on the patterns of the
requests to the data. In this
paper, we describe a simulation that characterizes such a system in terms of the
resources and the request patterns. We
learn from the simulation which parameters affect the performance of the system
the most. We also observe from the
simulation that, there is a point beyond which it is not worth investing in
additional resources as the benefit is too marginal.
We call this point the "point-of-no-benefit" (or PNB), and show
that using this concept we can more easily discover the relationship of various
parameters to the performance of the system.
|
Keywords: |
optimal
storage resources, storage allocation, storage simulation analysis |
|
Contact:
|
Dr.
Arie Shoshani (Lawrence Berkeley National Laboratory)
shoshani@lbl.gov |