Lightweight Preservation Environment (LPE)
Historically, the
LPE has been a complete data grid with preservation features. For the purposes of ADAPT, the
LPE needs to run supplementary to
other data grid technologies instead of existing independently.
Proposed System
We propose the following implementation for the next version of the
LPE.
1. Built entirely on SRB
Integration with SRB is a requirement at this point, since we must work with SDSC. For this reason, I think we should start there, and leave integration with Globus components until later.
2. Lightweight
We let the SRB do most of the heavy lifting: SRB protocol for moving data, SRB MCAT for storing location information as well as locating masters. Basically, the
LPE would exist as a data manager on a per-master basis, making a one to one correspondence between replica sites and SRB masters. The data manager handles storing per-replica policy information and enforcement.
3. De-centralized policy
Policy is managed on a replica-by-replica basis. For this version, policy consists of a positive and bounded number of required replicas, a lower-bounded frequency for checking these replicas, and an expiration date (which we don't have to act on). When a file is replicated from one replica site to the next, it's policy is replicated as well and enforced independently.
4. Simple interface
By keeping the interface simple and abstract, we can extend this system later beyond the SRB.
PAWN uploads and registers files with the
LPE in a single step. It then might optionally ask for proof that the file was stored. An initial push might also initiate a copy in to the deep archive.
The inter-manager interface is the same, with a store-and-register operation to create a new replica and a proof operation to verify that the remote replica is correct.
For retrieval, the
LPE provides functionality to find replicas, retrieve a single replica, and would also allow the aforementioned proof operation for retrieval clients.
For this to work, I assume self-identifying names. That is, the file name should be a cryptographic digest of the file itself, so that an error in file replication can be immediately detected. It also helps prevent two different files with the same name from showing up on different replica sites at the same time.
Implementation Issues
The following must be feasible for this to work. The easier these are, the easier it will be to implement the
LPE.
1. Master location through federated MCATs
2. Inter-zone replication
3. Finding the actual local filename for a given replica from the master, for local I/O operations
To ensure inter-zone replication, we could code the manager to pick replica sites in other zones before picking more local replica sites.
If we wanted to go real simple, we can associate the data manager with the MCAT in a federated MCAT system instead of on a master-by-master basis, but this might make verification operations prohibitively expensive.
I think this should get us started with a system that meets our needs in a relatively short amount of time.
--
GaryJackson - 04 May 2005
Comments, etc.
What would having the master location located in the MCAT cost us later in terms running the
LPE on non-srb or mixed systems?
Within the SRB, they use the dce-based guid generation for unique ID's,
PAWN already tracks these upon ingest and updates it's manifest in the SRB. Switching to a digest is do-able.
How would any non-file information be preserved, ie hiearchy information in the SRB as we replicate between zones, or do we not care and will leave this up to front-end interfaces to present the user with a nice view?
--
MikeSmorul - 11 May 2005
to top