<<O>>  Difference Topic DigArch (r1.12 - 03 Apr 2006 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

 <<O>>  Difference Topic DigArch (r1.11 - 12 Dec 2005 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

 <<O>>  Difference Topic DigArch (r1.9 - 05 Dec 2005 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

Line: 16 to 16

  • review all v1 components, identify success/failure
    • identify which components need re-written
Changed:
<
<
  • Secure communication and look at authorization roles
>
>
  • Secure communication and look at authorization within a distributed archive

  • Modularize data transfer
  • pluggable low-level data store (tsm, srb, etc)
  • releaseable beta
 <<O>>  Difference Topic DigArch (r1.8 - 05 Dec 2005 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

Changed:
<
<
Goals
>
>
Version 1 Goals

  • Automatic replication of packages
  • Create data nodes that support preservation functionality
Line: 10 to 10

    • package sets are grouped together to allow for nodes to validate data in the sets
  • Scale to 100 nodes
  • Focus on data nodes
Added:
>
>
  • internal prototype, document results in TR

Version 2 Goals - assuming preservation approach is feasable

  • review all v1 components, identify success/failure
    • identify which components need re-written
  • Secure communication and look at authorization roles
  • Modularize data transfer
  • pluggable low-level data store (tsm, srb, etc)
  • releaseable beta

External services to provide

Line: 23 to 33

  • web service for control information (axis)
    • security layered in later w/ wss4j as appropriate
  • Replication occurs at the entire fileset level.
Changed:
<
<
  • all tomcat/servlet based
>
>
  • All tomcat/servlet based

Terminology

Line: 34 to 44

Changed:
<
<
  • pick a uuid - use sha-256 digest of packages for now.
>
>

Components

Changed:
<
<
Manager
>
>
Manager

  • The manager is disposable,
    • all information stored is useful for active preservation
 <<O>>  Difference Topic DigArch (r1.7 - 05 Dec 2005 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

Line: 79 to 79

  • Creating peer-peer management functionality (reserach project)
  • cryptographically self-naming packages. Pakcages are identified by sha-256 digest in this prototype.
Added:
>
>

Comments

 <<O>>  Difference Topic DigArch (r1.6 - 03 Dec 2005 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

Line: 49 to 49

    • receive updates of package locations from nodes
    • receive fileset and fileset group information from fileset group masters
Changed:
<
<
Nodes
>
>
Data Nodes

  • Nodes are authoritative for package information.
  • provide data storage
Line: 77 to 77

  • Access control, calls will be secured later with ws-security as time permits
  • Versioning and format migration of data.
  • Creating peer-peer management functionality (reserach project)
Added:
>
>
  • cryptographically self-naming packages. Pakcages are identified by sha-256 digest in this prototype.

 <<O>>  Difference Topic DigArch (r1.5 - 02 Dec 2005 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

Goals

Changed:
<
<
  • Automatic replication of packages according to package requirements
>
>
  • Automatic replication of packages

  • Create data nodes that support preservation functionality
  • Node are able to perform preservation actions unassisted (replication, checking, etc)
Changed:
<
<
  • Create simple disposable manager to trigger preservation actions, because nodes do not take initiative
>
>
    • package sets are grouped together to allow for nodes to validate data in the sets

  • Scale to 100 nodes
  • Focus on data nodes

External services to provide

Changed:
<
<
  • Ingestion of prior-created packages according to client-specified requirements
>
>
  • Ingestion of prior-created packages according to pre-defined storage classes.

  • Query status of packages (acknowledge replication)
  • URL based access to packages and items contained within packages
  • Location service for packages (on manager)
Line: 42 to 42

  • The manager is disposable,
    • all information stored is useful for active preservation
Changed:
<
<
    • Manager is not authoritative for any package - level infomration, merely a cache.
  • Create replication plan from ingestion requirements and inform appropriate nodes
  • Store location of nodes
>
>
    • Manager is not authoritative for any package level infomration, merely a cache.
  • Create storage classes for placement of packages into fileset groups
  • Store location of filesets and fileset groups

  • Cache location of packages and provide lookup service
    • receive updates of package locations from nodes
Changed:
<
<
    • Cached package plans
  • Triggers data purges as instructed.
  • triggers replication based on file plans as replicas disapper, invalidated, etc
>
>
    • receive fileset and fileset group information from fileset group masters

Nodes

Line: 57 to 55

  • provide data storage
    • self-validate data stored in data store's
  • cache package metadata (ID, checksums, etc)
Deleted:
<
<
  • contains a plan manager to schedule data push to other nodes
    • plans fullfillments supplied from Manager
    • priority given to certain actions (first replica...)

  • Access servlet (url) to packages and items in a package
  • Allow for external challange of a package's integrity
  • Activity history (purge, add, etc...) on pakcages
Added:
>
>
  • nodes contain filesets which belong to fileset groups.
    • fileset groups actively manage packages stored within.

Package

  • Compound object made up of many files
    • Packages are just storage containers, and are not aware of contextual information outside it's uuid.
Added:
>
>
    • files have minimal types attaches (metadata, manifest, data)

  • Simple format, similiar to tar (id,size,data,id,size,data....)
Changed:
<
<
  • Contains plan, checksum, uuid.
>
>
  • contains checksum, uuid.

  • package named with global uuid, but internal file id's are package unique only
  • 64-bit size
 <<O>>  Difference Topic DigArch (r1.4 - 02 Dec 2005 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

Line: 33 to 33

Changed:
<
<
  • pick a uuid
>
>

Components

 <<O>>  Difference Topic DigArch (r1.3 - 02 Dec 2005 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

Line: 22 to 22

  • Raw sockets for bulk transfer
  • web service for control information (axis)
    • security layered in later w/ wss4j as appropriate
Added:
>
>
  • Replication occurs at the entire fileset level.

  • all tomcat/servlet based
Added:
>
>
Terminology

Work Plan

First Steps

 <<O>>  Difference Topic DigArch (r1.2 - 02 Dec 2005 - MikeSmorul)

META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

Line: 36 to 36

Manager

Changed:
<
<
  • Create replication plan from ingestion requirements
>
>
  • The manager is disposable,
    • all information stored is useful for active preservation
    • Manager is not authoritative for any package - level infomration, merely a cache.
  • Create replication plan from ingestion requirements and inform appropriate nodes

  • Store location of nodes
  • Cache location of packages and provide lookup service
    • receive updates of package locations from nodes
Line: 46 to 49

Nodes

Added:
>
>
  • Nodes are authoritative for package information.

  • provide data storage
    • self-validate data stored in data store's
  • cache package metadata (ID, checksums, etc)
Line: 58 to 62

Package

  • Compound object made up of many files
Added:
>
>
    • Packages are just storage containers, and are not aware of contextual information outside it's uuid.

  • Simple format, similiar to tar (id,size,data,id,size,data....)
  • Contains plan, checksum, uuid.
  • package named with global uuid, but internal file id's are package unique only
Line: 69 to 74

  • Creating of an AIP and organizing metadata/data (it's just data in a package)
  • Access control, calls will be secured later with ws-security as time permits
  • Versioning and format migration of data.
Added:
>
>
  • Creating peer-peer management functionality (reserach project)

 <<O>>  Difference Topic DigArch (r1.1 - 01 Dec 2005 - MikeSmorul)
Line: 1 to 1
Added:
>
>
META TOPICPARENT WebHome

ADAPT Middle Layer Prototype

Goals

  • Automatic replication of packages according to package requirements
  • Create data nodes that support preservation functionality
  • Node are able to perform preservation actions unassisted (replication, checking, etc)
  • Create simple disposable manager to trigger preservation actions, because nodes do not take initiative
  • Scale to 100 nodes
  • Focus on data nodes

External services to provide

  • Ingestion of prior-created packages according to client-specified requirements
  • Query status of packages (acknowledge replication)
  • URL based access to packages and items contained within packages
  • Location service for packages (on manager)

Technology

  • Raw sockets for bulk transfer
  • web service for control information (axis)
    • security layered in later w/ wss4j as appropriate
  • all tomcat/servlet based

Work Plan

First Steps

Components

Manager

  • Create replication plan from ingestion requirements
  • Store location of nodes
  • Cache location of packages and provide lookup service
    • receive updates of package locations from nodes
    • Cached package plans
  • Triggers data purges as instructed.
  • triggers replication based on file plans as replicas disapper, invalidated, etc

Nodes

  • provide data storage
    • self-validate data stored in data store's
  • cache package metadata (ID, checksums, etc)
  • contains a plan manager to schedule data push to other nodes
    • plans fullfillments supplied from Manager
    • priority given to certain actions (first replica...)
  • Access servlet (url) to packages and items in a package
  • Allow for external challange of a package's integrity
  • Activity history (purge, add, etc...) on pakcages

Package

  • Compound object made up of many files
  • Simple format, similiar to tar (id,size,data,id,size,data....)
  • Contains plan, checksum, uuid.
  • package named with global uuid, but internal file id's are package unique only
  • 64-bit size

Items that will not be addressed

  • Creating metadata and access interfaces outside of demos (PAWN will publish to these)
  • Creating of an AIP and organizing metadata/data (it's just data in a package)
  • Access control, calls will be secured later with ws-security as time permits
  • Versioning and format migration of data.
Revision r1.1 - 01 Dec 2005 - 22:28 - MikeSmorul
Revision r1.12 - 03 Apr 2006 - 16:11 - MikeSmorul