Jump to: navigation, search

Difference between revisions of "SMILA/Component Requirements/Record Binary Storage Requirements"

(added namespace and cluster requirement)
 
(3 intermediate revisions by one other user not shown)
Line 1: Line 1:
This page defines requirements posed on SMILA's components.
+
This page defines requirements posed on SMILA's record binary storage.
  
 
== Overview ==
 
== Overview ==
Line 16: Line 16:
 
#The 'get' and 'set' methods should operate both with streams (in case very large documents - more than 2GB in size - need to be stored/processed) and byte arrays (for convenience reasons)
 
#The 'get' and 'set' methods should operate both with streams (in case very large documents - more than 2GB in size - need to be stored/processed) and byte arrays (for convenience reasons)
 
#The client component must use different instances of the binary store fully transparently
 
#The client component must use different instances of the binary store fully transparently
 +
# Namespaces/Collections: Bin Storage shall support the notion of a namespace or collection which serves as a separation mechanism of the data. The characteristic of a namespace is such that no two diff. files with the same ID may exist. Backups/restores shall be possible on namespace level.
 +
# Clustering: fail-over clustering is not the primary needed use case currently. More important is the case of storing large amounts of data (e.g. Terabytes) in the same namespace, which requires client-transparent storing and retrieving from diff. nodes in the cluster.
 
#Proposal for essential API:
 
#Proposal for essential API:
 
<source lang="java">
 
<source lang="java">
 
void storeRecordAttachment(String attachmentId, InputStream attachmentStream)
 
void storeRecordAttachment(String attachmentId, InputStream attachmentStream)
void updateRecordAttachment(String attachmentId, InputStream attachmentStream)
 
InputStream getRecordAttachment(String attachmentId)
 
InputStream removeRecordAttachment(String attachmentId)
 
int getRecordAttachmentSize(String attachmentId)
 
 
void storeRecordAttachment(String attachmentId, byte[] attachmentStream)
 
void storeRecordAttachment(String attachmentId, byte[] attachmentStream)
void updateRecordAttachment(String attachmentId, byte[] attachmentStream)
+
byte[] fetchRecordAttachmentAsByte(String attachmentId)
byte[] getRecordAttachment(String attachmentId)
+
InputStream fetchRecordAttachmentAsStream(String attachmentId)
byte[] removeRecordAttachment(String attachmentId)
+
void removeRecordAttachment(String attachmentId)
 +
int fetchRecordAttachmentSize(String attachmentId)
 
</source>
 
</source>
  

Latest revision as of 09:24, 24 October 2008

This page defines requirements posed on SMILA's record binary storage.

Overview

The purpose of record binary storage is to store document binary data. Usually this is the content of the binary document referenced as one or more attachments in the record. The natural client component of this low level service is the blackboard service.

Note: The content (record attachments) of XML document should rather be stored in record XML storage since this enables the client component to fire some XQueries on the document content itself.

Requirements

  1. Record binary store has to offer an implementation-agnostic API. This particularly means that the client component should have no knowledge about the actual persistence technology being used (local file system, DB or distributed file system)
  2. The usage of one special implementation of binary storage service should be simply a matter of the framework configuration.
  3. The essential API should be kept short (max 10 methods)
  4. Expanded API may contain batch operations
  5. The 'get' and 'set' methods should operate both with streams (in case very large documents - more than 2GB in size - need to be stored/processed) and byte arrays (for convenience reasons)
  6. The client component must use different instances of the binary store fully transparently
  7. Namespaces/Collections: Bin Storage shall support the notion of a namespace or collection which serves as a separation mechanism of the data. The characteristic of a namespace is such that no two diff. files with the same ID may exist. Backups/restores shall be possible on namespace level.
  8. Clustering: fail-over clustering is not the primary needed use case currently. More important is the case of storing large amounts of data (e.g. Terabytes) in the same namespace, which requires client-transparent storing and retrieving from diff. nodes in the cluster.
  9. Proposal for essential API:
void storeRecordAttachment(String attachmentId, InputStream attachmentStream)
void storeRecordAttachment(String attachmentId, byte[] attachmentStream)
byte[] fetchRecordAttachmentAsByte(String attachmentId)
InputStream fetchRecordAttachmentAsStream(String attachmentId)
void removeRecordAttachment(String attachmentId)
int fetchRecordAttachmentSize(String attachmentId)

Note: By being able to get the size of the stored content at first, the client component developer can decide which method (stream or byte-array oriented) he/she should use.