Jump to: navigation, search

Difference between revisions of "SMILA/Specifications/Partitioning Storages"

m (New page: Use Case: Partitioning both Storages for Backup and Reuse/Recrawling Changes in XML/Bin-Storage To support partitioning both XML- and Bin- storages must be able to store data to partitio...)
 
m
Line 1: Line 1:
 +
==== Use Case: Partitioning both Storages for Backup and Reuse/Recrawling ====
  
Use Case: Partitioning both Storages for Backup and Reuse/Recrawling
+
# Changes in XML and Bin storage.
 
+
#:To support partitioning both XML- and Bin- storages must be able to store data to partitions. Thus XML- and Bin- storage APIs should be extended in such way that partition name will be accepted as an additional parameter to record Id when saving data. With binstorage, binary attachments can have quite a big size, thus if attachment wasn't changed from one partition to another, it's worth not to copy attachment's data for each partition but store only reference to actual attachment.
Changes in XML/Bin-Storage
+
# Partitioning configuration and changes in Blackboard API.
To support partitioning both XML- and Bin- storages must be able to store data to partitions. Thus XML- and Bin- storage APIs should be extended in such way that partition name will be  
+
#:Partition information (partition name) can be configured into Listener Rule.
 
+
#:There are following options how to pass partition information:
accepted as an additional parameter to record Id when saving data. With binstorage, binary attachments can have quite a big size, thus if attachment wasn't changed from one partition to  
+
#:1. Partition information is passed as a record Id property.
 
+
#:Listener gets record from the queue and sets partition property to the Id.
another, it's worth not to copy attachment's data for each partition but store only reference to actual attachment.
+
#:Blackboard uses parition information in load and commit operations.
 
+
#:In this case no signifant changes are required to the blackboard API because partition information is incapsulated into record Id.
 
+
#:2. Partition information is passed separately from record as a JMS property.
There are following options how to pass partition information:
+
#:Listener reads JMS property from the queue and makes it available for other components that will use blackboard (like processing).
 
+
#:In this case all methods from blackboard API should be duplicated to handle partition name as a second parameter.
1. Partition information is passed as a record Id property.
+
#:
Listener gets record from the queue and sets partition property to the Id.
+
#:The first option seems to be more useful because it allows to keep partition information directly into the record and thus easily #:pass and receive it between distributed components.
Blackboard uses parition information in load and commit operations.
+
In this case no signifant changes are required to the blackboard API because partition information is incapsulated into record Id.
+
 
+
 
+
2. Partition information is passed separately from record as a JMS property.
+
Listener reads JMS property from the queue and makes it available for other components that will use blackboard (like processing).
+
In this case all methods from blackboard API should be duplicated to handle partition name as a second parameter.
+
 
+
 
+
The first option seems to be more useful because it allows to keep partition information directly into the record and thus easily pass and receive it between distributed components.
+

Revision as of 12:37, 13 November 2008

Use Case: Partitioning both Storages for Backup and Reuse/Recrawling

  1. Changes in XML and Bin storage.
    To support partitioning both XML- and Bin- storages must be able to store data to partitions. Thus XML- and Bin- storage APIs should be extended in such way that partition name will be accepted as an additional parameter to record Id when saving data. With binstorage, binary attachments can have quite a big size, thus if attachment wasn't changed from one partition to another, it's worth not to copy attachment's data for each partition but store only reference to actual attachment.
  2. Partitioning configuration and changes in Blackboard API.
    Partition information (partition name) can be configured into Listener Rule.
    There are following options how to pass partition information:
    1. Partition information is passed as a record Id property.
    Listener gets record from the queue and sets partition property to the Id.
    Blackboard uses parition information in load and commit operations.
    In this case no signifant changes are required to the blackboard API because partition information is incapsulated into record Id.
    2. Partition information is passed separately from record as a JMS property.
    Listener reads JMS property from the queue and makes it available for other components that will use blackboard (like processing).
    In this case all methods from blackboard API should be duplicated to handle partition name as a second parameter.
    The first option seems to be more useful because it allows to keep partition information directly into the record and thus easily #:pass and receive it between distributed components.