Jump to: navigation, search

SMILA/Documentation/JdbcLoggingPipelet

< SMILA‎ | Documentation
Revision as of 10:30, 21 February 2013 by Andreas.weber.empolis.com (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Bundle: org.eclipse.smila.jdbc

Description

The JdbcLoggingPipelet logs a given statement string (= PreparedStatement) into a database via JDBC for each processed record. The PreparedStatement typically has parameters '?', these are filled with the values which are referenced by the valuePaths parameter. The valuePaths parameter is a list of strings, each string contains a path to a (sub-)attribute in the currently processed record's metadata.

Configuration

Property Type Read Type Required Description
dbUrl String runtime yes The (JDBC driver) dependent URL which is used to connect to the database.
dbProps String runtime yes Database connection properties, e.g. 'user' and 'password'.
stmt String runtime yes The (Prepared)Statement which is logged to the database, may have parameters.
valuePaths String (multi) runtime no List of paths which point to the record's metadata (sub)attributes that are used as parameter values in the logged statement. A path is separated by '/'.
Configuring a value path

The following should be taken into account when specifying a value path:

  • If a value path references a single value, this is used for the PreparedStatement.
  • If a value path references a sequence of values, (only) the first value of the sequence is used.
  • In any other case, the value is set 'null'. (Keep in mind that a 'null' value is ok for a PreparedStatement)

Example

The following example shows a sample pipelet configuration and the resulting log statement when logging the given record.

Pipelet configuration:

 {
   "dbUrl":"crawlJdbcJob",   
   "dbProps": {
     "user":"Andreas",
     "password":"topsecret"
    }   
   "stmt":"INSERT INTO myTable VALUES (?, ?, 100, ?, ?)",
   "valuePaths": [
     "_recordid",
     "_parameters/session/id"     
     "Authors",
     "Size"
    ]
   
 }

Sample record and resulting logged SQL statement:

 {
   "_recordid":"web:http://example.org",   
   "_parameters": {
      "maxCount": 100,
      "session": {
         "timestamp": "2012-10-12T14:00:00",
         "id": 4711
      }
   }
   "Path": "http://example.org/index.html",
   "Authors": ["Andreas Weber", "Jürgen Schumacher", "Andreas Schank"]   
 }

-> INSERT INTO myTable VALUES ('web:http://example.org', 4711, 100, 'Andreas Weber', null)