Was able to extract doc (HTM* & XML) files from Sequoyah & Eclipse projects
Each project can specify its own plugin exclude patterns (used for both properties & doc files extraction)
May need a few twists to exclude some configuration XML files
How should we save the doc files? Thoughts from Denis:
Since there is no turnkey solution for doing this, here are a couple of solutions I can think of for Babel.
1. We use an external VCS as Antoine suggested. I think this is an entirely valid approach, although perhaps a bit more complex.
2. We do like MediaWiki -- save the entire file on each save. If it works for them, it will work for us. Except:
when a user wants to diff two versions, we could use one of the diff libraries Gabe pointed out. No need to go to shell or write our own.
older revisions could pass through the gzip lib and be stored in compressed format. If a user wants to diff to an older version, we unzip it first, then pass it through the diff engine.
one design consideration here would be to use many tables -- perhaps one translation table per language. Or, one table to contain the 'latest' of everything and one table per language for the gzipped older versions.
Thoughts from Kit:
VCS may be an overkill
The main concern I had was doc file size; saving the whole file after one line change may not be very efficient
We probably do not need the function to inspect the diff (we don't have that for properties files)
Zip up old revisions (or even throw away the old revisions) may be okay?