The PDOM architecture is composed of three main components, the PDOM database, the PDOM indexers, the PDOM clients.
The PDOM database is a flat file that stores binary information. This file is memory mapped into direct byte buffers in 4K chunks. A LRU cache us maintained to ensure that only a managable number of chunks are kept in memory.
The date stored in the PDOM database is organized by records. Records may contain any type of data, but generally store integers, strings, and offsets of other records in the database. To keep things fast and simple, records can not span chunks so must be less than 4K in size. Smaller records are also recommended to reduce wasted space.
A pretty standard memory management technique is used to managed the allocation and freeing of records. Memory in the database is allocated in multiples of 16 bytes. (4K/16 = 256 different sizes). A linked list is maintained of all blocks of the same size. Blocks are subdivided when no smaller block is available to fit the size of the request. When blocks are freed, they are placed at the front of the free list. Neighbouring free blocks are not recombined, although this is an optimization we can do in a later release.
To help speed access to large lists of records, B-Tree indexes are available. The B-Tree features balanced inserts. For now, there is no delete from the B-Tree since this is hard to do and maintain balance. Interfaces for comparators and visitors are provided to manage the insertion and navigation of the B-Trees respectively.
One database is maintained per project and is stored in the meta-data. This is the same as we've done in the past with the old index architecture. In the future we will allow for sharing of these indexes by import/export mechanisms. For now, not so much...
Records in the PDOM exist for IBindings and IASTNames that are found in the DOM. They don't necessarily need to be derived from the DOM but must look like the do (e.g. ctags). These PDOM object are often intermixed with regular DOM objects to make them transparent to clients as much as possible.
One of the key things we've learned over the years in the indexing business is that not all projects are created equal. With the Full indexer, a small project can be indexed in a matter of seconds, where a large project can take hours. Faster indexers may store less content and less accurate information, which may be good enough for some and not for others. As well, commercial vendors may want to plug in their own indexing technologies.
So the key is that we need to be able to plug in different indexers for different situations. To simplify the clients of indexing information, these indexing architectures must all be able to write their information to a PDOM database.
We will reuse the existing CIndexer extension point to allow ISV's, as well as ourselves, to plugin in new indexers. The user will be able to select a single indexer for each project. The preference is also maintained as the default for new projects.
Indexers must now implement IPDOMIndexer to plug into the PDOM architecture. A PDOMManager will be responsible for passing on CElement Deltas and Indexing commands to the appropriate indexers for each project.
We'll have different sections in the design doc for each indexer.
PDOM Clients are essentially all features that require information from the index. This includes, but is not limited to, the C/C++ Search Page, Search for Declarations/Reference actions, Open Declaration/ Definition/Reference actions, Content Assist.
As much as possible, features such as IASTName.resolveBinding(), will look up information from the PDOM when it is not found in the in-memory DOM. This will simplify things such as Open Declaration which can make use of the PDOM pretty much without change.
Other features that search through the index for patterns, such as the search page and content assist, will use the PDOM search facility directly.