Jump to: navigation, search

Gyrex/Concepts/Listing Story

Note.png
A note on history...
The Gyrex project initially started developing an e-commerce system. Thus, this concept is more of a business case solving one than a fundamental technical Gyrex concept. It has been implemented as "Content Distribution Service" (CDS) and will be kept around for some of the example applications.


Summary

A common concept of on-line shop systems is to have store items which represent products. Even some shop systems work with products directly. In the auction world this is similar. There are various reasons for decoupling the store items from the products most notably being design driven (separation of concerns) and performance.

The listing story presented here is a based on the same basic principles. However, the concept proposed in this document goes beyond these principles and creates a full-fledged listing store which is suitable for serving auction sites as well as classic storefronts and can be extended for any other scenario.

It's also essential to note that the listing story is deeply integrated with search. Actually, search will be the only way to retrieve listings from the underlying store. Traditional browsing will be possible by filtering.

Following the CloudFree approach the listing service provides a common interface for storefront/auction (i.e., web site) developers and hides the complexity from them.


The Listings

The listings are the core elements in this story. The can represent products for sale on a storefront or items for sale in an auction. However, the key is that they are not limited to those two possibilities. Basically, a listing can by anything that you want to present in some way to somebody. It's also possible that listings are digital goods which can be downloaded or simply texts which can be viewed on-line.

Therefore, listings do not provide a fixed structure. They are unstructured documents consisting of a bunch of simple name-value attributes. The attributes describe the listing further.


Base Listing Attributes

In order to provide a common infrastructure for working with listings the following attributes will be defined as the base attribtes.

  • id
  • name
  • title
  • description

id .. a machine generated unique identifier of a listing. It's purpose is to locate a single specific listing if necessary.

name .. a human-readable name of a listing which is typically an identifier that makes sense in a given context (eg. a product/sku number)

title .. a human-readable listing title

description .. a human-readable description of the listing

All other listing attributes are specific to the context where the listing is being used. For example, a "price" attribute can be used for listing in a classic storefront to represent the actually item price. However, in an auction site the "price" attribute can represent a "Buy Now" price.


Navigational Attributes

It is assumed that listings can be navigated. In order to make the navigation as flexible as possible it will be based on the listing attributes as well. Therefore, the following attributes will be used for navigation.

  • path
  • uripath
  • tags
  • start
  • end

path .. path a listing is contained in (eg. "folder/sub/subsub"). Note, the complete path is used here in order to have an implicit assignment to the parent path as well. A listing may be attached to multiple paths. In this case, multiple path attributes may be used. If a path for a listing cannot be determined it will be listed in the root path (i.e. '/').

uripath .. a URI path portion for direct lookup of listings. This is useful for building search engine friendly site URLs. The URI does not need to be unique across the board but should be unique within the same lookup context (eg. unique across all auction listings of a site). Thus, it will be interpreted relative to the context base. If the uri is not provided it will be computed based on the title attribute. If no title attribute is available the name attribute will be used with a last fall-back to the listing id. URI must not start or end with a slash ('/'). Note, the URI path is not encoded but contains the raw string as entered by a user.

tags .. tags (aka. labels) attached to a listing. This allows to navigate listings through a tag cloud or to filter based on tags. Tags are optional.

start .. a timestamp which represents the time after which a listing should be visible (UTC; yyyyMMddHHmm, eg. 200801311200). If the start timestamp cannot be determined it will be set to "0" to make the listing visible as soon as possible.

end .. a timestamp which represents the time after which a listing should not be visible anymore (UTC; yyyyMMddHHmm, eg. 201012312359). If the end timestamp cannot be determined it will be set to "0" which will to make the listing visible forever.


Performance

Performance is an important design goal. The most important thing to note is that under all circumstances listing display time must be cheap (read: flying fast). In order to achieve this we will aggregate as much data as possible in listing attributes and move any computing and aggregation into the listing generation process.

The general strategy is that listings will be generated from underlying data on specific events. Additionally, all listing attributes will be attached to a listing prior to making it available. We specifically trade in actuality for performance, i.e. all updates to the underlying data will be reflected on the listing only after an event has been triggered to update the listing as well and only after the listing update process has finished. The reason is that the listing update process is considered to be expensive.

Even through the listing process is considered to be expensive it still has to scale. Therefore, any implementation must be able to scale horizontally. The listing generation process will process items in parallel and be able to leverage a computing grid in order to scale.

It is important that during listing generation really all possible data is aggregated into the listing. For example, if during a promotion discounts are granted to a product the discount will be calculated during the listing generation process. Additionally, the applied discounts will be attached to the listing so that a storefront can simply read the info from the listing without issuing an additional lookup for discounts and without additional computations. Of course, this is only possible with product discounts but not with order-time discounts. However, following this philosophy it's also possible to aggregate promotional information (eg., buy three get one free) into the listing which would safe additional lookups during basket calculation.


Real-Time Trade-Off

When real-time information needs to be displayed the listing story needs to make an exception to the performance goal. The underlying assumption is that any real-time service is performant and scalable to query. Very likely, the real-time service will use a short cache for handling peak loads. Based on this assumption it is further assumed that using the real-time service directly is more efficient than performing frequent invocations of the listing update process which is simply not designed to handle real-time updates.

One example is the auction pricing model. The auction pricing model requires real-time bid information to be present on an auction listing. Therefore, the auction pricing *will* perform an additional lookup for the bid information on a listing details page. Further performance investigations will show if it's eventually possible to avoid the additional lookups on listings browsing pages. However, the current assumption is that the bidding service will provide this information in a performant and scalable manner even for listing browsing pages which would be more efficient than performing frequent invocations of the listing update process.

Another example is inventory availability in an on-line shop. It's a common functionality these days to indicate if an item is available and can be shipped immediately (i.e., "on stock") or if there is a delay or even sold out.

Of course, following the CloudFree approach the site developer will not recognize the complexity behind this. A site developer will just query the listing service and the listing service implementation will query the listing store and any real-time service if necessary.


Storage

The power of the listing story relies in the listing store. It is assumed that the listing store will be capable of hosting all listings and allows to query for listings very quickly. The listing store an be compared to an index. It needs to provide query and filter capabilities. The listing store shall also provide faceting capabilities. The listings returned by the listing store will contain all the stored attributes. The result has to be paged. The maximum listings per page shall be reasonable low.

The storage itself will be pluggable supporting various client needs and future growth. One store might be build upon open source search technology like Apache Solr. Another store might be a commercial one based on commercial offerings such as FAST or FACT-Finder.