The proxy cache code from Apurva Kumar of IBM India Research was committed to HEAD recently. The proxy cache provides the following features : 1. Semantic caching of positive conjunctive LDAP queries 2. Attribute level caching. 3. Consistency support 4. Support for multiple backend types 5. Support for caching multiple directories 6. Support for multiple database instances for a single cache directory tree. Please refer to the attached document and ITS#2062 for further information and history. Any comments and feedbacks to the developers' list are welcome. (See attached file: ldapcache.html) ------------------------ Jong Hyuk Choi IBM Thomas J. Watson Research Center - Enterprise Linux Group P. O. Box 218, Yorktown Heights, NY 10598 email: jongchoi@us.ibm.com (phone) 914-945-3979 (fax) 914-945-4425 TL: 862-3979Title: LDAP proxy cache
LDAP Proxy Cache extension for OpenLDAPFeatures
LDAP proxy caching: MotivationThere has been a growth of websites providing dynamic content on the web. Typically dynamic content is generated in response to a user request (query) evaluated against a database. Techniques used for traditional content caching are thus not useful for caching dynamic content. Directories are specialized databases which are capable of storing heterogeneous real world information in a single instance. A large number of dynamic content websites use directories at the backend to access personnel information, customer profiles, preferences, authentication information etc. In most of these applications even though the content provided to the user is dynamic, the directory data is relatively static. The solutions proposed for active query caching of database backed web applications are typically based on a semantic query caching engine implemented in a servlet running at a proxy. The servlet contacts the backend database for queries which can not be answered from the cache. Applied to the directory context, the servlet will contact the origin LDAP directory server for such queries. Since directory is a part of the LDAP protocol, performing directory caching outside the LDAP framework poses several problems. Firstly, the servlet will require some schema information. This is because the attribute syntaxes and matching rules are required for query containment. Secondly, the solution will have to rely on some non-standard (outside the LDAP framework) means of ensuring consistency. Thirdly, the security framework of LDAP has to be implemented at the servlet. Lastly, for supporting clients which access LDAP directly (through LDAP URLs or browser address books etc.) such a solution will still require to export an LDAP interface. Our approach is to extend an LDAP directory server to an LDAP proxy cache by implementing the query cache within the server. Implementing LDAP caching at the middleware (LDAP) level ensures that unlike the servlet based approach different applications and directories can use the solution. In this solution, schema information can be replicated from the backend server. Access control information can either be replicated or a different policy might be supported. Consistency issues can be addressed in a standard way, for eg. using the LDAP client update protocol (LCUP). LDAP proxy cache: ArchitectureA typical directory server has a front end which processes incoming client requests, calls a backend interface to perform the corresponding operation on the database, and passes the results back to the client. The backend is a database driver for LDAP. It implements an interface for each of the LDAP operations which performs the read and write operations corresponding to that operation on the database. The result of the operation is sent back to the front end.
Figure 1 shows the architecture of an LDAP proxy cache. The proxy cache backend supports the backend interfaces like any other backend. It also consists of a query cache which implements query containment, cache replacement, prefetching algorithms and stores semantic information for the cached queries. The following cache specific operations are peformed by the proxy cache backend.
Proxy Cache: Implementation in OpenLDAPCaching is based on templates. A template is completely defined in terms of a filter format and a set of projected attributes. For eg. queries Query 1: Filter: &(cn=Foo)(c=US) attrset: {mail,telephoneNumber}. Query 2: Filter: &(cn=Bar)(c=IN) attrset: {mail,telephoneNumber}. belong to the same template filter: &(cn=)(c=) attrset: {mail,telephoneNumber}. Typical LDAP applications use a finite number of such templates. The proxy cache allows specification of those templates and attribute sets for which the corresponding queries are to be cached. Each template is associated with a time to live (TTL). This is the maximum duration for which the cached queries corresponding to the template are considered valid. Cache replacement removes queries which are least recently used (LRU). Cache replacement is invoked when cache size reaches a threshold (hi_thresh). Queries are removed until the cache size reaches (lo_thresh). The query cache and cache management routines are implemented in back-meta while the cached entries can be stored in any backend (eg. back-ldbm, back-bdb etc). Back-meta extensionsThe meta backend performs basic LDAP proxying with respect to a set of remote LDAP servers, called "targets". The information contained in these servers can be presented as belonging to a single Directory Tree (DIT). Cache configuration specific directives: Three cache specific directives are added:
The struct metainfo is extended to contain a pointer to struct cache_manager. typedef struct cache_manager_s { unsigned long cache_size; /* current cache size (bytes) */ unsigned long thresh_hi; /* cache capacity (bytes) */ unsigned long thresh_lo; /* lower threshold for cache replacement */ unsigned long num_cached_queries;/* total # of cached queries */ unsigned long max_queries; /* upper bound on # of cached queries */ int numattrsets; /* number of attribute sets */ int numtemplates; /* number of cacheable templates */ int total_entries; /* total number of entries cached */ int num_entries_limit; /* max # of entries in a cacheable query */ int consistency_cycle_time; /* consistency cycle time (sec) */ int consistency_time; /* last consistency check (sec) */ query_manager* qm; /* query cache managed by the cache manager */ /* mutexes */ ... } cache_manager; struct query_manager represents the query cache. typedef struct query_manager_s { struct attr_set* attr_sets; /* possible sets of projected attributes */ QueryTemplate* templates; /* cacheable templates */ CachedQuery* lru_top; /* top and bottom of LRU list */ CachedQuery* lru_bottom; /* Query cache methods */ QCfunc qcfunc; /* Query containment*/ CRfunc crfunc; /* cache replacement */ AddQueryfunc addfunc; /* add query */ } query_manager; The above structs are defined in back-meta/cache.h. Routine for query containmentQuery containment is provided for substring, GE, LE and equality queries. An incoming query belonging to a cacheable template is checked for containment with the cached queries of that template or a more general template. The general query containment routine is in back-meta/query-cache.c while containment using substring queries is implemented in back-meta/substring.c. Handling search requestThe search backend API of back-meta is modified to handle an incoming search request at the proxy cache. If the incoming query is contained in any of the cached queries, it performs a local cache search and sends the result to the clients. Otherwise, it sends the query to the origin server. The entries returned are sent to the client and are merged in the cache database. Merge/Remove implementationThe merge and remove operations are implemented using the callback mechanism provided in OpenLDAP. We introduce a cache specific multi-valued operational attribute queryid which is required in all the cached entries. It represents the IDs of all the queries, the entry belongs to. Add/merge: To add/merge an entry e: {(a1,v1), (a2,v2)..(an,vn)} with name, dn, obtained as a result of query with ID qid, using the callback mechanism the following steps are performed:
Remove: To remove a query with search base dn, scope s, and ID qid.
The backglue backend is used to provide appearance of a single cache database while allowing multiple databases to serve a single cache directory tree. Support for weak consistency of the cached information. Support for weak consistency of the cached queries is implemented. Consistency check is carried out periodically in which stale queries with expired TTLs are removed. The interval between checks and TTL values can be specified during configuration using cache specific directives.Author: Apurva Kumar kapurva@in.ibm.com |