A High Performance Model for State and Caching Philadelphia PA

Putting objects in the right place can help performance in the short term, allowing your existing configuration to do more, as well as setting you up for larger horizontal scaling options with server farms.

Local Companies

InfoCures, LLC
856-675-3700
4300 Haddonfield Road
Pennsauken, NJ
CPA Technology, LLC
610-862-1998
101 W. Elm Street
Conshohocken, PA
Deacom, Inc.
610-971-2278
950 West Valley Road, Suite 3000
Wayne, PA
Quintiq Inc.
610-964-8111
565 E Swedesford Rd #303
Wayne, PA
KasTech Consulting, Inc.
215-702-8155
116 N. Bellevue Avenue
Langhorne, PA
MLM Software
+1-402-524-6464
Berkeley House
Bristol, TX
eClinicalworks EMR Software
+1-395-685-7896
218 Mount Holly
California, AL
The MediGroup, Ltd.
610-666-1955 x12
1308 Egypt Road
Oaks, PA
Traylor Associates
(215) 843-6005
Philadelphia, PA
Delaware Valley Computing Services
(215) 849-1500
3502 Scotts Ln
Philadelphia, PA

provided by: 
Originally published at Internet.com


Every Web application has to deal with session state in one way or another. Most default to the state management built into the environment of the language they're using. It's simple, quick, and in many cases effective.

The problem comes in when you're trying to build a high volume site and the standard way of doing things just won't cut it. Dumping everything into session state just doesn't work; caching becomes a necessity. You're forced to start bringing more things into memory per user—while simultaneously keeping your per-user memory utilization down. It is when caching is added to the performance mix that you begin to really see how your fundamental views of session state and caching need to shift.

In this article, I present a model for creating different pools of information tied to both, how the information is used and how expensive that information is to reproduce. I'll detail a method for improving performance using careful management of state and cache.

A Tale of Three Pools

In the Web application world, there are three basic pools of information related to state and caching. They are: * Cache: Information that can be reproduced by persistent storage. Cache exists only because it improves performance. Cache, although a time proven technique, has the problem of cache coherency, as discussed later. Cache information is necessarily memory based and not persisted. It's designed for speed, so the performance impact of reading it from a database is nearly always not worth it. * Session: Information that the user has entered during the session or data specific to the current session. This information isn't available from other sources but doesn't need to be persisted for long. It is disposable once the user's session has ended. Sessions should have a short term persistence strategy—although many are still memory based and don't have a persistence strategy at all. * User: User information is information that is related to the user, which may need to be accessed during a session but it's different in that it is persisted for long periods of time. User information including user attributes, histories of orders, settings, and so on is stored with the user for the life of their account.

The key in any effort to manage session state is to determine what kind of information is really being managed. This is true because often cache information and session information are intermingled. Some of the information traditionally stored in a session is really cached information that can be regenerated if necessary.

Load Balancers Impact on Session

The most shocking experience that developers have when they first start to develop applications that need to be placed on a Web farm is the experience of a session shifting from one server to another server. This is because, even though they often believe that they've carefully stored away everything into a persisted storage, they are often see places where seemingly innocuous things have caused problems. All of a sudden, a two-page form breaks because the user was shifted from one server to another during the middle of the form.

Of course, there are settings on load balancers that anchor a user's session to one server or another for the length of the session; however, these settings come at a cost of the load balancer not being able to truly balance the load well. Generally, it's accepted that it's a bad idea to lock a user's session to one server or the other from a performance standpoint. This, of course, ignores what happens to the sessions should the server servicing a user go down—the user's session is thrown away.

Load balancers, when set to distribute activity evenly across the servers, will quickly show places where session variables were not persisted and were accidentally shoved off into some sort of cache. Unforrtunately, there isn't anything that will show the reverse off so easily—it's difficult to find the places where you've added cached data to the persisted session storage.

Load balancers are tremendously effective at spreading load if you've been able to properly plan for the kinds of data that you store, and can be really difficult to get working if you don't.

Cache Characteristics

As mentioned above, cache is designed to be fast access to information that can be regenerated if necessary. That "if necessary" part may mean that the server becomes memory starved and needs the memory back for operations; it may mean the user has transitioned to a new server and the cached information doesn't exist, an application pool was reset, or a variety of other reasons.

To make effective use of cache, you need to make it an integral part of the way the objects work. One of the most effective ways to do this is to use a singleton pattern where objects are constructed through a static method on the class rather than 'new'ing an object. This is important because an object that has been instantiated is necessarily a new object. However, the object that you are handed back from a static method may be new, or it may be an object from cache.

Take, for instance, a catalog where you have products. It's likely that the system will need a product several times. The object itself is mostly initiated from the database. During the execution, the user views the product, they add it to their cart, and they check out. In this most streamlined example, the user makes three state transitions (view, want to purchase, agree to purchase) and potentially many more pages as they put other items in their cart and they browse around the site, coming back to what they've already put in their cart to compare it.

Rather than creating a product object each time, you can create one product object, store it in cache, and fetch it each time that the application needs it. The static "GetProduct" method on the catalog class can look it up in cache and, if found, return it. If the product isn't found in cache, the static method can create it and put it in the cache for next time.

This has the side benefit that, in this case, it's possible to share the product object among multiple users so there may be 100 people with the new widget in their cart, but there will only be one copy of the object in memory at any one time. Of course, when you put it in cache you can decide whether or not you want it to be shared across all users.

This works really well for read-only objects, the kind that you don't update, or you don't update often. However, there's a problem called cache coherency that occurs when the data can be updated. It is when the data in the cached copy of the object is out of sync with the real source of the data.

There are a few ways to approach this problem. Three are listed below: 1. Ignore it: Although, on the surface, ignoring the problem may seem like a silly idea, it does have its place. Some problems aren't worth fixing because the probability that they will cause a problem is too low. For instance, if you're talking about updating the cache for the friendly name displayed on the site, there's low chance of the data changing, there's low impact of the data being out of sync causing real harm. It may not be worth fixing. Once of the techniques used to mitigate the problems is to reduce the cache time on cached objects so they don't stay in cache that long. Because they aren't in cache that long, the chances of a cache coherency problem is low. However, it limits the effectiveness of your cache as well because objects may expire before they can be used. 2. Session Based Clearing: If the cached data relates specifically to the user, write a date/time to the session. Any cached data created before that date and time is

Author: Robert Bogue

Read article at Internet.com site

Featured Local Company

InfoCures, LLC

Prescriptions for I.T. Success

856-675-3700
4300 Haddonfield Road
Pennsauken, NJ
www.infocures.com

InfoCures develops strategic IT plans and implements solutions that meet the contemporary technology needs of global businesses.


Related Local Events
Hot Java at the Watercooler
Dates: 12/23/2009 - 1/23/2010
Location: Odyssey Travel
Skippack, PA
View Details

Fraser Advanced Information Systems Mixer
Dates: 10/29/2009 - 10/29/2009
Location:
Bethlehem, PA
View Details

PREP 2009: 22nd International Symposium, Exhibit and Workshops
Dates: 7/19/2009 - 7/22/2009
Location: Loews Hotel
Philadelphia, PA
View Details