Web Based Session Management

Best Practices in Managing HTTP Based Client Sessions

by Gunter Ollmann

Overview

The stateless nature of HTTP requires organisations and solution developers to find other methods of uniquely tracking a visitor through a web-base application. Various methods of managing a visitor�s session have been proposed and used, but the most popular method is through the use of unique session IDs. Unfortunately, in too many cases organisations have incorrectly applied session ID management techniques that have left their �secure� application open to abuse and possible hijacking. This document reviews the common assumptions and flaws organisations have made and proposes methods to make their session management more secure and robust.

Understanding the Situation

Most organisations now have substantial investments in their online Internet presences. For major financial institutions and retailers, the Internet provides both a cost effective means of presenting their services and products to customer, and a method of delivering a personalised 24-7 presence. In almost all cases, the preferred method of delivering these services is over common HTTP. Due to the way this protocol works, there is no inbuilt facility to uniquely identify or track a particular customer (or session) within an application � thus the connection between the customer�s web-browser and the organisations web-service is referred to as stateless. Therefore, organisations have been forced to adopt custom methods of managing client sessions if they wish to maintain state.

The most common method of tracking a customer through a web site is by assigning a unique session ID � and having this information transmitted back to the web server with every request. Unfortunately, should an attacker guess or steal this session ID information, it is normally a trivial exercise to hijack and manipulate another user�s active session.

An important aspect of correctly managing state information through session IDs relates directly to authentication processes. While it is possible to insist that a client using an organisations web application provide authentication information for each �restricted� page or data submission, it would soon become tedious and untenable. Thus session IDs are not only used to follow clients throughout the web application, they are also used to uniquely identify an authenticated user � thereby indirectly regulating access to site content or information.

The methods available to organisations for successfully managing sessions and preventing hijacking type attacks are largely dependant upon the answers to a number of critical questions:

Where and how often are legitimate clients expected to utilise the web-based application?
At what stage does the organisation really need to manage the state of a client�s session?
What level of damage could be done to the legitimate client should an attacker be able to impersonate and hijack their account?
How much time is someone likely to invest in breaking the session management method?
How will the application identify or respond to potential or real hijacking attempts?
What is the significance to application usability should it be necessary to use an encrypted version of HTTP (HTTPS)?
What would be the cost to the organisations reputation should information about a security flaw in any session management be made public?

Finding answers to these questions will enable the organisation to evaluate the likelihood and financial risk of an inappropriate or poorly implemented session management solution.

Maintaining State

Typically, the process of managing the state of a web-based client is through the use of session IDs. Session IDs are used by the application to uniquely identify a client browser, while background (server-side) processes are used to associate the session ID with a level of access. Thus, once a client has successfully authenticated to the web application, the session ID can be used as a stored authentication voucher so that the client does not have to retype their login information with each page request.

Organisations application developers have three methods available to them to both allocate and receive session ID information:

Session ID information embedded in the URL, which is received by the application through HTTP GET requests when the client clicks on links embedded with a page.
Session ID information stored within the fields of a form and submitted to the application. Typically the session ID information would be embedded within the form as a hidden field and submitted with the HTTP POST command.
Through the use of cookies.

Each method has certain advantages and disadvantages, and one may be more appropriate than another. Selection of one method over another is largely dependant upon the type of service the web application is to deliver and the intended audience. Listed below is a more detailed analysis of the three methods. It is important that an organisations system developers understand the limitations and security implications of each delivery mechanism.

URL Based Session ID's

Session ID information embedded in the URL, which is received by the application through HTTP GET requests when the client clicks on links.

Example: http://www.example.com/news.asp?article=27781;sessionid=IE60012219

Advantages:

Can be used even if the client web-browser has high security settings and has disabled the use of cookies.
Access to the information resource can be sent by the client to other users by providing them with a copy of the URL.
If the Session ID is to be permanently associated with the client-browser and their computer, it is possible for the client to �Save as a favourite�.
Depending upon the web browser type, URL information is commonly sent in the HTTP REFERER field. This information can be used to ensure a site visitor has followed a particular path within the web application, and subsequently used to identify some common forms of attack.

Disadvantages:

Any person using the same computer will be able to review the browser history file or stored favourites and follow the same URL.
URL information will be logged by intermediary systems such as firewalls and proxy servers. Thus anyone with access to these logs could observe the URL and possibly use the information in an attack.
It is a trivial exercise for anyone to modify the URL and associated session ID information within a standard web browser. Thus, the skills and equipment necessary to carry out the attack are minimal � resulting in more frequent attacks.
When a client navigates to a new web site, the URL containing the session information can be sent to the new site via the HTTP REFERER field.

Hidden Post Fields

Session ID information stored within the fields of a form and submitted to the application. Typically the session ID information would be embedded within the form as a hidden field and submitted with the HTTP POST command.

Example: Embedded within the HTML of a page �

Advantages:

Not as obvious as URL embedded session information, and consequently requires a slightly higher skill level for an attacker to carry out any manipulation or hijacking.
Allows a client to safely store or transmit URL information relating to the site without providing access to their session information.
Can also be used even if the client web-browser has high security settings and has disabled the use of cookies.

Disadvantages:

While it requires a slightly higher skill level to perform, attacks can be carried out using commonly available tools such as Telnet or via personal proxy services.
The web application page content tends to be more complex � relying upon embedded form information, client-side scripting such as JavaScript, or embedded within active content such as Macromedia Flash. In addition - pages tend to be larger, requiring more time for the client to download and thus perceiving the site as slower and more unresponsive.
Due to poor coding practices, a failure to check the submission type (i.e. GET or POST) at the server side may allow the POST content to be reformed into a URL that could be submitted via the HTTP GET method.

Cookies

Each time a client web browser accesses content from a particular domain or URL, if a cookie exists, the client browser is expected to submit any relevant cookie information as part of the HTTP request. Thus cookies can be used to preserve knowledge of the client browser across many pages and over periods of time. Cookies can be constructed to contain expiry information and may last beyond a single interactive session. Such cookies are referred to as �persistent cookies�, and are stored on the client browsers hard-drive in a location defined by the particular browser or operating system (e.g. c:\documents and settings\clientname\cookies for Internet Explorer on Windows XP). By omitting expiration information from a cookie, the client browser is expected to store the cookie only in memory. These �session cookies� should be erased when the browser is closed.

Example: Within the plain text of the HTTP server response �

Set-Cookie: sessionID=�IE60012219�; path=�/�; domain=�www.example.com�; expires=�2003-06-01 00:00:00GMT�; version=0

Advantages:

Careful use of persistent and session type cookies can be used to regulate access to the web application over time.
More options are available for controlling session ID timeouts.
Session information is unlikely to be recorded by intermediary devices.
Cookie functionality is built in to most browsers. Thus no special coding is required to ensure session ID information is embedded within the pages served to the client browser.

Disadvantages:

An increasingly common security precaution with web browsers is to disable cookie functionality. Thus web applications dependant upon the cookie function will not work for �security conscious� users.
As persistent cookies exist as text files on the client system, they can be easily copied used on other systems. Depending on the hosts file access permissions, other users of the host may steal this information and impersonate the user.
Cookies are limited in size, and are unsuitable for storing complex arrays of state information.
Cookies will be sent with very page and file requested by the browser within the domain defined by the SET-COOKIE.

The Session ID

An important aspect of managing state within the web application is the �strength� of the session ID itself. As the session ID is often used to track an authenticated user through the application, organisations must be aware that this session ID must fulfil a particular set of criteria if it is not to be compromised through predictive or brute-force type attacks. The two critical characteristics of a good session ID are randomness and length.

Session ID Randomness

It is important that the session ID is unpredictable and the application utilises a strong method of generating random ID�s. It is vital that a cryptographically strong algorithm is used to generate a unique session ID for an authenticated user. Ideally the session ID should be a random value. Do not use linear algorithms based upon predictable variables such as date, time and client IP address.

To this end, the session ID should fulfil the following criteria:

It must look random � i.e. it should pass statistical tests of randomness.
It must be unpredictable � i.e. it must be infeasible to predict what the next random value will be, given complete knowledge of the computational algorithm or hardware generating the ID and all previous ID�s.
It cannot be reliably reproduced � i.e. if the ID generator is used twice with exactly the same input criteria, the result will be an unrelated random ID.

Session ID Length

It is important that the session ID be of a sufficient length to make it infeasible that a brute force method could be used to successfully derive a valid ID within a usable timeframe. Given current processor and bandwidth limitations, session ID�s consisting of over 50 random characters in length are recommended � but make them longer if the opportunity exists.

The actual length of the session ID is dependant upon a number of factors:

Speed of connection � i.e. there is typically a big difference between Internet client, B2B and internal network connections. While an Internet client will typically have less than a 512 kbps connection speed, an internal user may be capable of connecting to the application server at 200 times faster. Thus an internal user could potentially obtain a valid session ID in 1/200th of the time.
Complexity of the ID � i.e. what values and characters are used within the session ID? Moving from numeric values (0-9) to a case-sensitive alpha-numeric (a-z, A-Z, 0-9) range means that, for the same address space, the session ID becomes much more difficult to predict. For example, the numeric range of 000000-999999 could be covered by 0000-5BH7 using a case-sensitive alpha-numeric character set.

Session Hijacking

As session ID�s are used to uniquely identify and track a web application user, any attacker who obtains this unique identifier is potentially able to submit the same information and impersonate someone else � this class of attack is commonly referred to as Session Hijacking. Given the inherent stateless nature of the HTTP (and HTTPS) protocol, the process of masquerading as an alternative user using a hijacked session ID is trivial.

An attacker has at his disposal three methods for gaining session ID information � observation, brute force and misdirection of trust.

Observation

By default all HTTP traffic crosses the wire in an unencrypted, plain text, mode. Thus, any device with access to the same wire or shared network devices is capable of �sniffing� the traffic and recording session ID information (not to mention user authentication information such as user names and passwords). In addition, many perimeter devices automatically log aspects of HTTP traffic � in particular the URL information.

A simple security measure to prevent �sniffing� or logging of confidential URL information is to use the encrypted form of HTTP � HTTPS.

Brute Force

If the session ID information is generated or presented in such a way as to be predictable, it is very easy for an attacker to repeatedly attempt to guess a valid ID. Depending upon the randomness and the length of the session ID, this process can take as little time as a few seconds.

In ideal circumstances, an attacker using a domestic DSL line can potentially conduct up to as many as 1000 session ID guesses per second. Thus it is very important to have a sufficiently complex and long session ID to ensure that any likely brute forcing attack will take many hundreds of hours to predict.

A paper by David Endler on the processes involved in brute forcing session ID�s should be sought by readers requiring background information on this process.

Misdirected trust

In ideal circumstances, a client�s web browser would only ever disclose confidential session ID information to a single, trusted site. Unfortunately, there are numerous instances when this is not the case. For example � the HTTP REFERER field will send the full URL, and in some applications this URL may contain session ID information.

Another popular method, utilising common trust relationship flaws, are HTML embedded and Cross-site Scripting (CSS or sometimes XSS) attacks. Through clever embedding of HTML code or scripting elements, it is possible to steal session ID information � even if it is held within the URL, POST fields and cookies. Readers needing more information about this class of attack should review a copy of �HTML Code Injection and Cross-site scripting�.

Common Failings

While web based session management is important for tracking users and their navigation throughout an application, the most critical use is to maintain the state information of an authenticated user as he carries out his allowed functions. For online banking and retail environments, using an appropriately strong session management method is crucial to the success of the organisation.

In the past, I have had the opportunity to investigate session handling techniques for many of my client�s business critical online applications. Based upon these investigations, this section details some of the most common failings and assumptions that have been made.

Predictable Session ID�s

The most common flaw in session ID usage has always been predictability. As discussed earlier, the two causes are a lack of randomness, or length, or both.

Sequential allocation of Session ID�s � Each visitor to the site is allocated a session ID in sequential order. Thus, by observing your own session ID information, the simple practice of replacing it with another value a few iterations up or down will allow the attacker to impersonate another user.
Session ID values are too short � The full range of valid session ID�s could be covered during an automated attack before there is time for the session to expire.
Common hashing techniques � While many commercial web services have built in functions for calculating hashed information, these mechanisms are well known and available for reproduction. A hashing function will indeed create a session ID value that appears to be unique and great care should be taken to ensure that predicable information is not used in the generation of the hash. For example, there have been cases where the �unique� hash was based upon the local system time, and the IP address of the connecting host. Using the same hashing function, the attacker would be able to pre-calculate a large number of time dependant hashes for a popular internet portal or proxy service (i.e. AOL), and use them to brute force any existing session from that service.
Session Obfuscation � The use of a custom method of obscuring data and using it for session management. It is never a sound idea to include client or other confidential information within a session ID. For example, some organisations have even tried encoding the user�s name and password within the session ID using a shifted Unicode and hexadecimal representation of the information.

Insecure Transmission

For banking and retailing applications it is crucial that all confidential material and session information be transmitted securely and not vulnerable to observation or replay attacks. Unfortunately many commercial packages have failed in the past to secure the integrity of their session management due to insecure transmission.

Use Encryption when sending session information � As mentioned earlier, there are a lot of instances whereby a users connection to the application server will be logged if not sent over an encrypted channel, such as HTTPS. This is particularly important for applications that require high a degree of confidentiality. If using the cookie method for managing session IDs, organisations should note that the client browser will submit the session ID with every request (this includes pages and graphics) and may even submit it to other servers within the same domain � which may or may not be done over a secure data channel.
Use different session ID�s when shifting between secure and insecure application components � As a new user navigates the web application as a �guest�, use a different session ID than what would be allocated in the secure part of the application. Never use the same session ID information in the authenticated and unauthenticated sections of the web application. Again, ensure that the session ID to be used in the secure part of the web application is not predictable and based on the previous ID.

Length of Session Validity

For secure applications all session information should be time limited and allow for client-side cancellation or server-side revocation.

Client Cancellation � Many web applications fail to allow for client-side cancellation such as �log-out�. If the intention is to allow users to interact with the application from anywhere, including Internet Cafes, organisations need to be aware that other users can use the same machine and trawl through the �history� and cached page information. If the session has not been cancelled, it is a trivial exercise for the next user of the computer to �resume� the last connection.
Session Timeout � Again, when dealing with the possibility of shared client computers, it is extremely important that there is a limited lifetime (or period of inactivity) after which the session will automatically expire. The expiry time should be kept to a minimum period, and is dependant upon the nature of the application. Ideally the application should be capable of monitoring the period of inactivity for each session ID and be able to delete or revoke the session ID when a threshold has been reached.
Server Revocation � In some circumstances it may be necessary to cancel an session at the server-side. Likely events include when the user leaves the insecure part of the application and enters the secure part with a new session ID. Alternatively, should some kind of attack be recorded by the server, it would be advisable to revoke the session associated with the attackers system.

Session Verification

The processes for handling and manipulating session ID information must be robust and capable of correctly handing attacks targeting the content within.

Session ID Length - Ensure that the content of the session ID is of the expected size and type, and that the quality of the information is verified before processing. For instance, be capable of identifying over-sized session ID�s that may constitute a buffer overflow type attack. Additionally, ensure that the content of the session ID does not contain unexpected information � for example, if the session ID will be used within the application�s backend database, care should be taken that the session ID does not contain embedded data strings that may be interpreted as an extension to the 'Select' SQL query.
Source of the Session ID � When using the HTTP POST method for communication session information, ensure that the application is capable of discerning whether the session ID was delivered to the application from the client browser through the HTTP POST method, and not through a manipulated GET request. Converting HTTP POST into a GET request is a common method of conducting cross-site scripting attacks and other distributed brute force attacks.

Good Session Management

Depending upon the applications purpose, various methods of implementing session handling are available to developers and some may be more applicable than another. For applications requiring the maximum level of session handling security, options are limited, and require a mix of methods described earlier in this document. The following example currently represents one of the most secure methods of handling sessions, but is complex and difficult to implement successfully. The method relies upon three sources of session ID information. This information is held within the URL, the HTTP REFERER field and cookies.

When a client initially connects to the application as a guest, they are assigned a unique personal identifier (ID1), and this information is then embedded within the URL that they are redirected to. Also contained within the URL is a random identifier for the viewed page (ID2). A third personal identifier (ID3) is delivered as a session cookie, with a lifetime of the open client browser (i.e. the session cookie is held in memory � if the browser window and any child windows are closed, the information is lost). If the application server registers no activity from the client browser, the session information of ID3 is revoked.

1. Client connects to the site www.example.com over HTTP. http://www.example.com/

2. The Client is automatically redirected through a server-side redirect to the home page with a URL containing the unique session information - ID1 (user = ID93x7HeT7P4a9) and ID2 (current page = 3789264).

http://www.example.com/page.jsp?user=ID93x7HeT7P4a9;cpage= 3789264

3. Within the HTTP server response, a session cookie is delivered (user track = UT23dWT3nQi7n4).

Set-Cookie: UserTrack=" UT23dWT3nQi7n4"; path="/"; domain="www.example.com"; expires="2000-01-01 00:00:00GMT"; version=0

Within the page presented to the client, there will be many hyperlinks to other content pages within the application. Each link has been dynamically generated to include the client ID1, and a randomly generated (but catalogued) page identifier. As the unauthenticated user moves throughout the site, the current page identifier will change while ID1 and ID3 remain static. ID3 will change when the user is successfully authenticated.
For pages containing user information submission areas, all HTML forms have hidden fields which include both ID1 and ID2. If the submitted information is likely to contain ANY confidential or personal information, the submission MUST be made securely over HTTPS.

4. Within the page, each hyperlink is uniquely addressed and contains an associated random identifier.

5. Within a page containing a user submission area, the form may look like the following (note that the ACTION specifies both HTTPS and the full URL):
<FORM METHOD=POST ACTION="https://www.example.com/post/page.asp">
<INPUT TYPE="hidden" NAME="user" VALUE=" ID93x7HeT7P4a9">
<INPUT TYPE="hidden" NAME="cpage" VALUE="3789264">
<INPUT TYPE="text" NAME="data" MAXLENGTH="100">
<INPUT TYPE="submit" NAME="Send Data">

6. All pages or data submissions by the client browser will include the session cookie information (ID3).

7. The application must take the each identifier (ID1, ID2 and ID3) and check to see if they are valid for the client request, and that they have not timed out or been revoked. If this information is NOT correct, the client is redirected to the applications first page with all new identifiers (ID1, ID2 and ID3) and all previous ID information is revoked.

8. When the client browser submits a request or follows a hyperlink, a HTTP REFERER value is included. This value represents the URL that was previously presented to the client browser. The application should verify that ID2 within the REFERER URL is the correct precursor to the newly requested page (npage=). If not, the client browser has not followed the correct path to request the new page, and may be indicative of an attack in progress.
For example, the correct sequence to reach page 2 from the initial page is by following "link 1". Therefore, the request for the page http://www.example.com/page.asp?user=ID93x7HeT7P4a9;npage=8777623 must contain http://www.example.com/page.jsp?user=ID93x7HeT7P4a9;cpage=3789264 in the HTTP REFERER field.

9. If the identifiers are valid and correct, a new page is presented. ID2 is updated (e.g. current page = 8777623), while ID1 and ID3 remain the same. http://www.example.com/page.jsp?user=ID93x7HeT7P4a9;cpage=8777623

10. The returned page contains new random identifiers for all hyperlinks. There should be a link to go "back" to the previous page. However, the previous page will have been assigned a new random identifier. The client browsers "Back" button will no longer work. For example:

Original Page 1 was http://www.example.com/page.jsp?user=ID93x7HeT7P4a9;cpage=3789264

Page 2 is http://www.example.com/page.jsp?user=ID93x7HeT7P4a9;cpage=8777623

to return to Page 1, the URL may be �http://www.example.com/page.jsp?user=ID93x7HeT7P4a9;cpage=7322641

When the application requires the user to authenticate, all data submission MUST be over an encrypted session such as HTTPS. If the user is successfully authenticated, a new session cookie (ID3) is issued, and the previous session cookie information is revoked at the server. All communication there after (until the user decides to "logout") must be over HTTPS.

11. If the user successfully authenticates with the application, the previous session cookie (ID3) is revoked and a new ID3 is issued through the now encrypted HTTPS session.

12. The application must be able to associate ID3 with the type of communication (i.e. HTTP or HTTPS), and immediately revoke all session information (ID1, ID2 and ID3) if the new ID3 is used to access non-secure application resources. The use of revoked or inappropriate session information should result in the client browser being redirected to the start page and issued with all new session identifiers as previously discussed.

13. Again, just like the unsecured parts of the application, all pages passed to the client in the authenticated and secure part of the application should have randomly generated page identifiers.

14. The user must have the facility to "logout" and cancel their session. Logging out results in the revocation of all session information and, if possible, the automatic closing of the client browser. In addition, it is a good practice to ensure that both the HTML Meta tags associated with caching and HTTP caching options are set to expire in the past so that no page content should be stored on the client system.

It is important to note that when utilising session information in the URL, it becomes near impossible to conduct any kind of URL embedded cross-site scripting attack. By assigning unique random identifiers to each page and linking between pages with one-time identifiers, it is almost impossible for an attacker to conduct any brute force or repetitive attacks. However, as this session method relies upon the use of session cookies, it will not work with client browsers that have disabled cookies. In some cases, a client browser page request may not contain any data in the HTTP REFERER field.

Conclusions

The stateless nature of HTTP requires organisations to use their own custom method of managing state through the use of session specific information. While there are a number of ways of implementing a session management solution, there are benefits and restrictions to each implementation. It is vital that developers understand both the mechanisms available to them, as well as the limitations. For applications requiring an application user to authenticate to access resources, it is imperative that the session management process is implemented securely.

The likelihood of an attacker specifically targeting the session management process is growing on a daily basis. As the security technologies strengthen the server hosts perimeter defences, and good patching management is implemented, session handling often represents the weakest area of critical services.

While this paper has described the limitations of various session handling methods, developers must be aware that good session management is only one component of building a secure application. Good session management can be bypassed through other poorly coded and implemented application components, and should not be seen as a stand-alone security measure.

Related Papers

The Author asserts the moral right to be identified as the author of this work