Manipulating the data sent between the browser and the web application to an attacker's advantage has long been a simple but effective way to make applications do things in a way the user often shouldn't be able to. In a badly designed and developed web application, malicious users can modify things like prices in web carts, session tokens or values stored in cookies and even HTTP headers.
No data sent to the browser can be relied upon to stay the same unless cryptographically protected at the application layer. Cryptographic protection in the transport layer (SSL) in no way protects one from attacks like parameter manipulation in which data is mangled before it hits the wire. Parameter tampering can often be done with:
URL Query Strings
Cookies are the preferred method to maintain state in the stateless HTTP protocol. They are however also used as a convenient mechanism to store user preferences and other data including session tokens. Both persistent and non-persistent cookies, secure or insecure can be modified by the client and sent to the server with URL requests. Therefore any malicious user can modify cookie content to his advantage. There is a popular misconception that non-persistent cookies cannot be modified but this is not true; tools like Winhex are freely available. SSL also only protects the cookie in transit.
The extent of cookie manipulation depends on what the cookie is used for but usually ranges from session tokens to arrays that make authorization decisions. (Many cookies are Base64 encoded; this is an encoding scheme and offers no cryptographic protection).
Example from a real world example on a travel web site modified to protect the innocent (or stupid).
Cookie: lang=en-us; ADMIN=no; y=1 ; time=10:30GMT ;
The attacker can simply modify the cookie to;
Cookie: lang=en-us; ADMIN=yes; y=1 ; time=12:30GMT ;
One mitigation technique is to simply use one session token to reference properties stored in a server-side cache. This is by far the most reliable way to ensure that data is sane on return: simply do not trust user input for values that you already know. When an application needs to check a user property, it checks the userid with its session table and points to the users data variables in the cache / database. This is by far the correct way to architect a cookie based preferences solution.
Another technique involves building intrusion detection hooks to evaluate the cookie for any infeasible or impossible combinations of values that would indicate tampering. For instance, if the "administrator" flag is set in a cookie, but the userid value does not belong to someone on the development team.
The final method is to encrypt the cookie to prevent tampering. There are several ways to do this including hashing the cookie and comparing hashes when it is returned or a symmetric encryption , although server compromise will invalidate this approach and so response to penetration must include new key generation under this scheme.
HTTP headers are control information passed from web clients to web servers on HTTP requests, and from web servers to web clients on HTTP responses. Each header normally consists of a single line of ASCII text with a name and a value. Sample headers from a POST request follow.
Host: www.someplace.org Pragma: no-cache Cache-Control: no-cache User-Agent: Lynx/2.8.4dev.9 libwww-FM/2.14 Referer: http://www.someplace.org/login.php Content-type: application/x-www-form-urlencoded Content-length: 49
Often HTTP headers are used by the browser and the web server software only. Most web applications pay no attention to them. However some web developers choose to inspect incoming headers, and in those cases it is important to realize that request headers originate at the client side, and they may thus be altered by an attacker.
Normal web browsers do not allow header modification. An attacker will have to write his own program (about 15 lines of perl code will do) to perform the HTTP request, or he may use one of several freely available proxies that allow easy modification of any data sent from the browser.
Example 1: The Referer header (note the spelling), which is sent by most browsers, normally contains the URL of the web page from which the request originated. Some web sites choose to check this header in order to make sure the request originated from a page generated by them, for example in the belief it prevents attackers from saving web pages, modifying forms, and posting them off their own computer. This security mechanism will fail, as the attacker will be able to modify the Referer header to look like it came from the original site.
Example 2: The Accept-Language header indicates the preferred language(s) of the user. A web application doing internationalization (i18n) may pick up the language label from the HTTP header and pass it to a database in order to look up a text. If the content of the header is sent verbatim to the database, an attacker may be able to inject SQL commands (see SQL injection) by modifying the header. Likewise, if the header content is used to build a name of a file from which to look up the correct language text, an attacker may be able to launch a path traversal attack.
Simply put headers cannot be relied upon without additional security measures. If a header originated server-side such as a cookie it can be cryptographically protected. If it originated client-side such as a referer it should not be used to make any security decisions.
When a user makes selections on an HTML page, the selection is typically stored as form field values and sent to the application as an HTTP request (GET or POST). HTML can also store field values as Hidden Fields, which are not rendered to the screen by the browser but are collected and submitted as parameters during form submissions.
Whether these form fields are pre-selected (drop down, check boxes etc.), free form or hidden, they can all be manipulated by the user to submit whatever values he/she chooses. In most cases this is as simple as saving the page using "view source", "save", editing the HTML and re-loading the page in the web browser.
As an example an application uses a simple form to submit a username and password to a CGI for authentication using HTTP over SSL. The username and password form fields look like this.
Some developers try to prevent the user from entering long usernames and passwords by setting a form field value maxlength=(an integer) in the belief they will prevent the malicious user attempting to inject buffer overflows of overly long parameters. However the malicious user can simply save the page, remove the maxlength tag and reload the page in his browser. Other interesting form fields include disabled, readonly and value. As discussed earlier, data (and code) sent to clients must not be relied upon until in responses until it is vetted for sanity and correctness. Code sent to browsers is merely a set of suggestions and has no security value.
Hidden Form Fields represent a convenient way for developers to store data in the browser and are one of the most common ways of carrying data between pages in wizard type applications. All of the same rules apply to hidden forms fields as apply to regular form fields.
Example 2 - Take the same application. Behind the login form may have been the HTML tag;
<input name="masteraccess" type="hidden" value="N">
By manipulating the hidden value to a Y, the application would have logged the user in as an Administrator. Hidden form fields are extensively used in a variety of ways and while it's easy to understand the dangers they still are found to be significantly vulnerable in the wild.
Instead of using hidden form fields, the application designer can simply use one session token to reference properties stored in a server-side cache. When an application needs to check a user property, it checks the session cookie with its session table and points to the user's data variables in the cache / database. This is by far the correct way to architect this problem.
If the above technique of using a session variable instead of a hidden field cannot be implemented, a second approach is as follows.
The name/value pairs of the hidden fields in a form can be concatenated together into a single string. A secret key that never appears in the form is also appended to the string. This string is called the Outgoing Form Message. An MD5 digest or other one-way hash is generated for the Outgoing Form Message. This is called the Outgoing Form Digest and it is added to the form as an additional hidden field.
When the form is submitted, the incoming name/value pairs are again concatenated along with the secret key into an Incoming Form Message. An MD5 digest of the Incoming Form Message is computed. Then the Incoming Form Digest is compared to the Outgoing Form Digest (which is submitted along with the form) and if they do not match, then a hidden field has been altered. Note, for the digests to match, the name/value pairs in the Incoming and Outgoing Form Messages must concatenated together in the exact same order both times.
This same technique can be used to prevent tampering with parameters in a URL. An additional digest parameter can be added to the URL query string following the same technique described above.
URL Manipulation comes with all of the problems stated above about Hidden Form Fields, and creates some new problems as well.
HTML Forms may submit their results using one of two methods: GET or POST. If the method is GET, all form element names and their values will appear in the query string of the next URL the user sees. Tampering with hidden form fields is easy enough, but tampering with query strings is even easier. One need only look at the URL in the browser's address bar.
Take the following example; a web page allows the authenticated user to select one of his pre-populated accounts from a drop-down box and debit the account with a fixed unit amount. It's a common scenario. His/her choices are recorded by pressing the submit button. The page is actually storing the entries in form field values and submitting them using a form submit command. The command sends the following HTTP request.
A malicious user could construct his own account number and change the parameters as follows:
Thee new parameters would be sent to the application and be processed accordingly.
This seems remarkably obvious but has been the problem behind several well-published attacks including one where hackers bought tickets from the US to Paris for $25 and flew to hold a hacking convention. Another well-known electronic invitation service allowed users to guess the account ID and login as a specific user this way; a fun game for the terminally bored with voyeuristic tendencies.
Unfortunately, it isn't just HTML forms that present these problems. Almost all navigation done on the internet is through hyperlinks. When a user clicks on a hyperlink to navigate from one site to another, or within a single application, he is sending GET requests. Many of these requests will have a query string with parameters just like a form. And once again, a user can simply look in the "Address" window of his browser and change the parameter values.
Solving URL manipulation problems takes planning. Different techniques can be used in different situations. The best solution is to avoid putting parameters into a query string (or hidden form field).
When parameters need to be sent from a client to a server, they should be accompanied by a valid session token. The session token may also be a parameter, or a cookie. Session tokens have their own special security considerations described previously. In the example above, the application should not make changes to the account without first checking if the user associated with the session has permission to edit the account specified by the parameter "accountnumber". The script that processes a credit to an account cannot assume that access control decisions were made on previous application pages. Parameters should never be operated on unless the application can independently validate they were bound for and are authorized to be acted on.
However, a second form of tampering is also evident in the example. Notice that the creditamount is increased from 1 to 999999999. Imagine that the user doesn't tamper with the accountnumber but only with the amount. He may be crediting his own account with a very large sum instead of $1. Clearly this is a parameter that should simply not be present in the URL.
There are two reasons why a parameter should not be a URL (or in a form as a hidden field). The above example illustrates one reason - the parameter is one the user should not be able to set the value of. The second is if a parameter is one the user should not be able to see the value of. Passwords are a good example of the latter. Users's should not even see their own passwords in a URL because someone may be standing behind them and because browsers record URL histories. See Browser History Attack.
If a sensitive parameter cannot be removed from a URL, it must be cryptographically protected. Cryptographic protection can be implemented in one of two ways. The better method is to encrypt an entire query string (or all hidden form field values). This technique both prevents a user from setting the value and from seeing the value.
A second form of cryptographic protection is to add an additional parameter whose value is an MD5 digest of the URL query string (or hidden form fields) More details of this technique are described above in the section "HTML Form Field Manipulation". This method does not prevent a user from seeing a value, but it does prevent him from changing the value.