Tag Archives: xss

Properly Placing XSS Output Encoding

Cross-Site Scripting flaws, as well as other injection flaws, are pretty well understood. We know how they work and how to mitigate them. One of the key factors in mitigation of these flaws is output encoding or escaping. For SQL, we escape by using parameters. For cross-site scripting we use context sensitive output encoding.

In this post, I don’t want to focus on the how of output encoding for cross-site scripting. Rather, I want to focus on when in the pipeline it should be done. Over the years I have had a lot of people ask if it is ok to encode the data before storing it in the database. I recently came across this insufficient solution and thought it was a good time to address it again. I will look at a few different cases that indicate why the solution is insufficient and explain a more sufficient approach.

The Database Is Not Trusted

The database should not be considered a trusted resource. This is a common oversight in many organizations, most likely due to the fact that the database is internal. It lives in a protected portion of your production environment. While that is true, it has many sources that have access to it. Even if it is just your application that uses the database today, there are still administrators and update scripts that most likely access it. We can’t rule out the idea of a rogue administrator, even if it is very slim.

We also may not know if other applications access our database. What if there is a mobile application that has access? We can’t guarantee that every source of data is going to properly encode the data before it gets sent to the database. This leaves us with a gap in coverage and a potential for cross-site scripting.

Input Validation My Not Be Enough

I always recommend to developers to have good input validation. Of course, the term good is different for everyone. At the basic level, your input validation may limit a few characters. At the advanced level it may try to limit different types of attacks. In some cases, it is difficult to use input validation to eliminate all cross-site scripting payloads. Why? Due to the different contexts and the types of data that some forms accept, a payload may still squeak by. Are you protecting against all payloads across all the different contexts? How do you know what context the data will be used in?

In addition to potentially missing a payload, what about when another developer creates a function and forgets to include the input validation? Even more likely, what happens when a new application starts accessing your database and doesn’t perform the same input validation. it puts us back into the same scenario described in the section above regarding the database isn’t trusted.

You Don’t Know the Context

When receiving and storing data, the chances are good I don’t know where the data will be used. What is the context of the data when output? Is it going into a span tag, an attribute or even straight into JavaScript? Will it be used by a reporting engine? The context matters because it determines how we encode the data. Different contexts are concerned with different characters. Can I just throw data into the database with an html encoding context? At this point, I am transforming the data at a time where there is no transformation required. Cross-site scripting doesn’t execute in my SQL column.

A Better Approach

As you can see, the above techniques are useful, however, they appear insufficient for multiple reasons. I recommend performing the output encoding immediately before the data is actually used. By that, I mean to encode right before it is output to the client. This way it is very clear what the context is and the appropriate encoding can be implemented.
Don’t forget to perform input validation on your data, but remember it is typically not meant to stop all attack scenarios. Instead it is there to help reduce them. We must be aware that other applications may access our data and those applications may no follow the same procedures we do. Due to this, making sure we make encoding decisions at the last moment provides the best coverage. Of course, this relies on the developers remembering to perform the output encoding.

Amazon XSS: Thoughts and Takeaways

It was recently identified, and Amazon was quick (2 days) to fix it, that one of their sites was vulnerable to cross-site scripting. Cross-site scripting is a vulnerability that allows an attacker to control the output in the user’s browser. A more detailed look into cross-site scripting can be found on the OWASP site.


  • QA could have found this
  • Understand your input validation routines
  • Check to make sure the proper output encoding is in place in every location user supplied data is sent to the browser

Vulnerabilities like the one listed above are simple to detect. In fact, many can be detected by automated scanners. Unfortunately, we cannot rely on automated scanners to find every vulnerability. Automated scanning is a great first step in identifying flaws like cross-site scripting. It is just as important for developers and QA analysts to be looking for these types of bugs. When we break it down, a cross-site scripting flaw is just a bug. It may be classified under “security” but nonetheless it is a bug that effects the quality of the application.

We want to encourage developers and QA to start looking for these types of bugs to increase the quality of their applications. Quality is more than just if the app works as expected. If the application has a bug that allows an attacker the ability to send malicious code to another user of the application that is still a quality issue.

If you are a developer, take a moment to think about what output you send to the client and if you are properly encoding that data. It is not as simple as just encoding the less than character or greater than character. Context matters. Look for the delimiters and control characters that are relative to where the output is going to determine the best course of action. It is also a good idea to standardize the delimiters you use for things like HTML attributes. Don’t use double quotes in some places, single quotes in others and then nothing in the rest. Pick one (double or single quotes) and stick to it everywhere.

If you are a QA analyst, understand what input is accepted by the application and then where that output is then used again. The first step is testing what data you can send to the server. Has there been any input validation put in place? Input validation should be implemented in a way to limit the types and size of data in most of the fields. The next step is to verify that any special characters are being encoded when they are returned back down to the browser. These are simple steps that can be performed by anyone. You could also start scripting these tests to make it easier in the future.

It is our (dev,qa,ba,application owners) responsibility to create quality applications. Adding these types of checks do not add a lot of time to the cycle and the more you do it, the less you will start to see allowing you to increase the testing timelines. Bugs are everywhere so be careful and test often.