Sanitizing untrusted input for HTML meta-characters is an important technique for preventing cross-site scripting attacks. Usually, this is done by escaping <, >, & and ". However, the context in which the sanitized value is used decides the characters that need to be sanitized.

As a consequence, some programs only sanitize < and > since those are the most common dangerous characters. The lack of sanitization for " is problematic when an incompletely sanitized value is used as an HTML attribute in a string that later is parsed as HTML.

Sanitize all relevant HTML meta-characters when constructing HTML dynamically, and pay special attention to where the sanitized value is used.

The following example code writes part of an HTTP request (which is controlled by the user) to an HTML attribute of the server response. The user-controlled value is, however, not sanitized for ". This leaves the website vulnerable to cross-site scripting since an attacker can use a string like " onclick="alert(42) to inject JavaScript code into the response.

Sanitizing the user-controlled data for " helps prevent the vulnerability:

  • OWASP: DOM based XSS Prevention Cheat Sheet.
  • OWASP: XSS (Cross Site Scripting) Prevention Cheat Sheet.
  • OWASP Types of Cross-Site.
  • Wikipedia: Cross-site scripting.