Preventing Contact Form Spam

Posted by on Sep 9, 2012 in Code Chat | No Comments

This is a problem which comes up time and time again. Every time you add a contact form to a website, you open yourself up to spam attacks. Solving this problem involves striking the right balance between making life easy for the person filling in the form, and the person receiving the spam at the other end.

Captcha

Adding a test to the form that the user must complete before submitting it is the most resilient way. One of the most popular is reCaptcha which uses imperfectly scanned books to test the user by presenting them with two scanned words. The first word, Captcha knows what it should be and the second word is used to help identify the scan. If enough users provide the same identify the second word, it is promoted to a known word.

Cleverly, this helps translate imperfect book scans as well as providing a robust identity check.

However this can be annoying for users, and will reduce the chance of a user completing the form. This should only be used where the level of security is more important than the volume of submissions you require.

Javascript

Spambots are the automated scripts which crawl the web looking for forms to fill in. Since these are usually performed using server-side page scrapes rather than browsers, Javascript usually isn’t run before the form is submitted.

One solution is to have a field on the form which is filled by Javascript before the form is submitted. The downside of this is that your page will fail if Javascript isn’t enabled, and it wouldn’t take much for a hacker to manually inspect the page and add the extra field in to their spam.

You may want to combine client-side Javascript form validation with spam checking. This provides a neater and quicker form submission process, but unless you provide a Javascript fallback you are cutting off accessibility and non-Javascript users.

Cookies

In the same way that bots don’t execute Javascript, they usually won’t carry site cookies around with them either. Using a server-side or client-side cookie to make sure the user has visited the site before submitting the form could be quite a neat way of validating them. Once again though, some users may choose to turn off cookies in their browsers, and depending on how your validation script works, it may fail if the first page a user lands on is the contact page.

Time Dependence

If a bot submits your form it is likely that it will either load your page and then imminently submit the form, or scrape the contents, store it somewhere and at some later date start sending submissions.

By creating an encoded token based upon the time of page creation and sending that though with the form you can get a good idea of whether the user may be real or not. For example I may create a field with the value of

time()/date('d')

Which is the unix timestamp, divided by the day of the month, to obscure the value and make it difficult for bots to spot a pattern.

When the form is submitted, check that this field is passed though, and if it took less than say, 5 seconds to submit the form after the page loaded or more than 30 minutes, return a form error.

Conclusion

The best solution is probably to use a combination of some of the above techniques, depending on how high profile the form is going to be and the security you require from respondents.

Relying on the same technique for all your sites introduces the inherent danger that once one form is compromised, it’s likely that they all will be. Keeping the bots on their toes is the only way of staying ahead of them.

Leave a Reply