Permissive user input validation

This entry was posted by Sunday, 27 January, 2013

A ux.stackexchange question prompted me to consider how one might implement a more permissive type of input validation. It’s not rare for a form to punish the user if they add an extra space before typing in a date, or accidentally use a comma instead of a period when typing in an IP address. After-all, we employ strict validation to keep the data correct.

Garbage In — Garbage Out. It rings true but maybe, taken too literally, it makes us form strict validation and a no-exceptions policy to rebels. We punish a user typing ’12′ instead of the fully-qualified ’2012′,… why? Either it’s our thoughtlessness or it’s the very unlikely (depending on context) possibility that the user did in-fact mean the year ’1912′ or ’1812′ or ’1012′…

If we start down the road of permissive input validation then we need to also explore input correction. We can’t allow a rogue comma to slip in and not correct it. It’s probably best to correct it straight away (not too soon — possibly on blur) so that the actual data stored conforms to the correct format.

William Hudson executed a date survey in 2009 to discover all the various ways American users like to enter dates. The results show that users use a variety of formats. It makes perfect sense to accept all these variants and let the computer figure out what is what.

For the specific problem of entering dates, I would like to recommend Date.js, because it can successfully parse most of those variants. However, there is a big caveat when it comes to dates, especially on international forms. The American style of entering a date, MM/DD/YY, is technically impossible to differentiate from the other standard of DD/MM/YY, unless the DD portion happens to be above 12. For this reason I guess it would be best to cater to your localized users as best as possible.

An alternative is to retain rigidity in your validation but allow for some minor mistakes. For example, insist upon the ISO format of YYYY-MM-DD but don’t make a fuss if the user separates with a slash or a space (or heck, anything) instead of a dash.

My point is: Maybe formal validation with permissive aspects mixed in gives us the best of both worlds. We don’t punish the user for minor mistakes, and we don’t end up with ambiguous data.

In an attempt to practice this technique of mixing rigidity with leniency, I created vic.js.

Currently validation in JavaScript can be quite an ugly affair, plagued with remnants of DHTML and overly invasive input masks. It’s not uncommon to see stuff like this:

someInput.onkeyup = function() { if (!this.value.match(/some rigid regex/)) { alert('Enter the right value, you fool'); } };

Typically the rules are strict, the characters non-negotiable, the regular expression unyielding, and the presented invalidation UI annoying.

vic.js (a.k.a Vic, VIC) allows you to define a lenient regular expression, and it expects you to extract your important data from the captured groups.

Vic’s signature goes something like this:

vic( LENIENT_PATTERN_WITH_CAPTURED_GROUPS, PER_GROUP_PROCESSOR, POST_PROCESSOR );

The simple example would be a ‘year’ field:

var yearVic = vic( /^\s*(\d{1,4})\s*$/, function(year) { // Let's assume anything between 14 and 99 is from the 1900s: return vic.pad(year > 13 && year <= 99 ? '1900' : '2000' )(year); }, Number // cast full output to a Number );

Read Full Post at James Padolsey