Why is data validation important?
Data validation ensures that incoming data is accurate, complete, and correct ("valid"). Incorrect or incomplete ("invalid") data could result in a flawed analysis or processing of that data. If some data is invalid, all analysis performed on the entire data set may be invalid.
- If a user makes a mistake when providing input, the software application should detect the error and ask the user to correct the mistake.
- If a user intentionally provided bad input, the fraudulent data should be rejected and never provided to the software's data processing routines.
- Data validation can also detect when data is unintentionally corrupted during storage or transit.
Types of data validation
The following are general types of data validation commonly performed by computer applications.
- Format validation — Ensures the data is submitted in the correct format. For example, an application may ask the user to input a date in the format
MM-DD-YYYY(two-digit month, then a dash, two-digit day, another dash, and four-digit year). The user's input data should be checked to ensure the number of characters and the placement of the dashes matches that format.
- Data type validation — Ensures the input is of the correct data type. For example, in the date example above, after verifying the format, the characters should be checked to make sure they're numeric (only numbers, not letters, whitespace, or other characters).
- Range validation — Ensures the values fall within a range limit. For example, in the date example above, all numeric values should be greater than zero. The month value should always be less than 13 (there are only 12 months in a year). The day value should always be less than 32. If the month value is 2 (for "February"), the day should never be greater than 29. If the month value is 2, and the year value modulo 4 equals zero, indicating a leap year, then the day value should not be greater than 28.
- Consistency validation — Ensures the data is logically consistent with the requested input compared to other data. For example, if an application asks for a date of birth, it should logically be a date in the past.
- Uniqueness validation — Ensures the data is unique, if necessary. For example, if you ask the user to choose a username, it should be unique, to avoid conflict with any existing usernames.
- Code validation — Ensures any encoded data is valid according to the code specification. This form of validation can apply to any coding scheme, regardless of whether it's simple or highly complex. If an application asks the user to input a postal code, the input should be compared to a lookup table of real-world postal codes to ensure validity. If an application accepts XML (extensible markup language) data as input, the application should validate the input according to the corresponding XML schema.