Incorrect Behavior Order: Validate Before Canonicalize
Weakness ID: 180 (Weakness Base)Status: Draft
+ Description

Description Summary

The software validates input before it is canonicalized, which prevents the software from detecting data that becomes invalid after the canonicalization step.

Extended Description

This can be used by an attacker to bypass the validation and launch attacks that expose weaknesses that would otherwise be prevented, such as injection.

+ Time of Introduction
  • Implementation
+ Applicable Platforms

Languages

All

+ Demonstrative Examples

Example 1

The following code attempts to validate a given input path by checking it against a whitelist and then return the canonical path. In this specific case, the path is considered valid if it starts with the string "/safe_dir/".

(Bad Code)
Example Language: Java 
String path = getInputPath();
if (path.startsWith("/safe_dir/"))
{
File f = new File(path);
return f.getCanonicalPath();
}

The problem with the above code is that the validation step occurs before canonicalization occurs. An attacker could provide an input path of "/safe_dir/../" that would pass the validation step. However, the canonicalization process sees the double dot as a traversal to the parent directory and hence when canonicized the path would become just "/".

To avoid this problem, validation should occur after canonicalization takes place. In this case canonicalization occurs during the initialization of the File object. The code below fixes the issue.

(Good Code)
Example Language: Java 
String path = getInputPath();
File f = new File(path);
if (f.getCanonicalPath().startsWith("/safe_dir/"))
{
return f.getCanonicalPath();
}


+ Observed Examples
ReferenceDescription
CVE-2002-0433
CVE-2003-0332
CVE-2002-0802
CVE-2000-0191Overlaps "fakechild/../realchild"
CVE-2004-2363Product checks URI for "<" and other literal characters, but does it before hex decoding the URI, so "%3E" and other sequences are allowed.
+ Potential Mitigations

Inputs should be decoded and canonicalized to the application's current internal representation before being validated. Make sure that your application does not decode the same input twice. Such errors could be used to bypass whitelist schemes by introducing dangerous inputs after they have been checked.

+ Relationships
NatureTypeIDNameView(s) this relationship pertains toView(s)
ChildOfCategoryCategory171Cleansing, Canonicalization, and Comparison Errors
Development Concepts (primary)699
ChildOfWeakness BaseWeakness Base179Incorrect Behavior Order: Early Validation
Research Concepts (primary)1000
ChildOfCategoryCategory722OWASP Top Ten 2004 Category A1 - Unvalidated Input
Weaknesses in OWASP Top Ten (2004) (primary)711
+ Relationship Notes

This overlaps other categories.

+ Functional Areas
  • Non-specific
+ Taxonomy Mappings
Mapped Taxonomy NameNode IDFitMapped Node Name
PLOVERValidate-Before-Canonicalize
OWASP Top Ten 2004A1CWE More SpecificUnvalidated Input
+ Related Attack Patterns
CAPEC-IDAttack Pattern Name
(CAPEC Version: 1.4)
3Using Leading 'Ghost' Character Sequences to Bypass Input Filters
4Using Alternative IP Address Encodings
78Using Escaped Slashes in Alternate Encoding
79Using Slashes in Alternate Encoding
71Using Unicode Encoding to Bypass Validation Logic
80Using UTF-8 Encoding to Bypass Validation Logic
+ Content History
Submissions
Submission DateSubmitterOrganizationSource
PLOVERExternally Mined
Modifications
Modification DateModifierOrganizationSource
2008-07-01Eric DalciCigitalExternal
updated Potential Mitigations, Time of Introduction
2008-08-15VeracodeExternal
Suggested OWASP Top Ten 2004 mapping
2008-09-08CWE Content TeamMITREInternal
updated Relationships, Other Notes, Taxonomy Mappings, Type
2008-10-14CWE Content TeamMITREInternal
updated Description
2009-05-27CWE Content TeamMITREInternal
updated Other Notes, Relationship Notes
Previous Entry Names
Change DatePrevious Entry Name
2008-04-11Validate-Before-Canonicalize