CWE 138

Improper Sanitization of Special Elements

Weakness ID: 138 (Weakness Class)

Status: Draft

Description

Description Summary

The software receives input from an upstream component, but it does not sanitize or incorrectly sanitizes special elements that could be interpreted as control elements or syntactic markers when they are sent to a downstream component.

Extended Description

Most languages and protocols have their own special elements such as characters and reserved words. These special elements can carry control implications. If software fails to prevent external control or influence over the inclusion of such special elements, the control flow of the program may be altered from what was intended. For example, both Unix and Windows interpret the symbol < ("less than") as meaning "read input from a file".

Time of Introduction

Implementation

Applicable Platforms

Languages

Language-independent

Observed Examples

Reference	Description
CVE-2001-0677	Read arbitrary files from mail client by providing a special MIME header that is internally used to store pathnames for attachments.
CVE-2000-0703	Setuid program does not cleanse special escape sequence before sending data to a mail program, causing the mail program to process those sequences.
CVE-2003-0020	Multi-channel issue. Terminal escape sequences not filtered from log files.
CVE-2003-0083	Multi-channel issue. Terminal escape sequences not filtered from log files.

Potential Mitigations

Phase: Implementation

Developers should anticipate that special elements (e.g. delimiters, symbols) will be injected into input vectors of their software system. One defense is to create a white list (e.g. a regular expression) that defines valid input according to the requirements specifications. Strictly filter any input that does not match against the white list. Properly encode your output, and quote any elements that have special meaning to the component with which you are communicating.

Phases: Architecture and Design; Implementation

Assume all input is malicious. Use a standard input validation mechanism to validate all input for length, type, syntax, and business rules before accepting the data to be displayed or stored. Use an "accept known good" validation strategy.

Phase: Implementation

Use and specify an appropriate output encoding to ensure that the special elements are well-defined. A normal byte sequence in one encoding could be a special element in another.

Phase: Implementation

Do not rely exclusively on blacklist validation to detect malicious input or to encode output. There are too many variants to encode a character; you're likely to miss some variants.

Phase: Implementation

Inputs should be decoded and canonicalized to the application's current internal representation before being validated. Make sure that your application does not decode the same input twice. Such errors could be used to bypass whitelist schemes by introducing dangerous inputs after they have been checked.

Weakness Ordinalities

Ordinality	Description
Primary	(where the weakness exists independent of other weaknesses)

Relationships

Nature	Type	ID	Name	View(s) this relationship pertains to
ChildOf	Weakness Class	74	Failure to Sanitize Data into a Different Plane ('Injection')	Development Concepts (primary)699
ChildOf	Category	137	Representation Errors	Development Concepts699
ChildOf	Weakness Class	707	Improper Enforcement of Message or Data Structure	Research Concepts (primary)1000
ParentOf	Weakness Base	140	Failure to Sanitize Delimiters	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	147	Improper Sanitization of Input Terminators	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	148	Failure to Sanitize Input Leaders	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	149	Failure to Sanitize Quoting Syntax	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	150	Failure to Sanitize Escape, Meta, or Control Sequences	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	151	Improper Sanitization of Comment Delimiters	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	152	Improper Sanitization of Macro Symbols	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	153	Improper Sanitization of Substitution Characters	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	154	Improper Sanitization of Variable Name Delimiters	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	155	Improper Sanitization of Wildcards or Matching Symbols	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	156	Improper Sanitization of Whitespace	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	157	Failure to Sanitize Paired Delimiters	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Variant	158	Failure to Sanitize Null Byte or NUL Character	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Weakness Class	159	Failure to Sanitize Special Element	Development Concepts (primary)699 Research Concepts (primary)1000
ParentOf	Category	169	Technology-Specific Special Elements	Development Concepts (primary)699
ParentOf	Weakness Base	464	Addition of Data Structure Sentinel	Research Concepts (primary)1000
ParentOf	Weakness Class	790	Improper Filtering of Special Elements	Research Concepts (primary)1000

Relationship Notes

This weakness can be related to interpretation conflicts or interaction errors in intermediaries (such as proxies or application firewalls) when the intermediary's model of an endpoint does not account for protocol-specific special elements.

See this entry's children for different types of special elements that have been observed at one point or another. However, it can be difficult to find suitable CVE examples. In an attempt to be complete, CWE includes some types that do not have any associated observed example.

Research Gaps

This weakness is probably under-studied for proprietary or custom formats. It is likely that these issues are fairly common in applications that use their own custom format for configuration files, logs, meta-data, messaging, etc. They would only be found by accident or with a focused effort based on an understanding of the format.

Taxonomy Mappings

Mapped Taxonomy Name	Node ID	Fit	Mapped Node Name
PLOVER			Special Elements (Characters or Reserved Words)
PLOVER			Custom Special Character Injection

Related Attack Patterns

CAPEC-ID	Attack Pattern Name	(CAPEC Version: 1.4)
15	Command Delimiters

Content History

Submissions
Submission Date	Submitter	Organization	Source
	PLOVER		Externally Mined

Modifications
Modification Date	Modifier	Organization	Source
2008-07-01	Eric Dalci	Cigital	External
2008-07-01	updated Description, Potential Mitigations, Time of Introduction
2008-09-08	CWE Content Team	MITRE	Internal
2008-09-08	updated Description, Relationships, Other Notes, Taxonomy Mappings
2009-03-10	CWE Content Team	MITRE	Internal
2009-03-10	updated Description, Name
2009-07-27	CWE Content Team	MITRE	Internal
2009-07-27	updated Applicable Platforms, Description, Observed Examples, Other Notes, Potential Mitigations, Relationship Notes, Relationships, Research Gaps, Taxonomy Mappings, Weakness Ordinalities
2009-12-28	CWE Content Team	MITRE	Internal
2009-12-28	updated Relationships
Previous Entry Names
Change Date	Previous Entry Name
2008-04-11	Special Elements (Characters or Reserved Words)

2009-03-10	Failure to Sanitize Special Elements

CWE 138

COMPANY

STANDARDS

RECENT POSTS

MENU