Uncontrolled Format String |
Weakness ID: 134 (Weakness Base) | Status: Draft |
Description Summary
Languages
C: (Often)
C++: (Often)
Perl: (Rarely)
Languages that support format strings
The programmer rarely intends for a format string to be user-controlled at all. This weakness is frequently introduced in code that constructs log messsages, where a constant format string is omitted. |
In cases such as localization and internationalization, the language-specific message repositories could be an avenue for exploitation, but the format string issue would be resultant, since attacker control of those repositories would also allow modification of message length, format, and content. |
Scope | Effect |
---|---|
Confidentiality | Format string problems allow for information disclosure which can severely simplify exploitation of the program. |
Access Control | Format string problems can result in the execution of arbitrary code. |
Automated Static Analysis This weakness can often be detected using automated static analysis tools. Many modern tools use data flow analysis or constraint-based techniques to minimize the number of false positives. |
Black Box Since format strings often occur in rarely-occurring erroneous conditions (e.g. for error message logging), they can be difficult to detect using black box methods. It is highly likely that many latent issues exist in executables that do not have associated source code (or equivalent source. Effectiveness: Limited |
Example 1
The following example is exploitable, due to the printf() call in the printWrapper() function. Note: The stack buffer was added to make exploitation more simple.
Example 2
The following code copies a command line argument into a buffer using snprintf().
This code allows an attacker to view the contents of the stack and write to the stack using a command line argument containing a sequence of formatting directives. The attacker can read from the stack by providing more formatting directives, such as %x, than the function takes as arguments to be formatted. (In this example, the function takes no arguments to be formatted.) By using the %n formatting directive, the attacker can write to the stack, causing snprintf() to write the number of bytes output thus far to the specified argument (rather than reading a value from the argument, which is the intended behavior). A sophisticated version of this attack will use four staggered writes to completely control the value of a pointer on the stack.
Example 3
Certain implementations make more advanced attacks even easier by providing format directives that control the location in memory to read from or write to. An example of these directives is shown in the following code, written for glibc:
This code produces the following output: 5 9 5 5 It is also possible to use half-writes (%hn) to accurately control arbitrary DWORDS in memory, which greatly reduces the complexity needed to execute an attack that would otherwise require four staggered writes, such as the one mentioned in the first example.
Reference | Description |
---|---|
CVE-2002-1825 | format string in Perl program |
CVE-2001-0717 | format string in bad call to syslog function |
CVE-2002-0573 | format string in bad call to syslog function |
CVE-2002-1788 | format strings in NNTP server responses |
CVE-2007-2027 | Chain: untrusted search path enabling resultant format string by loading malicious internationalization messages |
Phase: Requirements Choose a language that is not subject to this flaw. |
Phase: Implementation Ensure that all format string functions are passed a static string which cannot be controlled by the user and that the proper number of arguments are always sent to that function as well. If at all possible, use functions that do not support the %n operator in format strings. |
Build: Heed the warnings of compilers and linkers, since they may alert you to improper usage. |
While Format String vulnerabilities typically fall under the Buffer Overflow category, technically they are not overflowed buffers. The Format String vulnerability is fairly new (circa 1999) and stems from the fact that there is no realistic way for a function that takes a variable number of arguments to determine just how many arguments were passed in. The most common functions that take a variable number of arguments, including C-runtime functions, are the printf() family of calls. The Format String problem appears in a number of ways. A *printf() call without a format specifier is dangerous and can be exploited. For example, printf(input); is exploitable, while printf(y, input); is not exploitable in that context. The result of the first call, used incorrectly, allows for an attacker to be able to peek at stack memory since the input string will be used as the format specifier. The attacker can stuff the input string with format specifiers and begin reading stack values, since the remaining parameters will be pulled from the stack. Worst case, this improper use may give away enough control to allow an arbitrary value (or values in the case of an exploit program) to be written into the memory of the running program. Frequently targeted entities are file names, process names, identifiers. Format string problems are a classic C/C++ issue that are now rare due to the ease of discovery. One main reason format string vulnerabilities can be exploited is due to the %n operator. The %n operator will write the number of characters, which have been printed by the format string therefore far, to the memory pointed to by its argument. Through skilled creation of a format string, a malicious user may use values on the stack to create a write-what-where condition. Once this is achieved, he can execute arbitrary code. Other operators can be used as well; for example, a %9999s operator could also trigger a buffer overflow, or when used in file-formatting functions like fprintf, it can generate a much larger output than intended. |
Ordinality | Description |
---|---|
Primary | (where the weakness exists independent of other weaknesses) |
Nature | Type | ID | Name | View(s) this relationship pertains to |
---|---|---|---|---|
ChildOf | Weakness Class | 20 | Improper Input Validation | Seven Pernicious Kingdoms (primary)700 |
ChildOf | Weakness Class | 74 | Failure to Sanitize Data into a Different Plane ('Injection') | Development Concepts (primary)699 Research Concepts (primary)1000 |
ChildOf | Category | 133 | String Errors | Development Concepts699 |
ChildOf | Category | 633 | Weaknesses that Affect Memory | Resource-specific Weaknesses (primary)631 |
ChildOf | Category | 726 | OWASP Top Ten 2004 Category A5 - Buffer Overflows | Weaknesses in OWASP Top Ten (2004) (primary)711 |
ChildOf | Category | 743 | CERT C Secure Coding Section 09 - Input Output (FIO) | Weaknesses Addressed by the CERT C Secure Coding Standard (primary)734 |
ChildOf | Category | 808 | 2010 Top 25 - Weaknesses On the Cusp | Weaknesses in the 2010 CWE/SANS Top 25 Most Dangerous Programming Errors (primary)800 |
PeerOf | Weakness Base | 123 | Write-what-where Condition | Research Concepts1000 |
MemberOf | View | 630 | Weaknesses Examined by SAMATE | Weaknesses Examined by SAMATE (primary)630 |
MemberOf | View | 635 | Weaknesses Used by NVD | Weaknesses Used by NVD (primary)635 |
Format string issues are under-studied for languages other than C. Memory or disk consumption, control flow or variable alteration, and data corruption may result from format string exploitation in applications written in other languages such as Perl, PHP, Python, etc. |
Mapped Taxonomy Name | Node ID | Fit | Mapped Node Name |
---|---|---|---|
PLOVER | Format string vulnerability | ||
7 Pernicious Kingdoms | Format String | ||
CLASP | Format string problem | ||
CERT C Secure Coding | FIO30-C | Exact | Exclude user input from format strings |
OWASP Top Ten 2004 | A1 | CWE More Specific | Unvalidated Input |
CERT C Secure Coding | FIO30-C | Exclude user input from format strings | |
WASC | 6 | Format String |
CAPEC-ID | Attack Pattern Name | (CAPEC Version: 1.4) |
---|---|---|
67 | String Format Overflow in syslog() |
A weakness where the code path has: 1. start statement that accepts input 2. end statement that passes a format string to format string function where a. the input data is part of the format string and b. the format string is undesirable Where "undesirable" is defined through the following scenarios: 1. not validated 2. incorrectly validated |
Steve Christey. "Format String Vulnerabilities in Perl Programs". <http://www.securityfocus.com/archive/1/418460/30/0/threaded>. |
Hal Burch and Robert C. Seacord. "Programming Language Format String Vulnerabilities". <http://www.ddj.com/dept/security/197002914>. |
Tim Newsham. "Format String Attacks". Guardent. September 2000. <http://www.lava.net/~newsham/format-string-attacks.pdf>. |
[REF-11] M. Howard and D. LeBlanc. "Writing Secure Code". Chapter 5, "Format String Bugs" Page 147. 2nd Edition. Microsoft. 2002. |
Submissions | ||||
---|---|---|---|---|
Submission Date | Submitter | Organization | Source | |
PLOVER | Externally Mined | |||
Modifications | ||||
Modification Date | Modifier | Organization | Source | |
2008-08-01 | KDM Analytics | External | ||
added/updated white box definitions | ||||
2008-09-08 | CWE Content Team | MITRE | Internal | |
updated Applicable Platforms, Common Consequences, Detection Factors, Modes of Introduction, Relationships, Other Notes, Research Gaps, Taxonomy Mappings, Weakness Ordinalities | ||||
2008-11-24 | CWE Content Team | MITRE | Internal | |
updated Relationships, Taxonomy Mappings | ||||
2009-03-10 | CWE Content Team | MITRE | Internal | |
updated Relationships | ||||
2009-05-27 | CWE Content Team | MITRE | Internal | |
updated Demonstrative Examples | ||||
2009-07-17 | KDM Analytics | External | ||
Improved the White Box Definition | ||||
2009-07-27 | CWE Content Team | MITRE | Internal | |
updated White Box Definitions |