Incorrect Calculation of Multi-Byte String Length
Weakness ID: 135 (Weakness Base)Status: Draft
+ Description

Description Summary

The software does not correctly calculate the length of strings that can contain wide or multi-byte characters.
+ Time of Introduction
  • Implementation
+ Applicable Platforms




+ Demonstrative Examples

Example 1

The following example would be exploitable if any of the commented incorrect malloc calls were used.

Example Language:
#include <stdio.h>
#include <strings.h>
#include <wchar.h>

int main() {

wchar_t wideString[] = L"The spazzy orange tiger jumped " \
"over the tawny jaguar.";
wchar_t *newString;

printf("Strlen() output: %d\nWcslen() output: %d\n",
strlen(wideString), wcslen(wideString));

/* Very wrong for obvious reasons //
newString = (wchar_t *) malloc(strlen(wideString));

/* Wrong because wide characters aren't 1 byte long! //
newString = (wchar_t *) malloc(wcslen(wideString));

/* Wrong because wcslen does not include the terminating null */
newString = (wchar_t *) malloc(wcslen(wideString) * sizeof(wchar_t));

/* correct! */
newString = (wchar_t *) malloc((wcslen(wideString) + 1) * sizeof(wchar_t));

/* ... */

The output from the printf() statement would be: Strlen() output: 0 Wcslen() output: 53

+ Potential Mitigations

Always verify the length of the string unit character.

Use length computing functions (e.g. strlen, wcslen, etc.) appropriately with their equivalent type (e.g.: byte, wchar_t, etc.)

+ Other Notes

There are several ways in which improper string length checking may result in an exploitable condition. All of these, however, involve the introduction of buffer overflow conditions in order to reach an exploitable state. The first of these issues takes place when the output of a wide or multi-byte character string, string-length function is used as a size for the allocation of memory. While this will result in an output of the number of characters in the string, note that the characters are most likely not a single byte, as they are with standard character strings. So, using the size returned as the size sent to new or malloc and copying the string to this newly allocated memory will result in a buffer overflow. Another common way these strings are misused involves the mixing of standard string and wide or multi-byte string functions on a single string. Invariably, this mismatched information will result in the creation of a possibly exploitable buffer overflow condition. Again, if a language subject to these flaws must be used, the most effective mitigation technique is to pay careful attention to the code at implementation time and ensure that these flaws do not occur.

+ Relationships
NatureTypeIDNameView(s) this relationship pertains toView(s)
ChildOfCategoryCategory133String Errors
Development Concepts (primary)699
ChildOfWeakness ClassWeakness Class682Incorrect Calculation
Research Concepts (primary)1000
ChildOfCategoryCategory741CERT C Secure Coding Section 07 - Characters and Strings (STR)
Weaknesses Addressed by the CERT C Secure Coding Standard (primary)734
+ Taxonomy Mappings
Mapped Taxonomy NameNode IDFitMapped Node Name
CLASPImproper string length checking
CERT C Secure CodingSTR33-CSize wide character strings correctly
+ References
[REF-11] M. Howard and D. LeBlanc. "Writing Secure Code". Chapter 5, "Unicode and ANSI Buffer Size Mismatches" Page 153. 2nd Edition. Microsoft. 2002.
+ Content History
Submission DateSubmitterOrganizationSource
CLASPExternally Mined
Contribution DateContributorOrganizationSource
2010-01-11Gregory PadgettUnitrendsFeedback
correction to Demonstrative Example
Modification DateModifierOrganizationSource
2008-07-01Eric DalciCigitalExternal
updated Potential Mitigations, Time of Introduction
2008-09-08CWE Content TeamMITREInternal
updated Applicable Platforms, Relationships, Other Notes, Taxonomy Mappings
2008-11-24CWE Content TeamMITREInternal
updated Relationships, Taxonomy Mappings
2009-05-27CWE Content TeamMITREInternal
updated Description
Previous Entry Names
Change DatePrevious Entry Name
2008-04-11Improper String Length Checking