Security Review 5 – juggling

July 8, 2019 - Security Review
By: jale

Security Review: Review the code and find the bugs that would otherwise pass a functional code review or tests, but have serious security flaws.

Here we have a script designed to accept a file upload and execute it if and only if it came from a trusted source. The source is deemed trustworthy by verifying the HMAC of the file.

$hash_alg = 'ripemd128';
$secret = getenv('SECRET');

header('Content-Type: text/plain');

// check to ensure all proper parameters have been passed
if (!isset($_POST['hmac']) || !is_string($_POST['hmac']) || !isset($_FILES['file']) || !isset($_FILES['file']['error']))
    exit('missing parameters');
// make sure the file was uploaded OK
if ($_FILES['file']['error'] != UPLOAD_ERR_OK)
    exit('bad file upload');
// ensure the hashing algo is supported
if (!in_array($hash_alg, hash_algos()))
    exit('hashing algorithm not supported');

$file = &$_FILES['file'];
$filepath = "./files/{$file['name']}";
if (!move_uploaded_file($file['tmp_name'], $filepath))
    exit('error handling uploaded file');

$contents = file_get_contents($filepath);
if (isset($_POST['nonce']) && is_string($_POST['nonce']))
    $contents .= $_POST['nonce'];
$hmac = hash_hmac($hash_alg, $contents, $secret);
if ($_POST['hmac'] == $hmac) {
    exec($filepath, $output, $retval);
    foreach ($output as $line)
        echo "$line\n";
    echo "----------\n";
    echo "{$file['name']} returned {$retval}\n";

This wouldn’t occur in a strongly typed language, and this one is specific to PHP.


The vulnerability here is exploited via loose comparison and PHPs type juggling. Variables in some weakly typed languages may be coerced to, or, interpreted as a different type depending on the operation being performed with them. The following is straight from the PHP docs on Type Juggling

PHP does not require (or support) explicit type definition in variable declaration; a variable’s type is determined by the context in which the variable is used.

The docs even give a simple example:

$foo = "1";  // $foo is string (ASCII 49)
$foo *= 2;   // $foo is now an integer (2)
$foo = $foo * 1.3;  // $foo is now a float (2.6)
$foo = 5 * "10 Little Piggies"; // $foo is integer (50)
$foo = 5 * "10 Small Pigs";     // $foo is integer (50)

Now with that as clear as mud we can get down to business. Even with type juggling explained (a little bit) the bug may not be obvious, but have a closer look at this line:

if ($_POST['hmac'] == $hmac) {

The double equal (==) indicates that the comparison be done loosely so type juggling may take effect. The first thing to determine would be if disparate types are being compared. Tracing through the script, line 7 does a type check to ensure $_POST['hmac'] is a string. $hmac is assigned the return value from hash_hmac(), for which the docs specify:

Returns a string containing the calculated message digest as lowercase hexits unless raw_output is set to true in which case the raw binary representation of the message digest is returned. Returns FALSE when algo is unknown or is a non-cryptographic hash function.

Rraw output is not being used and line 13 ensures the hash algorithm is supported so ($_POST['hmac'] == $hmac) should always compare a string to a string. What could possibly be the problem?

Here’s the thing, with loose comparison (via ==) and type juggling there’s still no guarantee that two strings will be compared as two strings. The values in the variables being compared may affect how they’re compared, and there’s a specific set of values that will cause problems. The following code snippet demonstrates this edge case:

$foo = "0";
$bar = "0e123";
($foo == $bar); // evaluates to TRUE

Two different strings are the same? wat?

Here’s the thing: during the comparison PHP sees “0e…” and decides that the string be evaluated as floating point number. 0e123 then becomes 0 x 10 ^ 123, which, mathematically evaluates to integer 0. The "0" in $foo is type juggled to an int and resulting comparison becomes (int) 0 == (int) 0. This is how the hmac check can potentially be bypassed.

$_POST['hmac'] is within direct control, so that can be passed in as 0. Now $hmac needs to be coerced to a magic hash value of the form “0e###…”

The script allows the usage of a nonce, and by iterating the nonce, the value of $hmac can be coerced. The ripemd128 hash algorithm returns a 128 bit hash. Trying to brute-force guess a 128 bit hash would take far too long (with 3.4028237e38 possibilities), but hash_hmac() returns a hexadecimal string, and some of those strings will match the magic hash form “0e###…” which significantly reduces the brute-force space. The following table shows magic hash occurrence based on the hash size.

BitsMagic Hash Occurence
161 in 655
321 in 4,295
641 in 184,467
1281 in 340,282,366
1601 in 14,615,016,373
1921 in 627,710,173,539
2561 in 1.1579209e+15
5121 in 1.3407808e+28

For 128 bits the odds of guessing a nonce that results in a magic hash is 1 in 340,282,366. That’s not trivial, but it’s no longer in the realm of taking the lifetime of the universe to brute-force, either. Moreover, the proportion of magic hashes diminishes as the size of the hash increases. Use a long enough hash, and the attack again becomes impractical.

Securing The Code

This one is easy, use triple equals (===) to perform strict comparison. No more type juggling.

There is however still room for improvement. Strict comparison will still technically leak timing information; the information will be about the hash comparison, which is useless, but it’s good habit not to leave anything that may potentially become a hole somewhere down the road. Using hash_equals() for the comparison would be much better, even over strict comparison, as it will not leak timing information.

Lessons Learned

Loose comparison be a harsh mistress.

A key understanding here is to know when using loose comparison will be okay, and when strict comparison would be better. Using strict comparison all the time would probably be ideal, but would require a lot of extra type checking or type converting code (ie. to make sure comparing 5 and "5" perform as expected) which somewhat defeats the purpose of using a weakly typed language.

Having an understand of what the output should be for operations with disparate types in weakly typed languages is also important, but it’s hard to blame a developer for not knowing the result of every case; there’s just too many possibilities. Proper testing should catch this edge case, but it still comes down the developer being aware of this particular form of type-juggling, otherwise it may not make it into the test suite. Honestly, I’ve been using PHP for quite a while and had no idea about this specific form of type juggling, but thankfully have never had it be an issue, either.

A lot of people will be happy to crap all over PHP for this one, and yes, it is a particularly strange idiosyncrasy of the language, but there are plenty of weakly typed languages that also do some really dumb things.

Further Reading