Menu

Security Review 3 – tempus edax rerum

May 31, 2019 - Security Review
By: jale

Security Review: Review the code and find the bugs that would otherwise pass a functional code review or tests, but have serious security flaws.

Here’s a snippet of code, it compares two strings, really not much to it. This bit of code probably occurs in some variation in almost every codebase, ever. But is it secure? In most cases, yes, it’s fine, in other specific instances, it is not.

/**
 * compares two strings
 * returns 0 if strings are equal, 1 otherwise.
 * str1 - string 1
 * str1len - length of string 1
 * str2 - string 2
 * str2len - length of string 2
 */
int str_equal(const char const * str1, int str1len, const char const * str2, int str2len) {
    if (str1len != str2len)
        return 1;

    for (int c = 0; c < str1len; c++) {
        if (str1[c] != str2[c])
            return 1;
    }

    return 0;
}

The return value of the function is not the only information that may be gleaned.

Vulnerability

“Tempus edax rerum” translation: “Time, devourer of all things”.

The issue with this function is that is leaks information; it is vulnerable to a particular side channel attack known as a timing attack.

Say, for instance, that this function was used to compare two passwords. An attacker could use the execution time of this function to determine the valid password. Depending on the ‘correctness’ of the password, the function will take longer to return when comparing them. For each matching consecutive character at the beginning of the strings, the function will iterate further through the for loop and thus take longer to return.

As a contrived example, let’s say the password is “secret”. The function is passed str1 as the secret password, and str2 as the password to verify. We’ll come back to how to determine the password length later, but for now lets say the attacker knows the password length, and thus can bypass the initial matching length check. To execute the timing attack an attacker would first try with the password “aaaaaa” and record the execution time. The function is then called, iterating out the first character of the string and recording execution times: “baaaaa”, “caaaaa”, “daaaaa”, … When “saaaaa” is reached, something slightly different happens. The function still returns 1, indicating the strings (passwords) are not equal, but it takes slightly longer to do so. Calling the function against “aaaaaa”, “baaaaa”, etc. results in the for loop iterating once. With “saaaaa” the for loop iterates twice: once for the “s”, and again for “a”, where it returns. The result is that the function takes very slight longer to execute against “saaaaaa” than the other invalid strings, leaking information about the ‘correctness’ of the string. The next character is iterated: “saaaaa”, “sbaaaa”, “scaaaa”, and so on. Graphing the execution times reveals that “seaaaa” took very slightly longer to execute, revealing the second character. The process is then repeated for each character and eventually the function returns 0 and the correct password has been determined.

The length check at the beginning of the function may also be defeated. The timing attack may still be exploited, however first all lengths must be run against the function. “a”, “b”, “c”, …, “aa”, “ba”, “ca”, …, “aaa”, “baa”, “caa”, … and eventually “saaaaa” will take slightly longer.

Securing The Code

The ideal situation is to have the str_equal function run in constant-time ie. it’s execution time is independent of it’s inputs. However, there are alternatives, which while not perfect, are known to be suitable.

One common method is to refactor the for loop as follows:

int result = 0;
for (int c = 0; c < str1len; c++) {
    result |= str1[c] ^ str2[c];
}
return (result == 0);

This ensures that the entire string is iterated regardless of it’s ‘correctness’. However, this method still leaks the length of the strings.

Another is to use a cryptographic hash function to hash the strings before comparison. While the function technically would still leak timing info, it would be about the hash, rather than the plain strings. To exploit that an attacker would essentially need to be able to reverse a hash function.

Lessons Learned

The concept to understand here is that code and algorithms which may be fine – even optimal – for general computing, may not be suitable for secure practices. When writing code that deals with secure information, be cognizant of the meta-information that may be leaked.

Side-channel vulnerabilities can be tricky to catch; the very nature of them is that they are not obvious. The Java MessageDigest.isEqual() was vulnerable to a timing attack for a while. Not to mention the slew of CPU vulnerabilities like Spectre and Meltdown that have roots in side-channel attacks.

Timing attacks are statistical attacks, and signal-to-noise ratio can be a large factor in successfully exploiting one. Signal being algorithm execution time, and noise being the time variances caused but anything else. Even under ideal circumstances it will likely take hundreds, even thousands of iterations to attain statistically relevant data. CPU optimizations, CPU core count, CPU varying clock speed, process priority, data locality, and machine load are all sources of noise, all which may widely vary the execution time of the same bit of code. Attempting to exploit a timing attack over a network adds in network latency and processing, device interrupts, etc. All which make the results noisier.

Timing attacks in many instances may just not be practical, but as it’s often an easy fix at a very minimal cost of extra execution time, it’s definitely worth securing against.