Breaking MSSQL's RAND() function

In this article I’m going to take a look at Microsoft SQL Servers RAND() implementation. We’ll reverse the relevant parts of SQL Server using windbg and Ghidra, replicate the random number generator in C and then look at some attacks and brute forcing methods. This project stemmed from a job I worked on recently where a stored procedure which called RAND() was used to generate session tokens within an API[1].

TLDR? An attacker who can obtain two consecutive outputs from RAND() can brute force the internal state variables and determine all previously returned values, and any future values for a specific connection. The solution is to use CRYPT_GEN_RANDOM for security sensitive code instead.

[1] - That wasn’t really how the app used RAND(), but it’s a close enough analogue to demonstrate the attacks.

Background

So earlier this month I worked on a job that involved attacking an API. The API generated session tokens (not really, but lets say session tokens for the sake of this article) using a stored procedure. The stored procedures used SQL Server’s RAND() method plus some funkiness to turn the returned float into a string. During the engagement, I was wondering how RAND() actually works internally, and couldn’t find much information concerning its internal workings. Some quick poking at SQL Server during the engagement indicated that the RAND() method called GetTickCount to seed itself somehow, but the specifics of the algorithm were a mystery. We reported this behavior to the client during testing, and collectively decided to prioritise testing the remaining API functionality rather than rummaging around SQL Server’s gross insides.

You’d think that would be that, but it turns out not knowing how a certain RNG is implemented really bugged me. I had no option but to dig into SQL Server in my own time. My hands were tied. I don’t make the rules.

The goal is to figure out how SQL Server’s RAND() method actually works, and whether any vulnerabilities lie therein. The approach will be a combination of debugging using windbg, static analysis with Ghidra and then a bunch of coding and messing around to build out some analogues of the RNG and try to find vulnerabilities. RAND() returns a random float between 0 and 1, so this was also an excuse to brush up on floating point assembly.

RAND() Details

Microsoft’s RAND() documentation provides some useful background on how the function works. A seed can be provided, and then RAND() will return the same value. Subsequent calls to RAND() on the same connection with the same initial seed will be deterministic. E.g. if you run SELECT RAND(16), RAND(), RAND() multiple times, you will receive the same three values every time.

This suggests that there is some kind of internal RNG state that’s persisted on a per-connection basis, but more on this later.

The following screenshot shows the RAND functions usual behavior:

Test Setup

The test setup was a bone stock instance of SQL Server 2017 Express Edition running on a windows 10 VM, along with windbg and Ghidra.

Finding and Reversing RAND()

You, me and windbg

The first step was to locate the bit of the SQL Server code that was responsible for implementing RAND(). This involved running windbg as an administrator, attaching to the sqlservr.exe process (remember to tick the Show processes from all users checkbox!). Next I decided to look for all symbols in sqllang.dll that contained the string rand. If this sounds lazy, it’s because it is. ^_^

Running x sqllang!*rand* in windbg to search the symbols (eventually) returned the following:

0:060> x sqllang!*rand*
00007fff`5ba30ee0 sqllang!CSECErrorRingBufferManager::GetLastErrorAndStoreRecord (<no parameter info>)
00007fff`5b3995a0 sqllang!CRandomMaskFunctionProperties::GetArgumentType (<no parameter info>)
00007fff`5c37ed70 sqllang!FFtCheckForFailedTriggerAndLog (<no parameter info>)
00007fff`5b399570 sqllang!CRandomMaskFunctionProperties::GetMaskingFunctionType (<no parameter info>)
00007fff`5b9da180 sqllang!SqlHostIntrinsics::Rand (<no parameter info>)
00007fff`5b2f35a0 sqllang!GetOsErrAndRaise (<no parameter info>)
{...snip...}

A large number of the returned entries obviously weren’t relevant to random numer generation. Things like GetLastErrorAndStoreRecord were caught by my half-brick-in-a-sock search for *rand*. Stripping out the cases which were unlikely to be relevant left the following:

00007fff`5b3995a0 sqllang!CRandomMaskFunctionProperties::GetArgumentType (<no parameter info>)
00007fff`5b399570 sqllang!CRandomMaskFunctionProperties::GetMaskingFunctionType (<no parameter info>)
00007fff`5b9da180 sqllang!SqlHostIntrinsics::Rand (<no parameter info>)
00007fff`5c9ef028 sqllang!CSECErrorAPI::wszBCryptGenRandom = <no type information>
00007fff`5d1f8bd0 sqllang!CFingerprintUtilStatic::m_randomTable = <no type information>
00007fff`5d1f72d0 sqllang!CMaskFunction::randomMaskingFunction = <no type information>
00007fff`5b399600 sqllang!CRandomMaskFunctionProperties::ValidateArgumentsSemantic (<no parameter info>)
00007fff`5cbb8608 sqllang!CRandomMaskFunctionProperties::`vftable' = <no type information>
00007fff`5d2f6080 sqllang!crc64RandomSeed = <no type information>
00007fff`5b9d4df0 sqllang!randES (<no parameter info>)
00007fff`5c008800 sqllang!CFingerprintUtilStatic::RandomTableInit (<no parameter info>)
00007fff`5b3995d0 sqllang!CRandomMaskFunctionProperties::IsValidForType (<no parameter info>)
00007fff`5bba5c40 sqllang!CryptGenerateRandom (<no parameter info>)
00007fff`5adc5740 sqllang!UllGetFingerprintRand (<no parameter info>)
00007fff`5d20d848 sqllang!CEncryptionMetadata::mdRandomized = <no type information>
00007fff`5b399580 sqllang!CRandomMaskFunctionProperties::GetMaskingFunctionName (<no parameter info>)
00007fff`5c7ccc68 sqllang!_imp_BCryptGenRandom = <no type information>
00007fff`5b399590 sqllang!CRandomMaskFunctionProperties::NumberOfArguments (<no parameter info>)
00007fff`5b9d4f00 sqllang!getRandES (<no parameter info>)
00007fff`5c7cf170 sqllang!_imp_rand = <no type information>
00007fff`5b9d5000 sqllang!R8Rand (<no parameter info>)

Marvelous. The plan from here is to break on one of the possible methods and then call RAND() using the SQL command line client. I decided to start with R8Rand. After setting the break point and running….

Success! First try! It’s never this easy, but for once one of my research projects decided to cut me a break. Disassembling the R8Rand method shows that it doesn’t really do much, but it does make a call to getRandES.

sqllang!R8Rand:
00007fff`5b9d5000 53                   push    rbx
00007fff`5b9d5001 4883ec20             sub     rsp, 20h
00007fff`5b9d5005 488bda               mov     rbx, rdx
00007fff`5b9d5008 4c8bc1               mov     r8, rcx
00007fff`5b9d500b 0fb612               movzx   edx, byte ptr [rdx]
00007fff`5b9d500e 80fa03               cmp     dl, 3
00007fff`5b9d5011 7420                 je      sqllang!R8Rand+0x34 (00007fff`5b9d5033)
00007fff`5b9d5013 33c0                 xor     eax, eax
00007fff`5b9d5015 448bcb               mov     r9d, ebx
00007fff`5b9d5018 80faff               cmp     dl, 0FFh
00007fff`5b9d501b 8b5308               mov     edx, dword ptr [rbx+8]
00007fff`5b9d501e 0f94c0               sete    al
00007fff`5b9d5021 85c0                 test    eax, eax
00007fff`5b9d5023 0f94c1               sete    cl
00007fff`5b9d5026 e8d5feffff           call    sqllang!getRandES (00007fff`5b9d4f00)
00007fff`5b9d502b f20f114308           movsd   mmword ptr [rbx+8], xmm0
00007fff`5b9d5030 c60300               mov     byte ptr [rbx], 0
00007fff`5b9d5033 4883c420             add     rsp, 20h
00007fff`5b9d5037 5b                   pop     rbx
00007fff`5b9d5038 c3                   ret

Similarly, getRandES makes a call to randES, after potentially calling GetTickCount:

    sqllang!getRandES:
{...snip...}
00007fff`5b9d4f63 8d41ff           lea     eax, [rcx-1]
00007fff`5b9d4f66 3da9ffff7f       cmp     eax, 7FFFFFA9h
00007fff`5b9d4f6b 762c             jbe     sqllang!getRandES+0x99 (00007fff`5b9d4f99)
00007fff`5b9d4f6d ff15ed76df00     call    qword ptr [sqllang!_imp_GetTickCount (00007fff`5c7cc660)]
00007fff`5b9d4f73 33c5             xor     eax, ebp
00007fff`5b9d4f75 33c6             xor     eax, esi
00007fff`5b9d4f77 99               cdq     
00007fff`5b9d4f78 c744242032090100 mov     dword ptr [rsp+20h], 10932h
00007fff`5b9d4f80 33c2             xor     eax, edx
00007fff`5b9d4f82 2bc2             sub     eax, edx
00007fff`5b9d4f84 ba39300000       mov     edx, 3039h
00007fff`5b9d4f89 8d48ff           lea     ecx, [rax-1]
00007fff`5b9d4f8c 81f9a9ffff7f     cmp     ecx, 7FFFFFA9h
00007fff`5b9d4f92 0f46d0           cmovbe  edx, eax
00007fff`5b9d4f95 89542450         mov     dword ptr [rsp+50h], edx
00007fff`5b9d4f99 488d542420       lea     rdx, [rsp+20h]
00007fff`5b9d4f9e 488d4c2450       lea     rcx, [rsp+50h]
00007fff`5b9d4fa3 e848feffff       call    sqllang!randES (00007fff`5b9d4df0)
{...snip...}

Looking at randES, it seemed to do a bunch of spooky math with the values it’s provided:

sqllang!randES:
00007fff`5b9d4df0 48895c2408       mov     qword ptr [rsp+8], rbx
00007fff`5b9d4df5 57               push    rdi
00007fff`5b9d4df6 4883ec30         sub     rsp, 30h
00007fff`5b9d4dfa 4c8bca           mov     r9, rdx
00007fff`5b9d4dfd 0f29742420       movaps  xmmword ptr [rsp+20h], xmm6
00007fff`5b9d4e02 4c8bd1           mov     r10, rcx
00007fff`5b9d4e05 b855c78913       mov     eax, 1389C755h
00007fff`5b9d4e0a f729             imul    dword ptr [rcx]
00007fff`5b9d4e0c 69094e9c0000     imul    ecx, dword ptr [rcx], 9C4Eh
00007fff`5b9d4e12 c1fa0c           sar     edx, 0Ch
00007fff`5b9d4e15 8bc2             mov     eax, edx
00007fff`5b9d4e17 c1e81f           shr     eax, 1Fh
00007fff`5b9d4e1a 03d0             add     edx, eax
00007fff`5b9d4e1c 69c2abffff7f     imul    eax, edx, 7FFFFFABh
{..snip...}

randES seemed to take two parameters, which are passed in RCX and RDX in Windows 64bit world. The next step was to check what these parameters are when RAND() is executed with and without a seed.

So RCX is a pointer to the original seed value (either provided by the user or sourced from GetTickCount) and RDX is always 0x10932 (or in decimal: 67890). RandES returns a float, which will be in xmm0 once the function returns. Doing some further debugging confirmed that RandES returns the value that is eventually passed back to the SQL client:

We’ll call the two integers passed to randES S1 and S2. Before moving on, I double checked that calling RAND() from within a stored procedure follows the same logic and doesn’t do anything silly like reseed from GetTickCount every time. That wasn’t the case, calling a stored procedure that used RAND() multiple times on the same connection behaved the same as calling select RAND() directly. Comfortable that I’d found the method that implements the RNG, I fired up Ghidra.

Ghidra vs sqllang.dll

Ghidra is a reverse engineering tool developed by the feds. It’s pretty awesome, mostly because it’s pseudo-code decompiler produces reasonably accurate C. It also includes useful things like a PDB parser.

The plan is to load sqllang.dll into Ghidra. The auto-analysis portion took about an hour and a bit on a 4 core virtual machine with 8GB of RAM. sqllang.dll is around 40MB, and the PDB file is another 64MB! Ghidra was clunky at times with such large files, but I didn’t experience any crashes or dead locks. Filtering symbols took a solid 10 minutes each time, but such is life.

After loading the file, I went to the getRandES method to try and understand how randES was called a little better:

The above code showed that if no seed was provided, then GetTickCount was XORed with two pointers to generate the initial seed value. If the RNG had already been used in the connection, then the parameters were retrieved from thread local storage instead. Manually providing a seed reinitialised the RNG with the user supplied value, regardless of whether RAND() had been called in that connection previously.

The interesting note here is that user seed values greater than 0x7FFFFFAA caused the RNG to default to 12345. Looking at the assembly, this is an unsigned comparison:

       101064f89 8d 48 ff        LEA        param_1,[RAX + -0x1]
       101064f8c 81 f9 a9        CMP        param_1,0x7fffffa9
                 ff ff 7f
       101064f92 0f 46 d0        CMOVBE     param_2,EAX

This is extremely useful info, as it’s defined the possible values that can be passed to randES. Specifically, unsigned integers from 0 to 0x7ffffffaa.

Moving onto randES itself:

The multiplication by zero didn’t make much sense, but it looks like this was Ghidra’s decompiler getting confused. Stepping through the assembly with windbg quickly confirmed the correct value.

Re-implementation in C

With the above information from Ghidra, re-implementing the RNG in C was relatively easy. I ended up with the following. Normally, S1 and S2 are passed as pointers and updated as part of the algorithm, in the example below I’ve broken this out into separate functions so that we get some granular control over the algorithm while messing around with it:

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

int32_t inc_s1(int32_t s){
    uint32_t uVar1;
    int32_t s1;

    uVar1 = s / 0xd1a4 + (s >> 0x1f);
    s1 = (s * 0x9c4e) + (uVar1 + (uVar1 >> 0x1f)) * -0x7fffffab;
    if (s1 < 0) {
        s1 = s1 + 0x7fffffab;
    }

    return s1;
}

int32_t inc_s2(int32_t s){
    uint32_t uVar1;
    int32_t s2;

    uVar1 = s / 0xce26 + (s >> 0x1f);
    s2 = s * 0x9ef4 + (uVar1 + (uVar1 >> 0x1f)) * -0x7fffff07;
    if (s2 < 0) {
        s2 = s2 + 0x7fffff07;
    }

    return s2;
}

double randES(int32_t s1, int32_t s2){
    int32_t iVar1;
    double dVar3;
  
    s1 = inc_s1(s1);
    s2 = inc_s2(s2);

    iVar1 = s1 - s2;
    if (iVar1 < 1) {
        iVar1 = iVar1 + 0x7fffffaa;
    }
    dVar3 = iVar1 * 4.656613e-010;

    return dVar3;
}

The full test code is available here.

Analysis

The plan is to figure out if we can determine S1 and S2 for a given float returned from RAND(). If we can, then we can figure out the next random value that will be returned for that connection, and potentially previous values. Using our session token example, it means an attacker can analyse their session token, figure out the state variables and then calculate the session tokens assigned for other users.

Seeds

As mentioned in the Ghidra section, the RNG needs its S1 and S2 seeds set. The S2 seed is always initially set to 67890, and S1 is either user provided or determined by the XORing the result of GetTickCount with two pointers. Looking at the GetTickCount implementation, the seed is generated using:

S1 = GetTickCount ^ stack_pointer (lower 32 bits) ^ heap_pointer (lower 32 bits)

The stack pointer was always the lower 32 bits of RSP+0x58. Some more digging would be required to figure out the origin of the heap pointer, but at this stage it’s not crucial for figuring out how the RNG works.

Attack Approach

Given that the initial S2 value is always static, we can brute force the first return from RAND() no problem. The trouble starts when we have things like applications that will re-use the same SQL connection. We have no way to know if the float we’ve obtained is the first, second, third, n-th random number returned for a specific connection. Initially, I figured I could just brute force every possible number for S1 with S2 set to 67890, and if the result isn’t found then move onto the next iteration of S2. This approach didn’t work out, as it turns out for any given S2 value, there is some S1 value that will generate the target float. If the attacker can obtain two consecutive random numbers, then it’s possible to correlate the two and figure out the real S1 and S2 values.

Bruteforcer Fail

As you can guess, my plan at this point involved number crunching. Basically writing something along the lines of for(i = 0; i < 0x7FFFFFAA; i++) and trying every possible permutation, seeing if the resulting state variables generated the next random number, and if not doing the whole thing all over again. I wrote brute forcers both using a handful of threads and even a CUDA kernel (I used this tutorial). I got a little carried away with the brute force concept, and forgot that math is a thing. I’m including this tidbit as a reminder that sometimes learning means doing something dumb for a while until you come up with something not-dumb. Oh well, a few evenings gone and at least I got to learn about CUDA development. ^_^

Anyway, as (rand * 4.656613e-010) = S1 - S2, and we can iterate over all of the possible values of S2, then finding candidate S1 values should be easy!

Bruteforcer Success

The complete brute forcer code is available here, however the following code snippet illustrates the general idea:

 58     double target_1 = atof(argv[1]);
 59     double target_2 = atof(argv[2]);
 60