Wednesday, January 28, 2009

Python: Simple URL extractor

def url_finder(data):

all =re.findall("http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+",data)

for i in all:
outpt = i.strip('"').strip("'") + "\n"
print outpt

inpt = "aaaaaaaaaaaaaa bbbbbbbbb ccccccccc dddd http://a.b/a/a/a/index.html"


This code will simply find url using regular expression and output it.

Wednesday, January 21, 2009

Python: UCS2 shellcode to hex converter

When analyzing javascript that contain shellcode, I really need a UCS2 to Hex converter before running the shellcode via libemu's sctest because the shellcode are in UCS2 format when directly convert the hex into ascii, it means nothing, for example:

UCS2 : %u3341

if i remove the %u and directly convert the 3341 to ascii, it will produce 3A in ascii. But this may bring a false meaning if we run the shellcode. Because the real hex is 4133. So, before we convert the ucs2 into hex, we need to remove the %u and swap the 33 and 41. To make our life easier, we a have python code that automate our job:

def ucs2hex(self, match):
s =
return "".join([s[4]+s[5],s[2]+s[3]]) # swap the 4th and 5th char with 2nd and 3rd char

def find_word(self,data):
p = re.compile(r'\%u(\w{4})') #regular expression to search for %u and 4 char after it
return p.sub(self.ucs2hex, data)

ucs2_string = "%u3341"
hex_string = self.find_word(ucs2_string)

print hex_string

this code will simply sear the string for %u and 4 chars after it, swap the char no 4 and 5 with char no 2 and 3.

Monday, January 19, 2009

Turning off GCC Stack Smashing Protection

When trying to test my code against stack smashing, I'm stuck when the stack smashing protection always disturb me and terminate the program. Thats really frustrated because I'm just want to learn buffer overflow attack. After a short research and googling, I wrote this short tutorial for my own reminder if i forgot it in the next time.

What is stack Smashing protection?

From .

It is a GCC (Gnu Compiler Collection) extension for protecting applications from stack-smashing attacks. Applications written in C will be protected by the method that automatically inserts protection code into an application at compilation time. The protection is realized by buffer overflow detection and the variable reordering feature to avoid the corruption of pointers. The basic idea of buffer overflow detection comes from StackGuard system.

How to Bypass SSP?

Let say our program named unprotect.c. To bypass the stack smashing protection, we just compile it with -fno-stack-protector option.

for example:

user@user:~$ gcc -fno-stack-protector unprotect.c -o unprotect

so, when we text the code, the SSP is not activated when we smash the stack.

for example:

user@user:~$printf "%0516x" | ./unprotect
user@user:~$Segmentation fault

yahoo.. we did it..

Thursday, January 8, 2009

Detect and Bypass Packer

Sometimes, when doing RCE (Reverse Code Engineering) using ollydbg, we got a message tell that the source are encrypted. And that make our life harder if the code are encrypted but ollydbg did not alert to us. Both of them because the code had been encrypted using "packer". Packer are used for reducing the size of file and at the same time it encrypt the code. It is one of anti-reverse engineering method. One of the commonly used is UPX but it was already known and easily unpackt it.

In this post I will demonstrate how easy you can bypass the the packer in our code. For this example, i used UPX for the packer and I'm packing calc.exe and rrenamed it to kalc.exe. Tutorial on how to pack using UPX is out of scope of this post but trust me, there're lots of tuts in google.

First thing is of course load the code in ollydbg and the first thing you see on EP was the PUSHAD instruction. PUSHAD was used to PUSH all the registers (eg: EAX,EBX ...) to the stack. This make the the backup af the data before the packing process occured. So, they did not fear of changing the data during packing.

So, the second thing is we step into the instruction by pressing F8. This is for making the all the data PUSHed into the stack.

After we step into the instruction, we can see ESP at the right side had filled with something. ESP is stack pointer and point to the top of the stack.

Then we right-click at the ESP and choose follow in Dump. We will see that something chnaging in the hexdump below the ollydbg.

Then, we will make a hardware breakpoint. Highlight the first dword value (thats are the first 4 pair hex value) and then right-click > Breakpoint > Harware on access > Dword.

After that, run the code and it will stop at the hardware breakpoint that we made before. If you notice, there are POPAD instruction. This instruction is calling or POP all the data in stack. It is opposite with PUSHAD. Thats mean, we are at the end of the packing process. But we need to step a little bit by pressing F8 and after we step after the JMP, we will arrive at the start point of the unpack file or we called Original Entry Point (OEP).