Copy Link
Add to Bookmark
Report

The Assembly Language Magazine 4

VOL 1 NUMBER 4
December 1989

Written by and for assembly language programmers.

Table of Contents

  • Editorial
  • Policy and Guide Lines
  • Beginners' Corner
  • Structure, Speed and Size
    • By Thomas J. Keller
    • Editorial Rebuttal

  • Accessing the Command Line Arguments By Thomas J. Keller
  • Original Vector Locator by Rick Engle
  • How to call DOS from within a TSR by David O'Riva
  • Environment Variable Processor by David O'Riva
  • Program Reviews
    • Multi-Edit ver 4.00a
    • SHEZ
    • 4DOS

  • Book Reviews
    • Assembly Language Quick Reference Reviewed by George A. Stanislav

  • GPFILT.ASM

Editorial

It has been much too long since the last issue of the Magazine was published. Much of this time was due to the lack of submissions but there has been enough to assemble since early November. I hope that it will not be as long till the next one is ready for distribution. You can help make that possible by writing up and sending in an article.

I'm trying out a new editor for this issue. That makes it four editors for 4 issues. There is a review of it in the review section.

There is a continueing and probably insoluable problem in formatting the 'Magazine'. The readability of the text portions is enhanced with wider margins and is more easily bound with a wide left margin. The difficulty arises when source code is included. 80 columns is little enough in which to fit the code and comments, allowing nothing for margins. So this time we'll try a 5 space margin on the left for the text portion. Further offset should be done with your printer.

A couple of quick notes here as I don't know where else to put them.

For the assembly programmer the principle difference in writing for DOS4+ is that there is a possible disk structure using 32 bit FAT entries. This of course has no effect as long as you use only the DOS calls for disk access, but if you are going to do direct disk editing this must be checked for.

The occasional ~ is for the use of my spelling checker.

Policy and Guide Lines

The Assembly Language 'Magazine' is edited by Patrick and David O'Riva. We also operate the AsmLang and CFS BBS to distribute the 'Magazine' and to make available as much information as possible to the assembly language programmer. On FidoNet the address is 1:143/37. Address:

2726 Hostetter Rd
San Jose, CA 95132
408-259-2223

Most Shareware mentioned is available on the AsmLang board if local sources cannot be found.

Name and address must be included with all articles and files. Executable file size and percent of assembly code (when available) should be included when a program is mentioned and is required from an author or publisher. Any article of interest to Assembly language programmers will be considered for inclusion. Quality of writing will not be a factor, but I reserve the right to try and correct spelling errors and minor mistakes in grammar, and to remove sections.

Non-exclusive copyright must be given. No monetary compensation will be made.

Outlines of projects that might be undertaken jointly are welcome. For example: One person who is capable with hardware needs support from a user friendly programmer and a math whiz.

Advertisements as such are not acceptable. Authors and publishers wishing to contribute reviews of their own products will be considered and included as space and time permit. These must include executable file size, percent of assembly code and time comparisons.

Your editor would like information on math libraries, and reviews of such.

Articles must be submitted in pclone readable format or sent E-mail.

Money: Your editor has none. Therefore no compensation can be made for articles included. Subscription fees obviously don't exist. Publication costs I expect to be nil (NUL). Small contributions will be accepted to support the BBS where back issues are available as well as files and programs mentioned in articles(if PD or Shareware ONLY).

Shareware-- Many of the programs mentioned in the "Magazine" are Shareware. Most of the readers are prospective authors of programs that can be successfully marketed as Shareware. If you make significant use of these programs the author is entitled to his registration fee or donation. Please help Shareware to continue to
be a viable marketing method for all of us by urging everyone to register and by helping to distribute quality programs.

Beginners' Corner

I finished up the last column by saying I would discuss more techniques this time. I have entirely forgotten what they were. So without dwelling on that we will just move on the means of getting your program ready to run. The two formats (.com and .exe) are very different and so will be discussed separately.


COM Programs

On Entry all of your segment registers are set to the same value, that of the start of the PSP. Your stack pointer is set to the top of the segment, and your instruction pointer is set to 100h. You need to make a generous estimate of the maximum amount of stack that your program can use (or count it exactly) Each level of Call uses 2 bytes (for the address of the next instruction). An INT uses 6 bytes. (2 for the IP, 2 for the CS, and 2 for the Flags). Each push of course uses 2. So if your subroutines can go 4 levels deep and contain 7 pushes (without intervening pops) and the deepest contains an INT21h, then you would need at least 28 bytes of stack. But stack space is cheap, and you might need to change things. So use a nice round number of 128 bytes. BIOS also uses YOUR stack in the earlier versions of DOS, and the guideline for that is at least 128 bytes. Result: 256 bytes is safe for a modest program. To implement this the following lines of code could be used at the start of the program:

~ 
org 100h
jmp main
defstack db 32 dup('stack ')
stacktop label byte

;other data

main:
cli
mov sp,offset stacktop
sti
~

The db statement is 32 times the string of 8 characters totaling 256 bytes. It could just all well be db 256, but it is kind of nice when looking at it with a debugger to see the stack area and how much has been used all nicely labeled. The cli and sti aren't really necessary here because it is only one instruction, but you are dealing with the stack, and it's well to remember that.

At the end of your program you need a label e.g.

     ~ 
progend label byte

Then following your stack adjustment above:

mov bx,offset progend
mov cl,4
shr bx,cl
inc bx
~

These instructions change the offset value into a number of paragraphs (16 bytes) and to the end of the last paragraph. This is the total number of paragraphs that will be occupied by your program. Then it is necessary to inform DOS of this information:

     ~ 
mov ah,4ah
int 21h
~

4a is the DOS function to modify allocated memory. It needs the new number of paragraphs in BX (which is where it was put)

At this point, your program is in an orderly condition. Your data as well as that in the PSP is available with the DS and ES registers, The stack is large enough and well mannered, and all surplus memory is available to you or other programs.

Structure, Speed and Size as Elements of Programming Style

By Thomas J. Keller
P.O. Box 14069
Santa Rosa, CA, 95402


Let us examine the reasons for choosing to implement a given program in assembly language as opposed to some high level language. The reasons most commonly given are execution speed and memory image size.

Execution speed, except in certain highly critical realtime applications, or certain high resolution graphics applications, is probably not a realistic reason to opt for assembly language. For example, a good C compiler with optimization (which precludes use of Turbo or Quick C) produces code which only suffers a 10-15% speed penalty, over typical hand crafted assembly language code. It is possible to write assembly language code which will run faster than this, but few programmers have the requisite skills.

In most applications, a 10-15% speed penalty is simply irrelevant. It is unlikely that the typical user would even notice such a difference. In particular, programs which are highly interactive, and thus spend far and away the greatest amount of time waiting for user input are highly insensitive to such speed penalties. Many people don't realize that even assuming that the user is typing at a rate of 100 wpm (approximately 500 keystrokes/minute), the CPU is still spending the bulk of its time idling, waiting for the next keystroke.

There are, of course, always exceptions to virtually any rule, and there are most certainly exceptions to this rule. Word processors, for example, while actually accepting text input, are not speed critical. When performing global search and replace, or spell checking, for example, even a 10% penalty can become expensive on large documents. So there is a tradeoff to be made.

Assembly language programs cost considerably more than 10-15% more to develop than high level programs. The minutiae involved in managing a massive assembly language programming effort are overwhelming. Assembly language programs take MUCH longer to complete, in almost all cases, than high level programs do, a major contributory factor in the overall cost of development. Finally, projects developed in high level programming languages are much more likely to be easily ported to platforms based on processors other than the platform on which the project is developed, and very important consideration for a major project. The ability to port a project easily to other platforms increases the market for a product, thereby not only increasing the profitability of the product, but also helping to reduce the sale price of the product (larger market generally translates to lower per unit cost).

So the vendor or developer must analyze the relative impact of a small improvement in execution speed vs a large increase in development time and cost, which consequently translates to higher selling prices, thereby reducing the anticipated market for their product. In many cases, the tradeoffs do not merit choosing assembly language.

Let us turn now to binary image size (memory size). The advantages of small programs are clear, when examining programs which are, in the DOS world, TSRs (The MAC and AMIGA worlds have similar cases, though I am not sufficiently familiar with them to know what they are called). These programs are loaded into memory, and remain there until explicitly removed, which means that the memory they use is NOT available for other uses. Device drivers similarly use memory, precluding its use for other programs, and therefore also clearly benefit from small size. In the multi-tasking world (DeskView or PC/MOS, in the PC clone market), small executables also have an advantage, permitting more programs to be run "simultaneously" in a given memory configuration, though running multi-taskers in severely restricted memory configurations probably qualifies as a technical error.

What of normal, single tasking, single user environments (such as DOS, the MAC and AMIGA environments)? Besides the ego boost of creating a very small, very tight utility or application, what benefit is there in generating very small programs?

They take less disk space to store, but realistically, at least under DOS, lots of very small utilities may actually not achieve a significant savings in disk space, due to granularity of storage allocation. They load a little faster, in most cases.

But once again, the economics of the issue comes back to haunt us. It is not clear that the effort and expense of writing most applications in assembly language due to size considerations is an economically rational decision. The same economic pressures and considerations apply as do to the execution speed issue discussed above.

On to structure. I must take issue with Patrick O'Riva regarding their purpose and nature of "structured programming." While much of his definition is true, it is incomplete, and appears to reflect a misunderstanding of certain aspects of the structured approach to programming.

Firstly, it is entirely possible (and not altogether a rare occurrence) to write thoroughly unstructured code in PASCAL or C. One must take care to recognize the difference between references to a "block structured" language, as PASCAL and C both are, and "structured programming," which is really a totally separate issue.

Structured programming is an approach to programming that is thoroughly applicable to whatever language a project is being implemented in. It implies firstly a step-wise refinement approach to defining the solution to a problem which the program is to address (in other words, determining the nature of the desired goal, and an at least rational approach to reaching said goal). Secondly, it involves determining, to the extent possible, the nature and structure of the data that is to be processed by the program. Finally, it involves a top-down approach to the actual coding process.

Just what is a top-down approach? Essentially, this means that we code the high level functionality of the program first, programming simple "do nothing" stubs for the lower levels of the program. As necessary to test the high level code, we implement lower level functions, again, if needed, programming still lower level stubs. Assuming that the structured design approach of step-wise refinement was used to begin with, the actual coding should really amount to translating the logic flow diagrams, or pseudo-code, or whatever means of recording the refinement process was used, into actual program code. In the ideal situation, the program almost literally codes itself at this point.

There is a myth that "structured programming" means "goto-less" programming. In fact, this is not the case. This myth came into being through misunderstanding of the rather harsh criticism of the "go to" which occurred in the computer science journals beginning in approximate the mid to late sixties. This criticism was based primarily upon the typically excessive use of the "go to" in FORTRAN and BASIC programming at the time. Such indiscriminate use of "goto" led to what has been called "spaghetti" code, code which is virtually impossible to trace or analyze.

In fact, there are many cases in programming where the goto is most structured solution available. Structured coding techniques are intended to clarify and make easier the process of analysis, design and implementation of computer programs, not to define rigid, strictly enforced rules in the face of all reason.

Structured programming is ALWAYS the best approach to ANY computer program. If the internal requirements of the program, as regards speed or memory utilization, dictate the use of goto's, then use them. A properly documented GOTO can be far more "structured" than an undocumented string of modular function calls.

So, back to assembly language programming. When is it appropriate to choose assembly language to implement a program? First, and most obviously, when the speed or memory utilization requirements of the application demand the capabilities that well crafted assembly language offers. Second, perhaps not so obviously, when it is necessary to work at the hardware level a great deal. High level languages, even C, do not generally manipulate hardware registers efficiently. So, if your program makes frequent or widespread use of direct hardware manipulation, it is a likely candidate for assembly language.

Finally, and probably the most gratifying reason of all to choose assembly language, is when you want the satisfaction of having tackled a project in assembly and pushed the bits around to suit your purpose. There is little I can imagine that is more satisfying than to reach down into the microprocessor chip and twiddle those bits. Just be sure that you don't let your ego cloud your judgment, when the economics of the project are important (e.g., when a project is to be distributed commercially, or there is an urgent need for speedy completion).

I believe that all PROGRAMMERS (as opposed to casual computer users) should learn the assembly language for the machines on which they work. Besides offering the flexibility of shifting to assembly to meet a specific goal, learning assembly intimately familiarizes the programmer with the hardware on which s/he is working. The more you know about your hardware environment, the better off you are.

Editorial Rebuttal

I thank Mr. Keller very much for his article and agree with many of the points he has made. However I must still argue the points of size and speed and justification.

Whenever a program is user limited and will not be used in a multi-tasking environment as is often the case with a word processor and certain drawing programs, there may be little to be gained in assembly programming. Also there are programs which are DOS limited and little speed increase is possible.

Mr. Keller uses a figure of 10 to 15 percent speed penalty. My experience indicates a value closer to 300 to 400 percent though direct comparisons are difficult to make because the same programs are usually not written in both assembly and in C. The size difference seems to be a factor of 5 to 10. The two prime examples I can offer are both by Microsoft, and it can be assumed they make use of an optimizing compiler. Their assembler is approximately 110k in size. A86 while not compatible in syntax has comparable features. It's size is 22k and assembles code in about one eighth the time.

Microsoft's programmers' editor is vaguely 250k. Qedit is about 50k and is a mix of high level and assembly. You can grow gray hairs waiting for the MS editor to do a search and replace, but if you blink you'll miss it with Qedit. A fully capable full screen editor without the extras that make it a pleasure to use can easily be written in less that 5k. Give another 5k for features. What has MS gained with the extra 240k of code?

David has recently completed (though they are still adding modules) a database and accounting program for a multi-office company. A much abbreviated version was threatening to overflow their 384k limit. Investigation of a Dbase implementation indicated in excess of 500k. Data base sorts used to take 10 hours. They now take 20 minutes. Savings in processing time and entry time plus increased functionality suggest a savings of $5000 to $10,000 per month PER OFFICE. Code size? 35k. Are they unhappy about the $15,000 they've been charged for a program that will get lost in a single floppy disk?

Given the above examples, I must maintain that the use of high level language, when there is significant processing to be done, and when it will be used on a regular and continuing basis, benefits only the software corporation, and is detrimental to the end user.

On Structured Programming I fully agree with Mr. Keller and hope that he clarified any misconceptions I left you with. I prefer a bottom up construction, but that is only preference and has no effect on the end product.

Dave's notes: Mr. Keller mentions that it is possible to get great size/speed reductions, but that few programmers have the requisite skills. But to a large extent, it isn't the skill that makes the program, it's the toolbox. The C language is extremely close to assembly - MSC does a very good job of optimizing - and it takes care of the minutiae for you. The problem with this is that the libraries supplied with the compilers were written to handle very general cases. The printf() function is an extreme example, but it typifies the problem: If you use printf once in your program to print "Hello", it adds 30K of code!

Another concern is that many high-level-language programmers don't even realize that with a tweak here, using putc instead of printf there, they can get much(!) better performance from their programs. Familiarity with the quirks of the compiler being used is a necessity... And even that isn't enough to get good performance out of a large program. AND, it decreases portability. So you're right back into the twiddling usually associated only with assembly.

I've found that if I use C for anything except flow control and one-shot tools, my programs start to get huge and slow, relative to anything that I've banged out in assembly. The database is a great example - it's a very complicated application, with a completely separated data engine & OS interface. If it had been written in C, it would be working in multiple code segments on a 286 with 4 megs and STILL take hours to run a balance, instead of 35K of code on an XT network terminal with half-hour runs.

The database was indeed a massive effort, but at this point it would be possible to strip out the engine and write with ease (and macros - lots of macros) anything that could be done in C or Dbase, and do it much better. And average runtime is cut at least in half, size by 50-90%. With a reasonably solid and application-specific toolbox, the advantages TO THE CUSTOMER of assembly programming completely eclipse those of any other language and the disadvantages of assembly itself.

Portability is another issue entirely. If you NEED portability and fast development, and IF run time and general productivity are not a concern, then C probably makes more sense. There's this nagging feeling, though, that if the UNIX OS core had been written in assembly by a reasonably good programmer, and been ported to new systems in kind, that the university systems would be clipping instead of slogging.

As far as structured programming goes, I usually design as I go along, and end up with a functional (even rational) structure. Call it "random-access programming." This is probably because I find it difficult to call a routine until I've laid out the calling conventions for it, and while I'm doing that I'll remember another routine that should be written for another module... This is not the generally recommended method, I gather.

Accessing the Command Line Arguments in Assembly Language Programs

By Thomas J. Keller
P.O. Box 14069
Santa Rosa, CA, 95402


If you're like me, you program in several languages, under several different operating systems. Under DOS, one very useful feature is the capability to pass arguments to a program as part of the invocation command line. The use of command line arguments significantly increases the power and flexibility of your programs, as well as improving the "professional look." Many languages support this capability with intrinsic or library routines which facilitate access to these command line arguments. Assembly language, of course, does not. What is a programmer to do?

As it turns out, it is quite simple to access the command line arguments under DOS. DOS places the so-called "command tail" (the command line less the actual program name) into a buffer area reserved in the PSP (Program Segment Prefix). This buffer area is known as the DTA (Disk Transfer Area).

It is extremely important that you parse the command tail, if you plan to do so at all, immediately upon entering your program. DOS does some particularly obscure and insidious things with this DTA buffer, which will destroy the command tail information.

In a .COM format program, the PSP is the first 100h (256) bytes of the program memory image, making access quite straightforward. How do we locate the PSP in a .EXE format program, however?

Fortunately, DOS sets the ES segment register to point to the beginning of the PSP under both .COM and .EXE programs. It happens to be the case that DOS also sets all other segment registers to the same location for a .COM program, simply because .COM programs reside in one and only one segment. In an .EXE invocation, the DS and ES registers are set to point to the segment in which the PSP resides as the first 100h bytes. This is the default data segment as well.

The DTA begins at offset 80h (128d) from the beginning of the PSP. When it contains a command tail, the byte at 80h contains the count of the number of bytes actually in the command tail, and the command tail string begins at offset 81h (129d) from the beginning of the PSP. The first byte of this string is always a blank (20h), and the string is terminated with a <cr> (0dh).

The exact means you use to parse the command line arguments is, of course, up to you. One possible approach is as follows:

  1. Use the data definition directives to set aside any memory you will need to store information about command line arguments (e.g., buffers for file names, byte or word values for flags and numeric arguments, etc.).
  2. Design a routine that starts scanning the command tail string for arguments. a 'first fit' (the shortest match possible) scheme is easiest to program. As each item is located and identified as to type and purpose, store the appropriate information in the data areas you have already set aside.
  3. Have a "usage" message defined, and a small routine to print it to the screen (a good idea is to print it to STDERR). Invoke this routine when the first argument on the command line is a '?,' or, if the program requires arguments, when it is invoked without them.
  4. You now have the switches, filenames, and other command line arguments available. Write your program to use them appropriately.

Included in this issue of Assembly Language Magazine is an source listing which is a sample template GPFILT.ASM for a general purpose assembly language filter. This program provides an excellent sample of command line argument parsing and one way of using these arguments (though the method used here is not the same as the one described above).

Original Vector Locator

by Rick Engle

November, 1989

INTTEST is a small assembly program which attempts to find the original address of the INT 21h function handler. This is valuable if you need to be able to make calls to the original INT 21h function even if a TSR or other program has that interrupt hooked or trapped. This gives your program secure control over the interrupt regardless of who is using it.

I did this prototype in an attempt to make certain programs somewhat immune to the effects of destructive viruses that may intercept INT 21h and use it for their own use. This technique could be used to find the original address of other MS-DOS interrupts. I wrote test programs to dump out the address of MS-DOS interrupts (such as INT 21h) and then disassembled portions of MS-DOS at those addresses to identify a stable signature of the interrupt. Then by following the chain to MS-DOS through the PSP (Program Segment Prefix) at offset 5h, I was able to find the segment:offset of the address of the handler for old CP/M calls.

This pointed to the correct segment in memory of MS-DOS and from there, after moving my offset backwards about 100h in memory, I scanned for my interrupt signature. Once I got a hit, I calculated the address of the interrupt and then could make calls to INT 21h at the segment:offset found. This program is a "brute-force" method of finding the original address. If anyone finds or has a better way, I'd be very interested in hearing about it.

NOTE: I have tested this program successfully on MS-DOS 2.11, 3.20, and 3.30.

~ 
; -----------------------------------------------------------------------
; INTTEST.ASM November, 1989 Rick Engle
;
; Finds the address of the INT 21h function dispatcher to
; allow the user to make INT 21h calls to the original
; interrupt regardless of who or what has INT 21h hooked.
;
; -----------------------------------------------------------------------
;
print macro print_parm
push ax
push dx
mov ah,9
mov dx,offset print_parm
int 21h
pop dx
pop ax
endm

; -----------------------------------------------------------------------
; - Start of program -
; -----------------------------------------------------------------------


cseg segment para public 'code'
assume cs:cseg,ds:cseg

org 100h

int_test proc far

print reboot_first

print int_address
mov cl,21h
mov ah,35h ; get interupt vector
mov al,cl ; for interupt in cl
int 21h ; do it

mov ax,es ; lets display the es
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print colon

mov ax,bx ; lets display the bx
mov di,offset out_byte
call conv_word
print out_byte
print crlf

print display_header2

mov ah,byte ptr cs:[05h] ; Get info from the PSP
mov al,byte ptr cs:[06h] ;
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print dash

mov ah,byte ptr cs:[07h] ;
mov al,byte ptr cs:[08h] ;
mov di,offset out_byte
call conv_word
print out_byte
print dash
mov ah,byte ptr cs:[09h] ;
mov al,byte ptr cs:[0ah] ;
mov di,offset out_byte
call conv_word
print out_byte
print crlf

print display_header

mov ah,byte ptr cs:[50h] ; Addess if INT 21 op code
mov al,byte ptr cs:[51h] ; in the PSP

push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print dash

mov ah,byte ptr cs:[52h] ;
mov al,byte ptr cs:[53h] ;
mov di,offset out_byte
call conv_word
print out_byte
print dash
mov ah,byte ptr cs:[54h] ;
mov al,byte ptr cs:[55h] ;
mov di,offset out_byte
call conv_word
print out_byte
print crlf

print far_address
mov ax,word ptr cs:[08h] ;
mov segm,ax
push cs ; set es = cs
pop es ;
mov di,offset out_byte
call conv_word
print out_byte
print colon

mov ax,word ptr cs:[06h] ;
mov off,ax
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print crlf

mov ax,segm
mov es,ax
mov di,off
inc di

print function_jmp
mov ax,word ptr es:[di+2] ;
mov segm2,ax
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print colon

mov ax,segm
mov es,ax
mov di,off

inc di
mov ax,word ptr es:[di] ;
mov off,ax ; save found offset of int 21h
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print crlf

;-----------------------------------------------------------------
;si = string di = string size es:bx = pointer to buffer to search
;ax = number of bytes in buffer to search. Zero flag set if found
;-----------------------------------------------------------------

mov ax,segm2
mov es,ax ;segment
mov bx,off ;offset
sub bx,0100h ;backup a bit to catch DOS
mov si,offset dos_sig ;start at modified byte
mov di,dos_sig_len ;enough of a match
mov ax,0300h ;# of bytes to search
call search ;use our search
jnz sig_not_found ;didn't find int 21h signature
mov START_SEGMENT,es ;set page
mov START_OFFSET,ax ;address of found string

print good_address
mov ax,START_SEGMENT ;
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print colon

mov ax,START_OFFSET ;
mov off,ax ; save found offset of int 21h
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print crlf

push cs ; set es = cs
pop es
mov bx,START_OFFSET
mov ax,START_SEGMENT
mov word ptr [OLDINT21], bx
mov word ptr [OLDINT21+2],ax

mov dx,offset test_message
mov ah,9
call dos_function
jmp terminate
sig_not_found:

print no_int21_found
terminate: mov ax,4c00h ; terminate process
int 21h ; and return to DOS

out_byte db 'XXXX'
db '$'
colon db ':$'
dash db '-$'
crlf db 10,13,'$'
reboot_first db 13,10,'INTTEST 1.0',13,10
db 'Reboot before running this, or',13,10
db 'make sure INT 21h is not hooked',13,10,13,10,'$'
display_header db 'HEX data at PSP address 50h is : $'
display_header2 db 'HEX data at PSP address 05h is : $'
int_address db 'Original INT 21h address is : $'
function_jmp db 'Jump address at DOS dispatcher : $'
far_address db 'Far address of DOS dispatcher : $'
good_address db 'Good INT 21h address found at : $'
test_message db 13,10,10,'This message is being printed using the INT '
db '21h Interrupt',13,10
db 'Found by Brute Force!!!!',13,10,10,'$'
no_int21_found db 13,10,'Int 21h address not found!$'
segm dw 0
segm2 dw 0
off dw 0
START_OFFSET dw 0 ;top addr shown on screen
START_SEGMENT dw 0
;dos_sig db 08Ah, 0E1h, 0EBh ; mov ah,cl
; ; jmp short label
dos_sig db 080h, 0FCh, 0F8h ; cmp ah,0F8h
dos_sig_len equ $ - dos_sig
OLDINT21 dd ? ; Old DOS function interrupt vector

int_test endp

; -----------------------------------------------------------------------
; - -
; - Subroutine to convert a word or byte to hex ASCII -
; - -
; - call with AX = binary value -
; - DI = address to store string -
; - -
; -----------------------------------------------------------------------

conv_word proc near

push ax
mov al,ah
call conv_byte ; convert upper byte
pop ax
call conv_byte ; convert lower byte
ret ; and return
conv_word endp

conv_byte proc near

push cx ; save cx

sub ah,ah ; clear upper byte
mov cl,16
div cl ; divide binary data by 16
call conv_ascii ; the quotient becomes the
stosb ; ASCII character
mov al,ah
call conv_ascii ; the remainder becomes the
stosb ; second ASCII character
pop cx ; restore cx
ret
conv_byte endp

conv_ascii proc near ; convert value 0-0Fh in al
add al,'0' ; into a "hex ascii" character
cmp al,'9'
jle conv_ascii_2 ; jump if in range 0-9
add al,'A'-'9'-1 ; offset it to range A-F
conv_ascii_2: ret ; return ASCII character in al
conv_ascii endp

;-----------------------------------------------------------------------
; This routine does a dos function by calling the old interrupt vector
;-----------------------------------------------------------------------
assume ds:nothing, es:nothing
dos_function proc

; mov cl,ah ;move our function # into cl
pushf ;These instructions simulate
;an interrupt
cli ;turn off interrupts
call CS:OLDINT21 ;Do the DOS function
sti ;enable interrupts

push cs
pop ds
push cs
pop es
ret

dos_function endp

;-----------------------------------------------------------------
;si = string di = string size es:bx = pointer to buffer to search
;ax = number of bytes in buffer to search. Zero flag set if found
;-----------------------------------------------------------------
SEARCH PROC NEAR ;si points at string
PUSH BX
PUSH DI
PUSH SI
XCHG BX,DI ;string size, ptr to data area
MOV CX,AX ;# chars in segment to search
BYTE_ADD:
LODSB ;char for first part of search
NEXT_SRCH:
REPNZ SCASB ;is first char in string in buffer
JNZ NOT_FOUND ;if not, no match
PUSH DI ;save against cmpsb

PUSH SI
PUSH CX
LEA CX,[BX-1] ;# chars in string - 1
JCXZ ONE_CHAR ;if one char search, we have found it
REP CMPSB ;otherwise compare rest of string
ONE_CHAR:
POP CX ;restore for next cmpsb
POP SI
POP DI
JNZ NEXT_SRCH ;if zr = 0 then string not found
NOT_FOUND:
LEA AX,[DI-1] ;ptr to last first character found
POP SI
POP DI
POP BX
RET ;that's all
SEARCH ENDP

cseg ends
end int_test

~

How to call DOS from within a TSR

by David O'Riva


Just a few ramblings on interactions between TSRs & DOS.

Cardinal rule: DON'T CALL DOS UNLESS YOU'RE SURE OF THE MACHINE STATE!!!

There are a few interrupt calls and memory locations you can play with to get this information. A list & explanation of sorts is below. The reason you don't call DOS if you've interrupted the machine in the middle of DOS is that:

  1. The stack is unstable as far as DOS is concerned, and you'll probably end up overwriting DOS data or going into the weeds.
  2. DOS only keeps one copy of certain crucial information as it processes a disk-related request. i.e. BPB's, current sectors, FAT memory images, fun stuff like that. If you interrupt it in the middle, ask for something different, then go back, you will probably destroy your disk, possibly beyond recall.
  3. DOS simply was not designed to be re-entrant. The first 9 or 10 function calls are cool most of the time, the rest are strictly single-processing-stream functions.

However, there is hope. And, (extra bonus) it happens to be compatible with most true MS-DOS releases, and many, many brand-name DOSes. As well as most clones.

What you need to do is after determining that the user wants to pop your program up, you set a few flags. One of them prevents your program from being popped up AGAIN while the current DOS call is completing, and the other tells a timer trap routine to start looking for DOS to finish it's current process (usually a matter of split seconds). When the timer routine detects that DOS is no longer active, it grabs control of the system and runs your TSR.

At this point, all DOS calls are as safe as they are for a normal application.

What follows is an outline of the code necessary to activate a TSR that uses DOS calls. Depending on the TSR, other things may need to be done in these routines as well. Definitely make sure you understand the interactions of the various routines before TSRing your background disk formatter.

Okay, nitty-gritty time...


You need 5 main chunks of code to do this right:

a) a bit of extra initialization code

b) your TSR's main program

c) activation request server (usually a keypress trap)

d) timer tick inDOS monitor

e) DOS busy loop monitor

And here's what they do:

a) asks DOS for the location of the inDOS flag, and stores that away.

b) does whatever you want it to.

c) when the activation requirement is sensed (the user pressed the hot-key, the print buffer is empty, the modem is sending another packet, whatever) the following steps need to be taken:

  1. have we already tried to activate, and are waiting for DOS to finish? if so, then ignore the activation request.
  2. check the inDOS flag. if we're not in DOS, then activate as usual.
  3. set a flag indicating that the TSR wants to activate, but can't right now
  4. return to DOS

d) this is linked in AFTER interrupt 08 - that is, when this interrupt happens, call the original INT 08 handler, then run your checking code:

  1. does the TSR want to run? if not, return from the interrupt.
  2. check the inDOS flag. If it's out of DOS, then run your code as normal
  3. return from the interrupt

* * NOTE: This code has to run FAST. If it's poorly coded, you may very well see downgraded performance of the entire system.


e) link in to the DOS keyboard busy loop - INT 28. This interrupt is called when DOS is waiting for a keystroke via functions 1,3,7,8,0A, and 0C. If the TSR takes control from this loop, then DOS functions ABOVE 0C are safe to use. Functions 0 - 0C are NOT safe to use.

  1. Does the TSR want to run? if not, continue down the interrupt chain.
  2. run the TSR as usual
  3. continue down the interrupt chain.

NOTES: The first action your main TSR code should take is to clear the flag that indicates the TSR is trying to run. If this is not done, your TSR will re-enter itself at least 18.2 times per second... i.e. a MESS.

The last action your main TSR code should take before leaving is to RESET the flags that prevent the TSR from being activated. If you forget to do this, your TSR will run once, then never again... I know from personal experience that this is frustrating to a dangerous degree.

Some of this code is really complicated, so don't get discouraged if it takes a few days of tweaking and hair-pulling to get it right.

All numbers in this text are in hex.

The timer tick routine is really touchy, at least the way I wrote it. Be very sure yours is reliable if you distribute a program with this structure.

The reason that functions 0-0C are separated from the rest of the DOS calls as far as re-entrancy is concerned is that they use an entirely separate stack frame. I believe this must have been done specifically for the purpose of helping TSR writers.

Does anyone know why the hell Microsoft built these neat functions into DOS and then refused to acknowledge their existence?

INTERRUPT & FUNCTION CALLS

INT 08
Timer tick interrupt. Called 18.2 times a second on IRQ 0.
The interrupt is triggered by timer 0.


INT 21, FUNCTION 34

inDOS flag address request. This function returns the address of the "inDOS flag" as a 32 bit pointer in ES:BX. The inDOS flag is a byte that is zero when DOS is not processing a function request, and is non-zero when DOS is in a function.

NOTE: This function is officially specified as RESERVED. It's use could change in future versions of DOS, and it can only be guaranteed to work in straight IBM PC-DOS or MS-DOS versions 2.0 to 3.30. Use at your own risk.


INT 28

DOS keyboard busy loop. This interrupt is called when DOS is waiting for a keystroke in the console input functions. When this interrupt is issued, it is safe to use any DOS call ABOVE 0C. Calls to DOS functions 0 - 0C will trash the stack and do nasty things.

NOTE: This function is officially RESERVED. See the note for function 34 above.


AUTHOR'S NOTE

First, the references I listed are really great. They've helped me out a lot over the past few years. Second, if your hard disk gets munched by your TSR, read the disclaimer.

CAVEAT PROGRAMMER & Disclaimer

The techniques described in here are, for the most part, UNDOCUMENTED by Microsoft or IBM. This means that you CAN NOT BE SURE that they will work on all IBM clones, and could even cause crashes on some! The timer tick interrupt provides some essential system services, and messing with it incautiously can wreak havoc.

The program outlines presented here are what worked for me on my system, and what should work on about 90% of the clones out there. However, I still suggest that you find a reference for all of the interrupts and functions described here. This file is meant to be a guideline and aid only.


REFERENCES

DOS Programmer's Reference, by Terry R. Dettmann.
$22.95, QUE Corporation

IBM DOS Technical Reference, version 3.30
$(?), International Business Machines Corp.

I can't remember how much it cost...

Environment Variable Processor

by David O'Riva

~ 
PAGE 60,132
TITLE Q43.ASM - editor prelude & display manager
;
;
COMMENT~***********************************************************************
* ---===> All code in this file is copyright 1989 by David O'Riva <===--- *
*******************************************************************************
* *
* The above line is there only to prevent people (or COMPANIIES) from *
* claiming original authorship of this code and suing me for using it. *
* You're welcome to use it anyhow you care to. *
* *
*
* Environment Variable Finder & Processor - *
* *
* The "get_environment_variable" routine is complete in itself, and can *
* be extracted and used in anything else that needs one. Just copy the entire
* routine, from the header to the endp (don't forget the RADIX and DW).
* Theroutine currently uses 315 (decimal) bytes.
*
*
* This program's purposeis to invoke an editor (or any program, really,
* with a specific machine state depending on environment variables. (Yeah!!!)
* Currently it is set up to change my screen to one of various modes, with
* the variable ED_SCRMODE being set to:
* 100/75 = 100 columns by 75 lines
* 132/44 = 132 by 44
* 80/44 = 80 by 44
* ...and then to EXEC my editor (qedit) with that mode set. You could
* set the screen back to the standard 80x25 after the EXEC returns.
*
* Note: The 80/44 set code should work on most (ar all?) EGAs. The
* other two high-res text modes use built-in extended BIOS modes in my
* Everex EV-657 EGA card (the 800x600 version) w/multisync monitor. If you've
* got one of those, you're in luck - no mods needed. It will also work on the
* EV-673 EVGA card w/appropriate monitor.
*
*
* Note to BEGINNERS: This is not an example of "good" asm code. This
* file is an example of what happens when you're up at 1:00am with too much
* coffee and a utility that needs to be fixed.
*
*
*
* This is a COM program, not an EXE. Remember to use EXE2BIN.
*
*
*
******************************************************************************~
; ; TRUE EQU 0FFH FALSE EQU 0 ;
;******************************************************************************
; CODE SEGMENT PARA PUBLIC 'CODE' ASSUME
CS:CODE,DS:CODE,ES:CODE,SS:CODE ; MAIN PROC NEAR


ORG 100H
entry:
;------------------------------------------------------------------------------
; set the screen to the correct mode
;------------------------------------------------------------------------------

call set_screen_mode

;------------------------------------------------------------------------------
; check for pathname change in environment
;------------------------------------------------------------------------------

call set_exec_name

;------------------------------------------------------------------------------
; setup memory and run the program
;------------------------------------------------------------------------------
MOV BX,OFFSET ENDRESIDENT ;deallocate unnecessary memory
MOV CL,4
SHR BX,CL
INC BX
MOV AH,04AH
INT 021H

MOV AX,CS ;exec the program
MOV INSERT_CS1,AX
MOV INSERT_CS2,AX
MOV INSERT_CS3,AX
MOV AX,04B00H
MOV BX,OFFSET EXECPARMS
MOV DX,OFFSET PROGNAME
INT 021H
;------------------------------------------------------------------------------
; clean up and leave
;------------------------------------------------------------------------------
MOV AH,04DH ;get return code from program
INT 021H

MOV AH,04CH ;leave
INT 021H
;
;******************************************************************************
;
; data
;
PROGNAME DB 'F:\UTILITY\MISC\Q.EXE',0
db 100 dup(' ')


EXECPARMS DW 0 ;use current environment
DW 080H ;use current command tail
INSERT_CS1 DW ?
DW 05CH ;use current FCB's
INSERT_CS2 DW ?
DW 06CH
INSERT_CS3 DW ?


ENDRESIDENT:

;******************************************************************************
; more data - used only for setup & checks
;
valid_modes db '80/44 '
db '132/44'
db '100/75'
screen_mode db ' '

mode_jump dw goto_43
dw goto_132
dw goto_100


ev_mode db 'ED_SCRMODE',0
ev_pathname db 'ED_PATH',0



PAGE
;******************************************************************************
; set_screen_mode -
;
;
; ENTRY:
;
; EXIT:
;
; DESTROYED:
;
;------------------------------------------------------------------------------
set_screen_mode:
MOV AH,012H ;check for presence of EGA/VGA
MOV BL,010H
INT 010H
CMP BL,010H ;BL changed? (should have # of
; bytes of EGA memory)
JE ssm_no_ega ;This is no EGA!
;------------------------------------------------------------------------------
; check environment for correct mode set -
; don't set mode if none specified
;------------------------------------------------------------------------------
mov si,offset ev_mode
mov di,offset screen_mode
mov cx,6 ;accept 6 chars
mov ax,4 ;get fixed-length string
call get_environment_variable

and ax,0feh
jne ssm_no_env_mode
;------------------------------------------------------------------------------
; look up the variable's value in my mode table
;------------------------------------------------------------------------------
mov bx,0
mov di,offset valid_modes

ssm_check_mode: mov dx,di
mov si,offset screen_mode
mov cx,6
repe cmpsb
je ssm_found_mode
mov di,dx
add di,6
inc bx
cmp bx,3
jne ssm_check_mode
jmp ssm_bad_mode
;------------------------------------------------------------------------------
; set the correct screen mode
;------------------------------------------------------------------------------
ssm_found_mode: shl bx,1
jmp mode_jump[bx]

goto_100: mov ax,0070h
mov bx,8
int 010h
jmp ssm_leave

goto_132: mov ax,0070h
mov bx,0bh
int 010h
jmp ssm_leave


goto_43: MOV AX,3
INT 010H
MOV AX,01112H ;set to 8x8 chars (43/50 lines)
MOV BL,0
INT 010H
ssm_no_env_mode:
ssm_bad_mode:
ssm_no_ega:
ssm_leave:
ret



PAGE
;******************************************************************************
; set_exec_name -
;
;
; ENTRY:
;
; EXIT:
;
; DESTROYED:
;
;------------------------------------------------------------------------------
set_exec_name:
;
; If you want, write a chunk here that will read an alternate pathname
; for the editor to be executed from a different variable (like ED_PATH)
; I was going to do it, but ran out of time and need. (My editor never wanders
; around!)
;
ret









PAGE
;******************************************************************************
; Get_environment_variable -
;
;
;
;
; ENTRY: ds:[si] -> ASCIIZ environment variable name
; ds:[di] -> (up to) 129 byte buffer for string
; es = segment of program's PSP
; cx = maximum # of characters to accept
; al = variable return format
; 0 - return string in ASCIIZ format
; xxxxx 0 ........
;
; 1 - return string in DOS string ('$' terminated) format
; xxxxxxxx $ ........
;
; 2 - return string in DOS input buffer format
; maxchrs,numchrs,xxxxxxxx CR ............
;
; 3 - return string in command tail format
; numchrs,xxxxxxxxxxx CR ..........
;
; 4 - return string in fixed-length (CX chars) format
; xxxxxx
;
; EXIT: al = return codes:
; bit 0 - if set, string was longer than max, truncated
; 1 - if set, string did not exist
; 2 - if set, invalid return format requested
;
; DESTROYED: ah is undefined
;
;------------------------------------------------------------------------------
.RADIX 010h
gev_flags dw ?

Get_environment_variable:

push bx
push cx
push dx
push si

push di
push es

mov cs:gev_flags,ax
mov es,es:[02c] ;es -> program's environment
;------------------------------------------------------------------------------
; make sure the environment has at least one variable in it
;------------------------------------------------------------------------------
mov ax,es:[0]
cmp ax,0
jne gev_exists
mov ax,2
jmp gev_leave
;------------------------------------------------------------------------------
; find length of search string
;------------------------------------------------------------------------------
gev_exists: push cx
push di
mov di,si
mov cx,0ffff
gev_sourcelen:
inc cx
mov al,[di]
inc di
cmp al,0
jne gev_sourcelen

cmp cx,0
jne gev_startfind

pop di
pop cx
mov ax,2
jmp gev_leave
;------------------------------------------------------------------------------
; find string
;------------------------------------------------------------------------------
gev_startfind: mov bx,cx
mov dx,si
mov di,0
gev_checknext:
mov cx,bx
mov si,dx
repe cmpsb
je gev_found?

gev_tonextvar: mov cx,0ffff
mov al,0
repne scasb

cmp es:[di],al
jne gev_checknext

mov ax,2
pop di
pop cx
jmp gev_leave

gev_found?: cmp byte ptr es:[di],'='
jne gev_tonextvar
;------------------------------------------------------------------------------
; found the string in the environment
;------------------------------------------------------------------------------
gev_found: inc di
mov si,di
pop di
pop cx
cmp cs:gev_flags,1
ja gev_ibufform
;------------------------------------------------------------------------------
; move normal string with 0 or $ terminator
;------------------------------------------------------------------------------

gev_nextchar0:  mov     al,es:[si] 
cmp al,0
je gev_setterm0
mov ds:[di],al
inc si
inc di
dec cx
jne gev_nextchar0
mov al,es:[si]
cmp al,0
je gev_setterm0
mov al,1

gev_setterm0: cmp cs:gev_flags,0
jne gev_setterm1
mov byte ptr ds:[di],0 ;ASCIIZ string
jmp gev_leave

gev_setterm1: mov byte ptr ds:[di],'$' ;DOS string
jmp gev_leave
;------------------------------------------------------------------------------
; move string into DOS input buffer format (int 21 function 0A)
;------------------------------------------------------------------------------
gev_ibufform: cmp cs:gev_flags,2
jne gev_ctailform

mov ds:[di],cl ;set max length
inc di
mov bx,di
inc di
mov dx,0

gev_nextchar2: mov al,es:[si]
cmp al,0
je gev_setterm2
mov ds:[di],al
inc si
inc di
inc dx
dec cx
jne gev_nextchar2
mov al,es:[si]
cmp al,0

je gev_setterm2
mov al,1

gev_setterm2: mov byte ptr ds:[di],0d ;add carriage return
mov ds:[bx],dl ;set actual # of chars
jmp gev_leave
;------------------------------------------------------------------------------
; move string into command tail format
;------------------------------------------------------------------------------
gev_ctailform: cmp cs:gev_flags,3
jne gev_fixedform

mov bx,di
inc di
mov dx,0

gev_nextchar3: mov al,es:[si]
cmp al,0
je gev_setterm3
mov ds:[di],al
inc si
inc di
inc dx
dec cx
jne gev_nextchar3
mov al,es:[si]
cmp al,0
je gev_setterm3
mov al,1

gev_setterm3: mov byte ptr ds:[di],0d ;set carriage return
mov ds:[bx],dl ;set # of bytes
jmp gev_leave
;------------------------------------------------------------------------------
; move string into fixed-length area (pad it out with spaces)
;------------------------------------------------------------------------------
gev_fixedform: cmp cs:gev_flags,4
jne gev_badform

gev_nextchar4: mov al,es:[si]
cmp al,0
je gev_padout4
mov ds:[di],al
inc si
inc di
dec cx
jne gev_nextchar4
mov al,es:[si]
cmp al,0
je gev_setterm4
mov al,1
jmp gev_setterm4

gev_padout4: mov byte ptr ds:[di],' '
inc di
dec cx
jne gev_padout4

mov al,0
gev_setterm4: jmp gev_leave


gev_badform: mov ax,4

gev_leave: pop es
pop di
pop si
pop dx
pop cx
pop bx
ret
.RADIX 00ah


MAIN ENDP
;
;******************************************************************************
;
CODE ENDS
;
;******************************************************************************
;
END ENTRY

~

Program Reviews

Multi-Edit ver 4.00a (demo version): Reviewed by Patrick O'Riva.

Multi-Edit is a high feature text editor with many word processor features. The demo version is completely functional though some of the reference material is not supplied and there are advertising screens. I consider this fully acceptable as shareware. The complete version with the macro reference library is available for 79.95 and an expanded version with a spelling checker, integrated Communication terminal and phone book is $179.95.

I couldn't list all of its features here, but in addition to everything you have come to expect in a quality programming editor (multi meg files, programmable keyboard etc.) there are a number of powerful additions you might not expect. The word processor functions rival most of the specialty ones that I've tried. It won't compete with the major names for those of you who are addicted to them, but it does offer full printer support, preview file, table of contents generation, and extension keyed formatting. It will right or left justify, and supports headers and footers, and auto pagination.

It contains a calculator and an Ascii table

Saving the best for last: The language support is very strong. It has built in templates for many common constructs, and the assembler/compiler is invoked from within the editor with a single key. It will read the error table generated by a variety of software and with successive key presses move you to each line where an error was found.

Something which I found unique is Multi-Edit's help system. It is a hypertext system, and is wonderfully context sensitive most everywhere in the system. From the Help menu it has a complete table of contents and index. It is also fully user extendible. I have integrated a database I have documenting the full set of interrupts that totals about 400k and the documentation on my spelling checker as well (which integrated into Multi-Edit almost seamlessly).

In many ways this is the best editor I've ever used, but it does have a few faults, some of which are very subtle and may not even be problems to most users. It is a 'tad' slower that what I'm used to with Qedit. This is seldom noticed except in the execution of complex macros. It is quite slow in paging through long files. There are some true bugs in this version such as a crash of the program (but not the data or the system) when large deletes from large files are made. Multi-edit's treatment of file windows while very versatile is slightly different and may take some time to get used to.

For all of its advantages, until putting this Magazine together, I still found myself reverting to Qedit for the speed and ease of use. It is the first software that has made this anything other than an exercise in frustration.


SHEZ

Just a quick mention because it isn't programming related. Shez is a compression shell along the lines of ArcMaster and Qfiler. It is a fine and versatile piece of programming, supporting all common compression types. The more recent versions have virus detection when used with the SCANV programs by John McAfee.


4DOS

This is a program that is an absolute joy to use. It is a complete and virtually 100% compatible replacement for Command.com. The code size is just slightly larger than MSDOS 3.3 command.com but the added and enhanced functions save many times that amount in TSR's you no longer need to install. Just to mention a few features: An alias command whereby you can assign whatever mnemonic you wish to a command or string of commands. Select is a screen interface that allows you to mark files for use with a command. Except will execute a command for a set of files excluding one or more. There is an environment editor, built in Help, command and filename completion, Global that will execute through the directory tree, A Timer to keep track of elapsed time, as well as many enhanced batch file commands. Additional features are too numerous to mention. The current version is 4.23 and is available as Shareware, but you should register after your first 10 minutes of use. You will be hooked forever.

The above 3 programs should all be available on your local BBS's. Please be sure and register programs you use.

Book Reviews

Assembly Language Quick Reference

by Allen L. Wyatt, Sr.
Reviewed by George A. Stanislav

This 1989 book published by QUE is a nice and handy reference for assembly language programmers.

Instruction sets for six microprocessors and numeric coprocessors are listed:

             8086/8088       8087 
80286 80287
80386 80387

I could find no reference to the 80186 microprocessor, not even a suggestion that it uses the 80286 instruction set but does not multitask. Because the 80186 was the brain of Tandy 2000, quite a popular computer in its own time, its omission from the book is surprising.

There is no division into chapters. This makes it somewhat hard to figure out where the instruction sets of individual processors start. Each higher processor set contains only the list of instructions that are new for the processor or that changed somewhat.

After a brief introduction, the book starts by listing, alphabetically, all 8086/8088 instructions. The listing itself is very well done. Each instruction stands out graphically from the rest of the text. For every code there is some classification, e.g. arithmetic, bit manipulation, data-transfer.

This is followed by a very brief description ended with a colon. Next, a more detailed explanation gives sufficient information to any assembly language programmer what the instruction does.

If applicable, the book lists flags affected by the instruction. Most instructions also contain some coding examples.

The 8086/8088 instruction set is followed by the 80286 set, or rather subset as it only contains the instructions new to this microprocessor. Similarly, the 80386 section contains only those instructions not found in the 8086/8088 and 80286 sections as well as those that changed somewhat.

I find it puzzling that among those instructions considered changed in the 80386 microprocessor we can find AND, NEG, POP - because they can be used as 32-bit instructions in addition to their original usage - but cannot find JE, JNE, and all other conditional jumps. These did indeed change in the 80386 processor inasmuch they can be used either as SHORT or as NEAR while on the older microprocessors they could only jump within the SHORT range.

The rest of the book contains instructions for the math coprocessors, the 8087, 80287 and 80387. This section is divided in the same way as the microprocessor part, i.e. describing first the 8087 set, then the one new instruction for the 80286, followed by the new 80387 instructions.

There are several possibilities of improvement QUE might consider for future editions of this book:

  • Make it easier to find the start of each section by color coding the side of the paper;
  • Include references to the instructions of the older processors within the listing for the new processors. Small print of the instruction with the page number where a more detailed description can be found would be a nice enhancement;
  • At least a brief mention of the 80186 microprocessor and perhaps the V-20 and V-30 would be useful.

Despite the possibility of improvement, this is an excellent reference for any assembly language programmer. Its small size makes it very handy to keep it next to the computer as well as to take it along when travelling.

The book costs $6.95 in USA and $8.95 in Canada.

GPFILT.ASM

~ 
page ,132
TITLE GPFILT
subttl General Purpose Filter Template
;
; GPFILT.ASM
; This file contains a template for a general-purpose assembly language
; filter program.
;
; Fill in the blanks for what you wish to do. The program is set up to
; accept a command line in the form:
; COMMAND [{-|/}options] [infile [outfile]]
;
; If infile is not specified, stdin is used.
; If outfile is not specified, stdout is used.
;
; To compile and link:
; MASM GPFILT ;
; LINK GPFILT ;
; EXE2BIN GPFILT GPFILT.COM
;
; Standard routines supplied in the general shell are:
;
; get_arg - returns the address of the next command line argument in
; DX. Since this is a .COM file, the routine assumes DS will
; be the same as the command line segment.
; The routine will return with Carry set when it reaches the end
; of the command line.
;
; err_msg - displays an ASCIIZ string on the STDERR device. Call with the
; address of the string in ES:DX.
;
; do_usage- displays the usage message on the STDERR device and exits
; with an error condition (errorlevel 1). This routine will
; never return.
;
; getch - returns the next character from the input stream in AL.
; It will return with carry set if an error occurs during read.
; It will return with the ZF set at end of file.
;
; putch - writes a character from AL to the output stream. Returns with
; carry set if a write error occurs.
;
cseg segment
assume cs:cseg, ds:cseg, es:cseg, ss:cseg

org 0100h ;for .COM files

start: jmp main ;jump around data area

;
; Equates and global data area.
;
; The following equates and data areas are required by the general filter
; routines. User data area follows.
;


STDIN equ 0
STDOUT equ 1
STDERR equ 2
STDPRN equ 3
cr equ 0dh
lf equ 0ah
space equ 32
tab equ 9

infile dw STDIN ;default input file is stdin
outfile dw STDOUT ;default output file is stdout
errfile dw STDERR ;default error file is stderr
prnfile dw STDPRN ;default print file is stdprn
cmd_ptr dw 0081h ;address of first byte of command tail
PSP_ENV equ 002ch ;The segment address of the environment
;block is stored here.

infile_err db cr, lf, 'Error opening input file', 0
outfile_err db cr, lf, 'Error opening output file', 0
aborted db 07, cr, lf, 'Program aborted', 0
usage db cr, lf, 'Usage: ', 0
crlf db cr, lf, 0

;************************************************************************
;* *
;* Buffer sizes for input and output files. The buffers need not be *
;* the same size. For example, a program that removes tabs from a text *
;* file will output more characters than it reads. Therefore, the *
;* output buffer should be slightly larger than the input buffer. In *
;* general, the larger the buffer, the faster the program will run. *
;* *
;* The only restriction here is that the combined size of the buffers *
;* plus the program code and data size cannot exceed 64K. *
;* *
;* The easiest way to determine maximum available buffer memory is to *
;* assemble the program with minimum buffer sizes and examine the value *
;* of the endcode variable at the end of the program. Subtracting this *
;* value from 65,536 will give you the total buffer memory available. *
;* *
;************************************************************************
;
INNBUF_SIZE equ 31 ;size of input buffer (in K)
OUTBUF_SIZE equ 31 ;size of output buffer (in K)

;
;************************************************************************
;* *
;* Data definitions for input and output buffers. DO NOT modify these *
;* definitions unless you know exactly what it is you're doing! *
;* *
;************************************************************************
;
; Input buffer
ibfsz equ 1024*INNBUF_SIZE ;input buffer size in bytes
inbuf equ endcode ;input buffer
ibfend equ inbuf + ibfsz ;end of input buffer
;
; ibfptr is initialized to point past end of input buffer so that the first
; call to getch will result in a read from the file.
;
ibfptr dw inbuf+ibfsz

; output buffer
obfsz equ 1024*OUTBUF_SIZE ;output buffer size in bytes
outbuf equ ibfend ;output buffer
obfend equ outbuf + obfsz ;end of output buffer
obfptr dw outbuf ;start at beginning of buffer

;************************************************************************
;* *
;* USER DATA AREA *
;* *
;* Insert any data declarations specific to your program here. *
;* *
;* NOTE: The prog_name, use_msg, and use_msg1 variables MUST be *
;* defined. *
;* *
;************************************************************************
;
; This is the program name. Under DOS 3.x, this is not used because we
; can get the program name from the environment. Prior to 3.0, this
; information is not supplied by the OS.
;
prog_name db 'GPFILT', 0
;
; This is the usage message. The first two lines are required.
; The first line is the programs title line.
; Make sure to include the 0 at the end of the first line!!
; The second line shows the syntax of the program.
; Following lines (which are optional), are discussion of options, features,
; etc...
; The message MUST be terminated by a 0.
;
use_msg db ' - General Purpose FILTer program.', cr, lf, 0
use_msg1 label byte
db '[{-|/}options] [infile [outfile]]', cr, lf
db cr, lf
db 'If infile is not specified, STDIN is used', cr, lf
db 'If outfile is not specified, STDOUT is used', cr, lf
db 0
;
;************************************************************************
;* *
;* The main routine parses the command line arguments, opens files, and *
;* does other initialization tasks before calling the filter procedure *
;* to do the actual work. *
;* For a large number of filter programs, this routine will not need to *
;* be modified. Options are parsed in the get_options proc., and the *
;* filter proc. does all of the 'filter' work. *
;* *
;************************************************************************
;
main: cld
call get_options ;process options

jc gofilter ;carry indicates end of arg list
mov ah,3dh ;open file
mov al,0 ;read access
int 21h ;open the file
mov word ptr ds:[infile], ax ;save file handle
jnc main1 ;carry clear indicates success
mov dx,offset infile_err
jmp short err_exit
main1: call get_arg ;get cmd line arg in DX
jc gofilter ;carry indicates end of arg list
mov ah,3ch ;create file
mov cx,0 ;normal file
int 21h ;open the file
mov word ptr ds:[outfile],ax ;save file handle
jnc gofilter ;carry clear indicates success
mov dx,offset outfile_err
jmp short err_exit
gofilter:
call filter ;do the work
jc err_exit ;exit immediately on error
mov ah,3eh
mov bx,word ptr [infile]
int 21h ;close input file
mov ah,3eh
mov bx,word ptr [outfile]
int 21h ;close output file
mov ax,4c00h
int 21h ;exit with no error
err_exit:
call err_msg ;output error message
mov dx,offset aborted
call err_msg
mov ax,4c01h
int 21h ;and exit with error
;
;************************************************************************
;* *
;* get_options processes any command line options. Options are *
;* preceeded by either - or /. There is a lot of flexibility here. *
;* Options can be specified separately, or as a group. For example, *
;* the command "GPFILT -x -y -z" is equivalent to "GPFILT -xyz". *
;* *
;* This routine MUST return the address of the next argument in DX or *
;* carry flag set if there are no more options. In other words, return *
;* what was returned by the last call to get_arg. *
;* *
;************************************************************************
;
get_options proc
call get_arg ;get command line arg
jnc opt1
; If at least one argument is required, use this line
; call do_usage ;displays usage msg and exits
; If there are no required args, use this line
ret ;if no args, just return
opt1: mov di, dx
mov al,byte ptr ds:[di]

cmp al,'-' ;if first character of arg is '-'
jz opt_parse
cmp al,'/' ;or '/', then get options
jz opt_parse
ret ;otherwise exit
opt_parse:
inc di
mov al,byte ptr ds:[di]
or al,al ;if end of options string
jz nxt_opt ;get cmd. line arg
cmp al,'?' ;question means show usage info
jz do_usage
;
;************************************************************************
;* *
;* Code for processing other options goes here. The current option *
;* character is in AL, and the remainder of the option string is pointed*
;* to by DS:DI. *
;* *
;************************************************************************
;
jmp short opt_parse

nxt_opt:
call get_arg ;get next command line arg
jnc opt1 ;if carry
vld_args: ;then validate arguments
;
;************************************************************************
;* *
;* Validate arguments. If some options are mutually exclusive/dependent*
;* use this area to validate them. Whatever the case, if you must *
;* abort the program, call the do_usage procedure to display the usage *
;* message and exit the program. *
;* *
;************************************************************************
;
ret ; no more options
;
;************************************************************************
;* *
;* Filter does all the work. Modify this routine to do what it is you *
;* need done. *
;* *
;************************************************************************
;
filter proc
call getch ;get a character from input into AL
jbe filt_done ;exit on error or EOF
and al, 7fh ;strip the high bit
call putch ;and output it
jc filt_ret ;exit on error
jmp short filter
filt_done:
jc filt_ret ;carry set is error
call write_buffer ;output what remains of the buffer
filt_ret:

ret
filter endp
;
;************************************************************************
;* *
;* Put any program-specific routines here *
;* *
;************************************************************************

;
;************************************************************************
;* *
;* For most programs, nothing beyond here should require modification. *
;* The routines that follow are standard routines used by almost every *
;* filter program. *
;* *
;************************************************************************
;
;************************************************************************
;* *
;* This routine outputs the usage message to the STDERR device and *
;* aborts the program with an error code. A little processing is done *
;* here to get the program name and format the output. *
;* *
;************************************************************************
;
do_usage:
mov dx, offset crlf
call err_msg ;output newline
mov ah,30h ;get DOS version number
int 21h
sub al,3 ;check for version 3.x
jc lt3 ;if carry, earlier than 3.0
;
; For DOS 3.0 and later the full pathname of the file used to load this
; program is stored at the end of the environment block. We first scan
; all of the environment strings in order to find the end of the env, then
; scan the load pathname looking for the file name.
;
push es
mov ax, word ptr ds:[PSP_ENV]
mov es, ax ;ES is environment segment address
mov di, 0
mov cx, 0ffffh ;this ought to be enuf
xor ax, ax
getvar: scasb ;get char
jz end_env ;end of environment
gv1: repnz scasb ;look for end of variable
jmp short getvar ;and loop 'till end of environment
end_env:
inc di
inc di ;bump past word count
;
; ES:DI is now pointing to the beginning of the pathname used to load the
; program. We will now scan the filename looking for the last path specifier
; and use THAT address to output the program name. The program name is
; output WITHOUT the extension.

;
mov dx, di
fnloop: mov al, byte ptr es:[di]
or al, al ;if end of name
jz do30 ;then output it
inc di
cmp al, '\' ;if path specifier
jz updp ;then update path pointer
cmp al, '.' ;if '.'
jnz fnloop
mov byte ptr es:[di-1], 0 ;then place a 0 so we don't get ext
jmp short fnloop ; when outputting prog name
updp: mov dx, di ;store
jmp short fnloop
;
; ES:DX now points to the filename of the program loaded (without extension).
; Output the program name and then go on with rest of usage message.
;
do30: call err_msg ;output program name
pop es ;restore
jmp short gopt3
;
; We arrive here if the current DOS version is earlier than 3.0. Since the
; loaded program name is not available from the OS, we'll output the name
; entered in the 'prog_name' field above.
;
lt3: mov dx, offset prog_name
call err_msg ;output the program name
;
; After outputting program name, we arrive here to output the rest of the
; usage message. This code assumes that the usage message has been
; written as specified in the data area.
;
gopt3: mov dx, offset use_msg
call err_msg ;output the message
mov dx, offset usage
call err_msg
mov dx, offset use_msg1
call err_msg
mov ax,4c01h
int 21h ;and exit with error
get_options endp

;
;************************************************************************
;* *
;* Output a message (ASCIIZ string) to the standard error device. *
;* Call with address of error message in ES:DX. *
;* *
;************************************************************************
;
err_msg proc
cld
mov di,dx ;string address in di
mov cx,0ffffh
xor ax,ax
repnz scasb ;find end of string

xor cx,0ffffh
dec cx ;CX is string length
push ds
mov ax,es
mov ds,ax ;DS is segment address
mov ah,40h
mov bx,word ptr cs:[errfile]
int 21h ;output message
pop ds
ret
err_msg endp

;
;************************************************************************
;* *
;* getch returns the next character from the file in AL. *
;* Returns carry = 1 on error *
;* ZF = 1 on EOF *
;* Upon exit, if either Carry or ZF is set, the contents of AL is *
;* undefined. *
;* *
;************************************************************************
;
; Local variables used by the getch proc.
eof db 0 ;set to 1 when EOF reached in read
last_ch dw ibfend ;pointer to last char in buffer

getch proc
mov si,word ptr ds:[ibfptr] ;get input buffer pointer
cmp si,word ptr ds:[last_ch];if not at end of buffer
jz getch_eob
getch1: lodsb ;character in AL
mov word ptr ds:[ibfptr],si ;save buffer pointer
or ah,1 ;will clear Z flag
ret ;and done

getch_eob: ;end of buffer processing
cmp byte ptr ds:[eof], 1 ;end of file?
jnz getch_read ;nope, read file into buffer
getch_eof:
xor ax, ax ;set Z to indicate EOF
ret ;and return

getch_read: ; Read the next buffer full from the file.
mov ah,3fh ;read file function
mov bx,word ptr ds:[infile] ;input file handle
mov cx,ibfsz ;#characters to read
mov dx,offset inbuf ;read into here
int 21h ;DOS'll do it for us
jc read_err ;Carry means error
or ax,ax ;If AX is zero,
jz getch_eof ;we've reached end-of-file
add ax,offset inbuf
mov word ptr ds:[last_ch],ax;and save it
mov si,offset inbuf
jmp short getch1 ;and finish processing character

read_err: ;return with error and...
mov dx,offset read_err_msg ; DX pointing to error message string
ret
read_err_msg db 'Read error', cr, lf, 0
getch endp

;
;************************************************************************
;* *
;* putch writes the character passed in AL to the output file. *
;* Returns carry set on error. The character in AL is retained. *
;* *
;************************************************************************
;
putch proc
mov di,word ptr ds:[obfptr] ;get output buffer pointer
stosb ;save the character
mov word ptr ds:[obfptr],di ;and update buffer pointer
cmp di,offset obfend ;if buffer pointer == buff end
clc
jnz putch_ret
push ax
call write_buffer ;then we've got to write the buffer
pop ax
putch_ret:
ret
putch endp

;
;************************************************************************
;* *
;* write_buffer writes the output buffer to the output file. *
;* This routine should not be called except by the putch proc. and at *
;* the end of all processing (as demonstrated in the filter proc). *
;* *
;************************************************************************
;
write_buffer proc ;write buffer to output file
mov ah, 40h ;write to file function
mov bx, word ptr ds:[outfile];output file handle
mov cx, word ptr ds:[obfptr]
sub cx, offset outbuf ;compute #bytes to write
mov dx, offset outbuf ;from this buffer
int 21h ;DOS'll do it
jc write_err ;carry is error
or ax,ax ;return value of zero
jz putch_full ;indicates disk full
mov word ptr ds:[obfptr],offset outbuf
clc
ret

putch_full: ;disk is full
mov dx,offset disk_full
stc ;exit with error
ret

write_err: ;error occured during write

mov dx,offset write_err_msg
stc ;return with error
ret
write_err_msg db 'Write error', cr, lf, 0
disk_full db 'Disk full', cr, lf, 0

write_buffer endp

;
;************************************************************************
;* *
;* get_arg - Returns the address of the next command line argument in *
;* DX. The argument is in the form of an ASCIIZ string. *
;* Returns Carry = 1 if no more command line arguments. *
;* Upon exit, if Carry is set, the contents of DX is undefined. *
;* *
;************************************************************************
;
get_arg proc
mov si,word ptr [cmd_ptr]
skip_space: ;scan over leading spaces and commas
lodsb
cmp al,0 ;if we get a null
jz sk0
cmp al,cr ;or a CR,
jnz sk1
sk0: stc ;set carry to indicate failure
ret ;and exit
sk1: cmp al,space
jz skip_space ;loop until no more spaces
cmp al,','
jz skip_space ;or commas
cmp al,tab
jz skip_space ;or tabs

mov dx,si ;start of argument
dec dx
get_arg1:
lodsb ;get next character
cmp al,cr ;argument seperators are CR,
jz get_arg2
cmp al,space ;space,
jz get_arg2
cmp al,',' ;comma,
jz get_arg2
cmp al,tab ;and tab
jnz get_arg1

get_arg2:
mov byte ptr ds:[si-1], 0 ;delimit argument with 0
cmp al, cr ;if char is CR then we've reached
jnz ga2 ; the end of the argument list
dec si
ga2: mov word ptr ds:[cmd_ptr], si ;save for next time 'round
ret ;and return
get_arg endp

endcode equ $

cseg ends
end start

~

← previous
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT