Assembly tutorial, The basics, enough to get started

Here is a basic rundown of how assembly language works, some of the registers, some of the commands, and enough to keep you (hopefully) occupied for a while.

You push and pop and otherwise move pieces of memory into different places, through various ports, and send them onto different routines. In the long run, it all comes together to process information, get user in/output, and make the most of your clock cycles.


These are your bread and butter. Last time I checked there were about 30 of these, and I'm going to brush over them here:

EAX,EBX,ECX,EDX - These are all-purpose registers. They are each 32 bits wide (DWORD) and can be broken down further eg EAX is all 32 bits, AX is the 16 least significant bits, and breaks down into AH and AL (H=high and L=low). This can be done with all four of these registers.

Traditionally they have all had different uses, although this has been relaxed in later years.

EAX is the Accumulator, and holds the return value from a function (by convention, not automatically so you need to put your return value here yourself). In 16-bit OS's, AX is used for interrupt subfunctions, but the bottom line here is that when you call a function you can usually count on whatever value was in this register beforehand being destroyed, and naturally you can do whatever you want with it in your programs too. If you want to read/write from memory to a register quickly, you can use LODSB/STOSB/LODSW/STOSW to move from memory pointed to by ESI to an A-series register, or from there to the location of EDI. EAX is also used in the MUL and DIV commands. If you are interested in saving space in your applications, some opcodes have special-case EAX versions, so in short MOV EAX,value is shorter than MOV EBX,value if I remember rightly, somebody please point out if I'm wrong here.

ECX is the Counter, and as a result has a variety of functions related to it with counting in mind. The most common in 16-bit days was the LOOP command, which decremented CX and jumped to the place given if CX!=1. It is also used with JCXZ/JCXNZ and REP. ECX is also a "garbage" register - you cannot count on it being untouched after calling a function, and you can in turn do what you want with it.

EDX is the Data register. It is similar to EAX, but not nearly as widely used. The D-series registers deal with the overflow from the MUL and DIV commands. EDX is also "garbage", so the same rules apply to its use as to ECX and EAX.

EBX is a weird register. On 16-bit processors it was the only general purpose register that could be used to reference memory, although on 32-bit processors and general purpose register can be used - EBX is conventional though. As a result, BX is sometimes passed to 16-bit interrupts as an offset pointer. EBX is NOT a garbage register, so if you intend to use it in your function be sure to save its original state, because even in windows 98 changing its value can crash your program. Because of this I try to avoid this register as much as possible, unless I really need the extra register.

ESI, EDI, ESP, EBP - Addressing registers. All of these can be referred to by their 16-bit names aswell

ESI is the Source register, and points to the data source in LODSB/W and MOVSB/W. It is not a garbage register, so while you should preserve its original state data can be safely left in it when calling other functions.

EDI is the Destination register, points to the data destination in STOSB/W and MOVSB/W. Apart from this it is identical to ESI.

ESP is the Stack pointer. Every time you PUSH a piece of data, ESP is decremented the appropriate amount and your value is stored at the new ESP. In real mode SP is used instead. DO NOT TOUCH THIS REGISTER.

EBP should not be touched unless you know what you're doing. With many high level languages which use stack-based local variables, ESP is decremented a certain amount to create space for local variables and the original ESP is stored in EBP. As EBP may have contained useful data it is pushed onto the stack before any of this happens, so as a result the entire process is reversible. (Side note: because of how these local variables are stored, and the fact that when a function is called the address of the calling opcode is stored in the stack, if a buffer is contained in local memory that is not length-checked, it is possible for code to be injected there, followed by the address of the injectable code as the offset of the return function is overwritten.)

CS, DS, ES, FS, GS, SS - Segment registers.
In 16-bit OS's, there were originally only CS, DS and ES, and the segments were simple - each could be multiplied by 0x10 (16) and added to the offset of any addressed data to find the physical memory address. However with 32-bit cpus and the introduction of protected mode, the segment registers now just refer to segment descriptors, which are structures containing information about the address of the segment, size of the segment, rights within the segment (readable, writeable, executable etc). CS is the code segment, and CS:EIP (or IP in 16-bit) points to the current instruction. DS is the data segment, and is used with (E)SI as ES (Extra segment) is used with (E)DI for LOD/STO/MOV SB commands. It should be noted that DS is the default segment if you don't specify one when moving data around, eg MOV EAX,[EDI] will load EAX with the dword at DS:EDI. ES,FS,GS are all "extra" segments, as in they can be used as placeholders for segments that aren't used particularly often. I may be wrong here (someone point this out if I am) but I'm pretty sure there is no opcode to move a data constant or even a piece from memory directly into a segment register, so you need to POP it off the stack, or MOV from another register. It should be pointed out also that these are not Extended as the other registers are, but are rather 16-bits wide for each register. SS is used in conjuntion with (E)SP for PUSHes and POPs, so should also not be touched AT ALL.

As well as these registers, which are most commonly used (except for segment registers which are never used for usermode 32-bit apps) there are a few more special purpose registers such as cr0 which is used for protected mode related stuff. There are also FPU registers which are manipulated by the arithmetic coprocessor, but there are tutorials on those specifically, including one which comes with the masm32 package. These ones I've mentioned should be the only ones that you really need to use.

These are known as "opcodes" because unlike commands in other languages, these don't call different procedures but are the actual messages that get sent directly to the CPU. Here's a few everyday ones to get you started:

MOV - moves data from memory to register, register to register, constant to register, register to memory. Cannot be used to move from memory to memory directly. If you want to set a register to 0 it is conventional to use XOR EAX,EAX (eg setting eax to zero) as it is smaller and slightly faster.

Syntax: MOV destination,source

Both arguments must be of the same width, eg MOV AX,EBX will not work.

CALL - calls a subroutine
If you use nasm or masm, you'll find the INVOKE macro used often. What this does is PUSHes all the arguments onto the stack in reverse order, and then CALLs the function requested. Note that this is a macro, and when assembled is converted into the PUSHes and CALL that make up the instruction.

Syntax: CALL label

RET - return from subroutine

Syntax: RET

PUSH - pushes a value, point in memory, or register onto the stack. This is often used to preserve registers that aren't allowed to be destroyed, and also for passing arguments onto a function.

Syntax: PUSH value/register/memory reference

POP - pops a value off the stack into a point in memory or register. The opposite of PUSH, and usually symmetrical to PUSH, as often if a register is to be preserved with PUSH EBX you can find POP EBX later in the function.

Syntax: POP value/register/memory reference

CMP - compares two registers, or a register and a value (fixme - may work with memory refs?)
This is what makes programs dynamic - the ability to do If and Else statements. This also works with loops if you're using unoptimised code. When it returns it sets various flags, which can be added onto variables or used with the conditional jump commands. If you want to compare a register to 0 it is also conventional to use OR register,register which I think may be slightly quicker.

Syntax: CMP value1,value2

JMP - Jump to a label
In 32-bit programming it doesn't matter whether a jump is far or near, so JMP is used for all purposes, however in 16-bit programming the difference between a near jump and a far jump is quite a lot because a near jump can only reference 16 bits either side of your opcode, so jumping to another segment requires a FAR JUMP which includes the segment of the new routine.

Syntax: JMP label

JE, JNE, JG, JL, JGE, JLE, JZ, JNZ - Conditional jumps
JZ and JE are the same, as are JNZ and JNE, but have different meanings, as JZ will jump if the zero flag is set, but JE will jump if the values are equal (and so the zero flag gets set meaning they are essentially the same). Which one you use is completely up to you, as they assemble to the same machine code. Also to be noted are JG, JL, JGE and JLE which are jump if greater, jump if less than, jump if greater than/equal to, and jump if less than/equal to. Note that these compare the first value of the CMP command to the second, so CMP v1,v2 JG label is the same as if(v1>v2) goto label; Also note that these have synonyms which are JA, JB, JAE and JBE which in this case have DIFFERENT assembled values as they compare different flags to produce what as far as I can tell are the same result. To make this more confusing, there are also opposites, eg JNB (jump if not below), JNG, JNGE which assemble to the same machine code commands as their counterparts, eg JNB is the same as JAE etc. There are also other commands that test individual flags which can be used for different purposes (eg testing OF the overflow flag to see if an addition has overflowed off the end) which are all fairly self-explanitory. These can all be near or short jumps, and should be assembled differently depending on where the labels are in your program.

Syntax: same as for JMP

These come in different shapes and sizes, fulfil different purposes, and most of all use slightly different syntax (with the exception of mingw/dev-cpp inline asm which is just weird). Nasm is a free open source assembler with simplified structure support and very well defined means for addressing, masm has better structure support, more macros, and is helpful when making the step between C and asm. Tasm is a lot older than both of these, and as far as I know is used more for console apps. Fasm is one I have heard much about, but never really played with much.

The bottom line here is that it doesn't matter what assembler you use, as long as you like it, and you know the differences between it and the other major assemblers for when you get code samples.


I'm hoping this is enough for now, post here if theres anything ive missed or badly fucked up and ill see to it that it gets fixed

All for now


Article written by AUTHOR_NAME