Bkerndev
Bkerndev
Bkerndev
Bran's Kernel Development A tutorial on writing kernels Version 1.0 (Feb 6th, 2005)
- a 100% IBM Compatible PC with: - a Pentium II or K6 300MHz - 32MBytes of RAM - a VGA compatible videocard with monitor - a Keyboard - a Floppy drive - a Hard disk with enough space for all development tools and space for documents and source code - Microsoft Windows, or a flavour of Unix (Linux, FreeBSD) - an Internet connection to look up documents (A mouse is highly recommended)
Toolset
Compilers
- The Gnu C Compiler (GCC) [Unix] - DJGPP (GCC for DOS/Windows) [Windows]
Assemblers
- Netwide Assembler (NASM) [Unix/Windows]
Virtual Machines
- VMWare Workstation 4.0.5 [Linux/Windows NT/2000/XP] - Microsoft VirtualPC [Windows NT/2000/XP] - Bochs [Unix/Windows]
dd MULTIBOOT_HEADER_MAGIC dd MULTIBOOT_HEADER_FLAGS dd MULTIBOOT_CHECKSUM ; AOUT kludge - must be physical addresses. Make a note of these: ; The linker script fills in the data for these ones! dd mboot dd code dd bss dd end dd start ; This is an endless loop here. Make a note of this: Later on, we ; will insert an 'extern _main', followed by 'call _main', right ; before the 'jmp $'. stublet: jmp $ ; Shortly we will add code for loading the GDT right here! ; In just a few pages in this tutorial, we will add our Interrupt ; Service Routines (ISRs) right here!
; Here is the definition of our BSS section. Right now, we'll use ; it just to store the stack. Remember that a stack actually grows ; downwards, so we declare the size of the data before declaring ; the identifier '_sys_stack' SECTION .bss resb 8192 ; This reserves 8KBytes of memory here _sys_stack:
The kernel's entry file: 'start.asm'
the list. We want the compiled version of 'start.asm' called 'start.o' to be the first object file linked, because that's where our kernel's entry point is. The next line is 'phys'. This is not a keyword, but a variable to be used in the linker script. In this case, we use it as a pointer to an address in memory: a pointer to 1MByte, which is where our binary is to be loaded to and run at. The 3rd keyword is SECTIONS. If you study this linker script, you will see that if defines the 3 main sections: '.text', '.data', and '.bss'. There are 3 variables defined also: 'code', 'data', 'bss', and 'end'. Do not get confused by this: the 3 variables that you see are actually variables that are in our startup file, start.asm. ALIGN(4096) ensures that each section starts on a 4096byte boundary. In this case, that means that each section will start on a separate 'page' in memory.
OUTPUT_FORMAT("binary") ENTRY(start) phys = 0x00100000; SECTIONS { .text phys : AT(phys) { code = .; *(.text) . = ALIGN(4096); } .data : AT(phys + (data - code)) { data = .; *(.data) . = ALIGN(4096); } .bss : AT(phys + (bss - code)) { bss = .; *(.bss) . = ALIGN(4096); } end = .; }
The Linker Script: 'link.ld'
that we need to link together and resolve in order to create kernel.bin. Lastly, the 'pause' command will display "Press a key to continue..." on the screen and wait for us to press a key so that we can see what our assembler or linker gives out onscreen in terms of syntax errors.
echo Now assembling, compiling, and linking your kernel: nasm -f aout -o start.o start.asm rem Remember this spot here: We will add 'gcc' commands here to compile C sources rem This links all your files. Remember that as you add *.o files, you need to rem add them after start.o. If you don't add them at all, they won't be in your kernel! ld -T link.ld -o kernel.bin start.o echo Done! pause
Our builder batch file: 'build.bat'
return rv; } /* We will use this to write to I/O ports to send bytes to devices. This * will be used in the next tutorial for changing the textmode cursor * position. Again, we use some inline assembly for the stuff that simply * cannot be done in C */ void outportb (unsigned short _port, unsigned char _data) { __asm__ __volatile__ ("outb %1, %0" : : "dN" (_port), "a" (_data)); } /* This is a very simple main() function. All it does is sit in an * infinite loop. This will be like our 'idle' loop */ void main() { /* You would add commands after here */ /* ...and leave this loop in. There is an endless loop in * 'start.asm' also, if you accidentally delete this next line */ for (;;); }
'main.c': Our kernel's small, yet important beginnings
Before compiling this, we need to add 2 lines into 'start.asm'. We need to let NASM know that main() is in an 'external' file and we need to call main() from the assembly file, also. Open 'start.asm', and look for the line that says 'stublet:'. Immediately after that line, add the lines:
extern _main call _main
Now wait just a minute. Why are there leading underscores for '_main', when in C, we declared it as 'main'? The compiler gcc will put an underscore in front of all of the function and variable names when it compiles. Therefore, to reference a function or variable from our assembly code, we must add an underscore to the function name if the function is in a C source file!. This is actually good enough to compile 'as is', however we are still missing our 'system.h'. Simply create a blank text file named 'system.h'. Add all the function prototypes for memcpy, memset, memsetw, strlen, inportb, and outportb to this file. It is wise to use macros to prevent an include file, or 'header' file from declaring things more than once using some nice #ifndef, #define, and #endif tricks. We will include this file in each C source file in this tutorial. This will define each function that you can use in your kernel. Feel free to expand upon this library with anything you think you will need. Observe:
#ifndef __SYSTEM_H #define __SYSTEM_H /* MAIN.C */ extern unsigned char *memcpy(unsigned char *dest, const unsigned char *src, int count); extern unsigned char *memset(unsigned char *dest, unsigned char val, int count);
extern unsigned short *memsetw(unsigned short *dest, unsigned short val, int count); extern int strlen(const char *str); extern unsigned char inportb (unsigned short _port); extern void outportb (unsigned short _port, unsigned char _data); #endif
Our global include file: 'system.h'
Next, we need to find out how to compile this. Open your 'build.bat' from the previous section in this tutorial, and add the following line to compile your 'main.c'. Please note that this assumes that 'system.h' is in an 'include' directory in your kernel sources directory. This command executes the compiler 'gcc'. Among the various arguments passed in, there is '-Wall' which gives you warnings about your code. '-nostdinc' along with '-fno-builtin' means that we aren't using standard C library functions. '-I./include' tells the compiler that our headers are in the 'include' directory inside the current. '-c' tells gcc to compile only: No linking yet! Remembering from the previous section in this tutorial, '-o main.o' is the output file that the compiler is to make, with the last argument, 'main.c'. In short, compile 'main.c' into 'main.o' with options best for kernels. Right click the batch file and select 'edit' to edit it!
gcc -Wall -O -fstrength-reduce -fomit-frame-pointer -finline-functions -nostdinc fno-builtin -I./include -c -o main.o main.c
Add this line to 'build.bat'
Don't forget to follow the instructions we left in 'build.bat'! You need to add 'main.o' to the list of object files that need to be linked to create your kernel! Finally, if you are stuck creating our accessory functions like memcpy, a solution 'main.c' is shown here.
The upper 8-bits of each 16-bit text element is called an 'attribute byte', and the lower 8bits is called the 'character byte'. As you can see from the above table, mapping out the parts of each 16-bit text element, the attribute byte gets broken up further into 2 different 4-bit chunks: 1 representing background color and 1 representing foreground color. Now, because of the fact that only 4-bits define each color, there can only be a maximum of 16 different colors to choose from (Using the equation (num bits ^ 2) - 4^2 = 16). Below is a table of the default 16-color palette. Value 0 1 2 3 4 5 6 7 Color BLACK BLUE GREEN CYAN RED MAGENTA BROWN LIGHT GREY Value 8 9 10 11 12 13 14 15 Color DARK GREY LIGHT BLUE LIGHT GREEN LIGHT CYAN LIGHT RED LIGHT MAGENTA LIGHT BROWN WHITE
Finally, to access a particular index in memory, there is an equation that we must use. The text mode memory is a simple 'linear' (or flat) area of memory, but the video controller makes it appear to be an 80x25 matrix of 16-bit values. Each line of text is sequential in memory; they follow eachother. We therefore try to break up the screen into horizontal lines. The best way to do this is to use the following equation: index = (y_value * width_of_screen) + x_value;
This equation shows that to access the index in the text memory array for say (3, 4), we would use the equation to find that 4 * 80 + 3 is 323. This means that to draw to location (3, 4) on the screen, we need to write to do something similar to this: unsigned short *where = (unsigned short *)0xB8000 + 323; *where = character | (attribute << 8); Following now is 'scrn.c', which is where all of our functions dealing with the screen will be. We include our 'system.h' file so that we can use outportb, memcpy, memset, memsetw, and strlen. The scrolling method that we use is rather interesting: We take a chunk of text memory starting at line 1 (NOT line 0), and copy it over top of line 0. This basically moves the entire screen up one line. To complete the scroll, we erase the last line of text by writing spaces with our attribute bytes. The putch function is possibly the most complicated function in this file. It is also the largest, because it needs to handle any newlines ('\n'), carriage returns ('\r'), and backspaces ('\b'). Later, if you wish, you may handle the alarm character ('\a' - ASCII character 7), which is only supposed to do a short beep when it is encountered. I have included a function to set the screen colors also (settextcolor) if you wish.
#include < system.h > /* These define our textpointer, our background and foreground * colors (attributes), and x and y cursor coordinates */ unsigned short *textmemptr; int attrib = 0x0F; int csr_x = 0, csr_y = 0; /* Scrolls the screen */ void scroll(void) { unsigned blank, temp; /* A blank is defined as a space... we need to give it * backcolor too */ blank = 0x20 | (attrib << 8); /* Row 25 is the end, this means we need to scroll up */ if(csr_y >= 25) { /* Move the current text chunk that makes up the screen * back in the buffer by a line */ temp = csr_y - 25 + 1; memcpy (textmemptr, textmemptr + temp * 80, (25 - temp) * 80 * 2); /* Finally, we set the chunk of memory that occupies * the last line of text to our 'blank' character */ memsetw (textmemptr + (25 - temp) * 80, blank, 80); csr_y = 25 - 1; } }
/* Updates the hardware cursor: the little blinking line * on the screen under the last character pressed! */ void move_csr(void) { unsigned temp;
/* The equation for finding the index in a linear * chunk of memory can be represented by: * Index = [(y * width) + x] */ temp = csr_y * 80 + csr_x; /* This sends a command to indicies 14 and 15 in the * CRT Control Register of the VGA controller. These * are the high and low bytes of the index that show * where the hardware cursor is to be 'blinking'. To * learn more, you should look up some VGA specific * programming documents. A great start to graphics: * http://www.brackeen.com/home/vga */ outportb(0x3D4, 14); outportb(0x3D5, temp >> 8); outportb(0x3D4, 15); outportb(0x3D5, temp); } /* Clears the screen */ void cls() { unsigned blank; int i; /* Again, we need the 'short' that will be used to * represent a space with color */ blank = 0x20 | (attrib << 8); /* Sets the entire screen to spaces in our current * color */ for(i = 0; i < 25; i++) memsetw (textmemptr + i * 80, blank, 80); /* Update out virtual cursor, and then move the * hardware cursor */ csr_x = 0; csr_y = 0; move_csr(); } /* Puts a single character on the screen */ void putch(unsigned char c) { unsigned short *where; unsigned att = attrib << 8; /* Handle a backspace, by moving the cursor back one space */ if(c == 0x08) { if(csr_x != 0) csr_x--; } /* Handles a tab by incrementing the cursor's x, but only * to a point that will make it divisible by 8 */ else if(c == 0x09) { csr_x = (csr_x + 8) & ~(8 - 1); } /* Handles a 'Carriage Return', which simply brings the * cursor back to the margin */ else if(c == '\r')
{ csr_x = 0; } /* We handle our newlines the way DOS and the BIOS do: we * treat it as if a 'CR' was also there, so we bring the * cursor to the margin and we increment the 'y' value */ else if(c == '\n') { csr_x = 0; csr_y++; } /* Any character greater than and including a space, is a * printable character. The equation for finding the index * in a linear chunk of memory can be represented by: * Index = [(y * width) + x] */ else if(c >= ' ') { where = textmemptr + (csr_y * 80 + csr_x); *where = c | att; /* Character AND attributes: color */ csr_x++; } /* If the cursor has reached the edge of the screen's width, we * insert a new line in there */ if(csr_x >= 80) { csr_x = 0; csr_y++; } /* Scroll the screen if needed, and finally move the cursor */ scroll(); move_csr(); } /* Uses the above routine to output a string... */ void puts(unsigned char *text) { int i; for (i = 0; i < strlen(text); i++) { putch(text[i]); }
/* Sets the forecolor and backcolor that we will use */ void settextcolor(unsigned char forecolor, unsigned char backcolor) { /* Top 4 bytes are the background, bottom 4 bytes * are the foreground color */ attrib = (backcolor << 4) | (forecolor & 0x0F) } /* Sets our text-mode VGA pointer, then clears the screen for us */ void init_video(void) { textmemptr = (unsigned short *)0xB8000; cls(); }
Next, we need to compile this into our kernel. To do that, you need to edit 'build.bat' in order to add a new gcc compile command. Simply copy the command in 'build.bat' that corresponds to 'main.c' and paste it right afterwards. In our newly pasted line, change 'main' to 'scrn'. Again, don't forget to add 'scrn.o' to the list of files that LD needs to link! Before we can use these in main, you must add the function prototypes for putch, puts, cls, init_video, and settextcolor into 'system.h'. Don't forget the 'extern' keyword and the semicolons as these are each function prototypes:
extern extern extern extern extern void void void void void cls(); putch(unsigned char c); puts(unsigned char *str); settextcolor(unsigned char forecolor, unsigned char backcolor); init_video();
Add these to 'system.h' so we can call these from 'main.c'
Now, it's safe to use our new screen printing functions in out main function. Open 'main.c' and add a line that calls init_video(), and finally a line that calls puts, passing it a string: puts("Hello World!"); Finally, save all your changes, double click 'build.bat' to make your kernel, debugging any syntax errors. Copy your 'kernel.bin' to your GRUB floppy disk, and if all went well, you should now have a kernel that prints 'Hello World!' on a black screen in white text!
The GDT
A vital part of the 386's various protection measures is the Global Descriptor Table, otherwise called a GDT. The GDT defines base access privileges for certain parts of memory. We can use an entry in the GDT to generate segment violation exceptions that give the kernel an opportunity to end a process that is doing something it shouldn't. Most modern operating systems use a mode of memory called "Paging" to do this: It is alot more versatile and allows for higher flexibility. The GDT can also define if a section in memory is executable or if it is infact, data. The GDT is also capable of defining what are called Task State Segments (TSSes). A TSS is used in hardwarebased multitasking, and is not discussed here. Please note that a TSS is not the only way to enable multitasking. Note that GRUB already installs a GDT for you, but if we overwrite the area of memory that GRUB was loaded to, we will trash the GDT and this will cause what is called a 'triple fault'. In short, it'll reset the machine. What we should do to prevent that problem is to set up our own GDT in a place in memory that we know and can access. This involves building our own GDT, telling the processor where it is, and finally loading the processor's CS, DS, ES, FS, and GS registers with our new entries. The CS register is also known as the Code Segment. The Code Segment tells the processor which offset into the GDT that it will find the access privileges in which to execute the current code. The DS register is the same idea, but it's not for code, it's the Data segment and defines the access privileges for the current data. ES, FS, and GS are simply alternate DS registers, and are not important to us. The GDT itself is a list of 64-bit long entries. These entries define where in memory that the allowed region will start, as well as the limit of this region, and the access privileges associated with this entry. One common rule is that the first entry in your GDT, entry 0, is known as the NULL descriptor. No segment register should be set to 0, otherwise this will cause a General Protection fault, and is a protection feature of the processor. The General Protection Fault and several other types of 'exceptions' will be explained in detail under the section on Interrupt Service Routines (ISRs). Each GDT entry also defines whether or not the current segment that the processor is running in is for System use (Ring 0) or for Application use (Ring 3). There are other ring types, but they are not important. Major operating systems today only use Ring 0 and Ring 3. As a basic rule, any application causes an exception when it tries to access system or Ring 0 data. This protection exists to prevent an application from causing the kernel to crash. As far as the GDT is concerned, the ring levels here tell the processor if it's allowed to execute special privileged instructions. Certain instructions are privileged, meaning that they can only be run in higher ring levels. Examples of this are 'cli' and 'sti' which disable and enable interrupts, respectively. If an application were allowed to use the assembly instructions 'cli' or 'sti', it could effectively stop your kernel from running. You will learn more about interrupts in later sections of this tutorial. Each GDT entry's Access and Granularity fields can be defined as follows:
7 P
6 DPL
5 4 3 DT Type
7 G
6 D 0
P - Segment is present? (1 = Yes) DPL - Which Ring (0 to 3) DT - Descriptor Type Type - Which type?
G - Granularity (0 = 1byte, 1 = 4kbyte) D - Operand Size (0 = 16bit, 1 = 32-bit) 0 - Always 0 A - Available for System (Always set to 0)
In our tutorial kernel, we will create a GDT with only 3 entries. Why 3? We need one 'dummy' descriptor in the beginning to act as our NULL segment for the processor's memory protection features. We need one entry for the Code Segment, and finally, we need one entry for the Data Segment registers. To tell the processor where our new GDT table is, we use the assembly opcode 'lgdt'. 'lgdt' needs to be given a pointer to a special 48-bit structure. This special 48-bit structure is made up of 16-bits for the limit of the GDT (again, needed for protection so the processor can immediately create a General Protection Fault if we want a segment whose offset doesn't exist in the GDT), and 32-bits for the address of the GDT itself. We can use a simple array of 3 entries to define our GDT. For our special GDT pointer, we only need one to be declared. We call it 'gp'. Create a new file, 'gdt.c'. Get gcc to compile your 'gdt.c' by adding a line to your 'build.bat' as outlined in previous sections of this tutorial. Once again, I remind you to add 'gdt.o' to the list of files that LD needs to link in order to create your kernel! Analyse the following code which makes up the first half of 'gdt.c':
#include < system.h > /* Defines a GDT entry. We say packed, because it prevents the * compiler from doing things that it thinks is best: Prevent * compiler "optimization" by packing */ struct gdt_entry { unsigned short limit_low; unsigned short base_low; unsigned char base_middle; unsigned char access; unsigned char granularity; unsigned char base_high; } __attribute__((packed)); /* Special pointer which includes the limit: The max bytes * taken up by the GDT, minus 1. Again, this NEEDS to be packed */ struct gdt_ptr { unsigned short limit; unsigned int base; } __attribute__((packed)); /* Our GDT, with 3 entries, and finally our special GDT pointer */ struct gdt_entry gdt[3]; struct gdt_ptr gp; /* This will be a function in start.asm. We use this to properly * reload the new segment registers */ extern void gdt_flush();
You will notice that we added a declaration for a function that does not exist yet: gdt_flush(). gdt_flush() is the function that actually tells the processor where the new GDT exists, using our special pointer that includes a limit as seen above. We need to reload new segment registers, and finally do a far jump to reload our new code segment. Learn from this code, and add it to 'start.asm' right after the endless loop after 'stublet' in the blank space provided:
; This will set up our new segment registers. We need to do ; something special in order to set CS. We do what is called a ; far jump. A jump that includes a segment as well as an offset. ; This is declared in C as 'extern void gdt_flush();' global _gdt_flush ; Allows the C code to link to this extern _gp ; Says that '_gp' is in another file _gdt_flush: lgdt [_gp] ; Load the GDT with our '_gp' which is a special pointer mov ax, 0x10 ; 0x10 is the offset in the GDT to our data segment mov ds, ax mov es, ax mov fs, ax mov gs, ax mov ss, ax jmp 0x08:flush2 ; 0x08 is the offset to our code segment: Far jump! flush2: ret ; Returns back to the C code!
Add these lines to 'start.asm'
It's not enough to actually reserve space in memory for a GDT. We need to write values into each GDT entry, set the 'gp' GDT pointer, and then we need to call gdt_flush() to perform the update. There is a special function which follows, called 'gdt_set_entry()', which does all the shifts to set each field in the given GDT entry to the appropriate value using easy to use function arguments. You must add the prototypes for these 2 functions (at very least we need 'gdt_install') into 'system.h' so that we can use them in 'main.c'. Analyse the following code - it makes up the rest of 'gdt.c':
/* Setup a descriptor in the Global Descriptor Table */ void gdt_set_gate(int num, unsigned long base, unsigned long limit, unsigned char access, unsigned char gran) { /* Setup the descriptor base address */ gdt[num].base_low = (base & 0xFFFF); gdt[num].base_middle = (base >> 16) & 0xFF; gdt[num].base_high = (base >> 24) & 0xFF; /* Setup the descriptor limits */ gdt[num].limit_low = (limit & 0xFFFF); gdt[num].granularity = ((limit >> 16) & 0x0F); /* Finally, set up the granularity and access flags */ gdt[num].granularity |= (gran & 0xF0); gdt[num].access = access; } /* Should be called by main. This will setup the special GDT
* pointer, set up the first 3 entries in our GDT, and then * finally call gdt_flush() in our assembler file in order * to tell the processor where the new GDT is and update the * new segment registers */ void gdt_install() { /* Setup the GDT pointer and limit */ gp.limit = (sizeof(struct gdt_entry) * 3) - 1; gp.base = &gdt; /* Our NULL descriptor */ gdt_set_gate(0, 0, 0, 0, 0); /* The second entry is our Code Segment. The base address * is 0, the limit is 4GBytes, it uses 4KByte granularity, * uses 32-bit opcodes, and is a Code Segment descriptor. * Please check the table above in the tutorial in order * to see exactly what each value means */ gdt_set_gate(1, 0, 0xFFFFFFFF, 0x9A, 0xCF); /* The third entry is our Data Segment. It's EXACTLY the * same as our code segment, but the descriptor type in * this entry's access byte says it's a Data Segment */ gdt_set_gate(2, 0, 0xFFFFFFFF, 0x92, 0xCF); /* Flush out the old GDT and install the new changes! */ gdt_flush(); }
Add this to 'gdt.c'. It does some of the dirty work relating to the GDT! Don't forget the prototypes in 'system.h'!
Now that our GDT loading infrastructure is in place, and we compile and link it into our kernel, we need to call gdt_install() in order to actually do our work! Open 'main.c' and add 'gdt_install();' as the very first line in your main() function. The GDT needs to be one of the very first things that you initialize because as you learned from this section of the tutorial, it's very important. You can now compile, link, and send our kernel to our floppy disk to test it out. You won't see any visible changes on the screen: this is an internal change. Onto the Interrupt Descriptor Table (IDT)!
The IDT
The Interrupt Descriptor Table, or IDT, is used in order to show the processor what Interrupt Service Routine (ISR) to call to handle either an exception or an 'int' opcode (in assembly). IDT entries are also called by Interrupt Requests whenever a device has completed a request and needs to be serviced. Exceptions and ISRs are explained in greater detail in the next section of this tutorial, accessible here. Each IDT entry is similar to that of a GDT entry. Both have hold a base address, both hold an access flag, and both are 64-bits long. The major differences in these two types of descriptors is in the meanings of these fields. In an IDT, the base address specified in the descriptor is actually the address of the Interrupt Service Routine that the processor should call when this interrupt is 'raised' (called). An IDT entry doesn't have a limit, instead it has a segment that you need to specify. The segment must be the same segment that the given ISR is located in. This allows the processor to give control to the kernel through an interrupt that has occured when the processor is in a different ring (like when an application is running). The access flags of an IDT entry are also similar to a GDT entry's. There is a field to say if the descriptor is actually present or not. There is a field for the Descriptor Privilege Level (DPL) to say which ring is the highest number that is allowed to use the given interrupt. The major difference is the rest of the access flag definition. The lower 5-bits of the access byte is always set to 01110 in binary. This is 14 in decimal. Here is a table to give you a better graphical representation of the access byte for an IDT entry. 7 P 6 DPL 54 Always 01110 (14) 0
P - Segment is present? (1 = Yes) DPL - Which Ring (0 to 3) Create a new file in your kernel directory called 'idt.c'. Edit your 'build.bat' file to add another line to make GCC also compile 'idt.c'. Finally, add 'idt.o' to the ever growing list of files that LD needs to link together to create your kernel. 'idt.c' will declare a packed structure that defines each IDT entry, the special IDT pointer structure needed to load the IDT (similar to loading a GDT, but alot less work!), and also declare an array of 256 IDT entries: This will become our IDT.
#include < system.h > /* Defines an IDT entry */ struct idt_entry { unsigned short base_lo; unsigned short sel; unsigned char always0; unsigned char flags; unsigned short base_hi; } __attribute__((packed)); struct idt_ptr
/* Our kernel segment goes here! */ /* This will ALWAYS be set to 0! */ /* Set using the above table! */
{ unsigned short limit; unsigned int base; } __attribute__((packed)); /* Declare an IDT of 256 entries. Although we will only use the * first 32 entries in this tutorial, the rest exists as a bit * of a trap. If any undefined IDT entry is hit, it normally * will cause an "Unhandled Interrupt" exception. Any descriptor * for which the 'presence' bit is cleared (0) will generate an * "Unhandled Interrupt" exception */ struct idt_entry idt[256]; struct idt_ptr idtp; /* This exists in 'start.asm', and is used to load our IDT */ extern void idt_load();
This is the beginning half of 'idt.c'. Defines the vital data structures!
Again, like 'gdt.c', you will notice that there is a declaration of a function that physically exists in another file. 'idt_load' is written in assembly language just like 'gdt_flush'. All 'idt_load' is is calling the 'lidt' assembly opcode using our special IDT pointer which we create later in 'idt_install'. Open up 'start.asm', and add the following lines right after the 'ret' for '_gdt_flush':
; Loads the IDT defined in '_idtp' into the processor. ; This is declared in C as 'extern void idt_load();' global _idt_load extern _idtp _idt_load: lidt [_idtp] ret
Add this to 'start.asm'
Setting up each IDT entry is alot easier than building a GDT entry. We have an 'idt_set_gate' function which accepts the IDT entry number, the base address of our Interrupt Service Routine, our Kernel Code Segment, and the access flags as outlined in the table introduced above. Again, we have an 'idt_install' function which sets up our special IDT pointer as well as clears out the IDT to a default known state of cleared. Finally, we would load the IDT by calling 'idt_load'. Please note that you can add ISRs to your IDT at any time after the IDT is loaded. More about ISRs later.
/* Use this function to set an entry in the IDT. Alot simpler * than twiddling with the GDT ;) */ void idt_set_gate(unsigned char num, unsigned long base, unsigned short sel, unsigned char flags) { /* We'll leave you to try and code this function: take the * argument 'base' and split it up into a high and low 16-bits, * storing them in idt[num].base_hi and base_lo. The rest of the * fields that you must set in idt[num] are fairly self* explanatory when it comes to setup */ } /* Installs the IDT */ void idt_install()
{ /* Sets the special IDT pointer up, just like in 'gdt.c' */ idtp.limit = (sizeof (struct idt_entry) * 256) - 1; idtp.base = &idt; /* Clear out the entire IDT, initializing it to zeros */ memset(&idt, 0, sizeof(struct idt_entry) * 256); /* Add any new ISRs to the IDT here using idt_set_gate */ /* Points the processor's internal register to the new IDT */ idt_load();
The rest of 'idt.c'. Try to figure out 'idt_set_gate'. It's easy!
Finally, be sure to add 'idt_set_gate' and 'idt_install' as function prototypes in 'system.h'. Remember that we need to call these functions from other files, like 'main.c'. Call 'idt_install' from inside our 'main()' function, right after the call to 'gdt_install'. You should be able to compile your kernel without problems. Take some time to experiment a bit with your new kernel. If you try to do an illegal operation like dividing by zero, you will find that your machine will reset! We can catch these 'exceptions' by installing Interrupt Service Routines in our new IDT. If you got stuck writing 'idt_set_gate', you may find the solution to this section of the tutorial here.
As mentioned earlier, some exceptions push an error code onto the stack. To decrease the complexity, we handle this by pushing a dummy error code of 0 onto the stack for any ISR that doesn't push an error code already. This keeps a uniform stack frame. To
track which exception is firing, we also push the interrupt number on the stack. We use the assembler opcode 'cli' to disable interrupts and prevent an IRQ from firing, which could possibly otherwise cause conflicts in our kernel. To save space in the kernel and make a smaller binary output file, we get each ISR stub to jump to a common 'isr_common_stub'. The 'isr_common_stub' will save the processor state on the stack, push the current stack address onto the stack (gives our C handler the stack), call our C 'fault_handler' function, and finally restore the state of the stack. Add this code to 'start.asm' in the provided space, filling out all 32 ISRs:
; In just a few pages in this tutorial, we will add our Interrupt ; Service Routines (ISRs) right here! global _isr0 global _isr1 global _isr2 ... ; Fill in the rest here! global _isr30 global _isr31 ; 0: Divide By Zero Exception _isr0: cli push byte 0 ; A normal ISR stub that pops a dummy error code to keep a ; uniform stack frame push byte 0 jmp isr_common_stub ; 1: Debug Exception _isr1: cli push byte 0 push byte 1 jmp isr_common_stub ... ; Fill in from 2 to 7 here!
; 8: Double Fault Exception (With Error Code!) _isr8: cli push byte 8 ; Note that we DON'T push a value on the stack in this one! ; It pushes one already! Use this type of stub for exceptions ; that pop error codes! jmp isr_common_stub ... ; You should fill in from _isr9 to _isr31 here. Remember to ; use the correct stubs to handle error codes and push dummies!
; We call a C function in here. We need to let the assembler know ; that '_fault_handler' exists in another file extern _fault_handler ; This is our common ISR stub. It saves the processor state, sets ; up for kernel mode segments, calls the C-level fault handler, ; and finally restores the stack frame. isr_common_stub: pusha push ds push es push fs push gs
mov ax, 0x10 ; Load the Kernel Data Segment descriptor! mov ds, ax mov es, ax mov fs, ax mov gs, ax mov eax, esp ; Push us the stack push eax mov eax, _fault_handler call eax ; A special call, preserves the 'eip' register pop eax pop gs pop fs pop es pop ds popa add esp, 8 ; Cleans up the pushed error code and pushed ISR number iret ; pops 5 things at once: CS, EIP, EFLAGS, SS, and ESP!
Add this to 'start.asm' in the spot we indicated in "The Basic Kernel"
Create yourself a new file called 'isrs.c'. Once again, remember to add the appropriate line to get GCC to compile the file in 'build.bat'. Add the file 'isrs.o' to LD's list of files so that it gets linked into the kernel. 'isrs.c' is rather straight-forward: declare our regular #include line, declare the prototypes of each of the ISRs from inside 'start.asm', point the IDT entry to the correct ISR, and finally, create an interrupt handler in C to service all of our exceptions generically. I'll leave it up to you to fill in the holes here:
#include < system.h > /* These are function prototypes for all of the exception * handlers: The first 32 entries in the IDT are reserved * by Intel, and are designed to service exceptions! */ extern void isr0(); extern void isr1(); extern void isr2(); ... extern void isr29(); extern void isr30(); extern void isr31(); /* This is a very repetitive function... it's not hard, it's * just annoying. As you can see, we set the first 32 entries * in the IDT to the first 32 ISRs. We can't use a for loop * for this, because there is no way to get the function names * that correspond to that given entry. We set the access * flags to 0x8E. This means that the entry is present, is * running in ring 0 (kernel level), and has the lower 5 bits * set to the required '14', which is represented by 'E' in * hex. */ void isrs_install() { idt_set_gate(0, (unsigned)isr0, 0x08, 0x8E); idt_set_gate(1, (unsigned)isr1, 0x08, 0x8E); idt_set_gate(2, (unsigned)isr2, 0x08, 0x8E); idt_set_gate(3, (unsigned)isr3, 0x08, 0x8E); ... /* Fill in the rest of these ISRs here */ /* Fill in the rest of the ISR prototypes here */
idt_set_gate(30, (unsigned)isr30, 0x08, 0x8E); idt_set_gate(31, (unsigned)isr31, 0x08, 0x8E); } /* This is a simple string array. It contains the message that * corresponds to each and every exception. We get the correct * message by accessing like: * exception_message[interrupt_number] */ unsigned char *exception_messages[] = { "Division By Zero", "Debug", "Non Maskable Interrupt", */ ... "Reserved", "Reserved" /* Fill in the rest here from our Exceptions table
};
/* All of our Exception handling Interrupt Service Routines will * point to this function. This will tell us what exception has * happened! Right now, we simply halt the system by hitting an * endless loop. All ISRs disable interrupts while they are being * serviced as a 'locking' mechanism to prevent an IRQ from * happening and messing up kernel data structures */ void fault_handler(struct regs *r) { /* Is this a fault whose number is from 0 to 31? */ if (r->int_no < 32) { /* Display the description for the Exception that occurred. * In this tutorial, we will simply halt the system using an * infinite loop */ puts(exception_messages[r->int_no]); puts(" Exception. System Halted!\n"); for (;;); } }
The contents of 'isrs.c'
Wait, we have a new structure here as an argument to 'fault_handler': struct 'regs'. In this case, 'regs' is a way of showing the C code what the stack frame looks like. Remember that in 'start.asm' that we push a pointer to the stack onto the stack itself: this is so that we may be able to retrieve any error codes and interrupt numbers from the handlers themselves. This design is what allows us to use the same C handler for each different ISR and still be able to tell which exception or interrupt actually happened.
/* This defines what the stack looks like after an ISR was running */ struct regs { unsigned int gs, fs, es, ds; /* pushed the segs last */ unsigned int edi, esi, ebp, esp, ebx, edx, ecx, eax; /* pushed by 'pusha' */ unsigned int int_no, err_code; /* our 'push byte #' and ecodes do this */ unsigned int eip, cs, eflags, useresp, ss; /* pushed by the processor automatically */
};
Defines a stack frame pointer argument. Add this to 'system.h'
Open 'system.h' and add the definition to struct 'regs' as well as the function prototype for 'isrs_install' so that we can call it from in 'main.c'. Finally, call 'isrs_install' from in our 'main' function, right after we install our new IDT. It would be a good idea to test out the exception handlers in our kernel now. OPTIONAL: In 'main', add some tester code that will divide a number by zero. As soon as the processor encounters this, the processor will generate a "Divide By Zero" Exception, and you will see that appear on the screen! When you test that, and it works, you can delete your exception causing code (the 'putch(myvar / 0);' line, or whatever you decide to write. You may find the complete solution to 'start.s' here, and the complete solution to 'isrs.c' here.
; Note that these don't push an error code on the stack: ; We need to push a dummy error code
push byte 32 jmp irq_common_stub ... ; You need to fill in the rest!
; 47: IRQ15 _irq15: cli push byte 0 push byte 47 jmp irq_common_stub extern _irq_handler ; This is a stub that we have created for IRQ based ISRs. This calls ; '_irq_handler' in our C code. We need to create this in an 'irq.c' irq_common_stub: pusha push ds push es push fs push gs mov ax, 0x10 mov ds, ax mov es, ax mov fs, ax mov gs, ax mov eax, esp push eax mov eax, _irq_handler call eax pop eax pop gs pop fs pop es pop ds popa add esp, 8 iret
Add this chunk of code to 'start.asm'
Just like each section of this tutorial before this one, we need to create a new file called 'irq.c'. Edit 'build.bat' to add the appropriate line to get GCC to compile to source, and also remember to add a new object file to get LD to link into our kernel.
#include < system.h > /* These are own ISRs that point to our special IRQ handler * instead of the regular 'fault_handler' function */ extern void irq0(); ... /* Add the rest of the entries here to complete the declarations */ extern void irq15(); /* This array is actually an array of function pointers. We use * this to handle custom IRQ handlers for a given IRQ */ void *irq_routines[16] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* This installs a custom IRQ handler for the given IRQ */ void irq_install_handler(int irq, void (*handler)(struct regs *r))
{ } irq_routines[irq] = handler;
/* This clears the handler for a given IRQ */ void irq_uninstall_handler(int irq) { irq_routines[irq] = 0; } /* Normally, IRQs 0 to 7 are mapped to entries 8 to 15. This * is a problem in protected mode, because IDT entry 8 is a * Double Fault! Without remapping, every time IRQ0 fires, * you get a Double Fault Exception, which is NOT actually * what's happening. We send commands to the Programmable * Interrupt Controller (PICs - also called the 8259's) in * order to make IRQ0 to 15 be remapped to IDT entries 32 to * 47 */ void irq_remap(void) { outportb(0x20, 0x11); outportb(0xA0, 0x11); outportb(0x21, 0x20); outportb(0xA1, 0x28); outportb(0x21, 0x04); outportb(0xA1, 0x02); outportb(0x21, 0x01); outportb(0xA1, 0x01); outportb(0x21, 0x0); outportb(0xA1, 0x0); } /* We first remap the interrupt controllers, and then we install * the appropriate ISRs to the correct entries in the IDT. This * is just like installing the exception handlers */ void irq_install() { irq_remap(); idt_set_gate(32, (unsigned)irq0, 0x08, 0x8E); ... /* You need to add the rest! */ idt_set_gate(47, (unsigned)irq15, 0x08, 0x8E);
/* Each of the IRQ ISRs point to this function, rather than * the 'fault_handler' in 'isrs.c'. The IRQ Controllers need * to be told when you are done servicing them, so you need * to send them an "End of Interrupt" command (0x20). There * are two 8259 chips: The first exists at 0x20, the second * exists at 0xA0. If the second controller (an IRQ from 8 to * 15) gets an interrupt, you need to acknowledge the * interrupt at BOTH controllers, otherwise, you only send * an EOI command to the first controller. If you don't send * an EOI, you won't raise any more IRQs */ void irq_handler(struct regs *r) { /* This is a blank function pointer */ void (*handler)(struct regs *r); /* Find out if we have a custom handler to run for this * IRQ, and then finally, run it */
handler = irq_routines[r->int_no - 32]; if (handler) { handler(r); } /* * * if { } If the IDT entry that was invoked was greater than 40 (meaning IRQ8 - 15), then we need to send an EOI to the slave controller */ (r->int_no >= 40) outportb(0xA0, 0x20);
/* In either case, we need to send an EOI to the master * interrupt controller too */ outportb(0x20, 0x20); }
The contents of 'irq.c'
In order to actually install the IRQ handling ISRs, we need to call 'irq_install' from inside the 'main' function in 'main.c'. Before you add the call, you need to add function prototypes to 'system.h' for 'irq_install', 'irq_install_handler', and 'irq_uninstall_handler'. 'irq_install_handler' is used for allowing us to install a special custom IRQ sub handler for our device under a given IRQ. In a later section, we will use 'irq_install_handler' to install a custom IRQ handler for both the System Clock (The PIT - IRQ0) and the Keyboard (IRQ1). Add 'irq_install' to the 'main' function in 'main.c', right after we install our exception ISRs. Immediately following that line, it's safe to allow IRQs to happen. Add the line: __asm__ __volatile__ ("sti"); Congratulations, you have now followed how to step by step create a simple kernel that is capable of handling IRQs and Exceptions. An IDT is installed, along with a custom GDT to replace the original one loaded by GRUB. If you have understood all that is mentioned up until this point, you have passed one of the biggest hurdles associated with Operating System development. Most hobbyist OS developers do not successfully get past installing ISRs and an IDT. Next, we will learn about the simplest device to use an IRQ: The Programmable Interval Timer (PIT).
this function. Use it if you wish, we won't use it in this tutorial to keep things simple. For accurate and easy timekeeping, I recommend setting to 100Hz in a real kernel.
void timer_phase(int hz) { int divisor = 1193180 / hz; outportb(0x43, 0x36); outportb(0x40, divisor & 0xFF); outportb(0x40, divisor >> 8); }
/* /* /* /*
Calculate our divisor */ Set our command byte 0x36 */ Set low byte of divisor */ Set high byte of divisor */
Create a file called 'timer.c', and add it to your 'build.bat' as you've been shown in the previous sections of this tutorial. As you analyse the following code, you will see that we keep track of the amount of ticks that the timer has fired. This can be used as a 'system uptime counter' as your kernel gets more complicated. The timer interrupt here simply uses the default 18.222Hz to figure out when it should display a simple "One second has passed" message every second. If you decide to use the 'timer_phase' function in your code, you should change the 'timer_ticks % 18 == 0' line in 'timer_handler' to 'timer_ticks % 100 == 0' instead. You could set the timer phase from any function in the kernel, however I recommend setting it in 'timer_install' if anything, to keep things organized.
#include < system.h > /* This will keep track of how many ticks that the system * has been running for */ int timer_ticks = 0; /* Handles the timer. In this case, it's very simple: We * increment the 'timer_ticks' variable every time the * timer fires. By default, the timer fires 18.222 times * per second. Why 18.222Hz? Some engineer at IBM must've * been smoking something funky */ void timer_handler(struct regs *r) { /* Increment our 'tick count' */ timer_ticks++; /* Every 18 clocks (approximately 1 second), we will * display a message on the screen */ if (timer_ticks % 18 == 0) { puts("One second has passed\n"); } } /* Sets up the system clock by installing the timer handler * into IRQ0 */ void timer_install() { /* Installs 'timer_handler' to IRQ0 */ irq_install_handler(0, timer_handler); }
Example of using the system timer: 'timer.c'
Remember to add a call to 'timer_install' in the 'main' function in 'main.c'. Having trouble? Remember to add a function prototype of 'timer_install' to 'system.h'! The next bit of code is more of a demonstration of what you can do with the system timer. If you look carefully, this simple function waits in a loop until the given time in 'ticks' or timer phases has gone by. This is almost the same as the standard C library's function 'delay', depending on your timer phase that you set:
/* This will continuously loop until the given time has * been reached */ void timer_wait(int ticks) { unsigned long eticks; eticks = timer_ticks + ticks; while(timer_ticks < eticks);
If you wish, add this to 'timer.c' and a prototype to 'system.h'
Next, we will discuss how to use the keyboard. This involves installing a custom IRQ handler just like this tutorial, with hardware I/O on each interrupt.
The Keyboard
A keyboard is the most common way for a user to give a computer input, therefore it is vital that you create a driver of some sort for handling and managing the keyboard. When you get down to it, getting the basics of the keyboard isn't too bad. Here we will show the basics: how to get a key when it is pressed, and how to convert what's called a 'scancode' to standard ASCII characters that we can understand properly. A scancode is simply a key number. The keyboard assigns a number to each key on the keyboard; this is your scancode. The scancodes are numbered generally from top to bottom and left to right, with some minor exceptions to keep layouts backwards compatible with older keyboards. You must use a lookup table (an array of values) and use the scancode as the index into this table. The lookup table is called a keymap, and will be used to translate scancodes into ASCII values rather quickly and painlessly. One last note about a scancode before we head into code is that if bit 7 is set (test with 'scancode & 0x80'), then this is the keyboard's way of telling us that a key was just released. Create yourself a 'kb.h' and do all your standard proceedures like adding a line for GCC and adding a file to LD's command line.
/* KBDUS means US Keyboard Layout. This is a scancode table * used to layout a standard US keyboard. I have left some * comments in to give you an idea of what key is what, even * though I set it's array index to 0. You can change that to * whatever you want using a macro, if you wish! */ unsigned char kbdus[128] = { 0, 27, '1', '2', '3', '4', '5', '6', '7', '8', /* 9 */ '9', '0', '-', '=', '\b', /* Backspace */ '\t', /* Tab */ 'q', 'w', 'e', 'r', /* 19 */ 't', 'y', 'u', 'i', 'o', 'p', '[', ']', '\n', /* Enter key */ 0, /* 29 - Control */ 'a', 's', 'd', 'f', 'g', 'h', 'j', 'k', 'l', ';', /* 39 */ '\'', '`', 0, /* Left shift */ '\\', 'z', 'x', 'c', 'v', 'b', 'n', /* 49 */ 'm', ',', '.', '/', 0, /* Right shift */ '*', 0, /* Alt */ ' ', /* Space bar */ 0, /* Caps lock */ 0, /* 59 - F1 key ... > */ 0, 0, 0, 0, 0, 0, 0, 0, 0, /* < ... F10 */ 0, /* 69 - Num lock*/ 0, /* Scroll Lock */ 0, /* Home key */ 0, /* Up Arrow */ 0, /* Page Up */ '-', 0, /* Left Arrow */ 0, 0, /* Right Arrow */ '+', 0, /* 79 - End key*/ 0, /* Down Arrow */
0, 0, 0, 0, 0, 0, 0, };
/* Page Down */ /* Insert Key */ /* Delete Key */ 0, 0, /* F11 Key */ /* F12 Key */ /* All other keys are undefined */
Sample keymap. Add this array to your 'kb.c'
Converting a scancode to an ASCII value is easy with this: mychar = kbdus[scancode]; Note that although we leave comments for the function keys and shift/control/alt, we leave them as 0's in the array: You need to think up some random values such as ASCII values that you normally wouldn't use so that you can trap them. I'll leave this up to you, but you should keep a global variable to be used as a key status variable. This keystatus variable will have 1 bit set for ALT, one for CONTROL, and one for SHIFT. It's also a good idea to have one for CAPSLOCK, NUMLOCK, and SCROLLLOCK. This tutorial will explain how to set the keyboard lights, but we will leave it up to you to actually write the code for it. The keyboard is attached to the computer through a special microcontroller chip on your mainboard. This keyboard controller chip has 2 channels: one for the keyboard, and one for the mouse. Also note that it is through this keyboard controller chip that you would enable the A20 address line on the processor to allow you to access memory past the 1MByte mark (GRUB enables this, you don't need to worry about it). The keyboard controller, being a device accessible by the system, has an address on the I/O bus that we can use for access and control. The keyboard controller has 2 main registers: a Data register at 0x60, and a Control register at 0x64. Anything that the keyboard wants to send the computer is stored into the Data register. The keyboard will raise IRQ1 whenever it has data for us to read. Observe:
/* Handles the keyboard interrupt */ void keyboard_handler(struct regs *r) { unsigned char scancode; /* Read from the keyboard's data buffer */ scancode = inportb(0x60); /* If the top bit of the byte we read from the keyboard is * set, that means that a key has just been released */ if (scancode & 0x80) { /* You can use this one to see if the user released the * shift, alt, or control keys... */ } else { /* Here, a key was just pressed. Please note that if you * hold a key down, you will get repeated key press * interrupts. */
/* Just to show you how this works, we simply translate * the keyboard scancode into an ASCII value, and then * display it to the screen. You can get creative and * use some flags to see if a shift is pressed and use a * different layout, or you can add another 128 entries * to the above layout to correspond to 'shift' being * held. If shift is held using the larger lookup table, * you would add 128 to the scancode when you look for it */ putch(kbdus[scancode]); } }
This might look intimidating, but it's 80% comments ;) Add to 'kb.c'
As you can see, the keyboard will generate an IRQ1 telling us that it has data ready for us to grab. The keyboard's data register exists at 0x60. When the IRQ happens, we call this handler which reads from port 0x60. This data that we read is the keyboard's scancode. For this example, we check if the key was pressed or released. If it was just pressed, we translate the scancode to ASCII, and print that character out with one line. Write a 'keyboard_install' function that calls 'irq_install_handler' to install the custom keyboard handler for 'keyboard_handler' to IRQ1. Be sure to make a call to 'keyboard_install' from inside 'main'. In order to set the lights on your keyboard, you must send the keyboard controller a command. There is a specific proceedure for sending the keyboard a command. You must first wait for the keyboard controller to let you know when it's not busy. To do this, you read from the Control register (When you read from it, it's called a Status register) in a loop, breaking out when the keyboard isn't busy: if ((inportb(0x64) & 2) == 0) break; After that loop, you may write the command byte to the Data register. You don't write to the control register itself except for in special cases. To set the lights on the keyboard, you first send the command byte 0xED using the described method, then you send the byte that says which lights are to be on or off. This byte has the following format: Bit0 is Scroll lock, Bit1 is Num lock, and Bit2 is Caps lock. Now that you have basic keyboard support, you may wish to expand upon the code. This section on the keyboard was more to show you how to do the basics rather than give an extremely detailed overview of all of the keyboard controller's functions. Note that you use the keyboard controller to enable and handle the PS/2 mouse port. The auxilliary channel on the keyboard controller manages the PS/2 mouse. Up to this point we have a kernel that can draw to the screen, handle exceptions, handle IRQs, handle the timer, and handle the keyboard. Click to find what's next in store for your kernel development.
What's Left
What you do next to your kernel is completely up to you. The next thing you should think of writing is a memory manager. A memory manager will allow you to grab chunks of memory so that you can dynamically allocate and free memory as you need it. Using a memory manager, you can use more complicated data structures such as linked lists and binary trees to allow for more efficient storage and manipulation of data. It's also a way of preventing applications from writing to kernel pages, which is a feature of protection. It's possible to write a VGA driver, also. Using a VGA driver, you can set up different graphics modes in your kernel, allowing higher resolutions and graphical display options such as buttons and images. If you want to go further, you could eventually look into VESA video modes for high color and higher resolutions. You could eventually write a device interface which would allow you to load or unload kernel 'modules' as you need them. Add support for filesystems and disk drives so that you can access files off disks and open applications. It's very possible that you add multitasking support and design scheduling algorithms to give certain tasks higher priority and longer time to run according to what the application is designed to run at. The multitasking system closely relies on your memory manager to give each task a separate space in memory.
I hope that this tutorial has given you a more thorough understanding of some of the various low-level items involved in creating a kernel: a driver for your processor and memory.