Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
May the source be with you, but remember the KISS principle ;-)
Skepticism and critical thinking is not panacea, but can help to understand the world better

Unix/Linux Internals

News Slightly Skeptical View on Enterprise Unix Administration Recommmended Books Recommended Links Tutorials OS history University Courses
Unix system calls Process Scheduling Filesystems init Unix Signals Volume Management  Linux Loopback filesystem
Virtual memory Assess Control RAID levels Disk Repartitioning and Resizing of NTFS and Ext3 partitions History Humor Etc.

At the center of the UNIX onion is a program called the kernel. Although you are unlikely to deal with the kernel directly, it is absolutely crucial to the operation of the UNIX system.

The kernel provides the essential services that make up the heart of UNIX systems; it allocates memory, keeps track of the physical location of files on the computer's hard disks, loads and executes binary programs such as shells, and schedules the task swapping without which UNIX systems would be incapable of doing more than one thing at a time. The kernel accomplishes all these tasks by providing an interface between the other programs running under its control and the physical hardware of the computer; this interface, the system call interface, effectively insulates the other programs on the UNIX system from the complexities of the computer. For example, when a running program needs access to a file, it cannot simply open the file; instead it issues a system call which asks the kernel to open the file. The kernel takes over and handles the request, then notifies the program whether the request succeeded or failed. To read data in from the file takes another system call; the kernel determines whether or not the request is valid, and if it is, the kernel reads the required block of data and passes it back to the program. Unlike DOS (and some other operating systems), UNIX system programs do not have access to the physical hardware of the computer. All they see are the kernel services, provided by the system call interface.

The critical part of the kernel is usually loaded into a protected area of memory, which prevents it from being overwritten by other parts of the operating system or, worse, applications (actually the ability to overwrite kernel in Unix is a side effect of using C which does not control pointers). The kernel performs its tasks, such as executing processes and handling interrupts, in kernel space, whereas other process operate in user space. This separation prevents user data and kernel data from interfering with each other and thereby diminishing performance or causing the system to become unstable (and possibly crashing).

The system call interface is an example of an API, or application programming interface. An API is a set of system calls with strictly defined parameters, which allow an application (or other program) to request access to a service; it literally acts as an interface. In this sense kernel is not that different from other applications. For example, a database system also provides an API that allows programmers to write external programs that request services from the database.

1978 IBM poster explaining virtual memory

Top Visited
Past week
Past month


Old News ;-)

[Oct 08, 2019] How does converting from raid 5 to 6 work? (on the back end)

Oct 08, 2019 |


4 points · 13 hours ago

RAID 5 stripes the data over N disks with an additional stripe containing the parity, basically the XOR of all the other disks. RAID 6 use the same parity as RAID 5 but also use a different type of parity on an extra disk. So RAID 5 requires N+1 disks and RAID 6 requires N+2 disks. So in theory you can just add another disk and fill it with the different parity bit and you have a RAID 6, however it is not that simple. The parity disks on both RAID 5 and 6 rotates for each stripe. So if the parity is stored on disk 1 for the first stripe it is stored on disk 2 on the second and so forth. So if you add an additional disk all the stripes needs to be rewritten in the new schema. Some RAID controllers have this fuctionality. The tricky thing is that you need to track how far you have gone so that in the case of a power failure you can still retrieve the data. In any case it does require another disk.

OnARedditDiet Windows Admin 4 points · 14 hours ago

You're going to put your RAID in degraded mode so you're basically causing it to be in a one drive failed scenario and then asking it to rewrite every disk. Is that something you want to do? level 2

Dry_Soda 6 points · 12 hours ago

What could possibly go wrong? #YOLO level 3

25cmshlong OCP DBA 12c, OCE 12c, OCP Solaris 11, RHCE, NCSE ONTAP, CCNA R&S 1 point · 9 hours ago

Not much. It is adding another parity disk so worst case array will be left in initial state - single parity (RAID5).

(Ofc truly worst case is that reading all the drives will overload power supply and fry whole disk subsystem. But it is better not to think about it since RAID6 will not help there either :) level 4

OnARedditDiet Windows Admin 1 point · 9 hours ago

It's very well known that a full read/write pass that comes from rebuilding a degraded RAID can potentially crash the RAID by exposing existing hard drive issues. In this case nothing has failed but you could crash the RAID by fixing a non fault situation. level 5

25cmshlong OCP DBA 12c, OCE 12c, OCP Solaris 11, RHCE, NCSE ONTAP, CCNA R&S 1 point · 8 hours ago
· edited 5 hours ago

EDIT: Oops, I remembered that in most implementations of RAID (ie, not ZFS & WAFL) there is no dedicated parity/dparity drives, but instead rotating parity. So there will definitely reading and rewriting on all disks of the array.

So text below is incorrect for most RAID subsystems

That's true but not a concern during adding parity disk. If some latent stripes appears they can be recovered using original parity. All writes during conversion will go to the new parity disks, original data on drives stays intact level 5

drbluetongue Drunk while on-call 1 point · 5 hours ago

I don't know why you were downvoted - the most likely time you will get a disk failure is during a rebuild of an array. I've had one fail during rebuild that was from the same batch as the already failed disk in a RAID 6, thank god it was RAID 6...

Nowdays, at least at my old job, we made sure to ask the vendor for disks for the SAN's to be randomised

[Aug 31, 2019] The Linux Programming Interface

Aug 31, 2019 |

73 "Michael Kerrisk has been the maintainer of the Linux Man Pages collection (man 7) for more than five years now, and it is safe to say that he has contributed to the Linux documentation available in the online manual more than any other author before. For this reason he has been the recipient a few years back of a Linux Foundation fellowship meant to allow him to devote his full time to the furthering this endeavor. His book is entirely focused on the system interface and environment Linux (and, to some extent, any *NIX system) provides to a programmer. My most obvious choice for a comparison of the same caliber is Michael K. Johnson and Eric W. Troan's venerable Linux Application Development , the second edition of which was released in 2004 and is somewhat in need of a refresh, lamentably because it is an awesome book that belongs on any programmer's shelf. While Johnson and Troan have introduced a whole lot of programmers to the pleasure of coding to Linux's APIs, their approach is that of a nicely flowing tutorial, not necessarily complete, but unusually captivating and very suitable to academic use. Michael's book is a different kind of beast: while the older tome selects exquisite material, it is nowhere as complete as his -- everything relating to the subject that I could reasonably think of is in the book, in a very thorough and maniacally complete yet enjoyably readable way -- I did find one humorous exception, more on that later. Keep reading for the rest of Federico's review.

The Linux Programming Interface
author Michael Kerrisk
pages 1552
publisher No Starch Press
rating 8/10
reviewer Federico Lucifredi
ISBN 9781593272203
summary The definitive guide to the Linux and UNIX programming interface
This book is an unusual, if not altogether unique, entry into the Linux programming library: for one, it is a work of encyclopedic breadth and depth, spanning in great detail concepts usually spread in a multitude of medium-sized books, but by this yardstick the book is actually rather concise, as it is neatly segmented in 64 nearly self-contained chapters that work very nicely as short, deep-dive technical guides. I have collected an extremely complete technical library over the years, and pretty much any book of significance that came out of the Linux and Bell Labs communities is in it -- it is about 4 shelves, and it is far from portable. It is very nice to be able to reach out and pick the definitive work on IPC, POSIX threads, or one of several socket programming guides -- not least because having read them, I know what and where to pick from them. But for those out there who have not invested so much time, money, and sweat moving so many books around, Kerrisk's work is priceless: any subject be it timers, UNIX signals, memory allocation or the most classical of topics (file I/O) gets its deserved 15-30 page treatment, and you can pick just what you need, in any order.

Weighing in at 1552 pages, this book is second only to Charles Kozierok's mighty TCP/IP Guide in length in the No Starch Press catalog. Anyone who has heard me comment about books knows I usually look askance at anything beyond the 500-page mark, regarding it as something defective in structure that fails the "I have no time to read all that" test. In the case of Kerrisk's work, however, just as in the case of Kozierok's, actually, I am happy to waive my own rule, as these heavyweights in the publisher's catalog are really encyclopedias, and despite my bigger library I will like to keep this single tome within easy reach of my desk to avoid having to fetch the other tomes for quick lookups -- yes, I still have lazy programmer blood in my veins.

There is another perspective to this: while writing, I took a break and while wandering around I found myself in Miguel's office (don't tell him ;-), and there spotted a Bell Labs book lying on his shelf that (incredibly) I have never heard of. After a quick visit to AbeBooks to take care of this embarrassing matter, I am back here writing to use this incident as a valuable example: the classic system programming books, albeit timeless in their own way, show their rust when it comes to newer and more esoteric Linux system calls (mmap and inotify are fair examples) and even entire subsystems in some cases -- and that's another place where this book shines: it is not only very complete, it is really up to date, a combination I cannot think of a credible alternative to in today's available book offerings.

One more specialized but particularly unique property of this book is that it can be quite helpful in navigating what belongs to what standard, be it POSIX, X/Open, SUS, LSB, FHS, and what not. Perhaps it is not entirely complete in this, but it is more helpful than anything else I have seen released since Donald Lewine's ancient POSIX Programmers Guide (O'Reilly). Standards conformance is a painful topic, but one you inevitably stumble into when writing code meant to compile and run not only on Linux but to cross over to the BSDs or farther yet to other *NIX variants. If you have to deal with that kind of divine punishment, this book, together with the Glibc documentation, is a helpful palliative as it will let you know what is not available on other platforms, and sometimes even what alternatives you may have, for example, on the BSDs.

If you are considering the purchase, head over to Amazon and check out the table of contents, you will be impressed. The Linux Programming Encyclopedia would have been a perfectly adequate title for it in my opinion. In closing, I mentioned that after thinking for a good while I found one thing to be missing in this book: next to the appendixes on tracing, casting the null pointer, parsing command-line options, and building a kernel configuration, a tutorial on writing man pages was sorely and direly missing! Michael, what were you thinking?

Federico Lucifredi is the maintainer of man (1) and a Product Manager for the SUSE Linux Enterprise and openSUSE distributions.

You can purchase The Linux Programming Interface from . Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines , then visit the submission page .

Anatomy of Linux process management by M. Tim Jones

Dec 20, 2008 | developerWorks

Linux is a very dynamic system with constantly changing computing needs. The representation of the computational needs of Linux centers around the common abstraction of the process. Processes can be short-lived (a command executed from the command line) or long-lived (a network service). For this reason, the general management of processes and their scheduling is very important.

From user-space, processes are represented by process identifiers (PIDs). From the user's perspective, a PID is a numeric value that uniquely identifies the process. A PID doesn't change during the life of a process, but PIDs can be reused after a process dies, so it's not always ideal to cache them.

In user-space, you can create processes in any of several ways. You can execute a program (which results in the creation of a new process) or, within a program, you can invoke a fork or exec system call. The fork call results in the creation of a child process, while an exec call replaces the current process context with the new program. I discuss each of these methods to understand how they work.

For this article, I build the description of processes by first showing the kernel representation of processes and how they're managed in the kernel, then review the various means by which processes are created and scheduled on one or more processors, and finally, what happens if they die.

Herd Mentality

That's a very questionable approach. Standardization is the most powerful thing in computing. Actually Apple is indirectly subsidized by Microsoft as he uses the same Intel-based architecture.
23 October 2009 | Daring Fireball

Conformity is a powerful instinct. There's safety in numbers. You have to be different to be better, but different is scary.

So of course there's some degree of herd mentality in every industry. But I think it's more pronounced, to a pathological degree, in the PC hardware industry. It was at the root of long-standing punditry holding that Apple should license the Mac OS to other PC makers, or that Apple should dump Mac OS and make Windows PCs. On the surface, those two old canards seem contradictory - one arguing that Apple should be a hardware company, the other arguing that it should be a software company. But at their root they're the same argument: that Apple should stop being different, and either act just like other PC makers (and sell computers running Windows) or else act just like Microsoft (and sell licenses to its OS).

No one argues those two points any more. But it's the same herd mentality that led to the rash of Apple needs to get in the "netbook" game punditry that I claim-checked earlier this week. I could have linked to a dozen others. The argument, though, is the same: everyone else is making netbooks, so Apple should, too. Why? Because everyone else is.

I think there's a simple reason why the herd mentality is worse in the PC industry: Microsoft. In fact, I think it used to be worse. A decade ago the entire computing industry - all facets of it - was dominated by a herd mentality that boiled down to Get behind Microsoft and follow their lead, or else you'll get stomped. That's no longer true in application software. The web, and Google in particular, have put an end to that.

But the one area where Microsoft still reigns supreme is in PC operating systems. PC hardware makers are crippled. They can't stand apart from the herd even if they want to. Their OS choices are: (a) the same version of Windows that every other PC maker includes; or (b) the same open source Linux distributions that every other PC maker could include but which no customers want to buy.1

Apple's ability to produce innovative hardware is inextricably intertwined with its ability to produce innovative software. The iPhone is an even better example than the Mac.

It's not just that Apple is different among computer makers. It's that Apple is the only one that even can be different, because it's the only one that has its own OS. Part of the industry-wide herd mentality is an assumption that no one else can make a computer OS - that anyone can make a computer but only Microsoft can make an OS. It should be embarrassing to companies like Dell and Sony, with deep pockets and strong brand names, that they're stuck selling computers with the same copy of Windows installed as the no-name brands.

And then there's HP, a company with one of the best names and proudest histories in the industry. Apple made news this week for the design and tech specs of its all-new iMacs, which start at $1199. HP made news this week for unveiling a Windows 7 launch bundle at Best Buy that includes a desktop PC and two laptops, all for $1199. That might be great for Microsoft, but how is it good for HP that their brand now stands for bargain basement prices?

Operating systems aren't mere components like RAM or CPUs; they're the single most important part of the computing experience. Other than Apple, there's not a single PC maker that controls the most important aspect of its computers. Imagine how much better the industry would be if there were more than one computer maker trying to move the state of the art forward.

  1. And, perhaps soon, the same version of Google Chrome OS that's available to every other PC maker. Chrome OS might help PC makers break free of Microsoft, but it won't help them break free from each other.

[Jul 22, 2008] UNDELETED by Ralf Spenneberg

Linux Magazine Online

Modern filesystems make forensic file recovery much more difficult. Tools like Foremost and Scalpel identify data structures and carve files from a hard disk image.

IT experts and investigators have many reasons for reconstructing deleted files. Whether an intruder has deleted a log to conceal an attack or a user has destroyed a digital photo collection with an accidental rm ‑rf, you might someday face the need to recover deleted data. In the past, recovery experts could easily retrieve a lost file because an earlier generation of filesystems simply deleted the directory entry. The meta information that described the physical location of the data on the disk was preserved, and tools like The Coroner's Toolkit (TCT [1]) and The Sleuth Kit (TSK [2]) could uncover the information necessary for restoring the file. Today, many filesystems delete the full set of meta information, leaving the data blocks. Putting these pieces together correctly is called file carving – forensic experts carve the raw data off the disk and reconstruct the files from it. The more fragmented the filesystem, the harder this task become.

[Apr 03, 2007] Speaking UNIX, Part 8 UNIX processes

On UNIX® systems, each system and end-user task is contained within a process. The system creates new processes all the time and processes die when a task finishes or something unexpected happens. Here, learn how to control processes and use a number of commands to peer into your system.

At a recent street fair, I was mesmerized by the one-man band. Yes, I am easily amused, but I was impressed nonetheless. Combining harmonica, banjo, cymbals, and a kick drum -- at mouth, lap, knees, and foot, respectively -- the veritable solo symphony gave a rousing performance of the Led Zeppelin classic "Stairway to Heaven" and a moving interpretation of Beethoven's Fifth Symphony. By comparison, I'm lucky if I can pat my head and rub my tummy in tandem. (Or is it pat my tummy and rub my head?)

Lucky for you, the UNIX® operating system is much more like the one-man band than your clumsy columnist. UNIX is exceptional at juggling many tasks at once, all the while orchestrating access to the system's finite resources (memory, devices, and CPUs). In lay terms, UNIX can readily walk and chew gum at the same time.

This month, let's probe a little deeper than usual to examine how UNIX manages to do so many things simultaneously. While spelunking, let's also glimpse the internals of your shell to see how job-control commands, such as Control-C (terminate) and Control-Z (suspend), are implemented. Headlamps on! To the bat cave!

[Nov 14, 2006] A Comparison of Solaris, Linux, and FreeBSD Kernel

Re:wishfull thinking(Score:5, Informative)

by TheNetAvenger (624455) on Monday October 17, @02:08AM (#13807631)

Win32 subsystem is TOO much tied to NT kernel and closely coupled to achieve the performance it has today.
That is why NT 3.51/3.53 was more robust than NT 4,0 which moved major parts of the UI code to kernel mode.

Please actually read Inside Windows NT 3.51 by Helen Custer and THEN read Inside Windows NT 4.0 to know the difference.

Sorry, hun, read both and even had this discussion with a key kernel developer at Microsoft a few years ago. (1997 in fact, as we were starting to work with Beta 1 of Windows 2000)

NT 4.0 ONLY moved video to a lower ring. It had NOTHING to do with moving the Win32 subsystem INTO NT - that did not happen.

That is why Windows NT Embedded exists, and also why even the WinCE is a version of the NT kernel with NO Win32 ties.

Microsoft can STILL produce NT without any Win32, and just throw a *nix subsystem on it if they wanted to, but yet have the robustness of NT. Win32 is the just the default interface because of the common API and success of Windows applications.

I think you are confusing Ring dropping of the video driver with something completely different.

NT is a client/server kernel... Go look up what that means, please for the love of God.

Win32 is a subsystem, plain and simple. Yes it is a subsystem that has tools to control the NT kernel under it, but that is just because that is the default subsystem interface. You could build these control tools in any subsystem you want to stack on NT. PERIOD.

[Nov 9, 2005] 'Unix beats Windows' - says Microsoft! Paul Murphy

This is just a discussion. You need to read the report first. It contains a lot of interesting information

Microsoft Research has released part of a report on the "Singularity" kernel they've been working on as part of their planned shift to network computing. The report includes some performance comparisons that show Singularity beating everything else on a 1.8Ghz AMD Athlon-based machine.

What's noteworthy about it is that Microsoft compared Singularity to FreeBSD and Linux as well as Windows/XP - and almost every result shows Windows losing to the two Unix variants.

For example, they show the number of CPU cycles needed to "create and start a process" as 1,032,000 for FreeBSD, 719,000 for Linux, and 5,376,000 for Windows/XP. Similarly they provide four graphs comparing raw disk I/O and show the Unix variants beating Windows/XP in three (and a half) of the four cases.

Oddly, however, it's the cases in which they report Windows/XP as beating Unix that are the most interesting. There are three examples of this: one in which they count the CPU cycles needed for a "thread yield" as 911 for FreeBSD, 906 for Linux, and 753 for Windows XP; one in which they count CPU cycles for a "2 thread wait-set ping pong" as 4,707 for FreeBSD, 4,041 for Linux, and 1,658 for Windows/XP; and, one in which they report that "for the sequential read operations, Windows XP performed significantly better than the other systems for block sizes less than 8 kilobytes."

So how did they get these results?

The sequential tests read or wrote 512MB of data from the same portion of the hard disk. The random read and write tests performed 1000 operations on the same sequences of blocks on the disk. The tests were single threaded and performed synchronous raw I/O. Each test was run seven times and the results averaged.


The Unix thread tests ran on user-space scheduled pthreads. Kernel scheduled threads performed significantly worse. The "wait-set ping pong" test measured the cost of switching between two threads in the same process through a synchronization object. The "2 message ping pong" measured the cost of sending a 1-byte message from one process to another and then back to the original process. On Unix, we used sockets, on Windows, a named pipe, and on Singularity, a channel.

So why is this interesting? Because their test methods reflect Windows internals, not Unix kernel design. There are better, faster, ways of doing these things in Unix, but these guys - among the best and brightest programmers working at Microsoft- either didn't know or didn't care.

[Jan 3, 2005] Has UNIX Programming Changed in 20 Years By Marc Rochkind.

If all the basics are the same, what has changed? Well, these things:

More System Calls

The number of system calls has quadrupled, more or less, depending on what you mean by "system call." The first edition of Advanced UNIX Programming focused on only about 70 genuine kernel system calls-for example, open, read, and write; but not library calls like fopen, fread, and fwrite. The second edition includes about 300. (There are about 1,100 standard function calls in all, but many of those are part of the Standard C Library or are obviously not kernel facilities.) Today's UNIX has threads, real-time signals, asynchronous I/O, and new interprocess-communication features (POSIX IPC), none of which existed 20 years ago. This has caused, or been caused by, the evolution of UNIX from an educational and research system to a universal operating system. It shows up in embedded systems (parking meters, digital video recorders); inside Macintoshes; on a few million web servers; and is even becoming a desktop system for the masses. All of these uses were unanticipated in 1984.

More Languages

In 1984, UNIX applications were usually programmed in C, occasionally mixed with shell scripts, Awk, and Fortran. C++ was just emerging; it was implemented as a front end to the C compiler. Today, C is no longer the principal UNIX application language, although it's still important for low-level programming and as a reference language. (All the examples in both books are written in C.) C++ is efficient enough to have replaced C when the application requirements justify the extra effort, but many projects use Java instead, and I've never met a programmer who didn't prefer it over C++. Computers are fast enough so that interpretive scripting languages have become important, too, led by Perl and Python. Then there are the web languages: HTML, JavaScript, and the various XML languages, such as XSLT.

Even if you're working in one of these modern languages, though, you still need to know what going on "down below," because UNIX still defines-and, to a degree, limits-what the higher-level languages can do. This is a challenge for many students who want to learn UNIX, but don't want to learn C. And for their teachers, who tire of debugging memory problems and explaining the distinction between declarations and definitions.


To enable students to learn UNIX without first learning C, I developed a Java-to-UNIX system-call interface that I call Jtux. It allows almost all of the UNIX system calls to be executed from Java, using the same arguments and datatypes as the official C calls. You can find out more about Jtux and download its source code from

More Subsystems

The third area of change is that UNIX is both more visible than ever (sold by Wal-Mart!) and more hidden, underneath subsystems like J2EE and web servers, Apache, Oracle, and desktops such as KDE or GNOME. Many application programmers are programming for these subsystems, rather than for UNIX directly. What's more, the subsystems themselves are usually insulated from UNIX by a thin portability layer that has different implementations for different operating systems. Thus, many UNIX system programmers these days are working on middleware, rather than on the end-user applications that are several layers higher up.

More Portability

The fourth change is the requirement for portability between UNIX systems, including Linux and the BSD-derivatives, one of which is the Macintosh OS X kernel (Darwin). Portability was of some interest in 1984, but today it's essential. No developer wants to be locked into a commercial version of UNIX without the possibility of moving to Linux or BSD, and no Linux developer wants to be locked into only one distribution. Platforms like Java help a lot, but only serious attention to the kernel APIs, along with careful testing, will ensure that the code is really portable. Indeed, you almost never hear a developer say that he or she is writing for XYZ's UNIX. It's much more common to hear "UNIX and Linux," implying that the vendor choice will be made later. (The three biggest proprietary UNIX hardware companies-Sun, HP, and IBM-are all strong supporters of Linux.)

More Complete Standards

The requirement for portability is connected with the fifth area of change, the role of standards. In 1984, a UNIX standards effort was just starting. The IEEE's POSIX group hadn't yet been formed. Its first standard, which emerged in 1988, was a tremendous effort of exceptional quality and rigor, but it was of very little use to real-world developers because it left out too many APIs, such as those for interprocess communication and networking. That minimalist approach to standards changed dramatically when The Open Group was formed from the merger of X/Open and the Open Software Foundation in 1996. Its objective was to include all the APIs that the important applications were using, and to specify them as well as time allowed-which meant less precisely than POSIX did. They even named one of their standards Spec 1170, the number being the total of 926 APIs, 70 headers, and 174 commands. Quantity over quality, maybe, but the result meant that for the first time programmers would find in the standard the APIs they really needed. Today, The Open Group's Single UNIX Specification is the best guide for UNIX programmers who need to write portably.

[Aug 20, 2004] Manipulating Files And Directories In Unix Copyright (c) 1998-2002 by guy keren.

The following tutorial describes various common methods for reading and writing files and directories on a Unix system. Part of the information is common C knowledge, and is repeated here for completeness. Other information is Unix-specific, although DOS programmers will find some of it similar to what they saw in various DOS compilers. If you are a proficient C programmer, and know everything about the standard I/O functions, its buffering operations, and know functions such as fseek() or fread(), you may skip the standard C library I/O functions section. If in doubt, at least skim through this section, to catch up on things you might not be familiar with, and at least look at the standard C library examples.

  • This document is copyright (c) 1998-2002 by guy keren.

    The material in this document is provided AS IS, without any expressed or implied warranty, or claim of fitness for a particular purpose. Neither the author nor any contributers shell be liable for any damages incured directly or indirectly by using the material contained in this document.

    permission to copy this document (electronically or on paper, for personal or organization internal use) or publish it on-line is hereby granted, provided that the document is copied as-is, this copyright notice is preserved, and a link to the original document is written in the document's body, or in the page linking to the copy of this document.

    Permission to make translations of this document is also granted, under these terms - assuming the translation preserves the meaning of the text, the copyright notice is preserved as-is, and a link to the original document is written in the document's body, or in the page linking to the copy of this document.

    For any questions about the document and its license, please contact the author.

  • [July 28, 2004] FreeBSD system programming Nathan Boeger (nboeger at
    Mana Tominaga (mana at

    Copyright (C) 2001,2002,2003,2004 Nathan Boeger and Mana Tominaga


    Server Operating Systems Technical Comparison

    This web site compares and contrasts operating systems. It originally started out on a small server in the engineering department of Ohio State University to answer a single question: "On technical considerations only, how does Rhapsody (also known as Mac OS X Server) stack up as a server operating system (especially in comparison to Windows NT)?" The web site now compares and contrasts server operating systems and will in the near future expand to compare other kinds of operating systems.

    For non-technical persons: A general overview of operating systems for non-technical people is located at: kinds of operating systems. Brief summaries of operating systems are located at: summaries of operating systems. There is an entire section of pages on individual operating systems, all formatted in the same order for easy comparison. The holistic area looks at operating systems from a holistic point of view and particular subjects in that presentation may be useful for comparison. Some of the charts and tables may also be useful for specific comparisons.

    For technical persons: The system components area goes into detail about the inner workings of an operating system and the individual operating systems pages provide some technical information.

    This site is organized as an unbalanced tree structure, with cyclic graph hyperlinks and a sequential traversal path through the tree.

    [Oct 15, 2001] Usenix/Login Teaching Operating Systems with Source Code UNIX

    A long time ago, my undergraduate operating-systems class required that we cross-compile a small, standalone system and upload it to a PDP-11 minicomputer. We could do some limited debugging at the console if the program didn't crash. The development environment was poor; it was painful and time-consuming to get things working, but the experience was an overall confidence builder. I feel there is a huge advantage for a student to control the operations of a computer directly.

    Another approach for teaching operating systems is to provide a controlled runtime and development environment using a simulator. Several universities teach operating-system concepts using the Nachos simulator (<>). The advantage is that the instructor can easily control much of the environment for assignments, and the students don't waste time with crashes, kernel builds, and rebooting. These kinds of systems can be very simplistic and lack realism.

    As a private pilot, I know that aviation simulation goes only so far. You need to spend some time in the sky, in the air-traffic-control system, in the weather, and with the attendant dangers, to absorb and appreciate the training fully. A two-hour actual flight lesson is often fatiguing and draining; but the same amount of time in a simulator is more like a classroom experience. Similarly, students sense the difference between working in a safe simulator environment and working on a real kernel. Lessons with the latter seem more dramatic.

    [Jun 15, 2001] "Operating Systems Handbook"

    is now available for free as a collection of Acrobat files.

    [Apr 04, 2001] LinuxPlanet: New HOWTO: The Linux Kernel HOWTO.

    This is a detailed guide to kernel configuration, compilation, upgrades, and troubleshooting for ix86-based systems.

    [Feb 23, 2001] Computer Operating Systems

    Nice tutorial

    Understanding the Linux Kernel Chapter 10 Process Scheduling

    Like any time-sharing system, Linux achieves the magical effect of an apparent simultaneous execution of multiple processes by switching from one process to another in a very short time frame. Process switch itself was discussed in Chapter 3, Processes; this chapter deals with scheduling, which is concerned with when to switch and which process to choose.

    The chapter consists of three parts. The section "Scheduling Policy" introduces the choices made by Linux to schedule processes in the abstract. The section "The Scheduling Algorithm" discusses the data structures used to implement scheduling and the corresponding algorithm. Finally, the section "System Calls Related to Scheduling" describes the system calls that affect process scheduling.

    [Nov 14, 2000] File Systems

    Scalable Linux Scheduling on a Symmetric Multi-Processor

    Class Notes for Operating Systems

    [Sep 30, 2000] Inside VMware

    VMware enables you to run a Virtual Machine, which is VMware's version of an emulated state of Windows, Linux, or FreeBSD. You heard me right-on VMware, not only can you do Windows, but also Linux and FreeBSD. That means if you need to test out that new version of Linux, but you don't want to format your drive just to test it out, VMware can just create a virtual drive and you're on your way to seeing what the latest version of your favorite distribution has to offer.

    To date, VMware has been pretty much a development product, but thanks to demand for a stable, versatile operating environment, VMware has upped the ante and created their best version of VMware yet-2.0.2.

    If you've used a package like Connectix VirtualPC for the Macintosh PowerPC, you'll notice many likenesses it shares with VMware. The website may really hype VMware up and make it sound like there is no loss of performance, but the simple fact is that you do lose clock speed, RAM and hard disk speed, just like you would with any piece of emulation software.

    In fact, you can tell both VMware and VirtualPC are designed along the same lines. The configuration is much the same, except one is obviously more PC-fied, while one is more Mac-centric.

    Although, what it comes down to is compatibility. VMware does a much better job at emulating x86 hardware, probably since it's operating on top of x86 hardware. That's a logical assumption, right? Enough with guesswork, let's take a look at what's really going on.

    Here we see how it really works. A typical PC works like we see on the left. I think the diagram oversimplifies things in a way, but it will do the job.

    Essentially, VMware interfaces directly with most your system hardware, which is one way it achieves pretty good performance even on low-end machines. Don't get me wrong, you still won't get the full speed of your PC out of VMware, this happens because things like the hard disk access (where it looks to be hurting the most) are still done through the operating system.

    This is how it all happens. This diagram shows you the devices which need VMware still needs to call through the OS-disk, memory and CPU.

    Once again, VMware has a few tricks up its sleeve. One great thing about VMware is that you can utilize your local network to get access to your Windows or Linux filesystem. In fact, you can even use a regular network along with your local network at the same time, so you don't need to sacrifice anything with the networking setup.

    [Aug 20, 2000] Using sysctl the Command and the Subroutine

    Coming from a hybrid Sys V and BSD system, the first time I began maintaining a BSD system I was immediately plunged into making system level changes and finding out very specific information about the system. There is a tool for just such a task, sysctl. Along with that, however, I had come across an unusual program that needed access to such information as well. The program needed the information "hard coded", something I did not like. Luckily, the sysctl calls are easily (and extraordinarily well documented) accessible via a simple system subroutine. This article will cover two aspects of sysctl:

    1. Some examples using the sysctl command.
    2. Examples with sample code on using the sysctl subroutines.

    Note: Examples were drawn from all three free BSDs (I have run all three of them at one time or another): NetBSD, FreeBSD and OpenBSD.

    The sysctl Command (Facility)

    It might be more correct to call sysctl a facility or utility rather than just a command. The official short definition is:

    sysctl get or set kernel state

    In reality (typical to BSD design - which is a good thing) sysctl has been extended to a great many things and show all sorts of great information. I say this because judging by the short definition, one would thing all you can do with it is examine kernel parameters and perhaps modify others . . . well, that and:

    Really, the well documented man page of man 8 sysctl has all the information you need. Let us take a look at some sample usages:

    First, how about the OS type:

       $ sysctl kern.ostype
       kern.ostype = NetBSD

    Here is a sample looking at the clockrate:

       $ sysctl kern.clocktrate
       kern.clockrate = tick = 10000, tickadj = 40, hz = 100, profhz = 100, stathz = 100

    A very important (and often modified parameter on systems) ye olde ip forwarding (where 1 is on and 0 is off):

       $ sysctl net.inet.ip.forwarding
       net.inet.ip.forwarding = 0

    Now some real quick hardware gathering examples that show us the following information respectfully:

    1. machine type
    2. specific model information
    3. number of processors
       $ sysctl kern.hw.machine
       hw.machine = sparc
       $sysctl hw.model
       hw.model = SUNW,SPARCstation-5, MB86904 @ 110 MHz, on-chip FPU
       $ sysctl hw.ncpu
       hw.ncpu = 1

    Another quick note: all of the examples were done in userland.

    We have seen the ease of use of the sysctl command, but the subroutine offers great access at a low level to even more information.

    Using the sysctl Subroutine(s)

    Note: The next section requires a basic understanding the C programming language.

    The sysctl function allows programmatic access to a wide array of information about the system itself, the kernel and network information, in this respect it is very similar in nature to it's command counterpart. It should also be quite obvious that this is in fact the function that the sysctl primarily uses (duh). This begs the question, why is this important to know or understand? The name of the game is understanding, seeing how to directly access the sysctl function is one of the many steps to systems programming emlightenment. In short order, what it is you do when you might use the sysctl command. Additionally, using the function can help develop or extend utilities. The reason sysctl is so wonderful at this is how it is so linked to the core operating system. Again, I must reiterate the BSD philosophy of extension versus new. It is better to extend a pre-existing piece of software rather than encourage the development of a completely new one, nevertheless, the sysctl function could be useful for (and is no doubt in employed in many other pieces of existing programs) building new utilities.

    Well let us get to it shall we? For the sake of simplicity, the code examples will follow some of the examples shown in the command section of this article. The best way to illustrate a usage is a case study, so. let us create one, for posterity, we will acknowledge the great forecoders by using the example that comes from the BSD Programmer's Manual and an additional one that does not:

    We have a program that, for some odd reason, needs to know the following information:

    • the number pf processes allowed on the system (the one from the manual)
    • the number of cpus (perhaps 3rd party licensing software :) )

    Getting the Number of Processes

    One thing I believe in is paying due respect, and as such, we will peruse one of the examples in the BSD documentation, how to snag the number of processes allowed on the system:

    . . .
    . . .
    int get_processes_max {
    	int mib[2], maxproc;
    	size_t len;
    	mib[0] = CTL_KERN;
    	mib[1] = KERN_MAXPROC;
    	len = sizeof(maxproc);
    	sysctl(mib, 2, &maxproc, &len, NULL, 0);
    	return maxproc;

    It is important, at this point, to understand what it is we are accessing and how it is done. To think in C terms, we are looking at this (again, noted in the man page):

    int sysctl(int *name, y_int namelen, void *oldp, size_t *oldlenp, void *newp, size_t newlen);

    If you look carefully across the function prototype for sysctl you will see where all of the arguments specified satisy the function.

    Again, for the next value, our function really would not have to look much different:

    . . .
    . . .
    int get_processes_max {
    	int mib[2], num_cpu;
    	size_t len;
    	mib[0] = CTL_HW;
    	mib[1] = HW_NCPU;
    	len = sizeof(num_cpu);
    	sysctl(mib, 2, &num_cpu, &len, NULL, 0);
    	return num_cpu;

    Basically what we are looking at is access to data structures, nothing more really. The great thing about it is the ease of access, quite simpler than endless routine writing for endless direct file level access, instead, using this function, we can get a great deal of information about the system with a minimal and safe level of exertion.

    This Is Just The Beginning

    Doubtless if this article was something new to you, then the door that lie before you is a great one indeed. BSD presents an unparalled opportunity to delve into the inner workings of BSD and UNIX itself. Continue on and look to programming guides and documentation to lead the way, you will not be disappointed. As for my material, I will also open the door, and we shall see in the long run what lie on the other side.

    What About sysctl for Linux

    To the best_of_my_knowledge system parms associated with sysctl can be viewed and modified under /proc/sys (for the most part) on Linux systems. When programmatic access is required it is recommended to use /proc as well.

    [Aug 7, 2000] Daemon News Design Elements of the FreeBSD VM System

    [Aug 6, 2000] IBM developerWorks: POSIX threads explained - A simple and nimble tool for memory sharing

    "POSIX (Portable Operating System Interface) threads are a great way to increase the responsiveness and performance of your code."

    Operating System Structures

    Operating Systems II -- nice slides

    [Jan 20, 2000] Daemon News Design Elements of the FreeBSD VM System By Matthew Dillon

    The title is really just a fancy way of saying that I am going to attempt to describe the whole VM enchilada, hopefully in a way that everyone can follow. For the last year I have concentrated on a number of major kernel subsystems within FreeBSD, with the VM and Swap subsystems being the most interesting and NFS being 'a necessary chore'. I rewrote only small portions of the code. In the VM arena the only major rewrite I have done is to the swap subsystem. Most of my work was cleanup and maintenance, with only moderate code rewriting and no major algorithmic adjustments within the VM subsystem. The bulk of the VM subsystem's theoretical base remains unchanged and a lot of the credit for the modernization effort in the last few years belongs to John Dyson and David Greenman. Not being a historian like Kirk I will not attempt to tag all the various features with peoples names, since I will invariably get it wrong.

    Before moving along to the actual design let's spend a little time on the necessity of maintaining and modernizing any long-living codebase. In the programming world, algorithms tend to be more important than code and it is precisely due to BSD's academic roots that a great deal of attention was paid to algorithm design from the beginning. More attention paid to the design generally leads to a clean and flexible codebase that can be fairly easily modified, extended, or replaced over time. While BSD is considered an 'old' operating system by some people, those of us who work on it tend to view it more as a 'mature' codebase which has various components modified, extended, or replaced with modern code. It has evolved, and FreeBSD is at the bleeding edge no matter how old some of the code might be. This is an important distinction to make and one that is unfortunately lost to many people. The biggest error a programmer can make is to not learn from history, and this is precisely the error that many other modern operating systems have made. NT is the best example of this, and the consequences have been dire. Linux also makes this mistake to some degree -- enough that we BSD folk can make small jokes about it every once in a while, anyway (grin). Linux's problem is simply one of a lack of experience and history to compare ideas against, a problem that is easily and rapidly being addressed by the Linux community in the same way it has been addressed in the BSD community -- by continuous code development. The NT folk, on the other hand, repeatedly make the same mistakes solved by UNIX decades ago and then spend years fixing them. Over and over again. They have a severe case of 'not designed here' and 'we are always right because our marketing department says so'. I have little tolerance for anyone who cannot learn from history.

    Much of the apparent complexity of the FreeBSD design, especially in the VM/Swap subsystem, is a direct result of having to solve serious performance issues that occur under various conditions. These issues are not due to bad algorithmic design but instead rise from environmental factors. In any direct comparison between platforms, these issues become most apparent when system resources begin to get stressed. As I describe FreeBSD's VM/Swap subsystem the reader should always keep two points in mind. First, the most important aspect of performance design is what is known as "Optimizing the Critical Path". It is often the case that performance optimizations add a little bloat to the code in order to make the critical path perform better. Second, a solid, generalized design outperforms a heavily-optimized design over the long run. While a generalized design may end up being slower than an heavily-optimized design when they are first implemented, the generalized design tends to be easier to adapt to changing conditions and the heavily-optimized design winds up having to be thrown away. Any codebase that will survive and be maintainable for years must therefore be designed properly from the beginning even if it costs some performance. Twenty years ago people were still arguing that programming in assembly was better than programming in a high-level language because it produced code that was ten times as fast. Today, the fallibility of that argument is obvious -- as are the parallels to algorithmic design and code generalization.

    [Jan 3, 2000] BYTE Column - Process Scheduling In Linux, Moshe Bar

    In This Article
    Process Scheduling In Linux

    Tangled In The Threads

    Two Paths

    Kernel Pre-emption And User Pre-emption

    Last month, we started a new series of Linux kernel internals. In that first part, we looked at how Linux manages processes and why in many ways Linux is better at creating and maintaining processes than many commercials Unixes.

    This series on Linux internals is by the way the fruit of a tight collaboration with some of the most experienced kernel hackers in the Linux project. Without the contribution of people like Andrea Arcangeli in Italy (VM contributor and SuSE employee), Ingo Molnar (scheduler contributor) and many others, this series wouldn't be possible. Many thanks to all of them, but especially to Andrea Arcangeli who has shown a lot of patience in answering many of my questions.

    [Dec 12, 1999] A nice site targeting OS design:

    "Kernel Development" page can be useful too. Archives are here.

    [Nov. 30, 1999] BYTE Column - The Linux Process Model, Moshe Bar

    [Nov. 20, 1999] Build a useful five-headed penguin

    VMware's system emulator lets you run up to five OSs on one box simultaneously

    Rawn Shah checks out VMware's latest system emulator, version 1.1. It promises to let you run a Linux host OS, then switch -- without rebooting -- among up to four other guest OSs that operate inside virtual hardware created by VMware. (2,100 words)

    [Aug 11, 1999] The Programmer's File Format Collection

    [July 25, 1999] Welcome to VMware Inc. - Virtual Platform Technology VMware software initially comes in two flavors, depending on the user's host operating system: VMware for Linux, and VMware for Windows NT. VMware for Linux (time-limited demo) -- run DOS-, FreeBSD-, Windows 3.x, 9x and NT 4.0-applications easily under Linux. VMware is included in SuSE distribution:

    Experience of on of the users who has Celeron 450 MHz, 256MB RAM and had given virtual machine 64M was quite positive. He used NT driver SVGA from vmware, and after than it started to work with the screen noticeably faster and supported modes more than 800x600. (vmware recommend X in this configuration Visio is working satisfactory (redrawing of screen is a little bit slow in non-full screen mode), but generally is OK. The fact that it's now possible to work on a single computer instead of two overweight the small inconveniences described.

    [May 27, 1999] Linux Memory Management subsystem; main page

    [March 2,1999] Linux Kernel Mailing List, Archive by Week by thread

    [Feb.12,1999] -- the ultimate OS

    Uniform Driver Interface (UDI)

    Universal Serial Bus (USB)

    Kernel Traffic ( -- information of new kernel developments

    See Also

    Recommended Links

    Google matched content

    Softpanorama Recommended

    Top articles


    General info:

    Selected Topics


    University Courses

    Algorithms and data structures

    Educational OSes







    C programming






    Unix security


    Selected Topics

    Introduction History Architecture




    Interprocess Communication

    Process synchronization


    Memory management

    Linkers and Loaders

    Virtual memory


    Introduction to networking


    Linux Modules

    Tutorials and E-books

    See also University Courses

    Linux Documentation Project Guides(see also Linux Guides):

    The Linux Kernel Hackers' Guide, freely redistributable collection of documents; version 0.7 by Michael K. Johnson is available in HTML and HTML (tared and gziped).

    The Linux Kernel, freely redistributable book by David A. Rusling. Version 0.8-2is available in HTML, HTML (tared and gziped), DVI, LaTeX source, PDF, and PostScript.

    The Linux Programmer's Guide, version 0.4 by B. Scott Burkett, Sven Goldt, John D. Harper, Sven van der Meer and Matt Welsh, is available in HTML, HTML (tared and gziped), LaTeX source, PDF and PostScript.

    Linux Kernel Glossary

    Operating Systems -- introduction to OS by Sharon Heimansohn, see also other modules from Department of Computer and Information Science of IUPU (Indiana University / Purdue University Indianapolis):


    The Mythical Man-Month. Essays on Software Engineering by Frederick Brooks Jr. Anniversary Edition. Contain a fascinating account on the creation of OS/360 -- real classic.

    A Quarter Century of Unix
    Peter H. Salus / Paperback / Published 1994
    Casting the Net : From Arpanet to Internet and Beyond (Unix and Open Systems Series)
    Peter H. Salus / Paperback / Published 1995
    Hard Drive : Bill Gates and the Making of the Microsoft Empire
    James Wallace, et al / Paperback /
    Overdrive : Bill Gates and the Race to Control Cyberspace
    James Wallace / Paperback / Published 1998 -- not that good as a previous one but still interesting




    Interprocess Communication

    Process synchronization

    Other synchronization primitives

    Ada Tasking


    Atomic Transactions


    Bankers algorithm

    Dining Philosophers

    Lecture Notes

    Distributed case and databases


    Deadlock... The Deadly Embrace (Millersville University) Dr. Roger W. Webster (contains the picture from SG)

    Memory management

    Linkers and Loaders

    Virtual memory

    Paging vs. Segmentation, Multilevel Page Tables, Paging Along with Segmentation

    Capability Addressing, Protection Capabilities, Single Virtual Address Space, & Protection Rings

    Distributed Shared Memory, & The Mach VM

    Memory Consistency, & Consistency Models Requiring & Not Requiring Synchronization Operations

    NUMA vs NORMA, Replication Of Memory, Achieving Sequential Consistency, & Synchronization in DSM Systems

    Management of Available Storage, Swapping and Paging, & Inverted Page Tables

    Performance of Demand Paging, Replacement Strategies, Stack Algorithms and Priority Lists, Approximations to LRU Replacement, Page vs. Segment Replacement, & Page Replacement in DSM Systems

    Locality of Reference, User-Level Memory Managers,The Working Set Model, Load Control in UNIX, & Performance of Paging Algorithms


    SunWorld Online - January - CacheFS and Solstice AutoClient

    Linux Modules



    Blox Data AB

    Educational OSes


    Floppy-based version of linux



    JOS - Java VM-based OS

    Real time OSes

    Advanced systems programming and realtime systems Realtime operating systems and device programming

    Unix vs NT

    Windows NT Architecture, Part 1

    Sample Chapter from Inside Windows NT®, Second Edition by David A. Solomon, based on the original edition by Helen Custer.

    Inside the Windows 2000 Kernel

    Windows NT File System Internals A Developer's Guide Chapter 4. The NT I-O Manager


    Random Findings


    The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D

    Copyright © 1996-2018 by Dr. Nikolai Bezroukov. was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) in the author free time and without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

    FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

    This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

    You can use PayPal to make a contribution, supporting development of this site and speed up access. In case is down you can use the at


    The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

    The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

    Last modified: November 02, 2019