EECS 441: Resource Virtualization, Winter 2015

Instructor:Peter Dinda (Office Hours: Thursdays, 2-5pm, or by appointment, Tech L463)
Undergrad Assistant:Chunxiao Diao (Office Hours: Tuesdays, 2-5pm, or by appointment, Wilkinson Lab)
Class Time:Winter 2015, Tuesdays and Thursdays, 9:30-11:00am
Class Location:Tech LG62 (may move to a conference room)
Course number:EECS 441

The bulk of the time in this class is spent examining a virtual machine monitor (VMM) in depth, at the source code level. The course explains the hardware/software interface of a modern x86 computer in detail. A VMM is an operating system that is implemented directly on top of the hardware interface, and itself presents a hardware interface to higher-level software.

Both Computer Science students and Computer Engineering students can benefit from EECS 441, as it focuses on the hardware/software interface. Students will also acquire valuable kernel development skills by working on projects related to virtualization. In short, students will learn how a real modern machine and operating system work, and how to extend them.

We will examine the implementation of the Palacios VMM from my V3VEE Research Project. In particular, we will take a look at the "bleeding edge" of the devel branch. Furthermore, the class will share a repository so that every student or group can contribute as a core developer and see what's going on. Palacios is an embeddable VMM, and we will consider its embedding into Linux as a kernel module. We may also use two other kernels: the Kitten lightweight kernel from Sandia National Labs and the Nautilus kernel under development here at Northwestern.

Within the undergraduate CS major, EECS 441 counts for breadth or depth credit in the systems area. Undergraduates are welcome.

For graduate students, EECS 441 counts as a graduate course.

The testbed hardware used in this course was generously donated by Shea Lutton and the Murphy Society. We gratefully acknowledge their contributions to the success of this course and to experimental computer systems education at Northwestern.

Prerequisites

Coming into this course, you must have a basic familiarity with systems, specifically x86 32 or 64 bit systems, to the level of EECS 213 or EECS 205, and be familiar with the C programming language and the Unix development environment. Familiarity with basic operating systems (EECS 343 or similar) is a prequisite for the course. Familiarity with computer archiecture (EECS 361 or similar) is useful. Because there are many ways to achieve this basic background, the instructor has interviewed students before the class. If you're in the course, it's because the instructor thinks you have the background to do well. If you'd like to join the class, talk to the instructor about a permission number.

Books

For the most part, we will be examining and discussing real code on a real machine in the class. There is no required textbook. This makes it essential that you attend class, and use office hours. Also, this is a learn by doing class, so it is essential that you get your feet wet quickly.

The following reference book is a good explanation of virtual machines in general:

  • J. Smith, R. Nair, Virtual Machines: Versatile Platforms for Systems and Processess, Morgan Kaufman, 2005.
  • The Linux kernel is a powerful, practical operating systems codebase that is free for anyone to download and use. In addition to the code itself, you will find the following book to be very helpful in that it explains the structure and theory of operation of Linux in high quality way.

  • D. Bovet, M. Casti, Understanding the Linux Kernel, Third Edition, O'Reilly, 2005.
  • Whatever Linux books you read, for the purposes of this course, make sure that they are about version 2.6 of the kernel, which is substantially different from prior versions. Version 3.0+ are also appropriate, although we will use 2.6 here.

    Palacios is described in considerable detail in its technical report:

  • J. Lange, P. Dinda, K. Hale, L. Xia, An Introduction to the Palacios Virtual Machine Monitor---Version 1.3, Technical Report NWU-EECS-11-10, Department of Electrical Engineering and Computer Science, Northwestern University, November, 2011. pdf
  • Xen is another, widely used open-source VMM. The following book is an excellent introduction to it for kernel developers:

  • D. Chisnall, The Definitive Guide to the Xen Hypervisor, Prentice Hall, 2007.
  • You may find it helpful to have general introductory books on systems, operating systems, and architecture available for reference. I would recommend these:

  • R. Bryant, D. O'Hallaron, Computer Systems: A Programmer's Perspective (2nd Edition), Addison Wesley, 2010. (first edition is fine too for this course)
    This is the book used for EECS 213
  • A. Silberschatz, P. Galbin, G. Gagne, Operating Systems Concepts (8th Edition), Wiley, 2008. (earlier editions are also fine for this course)
    This is the book used for EECS 343
  • J. Hennessy, D. Patterson,Computer Architecture: A Quantitative Approach (3rd Edition), Morgan Kaufman, 2002. (any version is fine for this course)
  • Unfortunately, I am not aware of a good single book covering the modern x86 architecture from an OS perspective. We will use the Intel and AMD architecture manuals as needed (links given below).

    I may also provide links to internal materials on Palacios and Kitten during the class. Note that you can now examine the codebase of Palacios online. The codebase of Linux can be examined online too.

    Grading

    The components of the class will break down as follows:

  • In-class discussion: 30%
  • Project: 50% (including weekly progress reports)
  • Project paper and presentation: 20%
  • Note that a very substantial portion of the grade is project-related. It is important that you dive into the code soon! Because this is a small class, I will have plenty of time to work one-on-one with project groups and individual students.

    Communication

    For discussions and announcements this quarter, we will use Piazza: EECS 441 Piazza Site. We will enroll you. Directing your questions to Piazza will likely produce the fastest response, and everyone else in the class will also benefit. We also have available the historic discussions (about 6 years worth) from this class in Google group. You can request access using the following:
    Subscribe to EECS 441 Resource Virtualization
    Email:
    Visit this group

    There is nothing on Blackboard or Canvas.

    Development Environment

    A core part of this course is active OS kernel development on physical hardware.

    The following is a short summary of the development environment that will be available to every student.

  • A shared git repository for the class to which all students will have full push privileges. This includes a gitweb interface so that students can easily see what's been freshly pushed in a web browser. We will also push commits coming from the main Palacios devel branch.
  • A dedicated set of teaching lab machines which all students will have full root access to. These machines run Fedora and Red Hat and are set up to support the Palacios VMM.
  • Help in setting up development on other machines. The codebase will also run under recent commercial virtualization tools such as VMware.
  • You will be given access credentials to the testbed in class. Once you can log into the machines, you can find information about how to checkout, configure, build, and test Palacios on them in the file /441/GettingStarted.

    What You Will Learn

    This is a course in operating systems (OS) design and implementation where the example OS is a VMM. OSes operate very differently from application programs, and the development process is also markedly different. In part, this is because OSes interact directly with the hardware interface provided by the processor and system architecture. A VMM is a particularly interesting kind of OS to learn about because it also has to implement what looks like a hardware interface. By studying a VMM, you will be exposed to both sides of the hardware/software interface. This class will do this by considering a real VMM running on top of real hardware. Some specific examples of what you will learn include:

  • The hardware interface of Intel and AMD x86 and x86_64 ("x64") processors from an OS perpective. These processors underly almost all modern PCs, Macs, laptops, workstations, and servers.
    Modes and privilege levels, exceptions and interrupts, address translation, control registers, IPI, etc.
  • The basic PC systems architecture. This architecture underlies almost all modern PCs, Macs, workstations, and servers.
    PIC/APIC/IOAPIC, PIT, PCI, NVRAM, BIOS, etc.
  • Multicore x86 architecture (SMP model, the Intel Multiprocessor Specification)
  • Modern kernel development.
    Version control (git), compilers, assemblers, bintools, image compilation and linking, emulator (qemu), serial debugging, PXE, kgdb, etc.
  • Interrupts and I/O models.
  • Virtual memory.
  • Devices and device drivers.
  • The boot process.
  • Synchronization in an OS kernel.
  • Implementation of basic OS abstractions, such as kernel threads.
  • Hardware virtualization interface (focusing on AMD SVM with some discussion of Intel VT's differences).
  • Whole system virtualization versus paravirtualization.
  • Virtualizing machine modes.
  • Virtualizing virtual memory with shadow and nested paging.
  • Virtual devices - programs that emulate hardware devices.
  • Multicore issues in virtualization.
  • Project

    Over the course of the quarter, you will apply what you're learning in a project, and then document your project in a high quality paper and open presentation. Project topics will be chosen in consultation with me, and will primarily focus on the development of extensions or components for Palacios. Such projects will give you the opportunity to enhance your kernel development skills, and create something that can ship (and certainly be part of a portfolio). Exceptional projects can also lead to publications.

    Projects can be done in groups. We will discuss potential projects in detail a week or two into the course. I will expect weekly project reports. All projects will be presented at a public colloquium at the day/time of the final exam.

    Resources

  • We will be studying the Palacios Virtual Machine Monitor in depth. You will have access to an internal git repository on a machine set up to support development. Again, there is more information in /441/GettingStarted on the testbed machines.
  • You should have a Tlab and Wilkinson Lab Linux accounts. The Wilkinson Lab is quite a nice place to work as a group.
  • If you haven't used Linux or Unix remotely before, you will want to read Using Unix Remotely Without the Excruciating Pain.
  • You will want to have the Intel Architecture Manuals and the AMD Architecture Manuals handy.
  • We use the Palacios/Linux embedding. You can find all the Linux code online. Palacios compiles into a kernel module that can be inserted into a running kernel provided it has certain features enabled. The testbed has both Fedora and Red Hat setups. We also provide some guest images. The Palacios kernel module should work with other appropriately configured host environments.
  • Palacios can also be embedded into Sandia National Lab's Kitten Lightweight Kernel, which has its own useful documentation and description. It is much much smaller than Linux.

  • Peter Dinda
    Last modified: Thu Jan 8 11:27:17 CST 2015