Computerworld
Fair user scheduling for Linux
A new set of Linux scheduler features would allocate CPU time fairly among the users on the system.
Jonathan Corbet (LinuxWorld)  25 October, 2007 10:24

The Completely Fair Scheduler (CFS) was merged for the 2.6.23 kernel. One CFS feature which did not get in, though, was the group scheduling facility. Group scheduling makes the CFS fairness algorithm operate in a hierarchical fashion: processes are divided into groups, and, within each group, processes are scheduled fairly against one another. At the higher level, each group as a whole is given a fair share of the processor. The grouping of processes is done in user space in a highly flexible manner; the control groups (formerly "process containers") mechanism allows a management daemon to classify processes according to almost any policy.

One of the reasons why group scheduling did not get into 2.6.23 is that the control groups patch was not ready for merging. The author had expected control groups to go in for 2.6.24, but, as of this writing, it is looking like that patch might still be under too much active development to get into the mainline. The group scheduling feature is not waiting, though; it has been merged for the 2.6.24 release. In the absence of control groups, the general group scheduling mechanism will not be available. Over the last few months, though, the group scheduler has evolved a new feature which will allow it to be used without control groups, and which implements what is likely to be the most common use case.

That feature is per-user scheduling: creating a separate group for each user running on the system and using those groups to give each user a fair share of the processor. Since the groups are created implicitly by the scheduler, there is no separate need for the control groups interface. Instead, if the "fair user" configuration option is selected, the per-user group scheduling will go into effect with no further intervention by the administrator required.

Of course, once the system provides fair per-user scheduling, administrators will immediately want to make it unfair by arranging for some users to get more CPU time than others. The age-old technique of raising the priority of that crucial administrative wesnoth process still works, but it is a crude and transparent tool. It would be much nicer to be able to tweak the scheduler so that certain users get a higher share of the CPU for the running of their crucial video diagnostic tools.

To achieve such ends with the 2.6.24 scheduler, it will only be necessary to go to the new sysfs directory /sys/kernel/uids. There will be a subdirectory there for every active user ID on the system, and each subdirectory will contain a file called cpu_share. The integer value found in that file defaults to 1024. For the purposes of adjusting scheduling, all that really matters with the cpu_share value is its ratio between two users. If one user's cpu_share is set to 2048, that user will get twice as much CPU time as any one user whose value remains at the default 1024. The end result is that adjusting the scheduling of the CPU between users is quite easy for the administrator to do.

A rather large number of other patches was also merged for 2.6.24. Most of those are cleanups and small improvements. Some of the math within the scheduler has been made less intensive, and fairness has been improved in a number of ways. There is also a new facility for performing guest CPU accounting for virtualized systems running under KVM. It's a lot of patches, but the rate of change in the core CPU scheduler should be beginning to slow down again.

There are some other scheduler-related patches in the works, though. A couple of them address the problem of getting realtime tasks into a CPU promptly. Normally, the CPU scheduler will make a significant effort to avoid moving processes between CPUs because the cost of that migration (resulting from lost memory cache contents) is high. If a realtime process wants to run, though, the system is obligated to give it a processor even if there is a price to be paid in terms of overall throughput. The current CPU scheduler, however, will cause a realtime process to languish if a higher-priority process is running on the same CPU, even if other processors are available in the system.

Fixing this problem involves a couple of different patches. This one from Steven Rostedt addresses the situation where the scheduling of one realtime task causes a lower-priority (but still realtime) task to be pushed out of the CPU. Rather than leave that luckless task in the run queue, Steven's patch searches through the other processors on the system to find the one running the lowest-priority process. If a processor running a sufficiently low-priority process is found, the displaced realtime process is moved over to that processor.

Gregory Haskins has posted a similar patch which addresses a slightly different situation: a realtime task has just been awakened, but the CPU it is on is already running a higher-priority process. Once again, a search of the system to find the lowest-priority CPU is performed, with the realtime process being moved if a suitable home is found. In either case, the moved process will suffer a small performance hit as it finds a completely cold cache waiting for it. But it will still be able to respond much more quickly to the real world than it would if it were sitting on a run queue somewhere; that, of course, is what realtime scheduling is all about.

Computerworld Buyer's Guide - Vendors Matched to this Article
More about HIS Limited, Linux, KVM, CFS

Comments

Post new comment

Login or register to link comments to your user profile, or you may also post a comment without being logged in.
The content of this field is kept private and will not be shown publicly.
Enter the fully qualified URL, eg. http://www.example.com/
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

Add to Google
Computerworld Buyer's Guide - Vendors Matched to this Article
Zones
Zone logoZones provide focussed content from Computerworld and leading technology partners.
Newsletter Subscription
Newsletter Subscription
Sign up for our Computerworld newsletters!
Syndicate content
 

Computerworld Webinar

Thursday, June 11th, 2009
10:30am EST (Sydney, Australia)
Screening at your PC

Computerworld is hosting a 30 minute live webinar to help you to learn how unified communications can save you money, foster innovation and business agility by making it easier for people to find, reach and collaborate with one another.

Register Now

Computerworld Community Comments
Whitepaper

Speeding business innovation with Data Centre Transformation solutions

Data centre transformation helps your organisation shift spending from maintenance and management to focus on projects that support business growth and innovation while significantly reducing operating costs. Read more now.

Enterprise IT Buyer's Guide
Find Technology Vendors Fast
 
Find vendors by name | Find by category
Sponsored Links
 
Send Us E-mail | Privacy Policy
Features List | Media Kit | Advertising | Contact Us

Copyright 2009 IDG Communications. ABN 14 001 592 650. All rights reserved.
Reproduction in whole or in part in any form or medium without express written permission of IDG Communications is prohibited.