Skip to content

Commit

Permalink
Merge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug…
Browse files Browse the repository at this point in the history
…', 'ec', 'misc', 'printk' and 'processor' into release
  • Loading branch information
lenb committed Feb 7, 2009
8 parents 2b25c9f + 0a3db1c + 9fdd54f + 5ec5d38 + 4312495 + 370154b + 4d93915 + 62663ea commit 2d29c6a
Show file tree
Hide file tree
Showing 1,744 changed files with 32,279 additions and 15,883 deletions.
4 changes: 3 additions & 1 deletion .mailmap
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ Rudolf Marek <[email protected]>
Rui Saraiva <[email protected]>
Sachin P Sant <[email protected]>
Sam Ravnborg <[email protected]>
Sascha Hauer <[email protected]>
S.Çağlar Onur <[email protected]>
Simon Kelley <[email protected]>
Stéphane Witzmann <[email protected]>
Expand All @@ -100,6 +101,7 @@ Tejun Heo <[email protected]>
Thomas Graf <[email protected]>
Tony Luck <[email protected]>
Tsuneo Yoshioka <[email protected]>
Uwe Kleine-König <[email protected]>
Uwe Kleine-König <[email protected]>
Uwe Kleine-König <[email protected]>
Uwe Kleine-König <[email protected]>
Valdis Kletnieks <[email protected]>
4 changes: 3 additions & 1 deletion Documentation/Changes
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,12 @@ o Gnu make 3.79.1 # make --version
o binutils 2.12 # ld -v
o util-linux 2.10o # fdformat --version
o module-init-tools 0.9.10 # depmod -V
o e2fsprogs 1.29 # tune2fs
o e2fsprogs 1.41.4 # e2fsck -V
o jfsutils 1.1.3 # fsck.jfs -V
o reiserfsprogs 3.6.3 # reiserfsck -V 2>&1|grep reiserfsprogs
o xfsprogs 2.6.0 # xfs_db -V
o squashfs-tools 4.0 # mksquashfs -version
o btrfs-progs 0.18 # btrfsck
o pcmciautils 004 # pccardctl -V
o quota-tools 3.09 # quota -V
o PPP 2.4.0 # pppd --version
Expand Down
18 changes: 13 additions & 5 deletions Documentation/CodingStyle
Original file line number Diff line number Diff line change
Expand Up @@ -483,17 +483,25 @@ values. To do the latter, you can stick the following in your .emacs file:
(* (max steps 1)
c-basic-offset)))

(add-hook 'c-mode-common-hook
(lambda ()
;; Add kernel style
(c-add-style
"linux-tabs-only"
'("linux" (c-offsets-alist
(arglist-cont-nonempty
c-lineup-gcc-asm-reg
c-lineup-arglist-tabs-only))))))

(add-hook 'c-mode-hook
(lambda ()
(let ((filename (buffer-file-name)))
;; Enable kernel mode for the appropriate files
(when (and filename
(string-match "~/src/linux-trees" filename))
(string-match (expand-file-name "~/src/linux-trees")
filename))
(setq indent-tabs-mode t)
(c-set-style "linux")
(c-set-offset 'arglist-cont-nonempty
'(c-lineup-gcc-asm-reg
c-lineup-arglist-tabs-only))))))
(c-set-style "linux-tabs-only")))))

This will make emacs go better with the kernel coding style for C
files below ~/src/linux-trees.
Expand Down
2 changes: 1 addition & 1 deletion Documentation/DMA-API.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

This document describes the DMA API. For a more gentle introduction
phrased in terms of the pci_ equivalents (and actual examples) see
DMA-mapping.txt
Documentation/PCI/PCI-DMA-mapping.txt.

This API is split into two pieces. Part I describes the API and the
corresponding pci_ API. Part II describes the extensions to the API
Expand Down
88 changes: 88 additions & 0 deletions Documentation/DocBook/uio-howto.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,12 @@ GPL version 2.
</abstract>

<revhistory>
<revision>
<revnumber>0.7</revnumber>
<date>2008-12-23</date>
<authorinitials>hjk</authorinitials>
<revremark>Added generic platform drivers and offset attribute.</revremark>
</revision>
<revision>
<revnumber>0.6</revnumber>
<date>2008-12-05</date>
Expand Down Expand Up @@ -312,6 +318,16 @@ interested in translating it, please email me
pointed to by addr.
</para>
</listitem>
<listitem>
<para>
<filename>offset</filename>: The offset, in bytes, that has to be
added to the pointer returned by <function>mmap()</function> to get
to the actual device memory. This is important if the device's memory
is not page aligned. Remember that pointers returned by
<function>mmap()</function> are always page aligned, so it is good
style to always add this offset.
</para>
</listitem>
</itemizedlist>

<para>
Expand Down Expand Up @@ -594,6 +610,78 @@ framework to set up sysfs files for this region. Simply leave it alone.
</para>
</sect1>

<sect1 id="using_uio_pdrv">
<title>Using uio_pdrv for platform devices</title>
<para>
In many cases, UIO drivers for platform devices can be handled in a
generic way. In the same place where you define your
<varname>struct platform_device</varname>, you simply also implement
your interrupt handler and fill your
<varname>struct uio_info</varname>. A pointer to this
<varname>struct uio_info</varname> is then used as
<varname>platform_data</varname> for your platform device.
</para>
<para>
You also need to set up an array of <varname>struct resource</varname>
containing addresses and sizes of your memory mappings. This
information is passed to the driver using the
<varname>.resource</varname> and <varname>.num_resources</varname>
elements of <varname>struct platform_device</varname>.
</para>
<para>
You now have to set the <varname>.name</varname> element of
<varname>struct platform_device</varname> to
<varname>"uio_pdrv"</varname> to use the generic UIO platform device
driver. This driver will fill the <varname>mem[]</varname> array
according to the resources given, and register the device.
</para>
<para>
The advantage of this approach is that you only have to edit a file
you need to edit anyway. You do not have to create an extra driver.
</para>
</sect1>

<sect1 id="using_uio_pdrv_genirq">
<title>Using uio_pdrv_genirq for platform devices</title>
<para>
Especially in embedded devices, you frequently find chips where the
irq pin is tied to its own dedicated interrupt line. In such cases,
where you can be really sure the interrupt is not shared, we can take
the concept of <varname>uio_pdrv</varname> one step further and use a
generic interrupt handler. That's what
<varname>uio_pdrv_genirq</varname> does.
</para>
<para>
The setup for this driver is the same as described above for
<varname>uio_pdrv</varname>, except that you do not implement an
interrupt handler. The <varname>.handler</varname> element of
<varname>struct uio_info</varname> must remain
<varname>NULL</varname>. The <varname>.irq_flags</varname> element
must not contain <varname>IRQF_SHARED</varname>.
</para>
<para>
You will set the <varname>.name</varname> element of
<varname>struct platform_device</varname> to
<varname>"uio_pdrv_genirq"</varname> to use this driver.
</para>
<para>
The generic interrupt handler of <varname>uio_pdrv_genirq</varname>
will simply disable the interrupt line using
<function>disable_irq_nosync()</function>. After doing its work,
userspace can reenable the interrupt by writing 0x00000001 to the UIO
device file. The driver already implements an
<function>irq_control()</function> to make this possible, you must not
implement your own.
</para>
<para>
Using <varname>uio_pdrv_genirq</varname> not only saves a few lines of
interrupt handler code. You also do not need to know anything about
the chip's internal registers to create the kernel part of the driver.
All you need to know is the irq number of the pin the chip is
connected to.
</para>
</sect1>

</chapter>

<chapter id="userspace_driver" xreflabel="Writing a driver in user space">
Expand Down
4 changes: 2 additions & 2 deletions Documentation/IO-mapping.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[ NOTE: The virt_to_bus() and bus_to_virt() functions have been
superseded by the functionality provided by the PCI DMA
interface (see Documentation/DMA-mapping.txt). They continue
superseded by the functionality provided by the PCI DMA interface
(see Documentation/PCI/PCI-DMA-mapping.txt). They continue
to be documented below for historical purposes, but new code
must not use them. --davidm 00/12/12 ]

Expand Down
11 changes: 6 additions & 5 deletions Documentation/block/biodoc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,9 @@ a virtual address mapping (unlike the earlier scheme of virtual address
do not have a corresponding kernel virtual address space mapping) and
low-memory pages.

Note: Please refer to DMA-mapping.txt for a discussion on PCI high mem DMA
aspects and mapping of scatter gather lists, and support for 64 bit PCI.
Note: Please refer to Documentation/PCI/PCI-DMA-mapping.txt for a discussion
on PCI high mem DMA aspects and mapping of scatter gather lists, and support
for 64 bit PCI.

Special handling is required only for cases where i/o needs to happen on
pages at physical memory addresses beyond what the device can support. In these
Expand Down Expand Up @@ -953,14 +954,14 @@ elevator_allow_merge_fn called whenever the block layer determines
results in some sort of conflict internally,
this hook allows it to do that.

elevator_dispatch_fn fills the dispatch queue with ready requests.
elevator_dispatch_fn* fills the dispatch queue with ready requests.
I/O schedulers are free to postpone requests by
not filling the dispatch queue unless @force
is non-zero. Once dispatched, I/O schedulers
are not allowed to manipulate the requests -
they belong to generic dispatch queue.

elevator_add_req_fn called to add a new request into the scheduler
elevator_add_req_fn* called to add a new request into the scheduler

elevator_queue_empty_fn returns true if the merge queue is empty.
Drivers shouldn't use this, but rather check
Expand Down Expand Up @@ -990,7 +991,7 @@ elevator_activate_req_fn Called when device driver first sees a request.
elevator_deactivate_req_fn Called when device driver decides to delay
a request by requeueing it.

elevator_init_fn
elevator_init_fn*
elevator_exit_fn Allocate and free any elevator specific storage
for a queue.

Expand Down
63 changes: 63 additions & 0 deletions Documentation/block/queue-sysfs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Queue sysfs files
=================

This text file will detail the queue files that are located in the sysfs tree
for each block device. Note that stacked devices typically do not export
any settings, since their queue merely functions are a remapping target.
These files are the ones found in the /sys/block/xxx/queue/ directory.

Files denoted with a RO postfix are readonly and the RW postfix means
read-write.

hw_sector_size (RO)
-------------------
This is the hardware sector size of the device, in bytes.

max_hw_sectors_kb (RO)
----------------------
This is the maximum number of kilobytes supported in a single data transfer.

max_sectors_kb (RW)
-------------------
This is the maximum number of kilobytes that the block layer will allow
for a filesystem request. Must be smaller than or equal to the maximum
size allowed by the hardware.

nomerges (RW)
-------------
This enables the user to disable the lookup logic involved with IO merging
requests in the block layer. Merging may still occur through a direct
1-hit cache, since that comes for (almost) free. The IO scheduler will not
waste cycles doing tree/hash lookups for merges if nomerges is 1. Defaults
to 0, enabling all merges.

nr_requests (RW)
----------------
This controls how many requests may be allocated in the block layer for
read or write requests. Note that the total allocated number may be twice
this amount, since it applies only to reads or writes (not the accumulated
sum).

read_ahead_kb (RW)
------------------
Maximum number of kilobytes to read-ahead for filesystems on this block
device.

rq_affinity (RW)
----------------
If this option is enabled, the block layer will migrate request completions
to the CPU that originally submitted the request. For some workloads
this provides a significant reduction in CPU cycles due to caching effects.

scheduler (RW)
--------------
When read, this file will display the current and available IO schedulers
for this block device. The currently active IO scheduler will be enclosed
in [] brackets. Writing an IO scheduler name to this file will switch
control of this block device to that new IO scheduler. Note that writing
an IO scheduler name to this file will attempt to load that IO scheduler
module, if it isn't already present in the system.



Jens Axboe <[email protected]>, February 2009
24 changes: 22 additions & 2 deletions Documentation/cgroups/memcg_test.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Memory Resource Controller(Memcg) Implementation Memo.
Last Updated: 2008/12/15
Base Kernel Version: based on 2.6.28-rc8-mm.
Last Updated: 2009/1/19
Base Kernel Version: based on 2.6.29-rc2.

Because VM is getting complex (one of reasons is memcg...), memcg's behavior
is complex. This is a document for memcg's internal behavior.
Expand Down Expand Up @@ -340,3 +340,23 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
# mount -t cgroup none /cgroup -t cpuset,memory,cpu,devices

and do task move, mkdir, rmdir etc...under this.

9.7 swapoff.
Besides management of swap is one of complicated parts of memcg,
call path of swap-in at swapoff is not same as usual swap-in path..
It's worth to be tested explicitly.

For example, test like following is good.
(Shell-A)
# mount -t cgroup none /cgroup -t memory
# mkdir /cgroup/test
# echo 40M > /cgroup/test/memory.limit_in_bytes
# echo 0 > /cgroup/test/tasks
Run malloc(100M) program under this. You'll see 60M of swaps.
(Shell-B)
# move all tasks in /cgroup/test to /cgroup
# /sbin/swapoff -a
# rmdir /test/cgroup
# kill malloc task.

Of course, tmpfs v.s. swapoff test should be tested, too.
16 changes: 0 additions & 16 deletions Documentation/cpu-freq/user-guide.txt
Original file line number Diff line number Diff line change
Expand Up @@ -195,19 +195,3 @@ scaling_setspeed. By "echoing" a new frequency into this
you can change the speed of the CPU,
but only within the limits of
scaling_min_freq and scaling_max_freq.


3.2 Deprecated Interfaces
-------------------------

Depending on your kernel configuration, you might find the following
cpufreq-related files:
/proc/cpufreq
/proc/sys/cpu/*/speed
/proc/sys/cpu/*/speed-min
/proc/sys/cpu/*/speed-max

These are files for deprecated interfaces to cpufreq, which offer far
less functionality. Because of this, these interfaces aren't described
here.

4 changes: 2 additions & 2 deletions Documentation/filesystems/nfs-rdma.txt
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ NFS/RDMA Setup

Instruct the server to listen on the RDMA transport:

$ echo rdma 2050 > /proc/fs/nfsd/portlist
$ echo rdma 20049 > /proc/fs/nfsd/portlist

- On the client system

Expand All @@ -263,7 +263,7 @@ NFS/RDMA Setup
Regardless of how the client was built (module or built-in), use this
command to mount the NFS/RDMA server:

$ mount -o rdma,port=2050 <IPoIB-server-name-or-address>:/<export> /mnt
$ mount -o rdma,port=20049 <IPoIB-server-name-or-address>:/<export> /mnt

To verify that the mount is using RDMA, run "cat /proc/mounts" and check
the "proto" field for the given mount.
Expand Down
28 changes: 28 additions & 0 deletions Documentation/filesystems/proc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2027,6 +2027,34 @@ increase the likelihood of this process being killed by the oom-killer. Valid
values are in the range -16 to +15, plus the special value -17, which disables
oom-killing altogether for this process.

The process to be killed in an out-of-memory situation is selected among all others
based on its badness score. This value equals the original memory size of the process
and is then updated according to its CPU time (utime + stime) and the
run time (uptime - start time). The longer it runs the smaller is the score.
Badness score is divided by the square root of the CPU time and then by
the double square root of the run time.

Swapped out tasks are killed first. Half of each child's memory size is added to
the parent's score if they do not share the same memory. Thus forking servers
are the prime candidates to be killed. Having only one 'hungry' child will make
parent less preferable than the child.

/proc/<pid>/oom_score shows process' current badness score.

The following heuristics are then applied:
* if the task was reniced, its score doubles
* superuser or direct hardware access tasks (CAP_SYS_ADMIN, CAP_SYS_RESOURCE
or CAP_SYS_RAWIO) have their score divided by 4
* if oom condition happened in one cpuset and checked task does not belong
to it, its score is divided by 8
* the resulting score is multiplied by two to the power of oom_adj, i.e.
points <<= oom_adj when it is positive and
points >>= -(oom_adj) otherwise

The task with the highest badness score is then selected and its children
are killed, process itself will be killed in an OOM situation when it does
not have children or some of them disabled oom like described above.

2.13 /proc/<pid>/oom_score - Display current oom-killer score
-------------------------------------------------------------

Expand Down
Loading

0 comments on commit 2d29c6a

Please sign in to comment.