kernel weekly news – 16.07.2011

Posted: July 16, 2011 in kernel

-Hello! A new week is started so let’s do this! :)

-Chris Mason has btrfs fixes (git pull), Liam Girdwood has regulator fixes,
Roland Dreier has a patch fixing crashes in the block subtree,
Kukjin Kim has Samsung fixes, Greg Kroah-Hartman announces 2.6.39.3,
Jesse Barnes has a pci fix, Takashi Iwai has a pull request with small
fixes and Guenter Roeck has hwmon fixes (pull request) .

-Stefan Richter has several firewire updates (pull req.),
Frederic Weisbecker has cgroup updates in a 7-piece
patchset, Gustavo Padovan has updates to the bluetooth
subtree (“This is probably my last pull request to 3.1.
Here Andre Guedes adds support to enable/disable LE in a controller.
Joe Perches did a clean great clean up in our log system.
Mat Martineau reworked that Local Busy condition for the
Enhanced Retransmission Mode in L2CAP. And Vinicius Gomes improved the
Security Manager Protocol support, adding suppor to the phase 3 of its pairing
process and support to communicate keys with the userspace. The rest are just
fixes and clean up. Please pull. Thanks!”) and Steve French has cifs fixes
(pull request).

-Linus Torvalds announces the awaited -rc7 of 3.0 :

 I think I said -rc6 might be the last -rc. I lied.

Things have been pretty quiet, but there's enough new stuff here that
I wanted to do another -rc, and we still have some issues with the RCU
changes causing problems when RCU events happen before the scheduler
has been fully initialized etc. So -rc7 is out there, although it
might not have mirrored out to the public sites quite yet.

I also ended up re-generating the -rc6 files (fat-fingered the release
script), so the -rc6 patches and tar-balls look all brand spanking new
too! Two releases for the price of one!

There's not a whole lot to say about it - the appended shortlog gives
a reasonable overview. Random drivers (we're back to the usual "two
thirds drivers" statistics), some media and cifs updates, and some
vmscan corner case improvements.

                Linus 

-Stefan Hajnoczi , in a mail titled “[PATCH 0/3] ACPI / Battery: fix NULL pointer
dereference from battery”, explains what it’s all about (long and technical) :

 The following oops happens non-deterministically when resuming from suspend on
a recent Acer Aspire laptop.  The issue seems to be unregistering a led trigger
that was never registered.  The led_trigger->next_trig list_head fields contain
zeroes, hence the NULL pointer dereference in list_del().

Due to the non-deterministic nature of the bug I am not sure whether the
patches eliminate the oops.  However, the patches do address real problems in
drivers/acpi/battery.c.

There is a lack of error handling in battery.c so when power_supply_register()
fails the power_supply will continue be used as if there was no error.  The
reason for this is that power_supply_register() leaves a stale
power_supply->dev pointer set when it returns an error and battery.c simply
uses that field to decide whether the power_supply is registered or not.  This
is fixed in the first patch.

The second and third patches clean up the error handling in acpi_battery_add()
to prevent a use-after-free and properly release resources.

Here is the oops from a Debian 2.6.39-2-amd64 kernel.  linux-2.6/master
contains no relevant fixes AFAICT.  Any thoughts on what is going on here
appreciated:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [] led_trigger_unregister+0x36/0xb4
PGD 1b3d13067 PUD 118545067 PMD 0 
Oops: 0002 [#1] SMP 
last sysfs file: /sys/devices/virtual/vc/vcsa63/uevent
CPU 0 
Modules linked in: parport_pc ppdev lp parport rfcomm bnep bluetooth crc16 acpi_cpufreq mperf cpufreq_conservative cpufreq_stats cpufreq_userspace cpufreq_powersave uinput fuse ext2 loop snd_hda_codec_hdmi snd_hda_codec_realtek joydev arc4 ecb ath9k snd_hda_intel mac80211 snd_hda_codec snd_hwdep snd_pcm ath9k_common ath9k_hw snd_seq snd_timer i915 ath snd_seq_device snd uvcvideo videodev drm_kms_helper soundcore cfg80211 drm snd_page_alloc ac sparse_keymap media wmi battery rfkill v4l2_compat_ioctl32 intel_ips pcspkr evdev i2c_i801 power_supply i2c_algo_bit i2c_core psmouse video processor button serio_raw ext3 jbd mbcache sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod sd_mod crc_t10dif ahci libahci libata scsi_mod ehci_hcd usbcore atl1c thermal thermal_sys [last unloaded: scsi_wait_scan]

Pid: 29129, comm: pm-suspend Not tainted 2.6.39-2-amd64 #1 Acer Aspire 3820/JM31_CP
RIP: 0010:[]  [] led_trigger_unregister+0x36/0xb4
RSP: 0018:ffff8801187fbd68  EFLAGS: 00010246
RAX: 0000000000000000 RBX: dead000000200200 RCX: 0000000000000004
RDX: 0000000000000000 RSI: dead000000100100 RDI: ffffffff8164ede0
     ^                     ^
rdx gets dereferenced      LIST_POISON constants (haven't been used yet)

RBP: ffff8801b321aec0 R08: 0000000000000200 R09: ffffffff81683390
R10: ffff880100000010 R11: ffff8801b0e3f800 R12: 00000000ffffffff
R13: ffff8801b3d93378 R14: 0000000000000003 R15: ffffffffffffffff
FS:  00007f83b6797700(0000) GS:ffff8801bbc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 00000001b21cc000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process pm-suspend (pid: 29129, threadinfo ffff8801187fa000, task ffff8801b2880e40)
Stack:
 ffff8801b321aec0 0000000000000004 00000000ffffffff ffffffff81263fa6
 ffff8801b0e3f820 ffffffffa01716b2 ffff8801b0e3f820 ffffffffa0171035
 ffff8801b0e3f800 ffffffffa018628f ffff8801b0e3f800 ffffffffa01863ae
Call Trace:
 [] ? led_trigger_unregister_simple+0xe/0x17
 [] ? power_supply_remove_triggers+0x16/0x80 [power_supply]
 [] ? power_supply_unregister+0x15/0x1f [power_supply]
 [] ? sysfs_remove_battery+0x25/0x32 [battery]
 [] ? battery_notify+0x16/0x22 [battery]
 [] ? notifier_call_chain+0x2e/0x5b
 [] ? __blocking_notifier_call_chain+0x4c/0x63
 [] ? pm_notifier_call_chain+0x15/0x2a
 [] ? enter_state+0x10c/0x12b
 [] ? state_store+0xb1/0xce
 [] ? sysfs_write_file+0xe0/0x11c
 [] ? vfs_write+0xa4/0xff
 [] ? sys_write+0x45/0x6e
 [] ? system_call_fastpath+0x16/0x1b
Code: 64 81 53 48 bb 00 02 20 00 00 00 ad de e8 fc e2 0c 00 48 8b 55 30 48 8b 45 38 48 be 00 01 10 00 00 00 ad de 48 c7 c7 e0 ed 64 81 
 89 42 08 48 89 10 48 89 75 30 48 89 5d 38 e8 60 ec df ff 48 
RIP  [] led_trigger_unregister+0x36/0xb4
 RSP 
CR2: 0000000000000008
---[ end trace 099c095de50533f3 ]---

Disassembly of led_trigger_unregister:
ffffffff81263ee4 :
ffffffff81263ee4:       41 54                   push   %r12
ffffffff81263ee6:       55                      push   %rbp
ffffffff81263ee7:       48 89 fd                mov    %rdi,%rbp
ffffffff81263eea:       48 c7 c7 e0 ed 64 81    mov    $0xffffffff8164ede0,%rdi
ffffffff81263ef1:       53                      push   %rbx
ffffffff81263ef2:       48 bb 00 02 20 00 00    movabs $0xdead000000200200,%rbx
ffffffff81263ef9:       00 ad de 
ffffffff81263efc:       e8 fc e2 0c 00          callq  ffffffff813321fd 
ffffffff81263f01:       48 8b 55 30             mov    0x30(%rbp),%rdx
ffffffff81263f05:       48 8b 45 38             mov    0x38(%rbp),%rax
ffffffff81263f09:       48 be 00 01 10 00 00    movabs $0xdead000000100100,%rsi
ffffffff81263f10:       00 ad de 
ffffffff81263f13:       48 c7 c7 e0 ed 64 81    mov    $0xffffffff8164ede0,%rdi
ffffffff81263f1a:       48 89 42 08             mov    %rax,0x8(%rdx)
                                                ^--- boom! 

-Tejun Heo has a memblock-related patchset (x86/mm) :

 Hello,

There are multiple ways to represent memory configuration during boot.
Even with the latest incarnation - nobootmem, the configuration isn't
centralized or easy to use.  NUMA information lives in
early_node_map[] while memory config and alloc/reservation live in
memblock.  This leads to ugly code pieces which try to combine the two
separate information sources both in generic and arch specfic parts.

This patchset extends memblock such that it can also host node
information and allows an arch to do away with early_node_map[] and
use memblock as the sole early memory config / allocation mechanism.

For short term, this adds yet another config option -
HAVE_MEMBLOCK_NODE_MAP in this area.  Longer term goal is removing
early_node_map[] completely and convert everyone over to memblock.  As
early_node_map[] usage is limited only to NUMA archs, this should be
easiser than bootmem allocator conversion.  In the end, memblock will
be the only early mem mechanism.

Note that this patchset still leaves good amount of code which can be
removed / cleaned up in not too distant future.  For example, memblock
non-NUMA alloc code can simply be degenerate case of NUMA aware alloc,
which can also be implemented in simpler and more efficient way with
reverse free area iterator.

This patchset first extends memblock so that it can contain node
information and then replaces x86 specific memblock code with the
generic one.

 0001-memblock-Remove-memblock_memory_can_coalesce.patch
 0002-memblock-Reimplement-memblock_add_region.patch
 0003-memblock-Add-optional-region-nid.patch
 0004-x86-Use-HAVE_MEMBLOCK_NODE_MAP.patch
 0005-x86-Use-__memblock_alloc_base-in-early_reserve_e820.patch
 0006-memblock-Implement-for_each_free_mem_range.patch
 0007-x86-Replace-memblock_x86_find_in_range_size-with-for.patch
 0008-memblock-x86-Make-free_all_memory_core_early-explici.patch
 0009-memblock-x86-Replace-__get_free_all_memory_range-wit.patch
 0010-memblock-x86-Reimplement-memblock_find_dma_reserve-u.patch
 0011-x86-Use-absent_pages_in_range-instead-of-memblock_x8.patch
 0012-memblock-x86-Make-ARCH_DISCARD_MEMBLOCK-a-config-opt.patch
 0013-memblock-x86-Replace-memblock_x86_reserve-free_range.patch

0001-0004 implement HAVE_MEMBLOCK_NODE_MAP and use it in x86.

0005-0013 adds generic memblock free area iterator and gradually
replaces x86 specific memblock mechanism with generic one.

This patchset is on top of

  x86/urgent (5da0ef9a85 "x86: Disable AMD_NUMA for 32bit for now")
+ pfn->nid granularity check patches [1]
+ "memblock, x86: Misc cleanups" patchset [2]
+ "memblock, x86: Implement for_each_mem_pfn_range() and use it to improve memblock allocator" patchset [3]

and available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-x86-mm-memblock 

-Rob Herring has a patch fixing an ARM include problem
(“Currently, all ARM platforms must have a mach/hardware.h include. This is
because it is ultimately included by linux/pci.h which is included in many
places even for !CONFIG_PCI.

This could be fixed simply with an ifdef around the include of mach/hardware.h
in asm/pci.h. However, in the interest of fixing this for single kernel
binary builds, this series removes the include of mach/hardware.h outside of
mach-* and plat-*. What’s used from hardware.h is a couple of PCI defines.
Converting them to variables allows each platform to set the values as needed.

This does not address the inclusion of mach/hardware.h under drivers/*. This
appears to be mostly older platforms. There could also be some indirect
inclusions from other mach/* headers.

I’ve compile tested on most affected ARM platforms.

Changes from v2:
– Incorporated compile fixes for microblaze from Michal Simek.
– Added conversion of powerpc to generic pci flag functions.
– Combined powerpc and microblaze conversion to use
asm-generic/pci-bridge.h into one commit. Renaming of powerpc pci
flags functions is separate commit.
– Changed defaults for PCIBIOS_MIN_IO and PCIBIOS_MIN_MEM to 0x1000 and
0x1000000, respectively.
– Dropped commit moving ARCH_HAS_DMA_SET_COHERENT_MASK defines into
memory.h. This conflicts with other clean-up work by Nicolas Pitre.

Changes from v1:
– Added patch 2 to move ARCH_HAS_DMA_SET_COHERENT_MASK defines into memory.h.
– Separated VGA changes and renamed to vga_base.
– Reverted mach/hardware.h removal from ecard.c. It’s getting implicitly
included anyway.”), Keith Packard has drm-intel fixes (git)
(“What have we got here:

* A list of DP fixes from Jesse to make the code conform more closely
to the specification.

* Making Ivybridge use the Sandybridge GPU reset path.

* Recover from i915 load failure without causing a later panic
when the shrinker ran.

* Revert the RC6 enable patch — there are at least two machines which
mysteriously fail with RC6 enabled. We found lots of possible causes,
none of which appear to help these last few hold-outs.

* Fix an obvious typo — the GPU idling code was using the wrong variable
for the size of the ring. This may well cause spurious suspend
failures as the GPU wouldn’t have been reliably idled.

The following changes since commit fe0d42203cb5616eeff68b14576a0f7e2dd56625:

Linux 3.0-rc6 (2011-07-04 15:56:24 -0700)”) and
Trond Myklebust has nfs client fixes in a pull request.

-Rajiv Andrade announces tpm fixes (pull request), Greg Kroah-Hartman
announces kernels 2.6.32.43 and 2.6.33.16, Dave Airlie has drm fixes
(minor, git pull), Paul E. McKenney has a pull request for rcu/urgent
(“This pull request is an update from https://lkml.org/lkml/2011/7/11/248.
It fixes a rare but real boot-time hang that is caused by RCU callbacks
being registered during early boot whose callback functions depend on
the scheduler being fully initialized. This commit therefore defers
callback invocation until after the scheduler has spawned the first task.
In contrast, the earlier patch deferred only until the scheduler was
ready to spawn the first task, in particular, before the init task had
first entered schedule(). The former pull request fixed Ravi’s hang,
but not Julie’s. This pull request addresses both hangs.

There was another hang from Konrad, but this hang turned out to be
unrelated. Konrad’s hang has been solved: It was fixed by a patch from
Peter (https://lkml.org/lkml/2011/7/12/150).

I have Tested-by responses from all three (Julie, Ravi, and Konrad).

This commit is available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/urgent

I believe that this commit (and Peter’s patch, for that matter) should
be included in v3.0.”) and David Miller has a networking pull request :

 1) SLIP config ifdefs surround wrong bits of code, fix from
   Matvejchikov Ilya.

2) Natsemi does DMA unmaps using wrong length, fix from Jim Cromie.

3) Two SCTP fixes from Thomas Graf.  Do not deadlock on graceful
   shutdown when data chunks exist in the retransmit queue.  Also,
   if behave like TCP if receiver closes with data still queued up
   on receive by emitting an ABORT.

4) Fix use after free in HSO driver, from Octavian Purdila.

5) Natsemi module parms permissions are busted, from Jean Delvare.

6) Fix memory leak in XFRM state code, from Tushar Gohad.

7) Fix vulnerability in mac80211 TKIP replay handling, from Johannes
   Berg.

8) Several bluetooth fixes:
	Buffer overflow in l2cap from Dan Rosenberg
	HIDP disconnect deadlocks from Peter Hurley
	Incoming L2CAP regression fix from Gustavo F. Padovan
	Memory leak in hci_conn from Tomas Targownik

9) ath5k driver stores "ieee80211_hw" pointer in drvdata but then
   tries to use it as a "ath5k_softc" pointer, fix from Pavel roskin.

10) Fix deadlock in rfkill/sched_scan code of cfg80211 by using a new
    mutex, fix from Luciano Coelho.

Please pull, thanks a lot! 

-Frederic Weisbecker has a pull request on random-tracing
(hw_breakpoints), Steven Whitehouse has a pull request
for gfs2 and Steven Rostedt has a linux-trace pull request.

-That’s it, enjoy your weekend!

Leave a comment