kernel weekly news – 30.07.2011

Posted: July 30, 2011 in kernel

Howdy y’all!Let’s start this week’s news!

-Steven Whitehouse has a pull request with gfs2 (merge window),
Ingo Molnar has core/iommu changes for 3.1 (pull request),
Ingo also has core/printk, perf/core, sched/core and rcu
changes, Stefan Richter has firewire updates, Tejun Heo has
workqueue and percpu changes for 3.1, going back to Ingo Molnar,
he announces timers/{cleanups, core, rtc}, x86/{apic,asm,cleanups,
efi,mce,microcode,mtrr,numa,signal,uv,vdso} also for 3.1 and David
Miller has networking fixes :

 A bit less going on than in the past few releases, most notable this
time is:

1) There are currently 3 or 4 ways to add VLAN support for a driver,
   which is just crazy.  Jiri Pirko is trying to consolidate things so
   we have less of a mess here.

2) The Neighbour layer has been simplifier and sped up.  It had complexity
   purely for the sake of allowing situations that simply never happen.
   This removed some indirect calls in the fast path.

   It even had a method pointer that everyone assigned to the same global
   routine. :-)

   There will be more activity in this area in the future.

3) New driver for rtl8192de wireless chipset.

4) PowerPC 64-bit now has a BPF JIT too.

5) Get more drivers supporting 64-bit device stats, thanks to Stephen
   Hemminger.

6) Sometimes a config change can happen mid-dump in netlink, we can now
   detect this situation using sequence numbers and decide to rescan
   if we want to.

7) SKB zero-copy buffer support for virtualization from Shirley Ma.

8) Improve scalability of inetpeer table by removing the explicit
   unused list and killing off some false sharing.  From Eric Dumazet.

9) AF_PACKET sockets now support a "fanout" facility, whereby you can
   distribute packet capture amongst a group of sockets.  This will be
   used by userland traffic analysis tools such as suricata.

Please pull, thanks a lot! 

-Christoph Hellwig has hfsplus updates for 3.1 (all these are pull requests
unless otherwise noted), Grant Likely has devicetree, spi and gpio updates,
David Teigland has dlm updates for 3.1, Alex Elder has xfs updates for 3.0-rc1,
Oleg Nesterov has ptrace changes for 3.1 and Roland Dreier has infiniband
changes (3.1 merge window) .

-Jiri Kosina has HID updates (also 3.1 merge window), Jeremy Fitzhardinge
has xen-tracing updates, Rafael J. Wysocki has power management updates,
Al Viro has vfs updates for 3.1 (“The big ones in that are
* ->permission() API change (death to separate boolean argument, death to
generic_permission() check_acl callback, death to file_permission(), death
to exec_permission())
* nfs4 mknod() fixes (and end to pointless carrying vfsmount pointers in the
guts of nfs)
* further reduction of struct nameidata exposure, including the LOOKUP_…
flags use
* dchinner: per-sb shrinkers
* hch: death to ->i_alloc_sem
* jbacik: SEEK_HOLE/SEEK_DATA, ->fsync() API change.

Also by jbacik: DCACHE_NEED_LOOKUP, which is going to be the basis for
atomic_open-done-right (aside of originally intended uses in btrfs).
Unfortunately, ->d_lock/->d_parent shite had sidetracked me in the last
couple of weeks of last cycle, so atomic_open will have to wait for -rc2
or -rc3; *PLEASE* hold any unionfs or overlayfs merges until then.

Other than that, there’s a moderate bunch of assorted patches; I’ll push
more of that in the next pull request (i.e. folks who don’t see their
patches in the shortlog below, please wait for after the next vfs pull
request before complaining)”) and Mark Brown has a regmap-related
pull request, explained as follows :

 This pull request adds regmap, a library which abstracts out some widely
reimplemented code for accessing the register maps of devices on slow
buses like I2C and SPI.  As well as factoring out code this makes
it much easier to factor out higher level code for this class of
devices.  We've been using equivalent code in ASoC to great effect, the
idea here is to move that functionality so that it's usable in other
subsystems.

I'm not 100% happy with the implementation at present (to a large extent
due to keeping it simple for initial review, though there's some stuff
I'm just not happy with) so I expect a bit of internal churn but the
external interface should be solid and allow other code to start making
use of the code.

Only one driver is converted to the API in this pull request (the
tps65023), some other drivers have been converted including the generic
cache code in ASoC but various cross tree dependencies mean they can't
be applied until everything settles down in the merge window.

The code is in a separate directory because it is expected that the
addition of register cache support and diagnostic features like trace
and debugfs access to the register maps will mean that the code will
get larger. 

-Eric Van Hensbergen has changes for the 9p tree, Takashi Iwai updates
the sound tree, James Bottomley has a first round of scsi updates,
Nicholas Bellinger updates iscsi-target for 3.1-rc1 merge,
Martin Schwidefsky has s390 updates for 3.0+ and Jonas Bonn has
updates for the projcet porting Linux to the OpenRISC arch
(openrisc.net) .

-Avi Kivity has kvm updates for the 3.1 merge window, Jeff Garzik
has libata updates for 3.1, John W. Linville has wireless updates
, Dave Jones has some cpufreq updates, Jens Axboe has linux-block
updates for 3.1, Anton Blanchard has ppc64 scheduler fixes (patch),
Jean Delvare has i2c and hwmon fixes for 3.1 and Grant Likely has
patches that are needed for the arm architecture in 3.1 .

-Speaking of arm, Arnd Bergmann has arm-soc fixes for 3.1 (merge
window), Michal Marek has kbuild-* fixes, Steven Rostedt updates
ktest, Phillip Lougher has squashfs updates for 3.1, Greg Ungerer
has m68knommu updates and Benjamin Herrenschmidt has powerpc up-
dates.

-Greg Kroah-Hartman has updates/patches for driver-core, usb and
serial, all for 3.1, Al Viro updates vfs (“Regression fixes (devtmpfs
race, cifs ->d_revalidate() breakage, fsync fallout), ACL stuff
(me/Linus/Christoph), fixes for bugs found while wading through
the mode_t handling. There’ll be another pull request with assorted
stuff from over the last cycle.”), John Stultz has RTC fixes , Wu Fengguang
has writeback changes for 3.1 (“Besides the less interesting
tweaks on writeback queues and dirty balancing, there are two
remarkable changes:

– split the global inode_wb_list_lock (which is highly contented in
small file writeback) into per-bdi list_lock, which helps JBOD setup

– write bandwidth estimation, based on which the writeback chunk size is
increased from 4MB to 0.5 seconds worth of data. Note that ext4 already
use 128MB write chunk size internally, so won’t easily see the difference

There will be merge conflicts with commit 12ad3ab661 (“superblock:
move pin_sb_for_writeback() to fs/super.c”). If you want it, I can
resolve the conflicts either by a rebase, or by doing a local merge
into a new local branch:

git checkout master
git pull linus master

git branch for-linus-clean
git checkout for-linus-clean

git merge for-linus # and resolve conflicts by myself

I feel sorry for the last minute changes to the last two commits. They
are for doing a trivial variable rename.”) and Ingo Molnar has
olpc changes for 3.1 .

-Sage Weil has ceph updates for 3.1-rc1 :

 Lots of different things here.  There is some code cleanup with the file 
flags, a few snapshot metadata writeback fixes, a fix for unnecessary 
timeouts on heavily loaded clusters, and a fix for dentry leases.  There 
are many fixes for bugs Al turned up: a fix and cleanup in the open intent 
code, and several fixes for d_parent use without the appropriate locks (a 
full code audit caught a few more).  RBD devices now clean up on the 
server when they are unmapped, rbd request sizes are now larger, and the 
ceph readahead window is set up properly 

-Jan Kara has ext3, jbd, ext2, and quota fixes for 3.1-rc1, Dmitry Torokhov
has input subsystem updates for -rc0, Michal Simek has microblaze fixes for
-rc1, Dave Airlie has a drm pull request for -rc1 and Arnd Bergmann has arm-soc
changes for -rc1 :

 This adds support for two new platforms and a new OMAP SOC in the
ARM architecture. The new platforms, zynq and prima2, are about as
good as it gets in following the latest set of guidelines for how
to do a platform in ARM. We are working on improving that further,
getting rid of the need for platform specific include/mach/*.h files
one by one, and moving the clock and timer code into drivers.

The infrastructure required for those changes is not there yet, and
I see no reason to keep the new platforms out while waiting for it.
If everything goes well, they will become even cleaner in 3.2 and can
serve as examples for the other platforms in the meantime, showing
everyone where we are headed with the move to device trees instead
of board files.

The OMAP4460 is the current platform from TI, it still largely follows
the same basic pattern as other OMAP, which I think is the best
we can do for that right now. There is a small conflict in
cm-regbits-44xx.h, my solution was

  +/* Renamed from DELTAMSTEP Used by CM_SSC_DELTAMSTEP_DPLL_USB */
  +#define OMAP4460_DELTAMSTEP_0_20_SHIFT                                0
  +#define OMAP4460_DELTAMSTEP_0_20_MASK                         (0x1fffff << 0)
  +
 - /* Used by CM_SHADOW_FREQ_CONFIG1, CM_SHADOW_FREQ_CONFIG1_RESTORE */
 - #define OMAP4430_DLL_OVERRIDE_SHIFT                           2
 - #define OMAP4430_DLL_OVERRIDE_MASK                            (1 << 2)
 + /* Used by CM_DLL_CTRL */
 + #define OMAP4430_DLL_OVERRIDE_SHIFT                           0
 + #define OMAP4430_DLL_OVERRIDE_MASK                            (1 << 0)
  
 - /* Renamed from DLL_OVERRIDE Used by CM_DLL_CTRL */
 - #define OMAP4430_DLL_OVERRIDE_0_0_SHIFT                               0
 - #define OMAP4430_DLL_OVERRIDE_0_0_MASK                                (1 << 0)
 + /* Renamed from DLL_OVERRIDE Used by CM_SHADOW_FREQ_CONFIG1 */
 + #define OMAP4430_DLL_OVERRIDE_2_2_SHIFT                               2
 + #define OMAP4430_DLL_OVERRIDE_2_2_MASK                                (1 << 2)

All the branches I'm sending today are also merged in the for-next
branch of arm-soc, so you can also pull that one instead if you get
bored by the conflicts and just want to have it all.

The diffstat below is the one I generated post-merge, for reasons
I still need to understand better, git-request-pull would otherwise
add the diffstat for the omap/cleanup branch that you have already
pulled.

-Trond Myklebust has nfs client updates, Alex Elder has xfs updates
for -rc1, Chris Mason has btrfs updates (“Hi everyone,

The for-linus branch of the btrfs-unstable repo is reading for pulling:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git for-linus

This started off as a larger pull, but I had to pull out a number of
cleanups from Fujitsu, Novell and a few others (sorry guys) while
hunting a crash during stress.sh. It looks like it is unrelated to
those commits, but I had to pull out a bunch of them until I can be sure
I understand the bug.

I’ll have another pull request with the fully tested cleanups on Monday.
Depending on how Linus does rc1, they might end up as the start of my
3.2 branch.

This pull has the commits that I’ve been able to run through extensive
testing. The biggest change here is switching the btrfs tree locks to a
reader/writer lock. This has been one of our biggest bottlenecks for
some time, and it was consistently at the top of profiles on large
machines.

The new locks do away with all the adaptive spinning inside of btrfs and
rely on the spinning/blocking hints in the code to decide when it must
block.

The reader/writer locks break the code I had in place to use kmap on
metadata buffers, so all of our metadata is now in lowmem. I did test
this on a 32 bit VM, but x86-32 users will want to poke gently.

I also adapted Tejun’s lockdep fixes for the btrfs locks, and so far I
haven’t seen any lockdep warnings.

Josef has a series of enospc fixes and tweaks here as well. His bigger
patch to start reworking the enospc reservations seems to be causing the
corruptions during stress.sh, so it will wait for 3.2.”), Luis R. Rodriguez
announces the release of compat-wireless for Linux 3.0 (“Linus flushed out
Linux 3.0, the respective backport of that release
for the 802.11, Bluetooth and Networking subsystems is available now
[1]. Thanks for all the contributions, below are the compat.git and
compat-wireless.git contributions, for more details please refer to
the complete ChangeLog [2], and the stable compat-wireless page page
[3]. I’ve compile tested this against 2.6.38 and loaded iwlagn
successfully.

To discuss if we want to expand this framework to include other
subsystems we can talk about it in person at the 2011 Linux Plumbers
conference [4], should the BoF proposal get accepted. Farewell 2.6.x
days.

===============================================
ChangeLog for compat-wireless for linux-3.0
===============================================

This is the ChangeLog for the Linux kernel project compat-wireless.
It provides a backport of a few Linux kernel subsystems down to
older kernels:

* 802.11
* Bluetooth
* Ethernet

For more details refer to the home page:

http://wireless.kernel.org/en/users/Download/stable/

The compat-wireless project consists of code from three projects:

* The Linux kernel: linux-2.6-allstable.git
* Compat-wirelesS: compat-wireless.git
* Compat: compat.git

The compat-wireless stable releases incorporates code from from
each of these git trees for the respective upstream Linux kernel
stable release. A branch called linux-2.6.3x.y exists for each
stable release. Below we provide the ChangeLog of changes from
the previous branched release to the new branched release.”) and
James Morris has security-testing updates for 3.1 .

-David Miller has a pull request for sparc, enabling support for
Niagara-T3 processors, he also has a networking pull request :

 1) GRO fragment handling fix from Herbert Xu.

2) Gratuitout ARP only gets emitted for first address on interface,
   we should emit them for all of them.  From Zoltan Kiss.

3) ipv6 /127 prefix handling needs more checking, from YOSHIFUJI Hideaki.

4) Fix VLAN regressions in gianfar and forcedeth, from Sebastian Pöhn
   and Jiri Pirko.

5) Fix various corruption bugs in B43 BCMA support, which can now be
   marked non-BROKEN.  From Pavel Roskin and Rafał Miłecki.

6) Not all device types can handle transmitting a shared SKB, as
   pktgen does in certain modes.  Track this capability with a
   flag and check it in pktgen.  Fix from Neil Horman.

7) tg3 driver 5719 4K RDMA limit workaround from Matt Carlson.

8) If cdc-phonet is the only USB net driver enabled, the build won't
   actually traverse down into drivers/net/usb due to a missing
   Makefile line.  Fix from Chris CLayton.

9) Bonding string parsing fix, plus quiet a less-than-useful noisy
   warning log message.  From Andy Gospodarek.

Please pull, thanks a lot! 

-Last minute news :
-Pekka Enberg – lockless SLUB slowpaths for -rc1 :

 This pull request has patches to make SLUB slowpaths lockless like we already did for the fastpaths. They have been sitting in linux-next for a while now and should be fine. David Rientjes reports improved performance:

  I ran slub/lockless through some stress testing and it seems to be quite
  stable on my testing cluster.  There is about a 2.3% performance
  improvement with the lockless slowpath on the netperf benchmark with
  various thread counts on my 16-core 64GB Opterons, so I'd recommend it to
  be merged into 3.1.

One possible gotcha, though, is that page struct gets bigger on x86_64. Hugh
Dickins writes:

  By the way, if you're thinking of lining up a pull request to Linus
  for 3.1, please make it very clear in that request that these changes
  enlarge the x86_64 struct page from 56 to 64 bytes, for slub alone.

  I remain very uneasy about that (love the cache alignment but...),
  the commit comment is rather vague about it, and I'm not sure that
  anyone else has noticed yet (akpm?).

  Given that Linus wouldn't let Kosaki add 4 bytes to the 32-bit
  vm_area_struct in 3.0, telling him about this upfront does not
  improve your chances that he will pull ;) but does protect you
  from his wrath when he'd later find it sneaked in.

We haven't come up with a solution to keep struct page size the same but I think it's a reasonable trade-off.

                        Pekka 

-Jesse Barnes – PCI changes
-Guenter Roeck – hwmon updates for 3.1

-This is it, boys and gals, enjoy!

Comments
  1. […] ” “ Rares Aioanei: kernel weekly news – 30.07.2011 […]

Leave a comment