kernel news – 06.04.2013

Posted: May 6, 2013 in kernel

-Gleb Natapov announces KVM updates for 3.10 merge window:

Highlights of the updates are:

– new emulated device API
– legacy device assignment is now optional
– irqfd interface is more generic and can be shared between arches
– VMCS shadow support and other nested VMX improvements
– APIC virtualization and Posted Interrupt hardware support
– Optimize mmio spte zapping
– BookE: in-kernel MPIC emulation with irqfd support
– Book3S: in-kernel XICS emulation (incomplete)
– Book3S: HV: migration fixes
– BookE: more debug support preparation
– BookE: e6500 support
– reworking of Hyp idmaps
– ioeventfd for virtio-ccw

And many other bug fixes, cleanups and improvements.

-There is a MFD pull request from Samuel Ortiz targetting also 3.10:

Hi Linus,

This is the MFD pull request for the 3.10 merge window. There is one merge
conflict with your tree, and I fixed it for reference in my mfd-3.10-merge

For 3.10 we have a few new MFD drivers for:

– The ChromeOS embedded controller which provides keyboard, battery and power
management services. This controller is accessible through i2c or SPI.

– Silicon Laboratories 476x controller, providing access to their FM chipset
and their audio codec.

– Realtek’s RTS5249, a memory stick, MMC and SD/SDIO PCI based reader.

– Nokia’s Tahvo power button and watchdog device. This device is very similar
to Retu and is thus supported by the same code base.

– STMicroelectronics STMPE1801, a keyboard and GPIO controller supported by
the stmpe driver.

– ST-Ericsson AB8540 and AB8505 power management and voltage converter
controllers through the existing ab8500 code.

Some other drivers got cleaned up or improved. In particular:

– The Linaro/STE guys got the ab8500 driver in sync with their internal code
through a series of optimizations, fixes and improvements.

– The AS3711 and OMAP USB drivers now have DT support.

– The arizona clock and interrupt handling code got improved.

– The wm5102 register patch and boot mechanism also got improved.

-Joerg Roedel has IOMMU updates:

The updates are mostly about the x86 IOMMUs this time. Exceptions are
the groundwork for the PAMU IOMMU from Freescale (for a PPC platform)
and an extension to the IOMMU group interface. On the x86 side this
includes a workaround for VT-d to disable interrupt remapping on broken
chipsets. On the AMD-Vi side the most important new feature is a kernel
command-line interface to override broken information in IVRS ACPI
tables and get interrupt remapping working this way. Besides that there
are small fixes all over the place.

-Ingo Molnar has timers-nohz updates:

his tree from Frederic Weisbecker adds a new, (exciting! core kernel
feature to the timer and scheduler subsystems: ‘full dynticks’, or

This feature extends the nohz variable-size timer tick feature from idle
to busy CPUs (running at most one task) as well, potentially reducing the
number of timer interrupts significantly.

This feature got motivated by real-time folks and the -rt tree, but the
general utility and motivation of full-dynticks runs wider than that:

– HPC workloads get faster: CPUs running a single task should be able to
utilize a maximum amount of CPU power. A periodic timer tick at HZ=1000
can cause a constant overhead of up to 1.0%. This feature removes that
overhead – and speeds up the system by 0.5%-1.0% on typical distro
configs even on modern systems.

– Real-time workload latency reduction: CPUs running critical tasks
should experience as little jitter as possible. The last remaining
source of kernel-related jitter was the periodic timer tick.

– A single task executing on a CPU is a pretty common situation,
especially with an increasing number of cores/CPUs, so this feature
helps desktop and mobile workloads as well.

The cost of the feature is mainly related to increased timer-reprogramming
overhead when a CPU switches its tick period, and thus slightly longer
to-idle and from-idle latency.

Configuration-wise a third mode of operation is added to the existing two
NOHZ kconfig modes:

– CONFIG_HZ_PERIODIC: [formerly !CONFIG_NO_HZ], now explicitly named as a
config option. This is the traditional Linux periodic tick design:
there’s a HZ tick going on all the time, regardless of whether a CPU is
idle or not.

– CONFIG_NO_HZ_IDLE: [formerly CONFIG_NO_HZ=y], this turns off the
periodic tick when a CPU enters idle mode.

– CONFIG_NO_HZ_FULL: this new mode, in addition to turning off the tick
when a CPU is idle, also slows the tick down to 1 Hz (one timer
interrupt per second) when only a single task is running on a CPU.

The .config behavior is compatible: existing !CONFIG_NO_HZ and
CONFIG_NO_HZ=y settings get translated to the new values, without the user
having to configure anything. CONFIG_NO_HZ_FULL is turned off by default.

This feature is based on a lot of infrastructure work that has been
steadily going upstream in the last 2-3 cycles: related RCU support and
non-periodic cputime support in particular is upstream already.

This tree adds the final pieces and activates the feature. The pull
request is marked RFC because:

– it’s marked 64-bit only at the moment – the 32-bit support patch is
small but did not get ready in time.

– it has a number of fresh commits that came in after the merge window.
The overwhelming majority of commits are from before the merge window,
but still some aspects of the tree are fresh and so I marked it RFC.

– it’s a pretty wide-reaching feature with lots of effects – and while
the components have been in testing for some time, the full combination
is still not very widely used. That it’s default-off should reduce its
regression abilities and obviously there are no known regressions with
CONFIG_NO_HZ_FULL=y enabled either.

– the feature is not completely idempotent: there is no 100% equivalent
replacement for a periodic scheduler/timer tick. In particular there’s
ongoing work to map out and reduce its effects on scheduler
load-balancing and statistics. This should not impact correctness
though, there are no known regressions related to this feature at this

– it’s a pretty ambitious feature that with time will likely be enabled
by most Linux distros, and we’d like you to make input on its
design/implementation, if you dislike some aspect we missed. Without
flaming us to crisp!

Future plans:

– there’s ongoing work to reduce 1Hz to 0Hz, to essentially shut
off the periodic tick altogether when there’s a single busy task on a
CPU. We’d first like 1 Hz to be exposed more widely before we go for
the 0 Hz target though.

– once we reach 0 Hz we can and remove the periodic tick assumption from
nr_running>=2 as well, by essentially interrupting busy tasks only as
frequently as the sched_latency constraints require us to do – once
every 4-40 msecs, depending on nr_running.

I am personally leaning towards biting the bullet and doing this in v3.10,
like the -rt tree this effort has been going on for too long – but the
final word is up to you as usual.

More technical details can be found in Documentation/timers/NO_HZ.txt.

-pwm updates for -rc1 from Thierry Reding:

Nothing very exciting this time around. A couple of bug fixes and a lot
of cleanup across the board. The DaVinci 8xx family of SoCs now use the
same driver as the AM33xx family.

Many thanks to Axel Lin and Jingoo Han who have done a great job fixing
various bugs and inconsistencies.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s