path: root/Documentation
diff options
authorLinus Torvalds <torvalds@linux-foundation.org>2014-08-05 17:46:42 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2014-08-05 17:46:42 -0700
commite7fda6c4c3c1a7d6996dd75fd84670fa0b5d448f (patch)
treedaa51c16462c318b890acf7f01fba5827275dd74 /Documentation
parent08d69a25714429850cf9ef71f22d8cdc9189d93f (diff)
parent953dec21aed4038464fec02f96a2f1b8701a5bce (diff)
Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer and time updates from Thomas Gleixner: "A rather large update of timers, timekeeping & co - Core timekeeping code is year-2038 safe now for 32bit machines. Now we just need to fix all in kernel users and the gazillion of user space interfaces which rely on timespec/timeval :) - Better cache layout for the timekeeping internal data structures. - Proper nanosecond based interfaces for in kernel users. - Tree wide cleanup of code which wants nanoseconds but does hoops and loops to convert back and forth from timespecs. Some of it definitely belongs into the ugly code museum. - Consolidation of the timekeeping interface zoo. - A fast NMI safe accessor to clock monotonic for tracing. This is a long standing request to support correlated user/kernel space traces. With proper NTP frequency correction it's also suitable for correlation of traces accross separate machines. - Checkpoint/restart support for timerfd. - A few NOHZ[_FULL] improvements in the [hr]timer code. - Code move from kernel to kernel/time of all time* related code. - New clocksource/event drivers from the ARM universe. I'm really impressed that despite an architected timer in the newer chips SoC manufacturers insist on inventing new and differently broken SoC specific timers. [ Ed. "Impressed"? I don't think that word means what you think it means ] - Another round of code move from arch to drivers. Looks like most of the legacy mess in ARM regarding timers is sorted out except for a few obnoxious strongholds. - The usual updates and fixlets all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits) timekeeping: Fixup typo in update_vsyscall_old definition clocksource: document some basic timekeeping concepts timekeeping: Use cached ntp_tick_length when accumulating error timekeeping: Rework frequency adjustments to work better w/ nohz timekeeping: Minor fixup for timespec64->timespec assignment ftrace: Provide trace clocks monotonic timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC seqcount: Add raw_write_seqcount_latch() seqcount: Provide raw_read_seqcount() timekeeping: Use tk_read_base as argument for timekeeping_get_ns() timekeeping: Create struct tk_read_base and use it in struct timekeeper timekeeping: Restructure the timekeeper some more clocksource: Get rid of cycle_last clocksource: Move cycle_last validation to core code clocksource: Make delta calculation a function wireless: ath9k: Get rid of timespec conversions drm: vmwgfx: Use nsec based interfaces drm: i915: Use nsec based interfaces timekeeping: Provide ktime_get_raw() hangcheck-timer: Use ktime_get_ns() ...
Diffstat (limited to 'Documentation')
10 files changed, 374 insertions, 2 deletions
diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
index dd3f278faa8a..f2130586ef5d 100644
--- a/Documentation/DocBook/device-drivers.tmpl
+++ b/Documentation/DocBook/device-drivers.tmpl
@@ -54,7 +54,7 @@
<sect1><title>Wait queues and Wake events</title>
@@ -63,7 +63,7 @@
<sect1><title>High-resolution timers</title>
<sect1><title>Workqueues and Kevents</title>
diff --git a/Documentation/devicetree/bindings/timer/cirrus,clps711x-timer.txt b/Documentation/devicetree/bindings/timer/cirrus,clps711x-timer.txt
new file mode 100644
index 000000000000..cd55b52548e4
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/cirrus,clps711x-timer.txt
@@ -0,0 +1,29 @@
+* Cirrus Logic CLPS711X Timer Counter
+Required properties:
+- compatible: Shall contain "cirrus,clps711x-timer".
+- reg : Address and length of the register set.
+- interrupts: The interrupt number of the timer.
+- clocks : phandle of timer reference clock.
+Note: Each timer should have an alias correctly numbered in "aliases" node.
+ aliases {
+ timer0 = &timer1;
+ timer1 = &timer2;
+ };
+ timer1: timer@80000300 {
+ compatible = "cirrus,ep7312-timer", "cirrus,clps711x-timer";
+ reg = <0x80000300 0x4>;
+ interrupts = <8>;
+ clocks = <&clks 5>;
+ };
+ timer2: timer@80000340 {
+ compatible = "cirrus,ep7312-timer", "cirrus,clps711x-timer";
+ reg = <0x80000340 0x4>;
+ interrupts = <9>;
+ clocks = <&clks 6>;
+ };
diff --git a/Documentation/devicetree/bindings/timer/mediatek,mtk-timer.txt b/Documentation/devicetree/bindings/timer/mediatek,mtk-timer.txt
new file mode 100644
index 000000000000..7c4408ff4b83
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/mediatek,mtk-timer.txt
@@ -0,0 +1,17 @@
+Mediatek MT6577, MT6572 and MT6589 Timers
+Required properties:
+- compatible: Should be "mediatek,mt6577-timer"
+- reg: Should contain location and length for timers register.
+- clocks: Clocks driving the timer hardware. This list should include two
+ clocks. The order is system clock and as second clock the RTC clock.
+ timer@10008000 {
+ compatible = "mediatek,mt6577-timer";
+ reg = <0x10008000 0x80>;
+ interrupts = <GIC_SPI 113 IRQ_TYPE_LEVEL_LOW>;
+ clocks = <&system_clk>, <&rtc_clk>;
+ };
diff --git a/Documentation/devicetree/bindings/timer/renesas,cmt.txt b/Documentation/devicetree/bindings/timer/renesas,cmt.txt
new file mode 100644
index 000000000000..a17418b0ece3
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/renesas,cmt.txt
@@ -0,0 +1,47 @@
+* Renesas R-Car Compare Match Timer (CMT)
+The CMT is a multi-channel 16/32/48-bit timer/counter with configurable clock
+inputs and programmable compare match.
+Channels share hardware resources but their counter and compare match value
+are independent. A particular CMT instance can implement only a subset of the
+channels supported by the CMT model. Channel indices represent the hardware
+position of the channel in the CMT and don't match the channel numbers in the
+Required Properties:
+ - compatible: must contain one of the following.
+ - "renesas,cmt-32" for the 32-bit CMT
+ (CMT0 on sh7372, sh73a0 and r8a7740)
+ - "renesas,cmt-32-fast" for the 32-bit CMT with fast clock support
+ (CMT[234] on sh7372, sh73a0 and r8a7740)
+ - "renesas,cmt-48" for the 48-bit CMT
+ (CMT1 on sh7372, sh73a0 and r8a7740)
+ - "renesas,cmt-48-gen2" for the second generation 48-bit CMT
+ (CMT[01] on r8a73a4, r8a7790 and r8a7791)
+ - reg: base address and length of the registers block for the timer module.
+ - interrupts: interrupt-specifier for the timer, one per channel.
+ - clocks: a list of phandle + clock-specifier pairs, one for each entry
+ in clock-names.
+ - clock-names: must contain "fck" for the functional clock.
+ - renesas,channels-mask: bitmask of the available channels.
+Example: R8A7790 (R-Car H2) CMT0 node
+ CMT0 on R8A7790 implements hardware channels 5 and 6 only and names
+ them channels 0 and 1 in the documentation.
+ cmt0: timer@ffca0000 {
+ compatible = "renesas,cmt-48-gen2";
+ reg = <0 0xffca0000 0 0x1004>;
+ interrupts = <0 142 IRQ_TYPE_LEVEL_HIGH>,
+ clocks = <&mstp1_clks R8A7790_CLK_CMT0>;
+ clock-names = "fck";
+ renesas,channels-mask = <0x60>;
+ };
diff --git a/Documentation/devicetree/bindings/timer/renesas,mtu2.txt b/Documentation/devicetree/bindings/timer/renesas,mtu2.txt
new file mode 100644
index 000000000000..917453f826bc
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/renesas,mtu2.txt
@@ -0,0 +1,39 @@
+* Renesas R-Car Multi-Function Timer Pulse Unit 2 (MTU2)
+The MTU2 is a multi-purpose, multi-channel timer/counter with configurable
+clock inputs and programmable compare match.
+Channels share hardware resources but their counter and compare match value
+are independent. The MTU2 hardware supports five channels indexed from 0 to 4.
+Required Properties:
+ - compatible: must contain "renesas,mtu2"
+ - reg: base address and length of the registers block for the timer module.
+ - interrupts: interrupt specifiers for the timer, one for each entry in
+ interrupt-names.
+ - interrupt-names: must contain one entry named "tgi?a" for each enabled
+ channel, where "?" is the channel index expressed as one digit from "0" to
+ "4".
+ - clocks: a list of phandle + clock-specifier pairs, one for each entry
+ in clock-names.
+ - clock-names: must contain "fck" for the functional clock.
+Example: R7S72100 (RZ/A1H) MTU2 node
+ mtu2: timer@fcff0000 {
+ compatible = "renesas,mtu2";
+ reg = <0xfcff0000 0x400>;
+ interrupts = <0 139 IRQ_TYPE_LEVEL_HIGH>,
+ interrupt-names = "tgi0a", "tgi1a", "tgi2a", "tgi3a", "tgi4a";
+ clocks = <&mstp3_clks R7S72100_CLK_MTU2>;
+ clock-names = "fck";
+ };
diff --git a/Documentation/devicetree/bindings/timer/renesas,tmu.txt b/Documentation/devicetree/bindings/timer/renesas,tmu.txt
new file mode 100644
index 000000000000..425d0c5f4aee
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/renesas,tmu.txt
@@ -0,0 +1,39 @@
+* Renesas R-Car Timer Unit (TMU)
+The TMU is a 32-bit timer/counter with configurable clock inputs and
+programmable compare match.
+Channels share hardware resources but their counter and compare match value
+are independent. The TMU hardware supports up to three channels.
+Required Properties:
+ - compatible: must contain "renesas,tmu"
+ - reg: base address and length of the registers block for the timer module.
+ - interrupts: interrupt-specifier for the timer, one per channel.
+ - clocks: a list of phandle + clock-specifier pairs, one for each entry
+ in clock-names.
+ - clock-names: must contain "fck" for the functional clock.
+Optional Properties:
+ - #renesas,channels: number of channels implemented by the timer, must be 2
+ or 3 (if not specified the value defaults to 3).
+Example: R8A7779 (R-Car H1) TMU0 node
+ tmu0: timer@ffd80000 {
+ compatible = "renesas,tmu";
+ reg = <0xffd80000 0x30>;
+ interrupts = <0 32 IRQ_TYPE_LEVEL_HIGH>,
+ clocks = <&mstp0_clks R8A7779_CLK_TMU0>;
+ clock-names = "fck";
+ #renesas,channels = <3>;
+ };
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 97c9c06132c4..d415b38ec8ca 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -78,6 +78,7 @@ lsi LSI Corp. (LSI Logic)
lltc Linear Technology Corporation
marvell Marvell Technology Group Ltd.
maxim Maxim Integrated Products
+mediatek MediaTek Inc.
micrel Micrel Inc.
microchip Microchip Technology Inc.
mosaixtech Mosaix Technologies, Inc.
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index ddc531a74d04..eb8a10e22f7c 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1743,6 +1743,25 @@ pair provide additional information particular to the objects they represent.
While the first three lines are mandatory and always printed, the rest is
optional and may be omitted if no marks created yet.
+ Timerfd files
+ ~~~~~~~~~~~~~
+ pos: 0
+ flags: 02
+ mnt_id: 9
+ clockid: 0
+ ticks: 0
+ settime flags: 01
+ it_value: (0, 49406829)
+ it_interval: (1, 0)
+ where 'clockid' is the clock type and 'ticks' is the number of the timer expirations
+ that have occurred [see timerfd_create(2) for details]. 'settime flags' are
+ flags in octal form been used to setup the timer [see timerfd_settime(2) for
+ details]. 'it_value' is remaining time until the timer exiration.
+ 'it_interval' is the interval for the timer. Note the timer might be set up
+ with TIMER_ABSTIME option which will be shown in 'settime flags', but 'it_value'
+ still exhibits timer's remaining time.
Configuring procfs
diff --git a/Documentation/timers/00-INDEX b/Documentation/timers/00-INDEX
index 6d042dc1cce0..ee212a27772f 100644
--- a/Documentation/timers/00-INDEX
+++ b/Documentation/timers/00-INDEX
@@ -12,6 +12,8 @@ Makefile
- Build and link hpet_example
- Summary of the different methods for the scheduler clock-interrupts management.
+ - Clock sources, clock events, sched_clock() and delay timer notes
- how to insert delays in the kernel the right (tm) way.
diff --git a/Documentation/timers/timekeeping.txt b/Documentation/timers/timekeeping.txt
new file mode 100644
index 000000000000..f3a8cf28f802
--- /dev/null
+++ b/Documentation/timers/timekeeping.txt
@@ -0,0 +1,179 @@
+Clock sources, Clock events, sched_clock() and delay timers
+This document tries to briefly explain some basic kernel timekeeping
+abstractions. It partly pertains to the drivers usually found in
+drivers/clocksource in the kernel tree, but the code may be spread out
+across the kernel.
+If you grep through the kernel source you will find a number of architecture-
+specific implementations of clock sources, clockevents and several likewise
+architecture-specific overrides of the sched_clock() function and some
+delay timers.
+To provide timekeeping for your platform, the clock source provides
+the basic timeline, whereas clock events shoot interrupts on certain points
+on this timeline, providing facilities such as high-resolution timers.
+sched_clock() is used for scheduling and timestamping, and delay timers
+provide an accurate delay source using hardware counters.
+Clock sources
+The purpose of the clock source is to provide a timeline for the system that
+tells you where you are in time. For example issuing the command 'date' on
+a Linux system will eventually read the clock source to determine exactly
+what time it is.
+Typically the clock source is a monotonic, atomic counter which will provide
+n bits which count from 0 to 2^(n-1) and then wraps around to 0 and start over.
+It will ideally NEVER stop ticking as long as the system is running. It
+may stop during system suspend.
+The clock source shall have as high resolution as possible, and the frequency
+shall be as stable and correct as possible as compared to a real-world wall
+clock. It should not move unpredictably back and forth in time or miss a few
+cycles here and there.
+It must be immune to the kind of effects that occur in hardware where e.g.
+the counter register is read in two phases on the bus lowest 16 bits first
+and the higher 16 bits in a second bus cycle with the counter bits
+potentially being updated in between leading to the risk of very strange
+values from the counter.
+When the wall-clock accuracy of the clock source isn't satisfactory, there
+are various quirks and layers in the timekeeping code for e.g. synchronizing
+the user-visible time to RTC clocks in the system or against networked time
+servers using NTP, but all they do basically is update an offset against
+the clock source, which provides the fundamental timeline for the system.
+These measures does not affect the clock source per se, they only adapt the
+system to the shortcomings of it.
+The clock source struct shall provide means to translate the provided counter
+into a nanosecond value as an unsigned long long (unsigned 64 bit) number.
+Since this operation may be invoked very often, doing this in a strict
+mathematical sense is not desirable: instead the number is taken as close as
+possible to a nanosecond value using only the arithmetic operations
+multiply and shift, so in clocksource_cyc2ns() you find:
+ ns ~= (clocksource * mult) >> shift
+You will find a number of helper functions in the clock source code intended
+to aid in providing these mult and shift values, such as
+clocksource_khz2mult(), clocksource_hz2mult() that help determine the
+mult factor from a fixed shift, and clocksource_register_hz() and
+clocksource_register_khz() which will help out assigning both shift and mult
+factors using the frequency of the clock source as the only input.
+For real simple clock sources accessed from a single I/O memory location
+there is nowadays even clocksource_mmio_init() which will take a memory
+location, bit width, a parameter telling whether the counter in the
+register counts up or down, and the timer clock rate, and then conjure all
+necessary parameters.
+Since a 32-bit counter at say 100 MHz will wrap around to zero after some 43
+seconds, the code handling the clock source will have to compensate for this.
+That is the reason why the clock source struct also contains a 'mask'
+member telling how many bits of the source are valid. This way the timekeeping
+code knows when the counter will wrap around and can insert the necessary
+compensation code on both sides of the wrap point so that the system timeline
+remains monotonic.
+Clock events
+Clock events are the conceptual reverse of clock sources: they take a
+desired time specification value and calculate the values to poke into
+hardware timer registers.
+Clock events are orthogonal to clock sources. The same hardware
+and register range may be used for the clock event, but it is essentially
+a different thing. The hardware driving clock events has to be able to
+fire interrupts, so as to trigger events on the system timeline. On an SMP
+system, it is ideal (and customary) to have one such event driving timer per
+CPU core, so that each core can trigger events independently of any other
+You will notice that the clock event device code is based on the same basic
+idea about translating counters to nanoseconds using mult and shift
+arithmetic, and you find the same family of helper functions again for
+assigning these values. The clock event driver does not need a 'mask'
+attribute however: the system will not try to plan events beyond the time
+horizon of the clock event.
+In addition to the clock sources and clock events there is a special weak
+function in the kernel called sched_clock(). This function shall return the
+number of nanoseconds since the system was started. An architecture may or
+may not provide an implementation of sched_clock() on its own. If a local
+implementation is not provided, the system jiffy counter will be used as
+As the name suggests, sched_clock() is used for scheduling the system,
+determining the absolute timeslice for a certain process in the CFS scheduler
+for example. It is also used for printk timestamps when you have selected to
+include time information in printk for things like bootcharts.
+Compared to clock sources, sched_clock() has to be very fast: it is called
+much more often, especially by the scheduler. If you have to do trade-offs
+between accuracy compared to the clock source, you may sacrifice accuracy
+for speed in sched_clock(). It however requires some of the same basic
+characteristics as the clock source, i.e. it should be monotonic.
+The sched_clock() function may wrap only on unsigned long long boundaries,
+i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
+after circa 585 years. (For most practical systems this means "never".)
+If an architecture does not provide its own implementation of this function,
+it will fall back to using jiffies, making its maximum resolution 1/HZ of the
+jiffy frequency for the architecture. This will affect scheduling accuracy
+and will likely show up in system benchmarks.
+The clock driving sched_clock() may stop or reset to zero during system
+suspend/sleep. This does not matter to the function it serves of scheduling
+events on the system. However it may result in interesting timestamps in
+The sched_clock() function should be callable in any context, IRQ- and
+NMI-safe and return a sane value in any context.
+Some architectures may have a limited set of time sources and lack a nice
+counter to derive a 64-bit nanosecond value, so for example on the ARM
+architecture, special helper functions have been created to provide a
+sched_clock() nanosecond base from a 16- or 32-bit counter. Sometimes the
+same counter that is also used as clock source is used for this purpose.
+On SMP systems, it is crucial for performance that sched_clock() can be called
+independently on each CPU without any synchronization performance hits.
+Some hardware (such as the x86 TSC) will cause the sched_clock() function to
+drift between the CPUs on the system. The kernel can work around this by
+enabling the CONFIG_HAVE_UNSTABLE_SCHED_CLOCK option. This is another aspect
+that makes sched_clock() different from the ordinary clock source.
+Delay timers (some architectures only)
+On systems with variable CPU frequency, the various kernel delay() functions
+will sometimes behave strangely. Basically these delays usually use a hard
+loop to delay a certain number of jiffy fractions using a "lpj" (loops per
+jiffy) value, calibrated on boot.
+Let's hope that your system is running on maximum frequency when this value
+is calibrated: as an effect when the frequency is geared down to half the
+full frequency, any delay() will be twice as long. Usually this does not
+hurt, as you're commonly requesting that amount of delay *or more*. But
+basically the semantics are quite unpredictable on such systems.
+Enter timer-based delays. Using these, a timer read may be used instead of
+a hard-coded loop for providing the desired delay.
+This is done by declaring a struct delay_timer and assigning the appropriate
+function pointers and rate settings for this delay timer.
+This is available on some architectures like OpenRISC or ARM.

Privacy Policy