| fiq-engine 1.2 |
This package shows how to run the FIQ (“fast irq”) mechanism offered by the ARM processor in order to run a custom task with minimal jitter and delay in activation time.
The timing source for the user task is any interrupt source for the target platform. This package includes support for the AT91SAM926x internal timer and the PXA270 internal timer. You can change the code to use another interrupt source by changing the cpu-specific code.
Kernel code included has been tested on linux-2.6.26 and linux-2.6.27 (although the PXA has only been tested on 2.6.20). No other operating system kernels are supported as this code is Linux-specific, but other versions may be added if needed.
The included S50fiq script is an example of what I use in my
embedded systems to run this package. The script includes the
following commands, that you can copy and paste to your ARM shell:
insmod ./bprintk.ko
insmod ./sysctl-stamp.ko
insmod ./fiq-misc.ko
mknod /dev/fiqmisc c 10 68
insmod ./fiq-engine.ko fiq=1
insmod ./fiq-task.ko period=100
The fast irq, or FIQ, behaves somewhat like a non-maskable interrupt; it is part of the ARM CPU core and is thus available in all processors built around that architecture. More correctly, the core defines two different interrupts, the IRQ and the FIQ, which are independently maskable; however, until nobody masks the FIQ (and nobody does it in the Linux kernel) it acts like a non-maskable interrupt – while still allowing protection of critical sections if the programmer needs it.
Most interrupt controllers (actually, all of them as far as I know) allow to route each interrupt source to either the normal or the fast IRQ. It is therefore possible to use such a feature with whatever event source you want: a UART device, a cpu timer, an external signal.
Since the FIQ is not masked during normal system activity, the FIQ handler is not delayed by other interrupts, nor is it disabled during critical sections of the kernel, as spinlocks and the like only disable normal interrupts. This environment allows low latency and low jitter, but requires serious limitations in what FIQ code can do. Basically, the FIQ handler can't share data with other code nor can it call kernel functions (as they might access kernel data), since the interrupts might be disabled when the FIQ handler is running. Moreover, page-fault can't happen in the FIQ handler, as the kernel is not ready to handle them. In practice, the situation is not unlike what happens when a real-time kernel (such as RTAI) coexists with the Linux kernel.
This package offers a few support modules and the glue code to fire a user-defined procedure based on a FIQ interrupt source. Support code includes a printk-like function, a time-stamping mechanism to keep track of your delays and jitters, and a shared-memory area for communication. The main code deals with the hairy details of installing the vector within the Linux interrupt management and register saving/restoring, calling a user-defined function when the FIQ triggers.
Before you load the modules you need to apply a kernel patch to export access to the FIQ vector.
To load the modules you must patch the kernel, in order to
export access to the FIQ vector (as described below).
The patch is applied using “patch -p1” from within the
main kernel sources. It is provided for 2.6.26 through 2.6.29, but the
2.6.26 patch works back to linux-2.6.20, with only some offset
in line numbers. This is an example use:
cd linux-2.6.27
patch -p1 < /path/to/fiq-engine-1.2/patches/linux-2.6.27-fiq.patch
make uImage
If you have git, the 2.6.28 and 2.6.29 patches have been created withg it,
so you can “git am” them to your branch.
The patches modify the following files:
arch/arm/kernel/entry-armv.Sfiq_userptr symbol.
arch/arm/kernel/armksyms.cfiq_userptr.
mm/vmalloc.cTo compile the code you should just run make, with the
LINUX variable pointing to your kernel sources, patched
as described. As usual, the
CROSS_COMPILE and ARCH variables should be set as well
(unless you compile natively). All variables can be set in the
environment or on the command line of make. Example:
export LINUX=/usr/src/linux-at91
make ARCH=arm CROSS_COMPILE=/opt/eldk-4.2/usr/bin/arm-linux-
No make install is provided, you can load the modules where they have been compiled or move them by hand in another place of your choice.
Please note that the convention for platform-specific header files
changed over time, from “#include <asm/arch/...>” to
“#include <mach/...>”. In this package, at91 uses
the new convention thile pxa uses the old one. You may need to fix
it yourself.
The package is made up of a few kernel modules, they are explained one by one.
The module exports the bprintk function. You don't need to load it unless you use tha function (most likely yoùll use it at least during debugging).
The function is a printing function just like printk, but unlike printk it doesn't send anything to the console or to the kernel buffer. Instead, it uses its own buffer to host the strings until a safe time. A kernel timer is used to send such text to the real printk function.
If you need to use a printing function from within a FIQ handler, you can't call printk directly, as explained earlier, so this module offers a workaround to help debugging. Note, however, that if your system blocks in the FIQ handler, no bprintk can help, as the accumulated data won't have a way to reach the console.
Please note that calling printk from FIQ isn't guaranteed to fail: it might even work a number of times before failing or before triggering bugs that may be very difficult to track.
Everything you need to do with bprintk.ko is loading it, or having it automatically loaded by modprobe if you arrange for it.
This module offers a timestamping API, for use in your own modules.
Such functions are safe to call from a FIQ context. The timing source
being used is up to the caller; the module only offers the interface
to user space through /proc/sys/dev.
Use of this module is exemplified in fiq-task, where you can see the details in action. Moreover, the header file is well commented.
The module exports the following functions:
struct scstamp_table *scstamp_register(char *name,
struct scstamp_item *items, int nitems,
int num, int den, int avgweight);
void scstamp_one(struct scstamp_table *table, int offset, unsigned long count);
void scstamp_unregister(struct scstamp_table *table);
The first function registers a new directory in /proc/sys/dev,
under the name name. The items and nitems arguments
define a list of files, each used to stamp a different event.
The function must be called from Linux context (not FIQ context).
For each file in the array, timestamps are reported as count values passed to scstamp_one. The num, den and avgweight are used to convert such values in human-readable stamps (for example, microseconds). Each file in the array can declare its conversion values in the data structure passed to scstamp_register, whereas values passed as explicit arguments are used as default for the items that don't specifying them. Conversion is performed by issuin an integer multiplication by num followed by an integer division by den.
In each file four integer numbers can be read at any time by a user space program: the minimum, the average, the maximum and the last stamp reported. All numbers are converted as explained, and the average reported is a running average using avgweight as scale factor.
The scstamp_one function can be called from any context and adds a new timestamp (count) to the file (“item”) in position offset of the list registered to scstamp_register. The count is converted as explained and contributes to the average according to the avgweight in use.
When you are done with timestamping (for example, when your own module is unloaded), you can remove the files from /proc/sys/dev by calling scstamp_unregister from Linux context.
As suggested, fiq-task.c shows how to use the functions in practice.
This module registers a misc device driver that exports a shared-memory area to user space. Such memory area is meant to be accessed by both FIQ code and normal Linux code.
The module receives two optional parameters:
verbose= (default 0): activates more printk messages.
size= (default 64): the size in kilobytes of the memory area.
The device is created as a miscdevice with 68 as minor number
(actually, FIQMISC_MINOR in fix-misc.h). If you are not
running udev you should create the device yourself:
mknod /dev/fiq-misc c 10 68
On the device node, the following file operations are supported:
This module is the core of the FIQ implementation. It is made up of three object files: an assembly file with the register-saving part, the generic C source and a CPU-specific part.
The module receives two parameters:
verbose= (default 0): currently not used
fiq= (default 0): chooses whether to activate the FIQ
handler or a normal interrupt handler. The default is using
a normal interrupt handler to ease tracking bugs.
This module exports two symbols to other modules:
int fiq_register(void (*handler)(void), int irq);
int fiq_unregister(void (*handler)(void), int irq);
The handler being registered is invoked whenever the interrupt event
occurs. According to the fiq= argument the event is a fast
interrupt or a normal interrupt.
The CPU-specific engine is currently implemented for the AT91SAM92 family of processors (9260, 9261, 9263) and for the PXA270 CPU. In order to port to a different CPU implementation you need to offer the following low-level functions (either inline or real functions) using existing code as reference:
int __fiq_register(void (*handler)(void), int irq);void __fiq_unregister(void (*handler)(void), int irq);fiq_use_fiq.
void __fiq_ack(void);void __fiq_sched_next_irq(int usec);unsigned long __GETSTAMP(void);This example is released to the public domain. You are expected to write your own FIQ task based on this one.
The example, as released, fires a periodic task with a period of 250 microseconds (on either AT91SAM92 or PXA) and toggles a GPIO bit at each run; you thus get a 2kHz square wave on the pin. Note that while this example task is periodic, the FIQ module is designed to allow aperiodic tasks, as each invocation specifies how far in the future the next interrupt should fire.
The integer module parameter period= can be used to specify
the period, so you can change the default value of 250 usec. I used
it successfully down to 20 usec at 200MHz or 10 usec at 400MHz.
The module parameter bit= specifies which GPIO pin to move.
The default value is defined in fiq-at91sam92.h and
fiq-pxa.h with the macro FIQ_BITNR. For the numeric
definition of the bits on at91 see the relevant header files; on
the PXA the integer directly represents the GPIO number.xxxxxxxx
The example, in addition, registers four sysctl files that report the entry and exit times of the task, expressed both as raw timer counts and as microseconds.
The following hardware resources are used:
pin= is specified).
The code has been used on
the 9260EK/9261EK boards and is tailored to their frequencies. The code
can use TC1 or TC2 if so configured, since the CPU has three
different interrupt sources for the three timers. Like above,
time lapses are calculated assuming a 3MHz counter speed.
pin= is specified)
and uses OSMR1 as a timer source. The code can work with
OSMR2 or OSMR3 (according to fiq-pxa.h,
while OSMR0 is used by Linux.
Other timers found in PXA270 and later are not supported.
The pin being toggled can be observed with the scope. Moreover, the
timestamps of task activation (and task completion) can be observed in
/proc/sys:
# ls /proc/sys/dev/task/
entry--us entry-raw exit--us exit-raw
As described in section sysctl-stamp.ko, the files report minimum, average, maximum and last reported values. By writing to one of the files, the user can reset all counts.
The following example (which has been re-indented for readability)
shows how to read the stats, reset them, read again. As a
reminder, the entry files report the delay measured in the
activation time of the task, as both raw counts and microseconds, while
the exit files show the stamp when the task is done. On the PXA
processor, however, the exit time is not the time elapsed from the
next event but rather the time before next event. This
was measured on a 9260 running 2.6.26 (“exit” time is elapsed time):
# cd /proc/sys/dev/task
# grep . *
entry--us: 1 1 4 2
exit--us: 2 2 9 5
entry-raw: 2 2 13 6
exit-raw: 6 6 29 15
# echo 0 > entry-raw; grep . *
entry--us: 1 1 4 1
exit--us: 3 3 9 5
entry-raw: 2 3 12 7
exit-raw: 8 9 28 17
These data points shows how both task-activation time and task-completion time can change by a factor of 3-4 times. This can be tracked down to a cache effect: if the task is not in cache when the FIQ triggers, all of the code takes more time than it takes when it is already in cache. Activation time remains within a few microseconds of the timer trigger in any case, even with a high interrupt load, but please note that the FIQ is acknowledged before the time stamp is taken, and this eats up most of the measured time (see section fiq-empty.ko, later on).
As a comparison, this is what happens when the task uses the normal
interrupt (first data point is taken on idle machine, second data point
is taken under ping flood “ping -s 1024 -f”).
entry--us: 3 4 36 5
exit--us: 5 5 40 9
entry-raw: 10 10 113 14
exit-raw: 15 16 123 20
entry--us: 2 3 89 5
exit--us: 4 4 92 9
entry-raw: 7 8 275 32
exit-raw: 11 12 285 23
In this case, the delay in entering the task is on average a few microseconds more than in the FIQ case, but the jitter is definitely worse, with a measured delay of up to 36 microseconds in an unloaded system. This activation jitter is the result of interference between different interrupt sources. Such interference doesn't happen with the fast irq as a FIQ handler fires even while another interrupt is being serviced.
The empty module is a new addition in version 1.1 of this package. Together
with busy.ko it can be used to benchmark FIQ execution.
This module registers a periodic task just like fiq-task, and
it takes the same arguments: period= (default 250) and
pin= (default defined in header file as documented in section
fiq-task.ko). But unlike fiq-task, the pin is raised as soon as
the task is entered and lowered just before the task is existed. This
allows to benchmark the time needed to acknowledge the interrupt and
schedule the next one, using a scope. On the PXA I measured 6
microseconds (I hoped it was much less).
The module (added in version 1.1 of the package), is a simple Linux
module (not a fiq-related one) that continuously toggles a pin,
without ever releasing the CPU. Since interrupts (or FIQs) are not
disabled, normal device activity goes on but no process is scheduled.
The module returns -EIO so you won't need to rmmod it to
make another run.
The module takes two parameters: bit=, to select the GPIO bit
to toggle, and duration= (default: 5) to choose how many
seconds it should run.
By using a scope with two probes, you can look at both this GPIO bit
and the one toggled by fiq-task or fiq-empty and see the
overall overhead of the FIQ event, since the bit will not be switching
when the interrupt is being served. With fiq-task you'll a
quiet period of a few microseconds before and after the bit change
(for sysctl stamping, acknowledging the event, scheduling the next
one). With fiq-empty you'll see almost no quiet period before
its own bit is raised and after it is lowered. This because the CPU
takes very few instructions from the time it serves the FIQ event and
the time your own module gets control.
I have a few photographs, feel free to ask if you can't wire your own scope and would like to better understand the idea (I might include them in a later release, I've no time to clean them up right now).
The FIQ has been used in a few projects of mine, and code evolved while being supported by my clients. Most of them currently prefer to remain anonymous although they all agreed generic code would go public over time.
Initial work and the Samsung code has been sponsored by Dataprocess Europe. I didn't yet clean up the Samsung code for publication.
Cleanup and documentation has been sponsored by cori.it.
The PXA port has been sponsored by BTicino SpA.