View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0022607FPCCompilerpublic2012-08-08 21:122013-11-10 13:17
ReporterThomas Schatzl 
Assigned ToThomas Schatzl 
PrioritynormalSeveritycrashReproducibilityalways
StatusresolvedResolutionfixed 
Platformarm-linux (armhf)OSlinuxOS Version3.6.0-rc1
Product Version2.7.1Product Build 
Target Version2.7.1Fixed in Version 
Summary0022607: arm-linux armhf crashes when initializing threads
DescriptionAny arm-linux armhf program that uses threads crashes immediately at thread initialization time (it seems). I.e. immediately an access violation, or process kill by the kernel due to using too much memory, occurs when using threads.
Steps To ReproduceCompile arm-linux with ARMHF support, e.g. run

make zipinstall FPC=path_to_fpc OPT="-dFPC_ARMHF -Cvfpv3 -Cparmv7" FPMAKEOPT="-T 4"

and fpmake crashes almost immediately when compiling the packages because it uses multiple threads.

Alternatively, if you have an armhf starting compiler (e.g. running above command line without the FPMAKEOPT) and using OPT=-CaEABIHF shows the same problems.

Testsuite programs which use threads (eg. theapthread) have the same issue.
TagsNo tags attached.
FPCOldBugId0
Fixed in Revision22062
Attached Fileslog file icon strace.log [^] (25,745 bytes) 2012-08-09 22:15
log file icon gdb3.log [^] (6,424 bytes) 2012-08-09 22:16
log file icon gdb4.log [^] (412 bytes) 2012-08-09 22:16
log file icon gdb5.log [^] (4,740 bytes) 2012-08-09 22:16
patch file icon 0001-Test-kuser_cmpxchg-for-all-ARM-CPUs.patch [^] (3,451 bytes) 2012-08-10 00:56 [Show Content]
log file icon fpmake-multiple-threads.log [^] (3,375 bytes) 2012-08-10 11:04
log file icon single-cpu-fpmake.log [^] (3,090 bytes) 2012-08-10 11:44

- Relationships

-  Notes
(0061527)
Florian (administrator)
2012-08-08 22:05

On my raspian, theapthread works fine. Could you provide some strace?
(0061535)
Thomas Schatzl (developer)
2012-08-09 16:16

I'm trying on an SMP ARM machine, i.e. quad-core armv7.

It seems to be a memory corruption issue, I will provide strace and backtrace later.
(0061539)
Thomas Schatzl (developer)
2012-08-09 22:19

Seems to be a threading related race condition which corrupts memory.

Attached strace.log shows a strace run of theapthread.
gdb3.log shows a gdb session of theapthread.
gdb4.log shows the output of theapthread run with the c memory manager (cmem)
gdb5.log shows a gdb session of theapthread with the c memory manager

Note that this issue is 100% reproducable on any fpc program that seems to use threads; so it might not be directly related to armhf, but threading.
(0061541)
Thomas Schatzl (developer)
2012-08-09 22:46

These logs are from an rtl and theapthread program compiled with -O- -gl, backtraces still look bogus.

"uname -a" gives

Linux linaro-ubuntu-desktop 3.6.0-rc1 0000005 SMP PREEMPT Tue Aug 7 13:39:13 KST 2012 armv7l armv7l armvx

btw.
(0061542)
Nico Erfurth (developer)
2012-08-10 00:58

I'm wondering if this could be a problem of missing memory barriers. Please try "0001-Test-kuser_cmpxchg-for-all-ARM-CPUs.patch" This changes the interlocked* functions to always call the kernel user helpers which will utilize the memory barriers. This is just for testing the theory.
(0061546)
Nico Erfurth (developer)
2012-08-10 09:34

If the patch doesn't change anything, please try disabling SMP-support (nosmp or maxcpus=0 in your kernel commandline). I currently don't have access to any ARM-MP device for testing.
(0061549)
Thomas Schatzl (developer)
2012-08-10 11:04

Does not change a thing. E.g. fpmake still crashes when compiling with multiple threads.

fpmake-multiple-threads.log shows the threads' backtraces.
(0061550)
Thomas Schatzl (developer)
2012-08-10 11:13

I believe that using the kernel cmpxchg methods instead of fpc internal will not change anything if necessary memory barriers have been forgotten in the fpc code.

I do not think they are defined to e.g. flush the write buffers in addition to doing the cmpxchg.

I also saw that you forgot to change a few of the Interlocked* functions, currently retrying.
(0061551)
Thomas Schatzl (developer)
2012-08-10 11:20

As expected, no change after also making InterlockedDecrement and InterlockedExchange "safe". Will try disabling SMP.
(0061552)
Nico Erfurth (developer)
2012-08-10 11:33

The kernel will issue a proper memory barrier when calling kuser_cmpxchg. See http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=0f82098c9bfe3618115dcc4e8b2d74d95c7ddcf6;hb=HEAD [^] Line 948-956 and Line 884-886.

If that would have helped then we could have implemented the necessary mnemonics in FPC. As it currently does not understand dmb&co.
(0061553)
Thomas Schatzl (developer)
2012-08-10 11:43

It still crashes, on a single cpu too as I'd expect. See single-cpu-fpmake.log.
(0061554)
Thomas Schatzl (developer)
2012-08-10 11:57

To complete the evaluation whether this is an Interlocked* issue or not, can you provide another patch that also changes InterlockedDecrement and InterLockedExchange to kernel calls?

Although I think I changed them in a correct way, I'd be more confident if I could just patch + compile instead of hacking myselves (esp. for InterLockedExchange).

Thx.
(0061556)
Nico Erfurth (developer)
2012-08-10 12:37

If it even crashes on a UP-system then the Interlocked* are most likely not to blame. Memory barriers make sense in 2 cases.

1.) Accessing other components on the bus.
2.) Sharing data with another processor (in fact just a special case of 1)

I'll try to get my ARMv7 UP System up and running over the weekend to run some tests.
(0061565)
Thomas Schatzl (developer)
2012-08-10 21:56
edited on: 2012-08-10 22:53

It seems much simpler than that: there seems to be some bug in the compiler/rtl for armv7 builds.

I.e. overriding the cpu type to armv6 with "-dFPC_ARMHF -Cparmv6" produces a non-crashing build. (The original issue when I first noticed was that fpmake started with multiple threads crashes).

Edit: note that in this build I was using a newer fpc revision; actually the build also works with armv7 now. Seems to have been fixed by something else in the meantime. I will try to walk the svn revisions to reproduce the issue again.

(0061566)
Florian (administrator)
2012-08-10 23:11
edited on: 2012-08-10 23:12

Didn't you say on IRC that you tested something like armv6 already?

Edit: This proves even more my suspicion that InterlockedExchange using the swp is broken on armv7 because of e.g. broken swp emulation.

(0061567)
Jonas Maebe (manager)
2012-08-10 23:49

Regarding the earlier discussion about memory barriers: interlocked* is not supposed to cause a memory barrier. It doesn't hurt if it does cause a memory barrier (except for performance), but it's definitely not required/expected.
(0061568)
Thomas Schatzl (developer)
2012-08-11 00:25

@Florian: I agree with you about that this may be due to a broken swap emulation. Or broken swp assembling for armv7, or something completely differently. I do not know.

If I manually add -Cparmv6 everything is fine (swp is not used). If I use -Cparmv5 it is broken again (swp is used). -Cparmv7 fails as well.

What I did yesterday was that I changed the four

"{$if defined(cpuarmv6) or defined(cpuarmv7m) or defined(cpucortexm3)} " lines

to

"{$if defined(cpuarmv6) or defined(cpuarmv7) or defined(cpuarmv7m) or defined(cpucortexm3)}"

in rtl/arm/arm.inc and compiled with -dFPC_ARMHF only (i.e. enabled -Cparmv7).

This did not work, however I may have missed one of these clauses, i.e. a copy&paste error or such; I do not have the old changes anymore because I swapped OSes meanwhile, deleting everything in the process.

Anyway, these defines are getting way to unwieldy imo, that's why I was suggesting the CPUARM_HAS_LDREX define for such things on IRC.

I will investigate a little more, maybe the swp emulation does not like swp's where source and destination register are the same. Or it does not like swp at all, who knows.

Note that the issue has definitely nothing to do with armhf: the same issue occurs on an armel chroot (on the same kernel of course).

@Jonas: that's what about I also remember about the Interlocked* functions, hence my remark.
(0061574)
Thomas Schatzl (developer)
2012-08-11 21:36

Fixed in r22062.

InterlockedExchange implementation was not synchronized to the other Interlocked* functions, using raw "swp".

This caused failures on the test machine, due to extreme timing differences because of the swp emulation on that armv7 machine.

Uses kuser_cmpxchg if available, otherwise the global system mutex as the implementations of the other Interlocked* functions.

Set to "feedback" because maybe somebody has time and hardware to check the changes.
(0071243)
Jonas Maebe (manager)
2013-11-10 13:17

I did a successful "make all" and testsuite run on ARM hardfloat a while ago.

- Issue History
Date Modified Username Field Change
2012-08-08 21:12 Thomas Schatzl New Issue
2012-08-08 21:12 Thomas Schatzl FPCOldBugId => 0
2012-08-08 22:05 Florian Note Added: 0061527
2012-08-09 16:16 Thomas Schatzl Note Added: 0061535
2012-08-09 22:15 Thomas Schatzl File Added: strace.log
2012-08-09 22:16 Thomas Schatzl File Added: gdb3.log
2012-08-09 22:16 Thomas Schatzl File Added: gdb4.log
2012-08-09 22:16 Thomas Schatzl File Added: gdb5.log
2012-08-09 22:19 Thomas Schatzl Note Added: 0061539
2012-08-09 22:46 Thomas Schatzl Note Added: 0061541
2012-08-10 00:56 Nico Erfurth File Added: 0001-Test-kuser_cmpxchg-for-all-ARM-CPUs.patch
2012-08-10 00:58 Nico Erfurth Note Added: 0061542
2012-08-10 09:34 Nico Erfurth Note Added: 0061546
2012-08-10 11:04 Thomas Schatzl Note Added: 0061549
2012-08-10 11:04 Thomas Schatzl File Added: fpmake-multiple-threads.log
2012-08-10 11:13 Thomas Schatzl Note Added: 0061550
2012-08-10 11:20 Thomas Schatzl Note Added: 0061551
2012-08-10 11:33 Nico Erfurth Note Added: 0061552
2012-08-10 11:43 Thomas Schatzl Note Added: 0061553
2012-08-10 11:44 Thomas Schatzl File Added: single-cpu-fpmake.log
2012-08-10 11:47 Thomas Schatzl File Added: complete_kuser.patch
2012-08-10 11:57 Thomas Schatzl Note Added: 0061554
2012-08-10 11:58 Thomas Schatzl File Deleted: complete_kuser.patch
2012-08-10 12:37 Nico Erfurth Note Added: 0061556
2012-08-10 21:56 Thomas Schatzl Note Added: 0061565
2012-08-10 22:53 Thomas Schatzl Note Edited: 0061565
2012-08-10 23:11 Florian Note Added: 0061566
2012-08-10 23:12 Florian Note Edited: 0061566
2012-08-10 23:49 Jonas Maebe Note Added: 0061567
2012-08-11 00:25 Thomas Schatzl Note Added: 0061568
2012-08-11 21:32 Thomas Schatzl Status new => assigned
2012-08-11 21:32 Thomas Schatzl Assigned To => Thomas Schatzl
2012-08-11 21:36 Thomas Schatzl Note Added: 0061574
2012-08-11 21:36 Thomas Schatzl Status assigned => feedback
2012-08-11 21:36 Thomas Schatzl Target Version => 2.7.1
2013-11-10 13:17 Jonas Maebe Fixed in Revision => 22062
2013-11-10 13:17 Jonas Maebe Note Added: 0071243
2013-11-10 13:17 Jonas Maebe Status feedback => resolved
2013-11-10 13:17 Jonas Maebe Resolution open => fixed



MantisBT 1.2.12[^]
Copyright © 2000 - 2012 MantisBT Group
Powered by Mantis Bugtracker