Jebu Ittiachen | 12 Jun 2012 07:21
Picon
Gravatar

Scheduler thread spins in futex_wait and sched_yield

Hi,

  I seem to have hit upon a weird bug in the Erlang scheduler. I'm running R15B01 on Linux 64bit, Erlang compiled with HiPE disabled. Erlang starts up with 4 scheduler threads and everything is ok for a while. After a period of time the CPU usage drops on the machine and things start going slow. top -H shows 2 threads of the 4 running at around 15% and the other 2 at 95%. Typically all 4 threads are more or less in the same CPU utilization figures. strace on the process shows the two sluggish threads alternating between calls to futex_wait and sched_yield while the other two are doing a lot of other stuff.

  Here is a sample of strace -f -p <pid> |grep <thread id>

20292 sched_yield( <unfinished ...>
20292 <... sched_yield resumed> )       = 0
20292 sched_yield( <unfinished ...>
20292 <... sched_yield resumed> )       = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 futex(0x1bf2220, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
20292 <... futex resumed> )             = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0
20292 sched_yield()                     = 0

  My only option out of this now is to restart the node, when it again runs happily for a while before scheduler threads start dropping off. I'd be happy to provide any more dumps/info that maybe needed to get to the bottom of this. 

Thanks
_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs

Gmane