12 Jun 2012 07:21
Scheduler thread spins in futex_wait and sched_yield
Jebu Ittiachen <jebu.ittiachen <at> gmail.com>
2012-06-12 05:21:50 GMT
2012-06-12 05:21:50 GMT
Hi,
I seem to have hit upon a weird bug in the Erlang scheduler. I'm running R15B01 on Linux 64bit, Erlang compiled with HiPE disabled. Erlang starts up with 4 scheduler threads and everything is ok for a while. After a period of time the CPU usage drops on the machine and things start going slow. top -H shows 2 threads of the 4 running at around 15% and the other 2 at 95%. Typically all 4 threads are more or less in the same CPU utilization figures. strace on the process shows the two sluggish threads alternating between calls to futex_wait and sched_yield while the other two are doing a lot of other stuff.
Here is a sample of strace -f -p <pid> |grep <thread id>
20292 sched_yield( <unfinished ...>
20292 <... sched_yield resumed> ) = 0
20292 sched_yield( <unfinished ...>
20292 <... sched_yield resumed> ) = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 futex(0x1bf2220, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
20292 <... futex resumed> ) = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
20292 sched_yield() = 0
My only option out of this now is to restart the node, when it again runs happily for a while before scheduler threads start dropping off. I'd be happy to provide any more dumps/info that maybe needed to get to the bottom of this.
Thanks
_______________________________________________ erlang-bugs mailing list erlang-bugs <at> erlang.org http://erlang.org/mailman/listinfo/erlang-bugs
RSS Feed