SMP segmentation faults
Posted: Sun Jul 08, 2007 11:52 am
My last two work units terminated with similar errors at 67%. Here's what I copied from the terminal:
[13:06:32] Writing local files
[13:06:33] Completed 335000 out of 500000 steps (67 percent)
[0]0:Return code = 0, signaled with Segmentation fault
[0]1:Return code = 0, signaled with Segmentation fault
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault
[13:23:53] CoreStatus = 0 (0)
[13:23:53] Client-core communications error: ERROR 0x0
[13:23:53] Deleting current work unit & continuing...
[0]0:Return code = 0, signaled with Quit
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 18
[13:28:21] - Preparing to get new work unit...
[13:28:21] + Attempting to get work packet
[13:28:21] - Connecting to assignment server
[13:28:21] - Successful: assigned to (171.64.65.56).
[13:28:21] + News From Folding@Home: Welcome to Folding@Home
[13:28:21] Loaded queue successfully.
[13:28:33] + Closed connections
[13:28:38]
[13:28:38] + Processing work unit
[13:28:38] Core required: FahCore_a1.exe
[13:28:38] Core found.
[13:28:38] Working on Unit 05 [July 8 13:28:38]
[13:28:38] + Working ...
[13:28:38]
[13:28:38] *------------------------------*
[13:28:38] Folding@Home Gromacs SMP Core
[13:28:38] Version 1.73 (November 27, 2006)
[13:28:38]
[13:28:38] Preparing to commence simulation
[13:28:38] - Ensuring status. Please wait.
[13:28:55] - Looking at optimizations...
[13:28:55] - Working with standard loops on this execution.
[13:28:55] - Previous termination of core was improper.
[13:28:55] - Going to use standard loops.
[13:28:55] - Files status OK
[13:28:57] (decompressed 537.8 percent)
[13:28:57] - Starting from initial work pa- Starting from initial work packet
[13:28:57]
[13:28:57] Project: 2608 (Run 0, Clone 40, Gen 30)
It appears to be running the same thing over and over, and crashing at the same place each time. I'm going to stop it, delete it, and try to get a new, different one.
Has this happened to anyone else? Is it my machine or a bad work unit?
Thanks
[13:06:32] Writing local files
[13:06:33] Completed 335000 out of 500000 steps (67 percent)
[0]0:Return code = 0, signaled with Segmentation fault
[0]1:Return code = 0, signaled with Segmentation fault
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault
[13:23:53] CoreStatus = 0 (0)
[13:23:53] Client-core communications error: ERROR 0x0
[13:23:53] Deleting current work unit & continuing...
[0]0:Return code = 0, signaled with Quit
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 18
[13:28:21] - Preparing to get new work unit...
[13:28:21] + Attempting to get work packet
[13:28:21] - Connecting to assignment server
[13:28:21] - Successful: assigned to (171.64.65.56).
[13:28:21] + News From Folding@Home: Welcome to Folding@Home
[13:28:21] Loaded queue successfully.
[13:28:33] + Closed connections
[13:28:38]
[13:28:38] + Processing work unit
[13:28:38] Core required: FahCore_a1.exe
[13:28:38] Core found.
[13:28:38] Working on Unit 05 [July 8 13:28:38]
[13:28:38] + Working ...
[13:28:38]
[13:28:38] *------------------------------*
[13:28:38] Folding@Home Gromacs SMP Core
[13:28:38] Version 1.73 (November 27, 2006)
[13:28:38]
[13:28:38] Preparing to commence simulation
[13:28:38] - Ensuring status. Please wait.
[13:28:55] - Looking at optimizations...
[13:28:55] - Working with standard loops on this execution.
[13:28:55] - Previous termination of core was improper.
[13:28:55] - Going to use standard loops.
[13:28:55] - Files status OK
[13:28:57] (decompressed 537.8 percent)
[13:28:57] - Starting from initial work pa- Starting from initial work packet
[13:28:57]
[13:28:57] Project: 2608 (Run 0, Clone 40, Gen 30)
It appears to be running the same thing over and over, and crashing at the same place each time. I'm going to stop it, delete it, and try to get a new, different one.
Has this happened to anyone else? Is it my machine or a bad work unit?
Thanks