Message boards :
Number crunching :
Xoroshigo2 v1.1 Issues Megathread
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jun 20 Posts: 58 Credit: 123,550,555 RAC: 1,386,402 ![]() ![]() ![]() ![]() ![]() |
Hi everyone, If you encounter a problem with xoroshigo2 (v1.1), please chime in on this thread. Helpful info for us to have: 1. Example task where the issue occurred 2. The contents of stderr_file (can be found in the slot directory running the app) 3. The contents of stdout_file (can be found in the slot directory running the app) Please include as much of this as possible as this gives us the greatest chance of diagnosing your particular issue and publishing a fix. Thank you! |
Send message Joined: 25 Jun 20 Posts: 11 Credit: 204,801,691 RAC: 851,299 ![]() ![]() ![]() |
4 Apr 2025, 15:06:59 UTC 4 Apr 2025, 23:44:40 UTC Erreur lors des calculs 2,193.89 2,078.34 --- Xoroshiro128++ Guessing Order Optimization v1.1 v1.02 Stderr output |
Send message Joined: 24 Jun 20 Posts: 29 Credit: 1,023,279,541 RAC: 10,506,233 ![]() ![]() ![]() |
I have 9x Raspberry Pi, or Pi clones. 8 of the 9 are completing tasks just fine. But this one gets nothing but errors: https://minecraftathome.com/minecrafthome/show_host_detail.php?hostid=28336 Here is a sample of the Stderr output: <core_client_version>7.18.1</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61)</message> <stderr_txt> 2025-04-05 01:04:50 (14416): wrapper (8.1.26018): starting 2025-04-05 01:04:50 (14416): wrapper: running ./client-linux-arm64.bin ( config-004-hixorlo-fullinfo-rank080.npz 30000000 3720 input.npz) 2025-04-05 01:04:50 (14416): wrapper: created child process 14418 2025-04-05 01:04:51 (14416): ./client-linux-arm64.bin exited; CPU time 0.428000 2025-04-05 01:04:51 (14416): app exit status: 0x4 2025-04-05 01:04:51 (14416): called boinc_finish(195) </stderr_txt> ]]> Any ideas about what is problem is? |
Send message Joined: 15 Jun 20 Posts: 58 Credit: 123,550,555 RAC: 1,386,402 ![]() ![]() ![]() ![]() ![]() |
It's hard to say what's wrong without the contents of stderr_file in a slot directory that failed. Can you find that and share it here? |
Send message Joined: 13 Jan 25 Posts: 4 Credit: 1,912,500 RAC: 153,295 |
It's hard to say what's wrong without the contents of stderr_file in a slot directory that failed. it can be difficult to get files from the slot directory after a failure since they are all deleted the instant the task finishes. might be easier to instruct users how to run the app offline so that it's not deleted by BOINC when the task finishes |
![]() Send message Joined: 21 Jul 20 Posts: 16 Credit: 5,767,364 RAC: 11,581 ![]() |
... Can you find that and share it here? I have had same experience as Ian - content of slots directory disappear nearly instantly upon crashing. Providing detailed manual running instructions OUTSIDE OF BOINC would be appreciated so we can help you out. |
![]() ![]() Send message Joined: 24 Jun 20 Posts: 9 Credit: 118,033,409 RAC: 1,552,712 ![]() ![]() |
expect an app update that fixes this "soon" |
![]() Send message Joined: 8 Mar 21 Posts: 69 Credit: 772,295,473 RAC: 9,510,264 ![]() |
I've tried Ian's test package with the stock binary on my Jetson TX2-NX SoC but it fails instantly and doesn't leave any stderr.txt fle in the test directory for debugging information. Just leaves the downloaded numpy support directory. I'm currently testing his fixed test binary that is running now and will report back when it is finished. |
![]() ![]() Send message Joined: 24 Jun 20 Posts: 9 Credit: 118,033,409 RAC: 1,552,712 ![]() ![]() |
my odroids are now running version 1.03 of the v1.1 app and it looks good |
![]() Send message Joined: 8 Mar 21 Posts: 69 Credit: 772,295,473 RAC: 9,510,264 ![]() |
My test task using the new test application just finished after 2 hours 20 minutes on my Nvidia Jetson TX2-NX. contents of output.txt keith@tx2-nx:~/Downloads/test$ cat output.txt 68.023209 806778377d6b82ff04d0009d24e24481020f813500158f2e2931911780329016 config-003-hxlreg-fullinfo-rank100.npz 30000000 8840 127 30191000 |
Send message Joined: 15 Jun 20 Posts: 58 Credit: 123,550,555 RAC: 1,386,402 ![]() ![]() ![]() ![]() ![]() |
The issue with select ARM-based devices should now be resolved. Thanks for everyone's help with that! :) |
Send message Joined: 24 Jun 20 Posts: 29 Credit: 1,023,279,541 RAC: 10,506,233 ![]() ![]() ![]() |
The issue with select ARM-based devices should now be resolved. Thanks! |
Send message Joined: 25 Jun 20 Posts: 13 Credit: 401,891,845 RAC: 6,968,659 ![]() ![]() ![]() |
Did ya just cancel all the work? |
Send message Joined: 15 Jun 20 Posts: 58 Credit: 123,550,555 RAC: 1,386,402 ![]() ![]() ![]() ![]() ![]() |
I regenerated a bunch of work to interweave config files, rather than waiting until a config file is done to run the next one. |
![]() Send message Joined: 8 Mar 21 Posts: 69 Credit: 772,295,473 RAC: 9,510,264 ![]() |
Wondered why I saw all the new app 1.03 work was cancelled. |
Send message Joined: 3 Sep 20 Posts: 2 Credit: 420,452,549 RAC: 6,779,113 ![]() |
I regenerated a bunch of work to interweave config files, rather than waiting until a config file is done to run the next one. Sure, but canceling our running WUs while they were actually being processed on our machines is a waste of our resources and not cool. |
New member Send message Joined: 31 Mar 25 Posts: 2 Credit: 1,360,000 RAC: 100,442 |
Also those run but waiting wingman wasted as "can't validate". Paul. |
Send message Joined: 25 Jun 20 Posts: 13 Credit: 401,891,845 RAC: 6,968,659 ![]() ![]() ![]() |
Wondered why I saw all the new app 1.03 work was cancelled. 1.03 came out earlier, basically the day before. |
![]() Send message Joined: 21 Jul 20 Posts: 16 Credit: 5,767,364 RAC: 11,581 ![]() |
Any updates on why the tasks keep failing on Windows 7? Error tasks for computer 2591 |
Send message Joined: 15 Jun 20 Posts: 58 Credit: 123,550,555 RAC: 1,386,402 ![]() ![]() ![]() ![]() ![]() |
I'm not sure. I don't run windows 7 anymore (haven't in a long time really) but I can install a VM and test it. Without the contents of stderr_file I can't say one way or the other why it fails. It's also possible we can't fix it. Windows 7 has been out of support for 5 years now and it should be expected that applications will start to make breaking changes to move away from paradigms that existed during 7 but are deprecated or completely removed in later versions. After I test it, I'll update the thread so you know what I find. |