Xoroshigo2 v1.1 Issues Megathread

Message boards : Number crunching : Xoroshigo2 v1.1 Issues Megathread
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
boysanic
Project administrator
Project developer

Send message
Joined: 15 Jun 20
Posts: 58
Credit: 123,550,555
RAC: 1,386,402
Message 1078 - Posted: 4 Apr 2025, 8:37:58 UTC

Hi everyone,

If you encounter a problem with xoroshigo2 (v1.1), please chime in on this thread.

Helpful info for us to have:
1. Example task where the issue occurred
2. The contents of stderr_file (can be found in the slot directory running the app)
3. The contents of stdout_file (can be found in the slot directory running the app)

Please include as much of this as possible as this gives us the greatest chance of diagnosing your particular issue and publishing a fix.

Thank you!
ID: 1078 · Report as offensive     Reply Quote
fzs600

Send message
Joined: 25 Jun 20
Posts: 11
Credit: 204,801,691
RAC: 851,299
Message 1091 - Posted: 5 Apr 2025, 2:10:20 UTC - in response to Message 1078.  

4 Apr 2025, 15:06:59 UTC 4 Apr 2025, 23:44:40 UTC Erreur lors des calculs 2,193.89 2,078.34 --- Xoroshiro128++ Guessing Order Optimization v1.1 v1.02
windows_x86_64

Stderr output

<core_client_version>8.0.2</core_client_version>
<![CDATA[
<message>
Le syst�me d�exploitation ne peut pas ex�cuter (null).
(0xc3) - exit code 195 (0xc3)</message>
<stderr_txt>
2025-04-05 01:06:24 (16868): wrapper: running .\client-windows.exe ( config-003-hxlreg-fullinfo-rank100.npz 30000000 459 input.npz)
2025-04-05 01:06:25 (16868): wrapper: created child process 15068
2025-04-05 01:42:51 (16868): .\client-windows.exe exited; CPU time 2078.343750
2025-04-05 01:42:51 (16868): app exit status: 0xc2
2025-04-05 01:42:51 (16868): called boinc_finish(195)

</stderr_txt>
]]>
ID: 1091 · Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 24 Jun 20
Posts: 29
Credit: 1,023,264,541
RAC: 10,518,228
Message 1094 - Posted: 5 Apr 2025, 3:16:26 UTC

I have 9x Raspberry Pi, or Pi clones. 8 of the 9 are completing tasks just fine. But this one gets nothing but errors:

https://minecraftathome.com/minecrafthome/show_host_detail.php?hostid=28336

Here is a sample of the Stderr output:

<core_client_version>7.18.1</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
2025-04-05 01:04:50 (14416): wrapper (8.1.26018): starting
2025-04-05 01:04:50 (14416): wrapper: running ./client-linux-arm64.bin ( config-004-hixorlo-fullinfo-rank080.npz 30000000 3720 input.npz)
2025-04-05 01:04:50 (14416): wrapper: created child process 14418
2025-04-05 01:04:51 (14416): ./client-linux-arm64.bin exited; CPU time 0.428000
2025-04-05 01:04:51 (14416): app exit status: 0x4
2025-04-05 01:04:51 (14416): called boinc_finish(195)

</stderr_txt>
]]>


Any ideas about what is problem is?
ID: 1094 · Report as offensive     Reply Quote
boysanic
Project administrator
Project developer

Send message
Joined: 15 Jun 20
Posts: 58
Credit: 123,550,555
RAC: 1,386,402
Message 1099 - Posted: 5 Apr 2025, 5:53:40 UTC - in response to Message 1094.  

It's hard to say what's wrong without the contents of stderr_file in a slot directory that failed.

Can you find that and share it here?
ID: 1099 · Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 13 Jan 25
Posts: 4
Credit: 1,902,500
RAC: 153,189
Message 1100 - Posted: 5 Apr 2025, 14:32:49 UTC - in response to Message 1099.  
Last modified: 5 Apr 2025, 14:37:09 UTC

It's hard to say what's wrong without the contents of stderr_file in a slot directory that failed.

Can you find that and share it here?


it can be difficult to get files from the slot directory after a failure since they are all deleted the instant the task finishes. might be easier to instruct users how to run the app offline so that it's not deleted by BOINC when the task finishes
ID: 1100 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 21 Jul 20
Posts: 16
Credit: 5,767,364
RAC: 11,581
Message 1101 - Posted: 5 Apr 2025, 19:58:59 UTC - in response to Message 1100.  

... Can you find that and share it here?


it can be difficult to get files from the slot directory after a failure since they are all deleted the instant the task finishes. might be easier to instruct users how to run the app offline so that it's not deleted by BOINC when the task finishes

I have had same experience as Ian - content of slots directory disappear nearly instantly upon crashing.

Providing detailed manual running instructions OUTSIDE OF BOINC would be appreciated so we can help you out.
ID: 1101 · Report as offensive     Reply Quote
Profile Vato
Avatar

Send message
Joined: 24 Jun 20
Posts: 9
Credit: 117,988,409
RAC: 1,549,887
Message 1102 - Posted: 5 Apr 2025, 21:26:56 UTC - in response to Message 1101.  

expect an app update that fixes this "soon"
ID: 1102 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 69
Credit: 772,185,473
RAC: 9,509,379
Message 1103 - Posted: 5 Apr 2025, 23:37:54 UTC
Last modified: 5 Apr 2025, 23:38:17 UTC

I've tried Ian's test package with the stock binary on my Jetson TX2-NX SoC but it fails instantly and doesn't leave any stderr.txt fle in the test directory for debugging information. Just leaves the downloaded numpy support directory.

I'm currently testing his fixed test binary that is running now and will report back when it is finished.
ID: 1103 · Report as offensive     Reply Quote
Profile Vato
Avatar

Send message
Joined: 24 Jun 20
Posts: 9
Credit: 117,988,409
RAC: 1,549,887
Message 1104 - Posted: 6 Apr 2025, 1:21:40 UTC - in response to Message 1102.  

my odroids are now running version 1.03 of the v1.1 app and it looks good
ID: 1104 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 69
Credit: 772,185,473
RAC: 9,509,379
Message 1105 - Posted: 6 Apr 2025, 1:52:28 UTC

My test task using the new test application just finished after 2 hours 20 minutes on my Nvidia Jetson TX2-NX.
contents of output.txt
keith@tx2-nx:~/Downloads/test$ cat output.txt
68.023209
806778377d6b82ff04d0009d24e24481020f813500158f2e2931911780329016
config-003-hxlreg-fullinfo-rank100.npz 30000000 8840
127 30191000
ID: 1105 · Report as offensive     Reply Quote
boysanic
Project administrator
Project developer

Send message
Joined: 15 Jun 20
Posts: 58
Credit: 123,550,555
RAC: 1,386,402
Message 1106 - Posted: 6 Apr 2025, 2:09:32 UTC

The issue with select ARM-based devices should now be resolved.

Thanks for everyone's help with that! :)
ID: 1106 · Report as offensive     Reply Quote
zombie67 [MM]
Avatar

Send message
Joined: 24 Jun 20
Posts: 29
Credit: 1,023,264,541
RAC: 10,518,228
Message 1107 - Posted: 6 Apr 2025, 11:38:43 UTC - in response to Message 1106.  

The issue with select ARM-based devices should now be resolved.

Thanks for everyone's help with that! :)

Thanks!
ID: 1107 · Report as offensive     Reply Quote
mmonnin

Send message
Joined: 25 Jun 20
Posts: 13
Credit: 401,781,845
RAC: 6,964,893
Message 1108 - Posted: 6 Apr 2025, 22:36:04 UTC

Did ya just cancel all the work?
ID: 1108 · Report as offensive     Reply Quote
boysanic
Project administrator
Project developer

Send message
Joined: 15 Jun 20
Posts: 58
Credit: 123,550,555
RAC: 1,386,402
Message 1109 - Posted: 6 Apr 2025, 23:36:44 UTC

I regenerated a bunch of work to interweave config files, rather than waiting until a config file is done to run the next one.
ID: 1109 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 69
Credit: 772,185,473
RAC: 9,509,379
Message 1110 - Posted: 7 Apr 2025, 2:18:55 UTC - in response to Message 1109.  

Wondered why I saw all the new app 1.03 work was cancelled.
ID: 1110 · Report as offensive     Reply Quote
bluestang

Send message
Joined: 3 Sep 20
Posts: 2
Credit: 420,377,549
RAC: 6,779,591
Message 1111 - Posted: 7 Apr 2025, 3:26:53 UTC - in response to Message 1109.  

I regenerated a bunch of work to interweave config files, rather than waiting until a config file is done to run the next one.


Sure, but canceling our running WUs while they were actually being processed on our machines is a waste of our resources and not cool.
ID: 1111 · Report as offensive     Reply Quote
Paul
New member

Send message
Joined: 31 Mar 25
Posts: 2
Credit: 1,360,000
RAC: 100,442
Message 1113 - Posted: 7 Apr 2025, 6:46:37 UTC - in response to Message 1111.  

Also those run but waiting wingman wasted as "can't validate".
Paul.
ID: 1113 · Report as offensive     Reply Quote
mmonnin

Send message
Joined: 25 Jun 20
Posts: 13
Credit: 401,781,845
RAC: 6,964,893
Message 1114 - Posted: 7 Apr 2025, 8:59:46 UTC - in response to Message 1110.  

Wondered why I saw all the new app 1.03 work was cancelled.


1.03 came out earlier, basically the day before.
ID: 1114 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 21 Jul 20
Posts: 16
Credit: 5,767,364
RAC: 11,581
Message 1115 - Posted: 7 Apr 2025, 16:12:22 UTC

Any updates on why the tasks keep failing on Windows 7?
Error tasks for computer 2591
ID: 1115 · Report as offensive     Reply Quote
boysanic
Project administrator
Project developer

Send message
Joined: 15 Jun 20
Posts: 58
Credit: 123,550,555
RAC: 1,386,402
Message 1117 - Posted: 7 Apr 2025, 19:21:00 UTC - in response to Message 1115.  

I'm not sure. I don't run windows 7 anymore (haven't in a long time really) but I can install a VM and test it.

Without the contents of stderr_file I can't say one way or the other why it fails.

It's also possible we can't fix it. Windows 7 has been out of support for 5 years now and it should be expected that applications will start to make breaking changes to move away from paradigms that existed during 7 but are deprecated or completely removed in later versions.

After I test it, I'll update the thread so you know what I find.
ID: 1117 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Xoroshigo2 v1.1 Issues Megathread