Message boards :
Number crunching :
Stronghold Bookfinder - High Memory Usage
Message board moderation
Author | Message |
---|---|
Send message Joined: 25 Jun 20 Posts: 19 Credit: 1,006,106,845 RAC: 5,278,472 ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Warning, high memory usage. I wanted to run a few tasks during the Pent and I came back with ~half the tasks running as I expected. I ran out of memory with 2GB per thread as I see they were all using at least 1gb. They varied between 1.3GB to 4.7GB with most over 3GB. |
Send message Joined: 25 Jun 20 Posts: 19 Credit: 1,006,106,845 RAC: 5,278,472 ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
And all junk. 4 for 4, aborted the rest. Admin, just run a SINGLE task to see if thing work before releasing. It's not that hard. Output file shbookfinder_6307_1746471151.026602_0_r1897758062_0 for task shbookfinder_6307_1746471151.026602_0 absent <core_client_version>8.0.4</core_client_version> <![CDATA[ <stderr_txt> 2025-05-05 16:57:04 (590962): wrapper (8.1.26018): starting 2025-05-05 16:57:05 (590962): wrapper: running ./java/bin/java (-jar shbookfinder.jar) 2025-05-05 16:57:05 (590962): wrapper: created child process 590966 2025-05-05 18:35:54 (590962): ./java/bin/java exited; CPU time 5938.346281 2025-05-05 18:35:54 (590962): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>shbookfinder_6370_1746471151.342935_0_r477949539_0</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]> |
![]() Send message Joined: 8 Mar 21 Posts: 89 Credit: 1,367,652,973 RAC: 11,055,324 ![]() ![]() ![]() ![]() ![]() ![]() |
Same here. All Stronghold Bookfinder tasks are erroring out with file missing errors. 95 failures so far. No successful attempts. |
Send message Joined: 15 Jun 20 Posts: 96 Credit: 221,870,555 RAC: 1,658,616 ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi, I ran a task locally before I released it, and it did work. But clearly I missed something in the boinc config. I also did not anticipate the memory usage across these tasks, so I'll bring that to the original developer who wrote the app and see what can be done if anything. Lastly, I'm just one person. Nobody else within the Minecraft@Home community manages this boinc project anymore. I apologize for the inconvenience, and I strive to do better, but it's a little demotivating when my mistakes are met with ridicule. If I'm reading more into your message than you intended, that's my fault. Regardless, tasks for that app are disabled. Scheduler and feeder should be back up shortly. |
![]() Send message Joined: 8 Mar 21 Posts: 89 Credit: 1,367,652,973 RAC: 11,055,324 ![]() ![]() ![]() ![]() ![]() ![]() |
Did you just run the task locally on a host? Or did you run it through an actual production Boinc environment? I never saw a beta release for the app where you set up a very limited run of tasks sent to a few hosts before sending it into production. I just saw the app in the applications list with no tasks available and then overnight, there were thousands of the tasks generated and ready to send and all my hosts were fully loaded with these faulty tasks. If there was a beta period, it was staggeringly short. We appreciate your work running the site. Didn't know you were a one-man show though. I had assumed from your Discord server there were more of you, admins and developers. |
Send message Joined: 15 Jun 20 Posts: 96 Credit: 221,870,555 RAC: 1,658,616 ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
As far as the boinc administration and configuration stuff goes, yeah I've been solo in managing it since I took it over in 2024. Prior to that, at least the last time we had work, others had been admining it and I simply got apps ready to run on boinc. Other folks develop the apps primarily but didn't know how to get them running on boinc. So I fill that gap and handle the day-to-day operations of the boinc environment. There is one other person who manages the virtual hardware that the boinc environment runs on, but he doesn't do much hands on with the boinc environment itself. That said it used to be part of my process back in like 2020-2021 to spin up a separate boinc environment locally to run a test task and see if it succeeds, but I kind of dropped that part of the deployment process once it was just me managing it. You're right that I should be doing that, or at the very least marking as beta at the beginning to limit the spread of potentially bad deployments. I'll mark shbookfinder as beta for when we relaunch it at some point. |
![]() ![]() Send message Joined: 25 Jun 20 Posts: 9 Credit: 252,492,028 RAC: 4,950,073 ![]() ![]() ![]() ![]() ![]() |
Thanks so much for your work boysanic it is appreciated. My tasks have all run to 100% and then sit there awhile before they error out, but they do run, it seems to be a problem when the job finalizes and then reports. Conan |
![]() Send message Joined: 28 Jun 20 Posts: 21 Credit: 440,000,580 RAC: 9,619,818 ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks so much for your work boysanic it is appreciated. Mine are the same way, here's a Linux pc's Stderr file: Stderr output <core_client_version>8.0.4</core_client_version> <![CDATA[ <stderr_txt> 2025-05-06 01:10:49 (48644): wrapper (8.1.26018): starting 2025-05-06 01:10:51 (48644): wrapper: running ./java/bin/java (-jar shbookfinder.jar) 2025-05-06 01:10:51 (48644): wrapper: created child process 48650 2025-05-06 04:23:47 (48644): ./java/bin/java exited; CPU time 10586.799567 2025-05-06 04:23:47 (48644): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>shbookfinder_18297_1746471219.344162_0_r1519664204_0</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]> And here's a Windows pc's Stderr file for comparison: Stderr output <core_client_version>8.0.4</core_client_version> <![CDATA[ <stderr_txt> 2025-05-06 00:35:09 (13188): wrapper: running .\java\bin\java.exe (-jar shbookfinder.jar) 2025-05-06 00:35:09 (13188): wrapper: created child process 4128 2025-05-06 02:40:54 (13188): .\java\bin\java.exe exited; CPU time 5709.468750 2025-05-06 02:40:54 (13188): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>shbookfinder_2462_1746471130.201506_3_r1929507740_0</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> ]]> |
![]() Send message Joined: 8 Mar 21 Posts: 89 Credit: 1,367,652,973 RAC: 11,055,324 ![]() ![]() ![]() ![]() ![]() ![]() |
My recommendation for beta apps is to also produce very small sized runs first to see how they fare among the crunching populace. No point in generating a ton of faulty work which wastes bandwidth and is going to have to be reissued anyway. When the return rate is above 90% successful, then you can go for full deployment and remove beta status. |
Send message Joined: 3 Sep 20 Posts: 11 Credit: 1,189,625,049 RAC: 20,741,730 ![]() ![]() ![]() ![]() |
Over 400 tasks errored out after running for a lengthy amount of time. Ridiculous amount of wasted time and resources, let alone points lost. I second what Keith said about beta apps...send test runs out first before opening the flood gates. |
Send message Joined: 11 Mar 21 Posts: 8 Credit: 649,549,901 RAC: 12,255,709 ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi boysanic I would like to say thank you for your effort to run Minecraft@Home on the BOINC side. It was a long wait until you restarted it again and now we have a constant flow of new WUs. Excellent! Some high-up may occur, but you will sort it out with our help, I am sure. Keep your good work going on! klepel |
Send message Joined: 13 Oct 20 Posts: 15 Credit: 204,776,591 RAC: 2,005,856 ![]() ![]() ![]() ![]() ![]() |
Any news about this app? |