New kaktwoos 2.12 beta application has issues

Message boards : Number crunching : New kaktwoos 2.12 beta application has issues
Message board moderation

To post messages, you must log in.

AuthorMessage
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 53
Credit: 243,832,973
RAC: 0
Message 547 - Posted: 29 Apr 2021, 0:10:45 UTC

Looks like a newly deployed beta application has lots of issues.
Most of my wingmen have errored out the tasks.
Either a download issue getting the .cl file or the application can't enumerate the gpu properly.
Also all my finished tasks had wildly incorrect estimated runtimes of multiple days even though the tasks completed in the normal timeframe.
This caused all the minecraftathome tasks to monopolize my gpus and force my other gpu work off the cards.
ID: 547 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 53
Credit: 243,832,973
RAC: 0
Message 548 - Posted: 29 Apr 2021, 16:53:00 UTC

Devs-Admins, any estimate when you will be sending out new work again?
ID: 548 · Report as offensive     Reply Quote
Profile Hy
Project administrator
Project developer
Avatar

Send message
Joined: 15 Jun 20
Posts: 74
Credit: 19,537,761
RAC: 0
Message 550 - Posted: 29 Apr 2021, 18:29:08 UTC - in response to Message 548.  
Last modified: 29 Apr 2021, 18:29:40 UTC

From what I saw, your computers were working on Kaktwoos-cl 2.12 tasks yesterday without errors. Our admin, Neil saw and mentioned to us some sort of error while making uploads/updates to the BOINC backend, hence moving 2.12 to beta and reverting to 2.11 to continue work. We have made him aware of the issue, and when he is available should ensure work is sent out to clients who had swapped between versions and/or are having issues.

I haven't personally seen this new error myself, and if you can post an example of a task or computer that failed with 2.12 but worked well previously, I'd appreciate it. Sometimes tasks are cancelled by the server which may appear as an error, but resolve themselves quickly. Other errors persist until a reboot, despite configuration changes being submitted to our servers
ID: 550 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 53
Credit: 243,832,973
RAC: 0
Message 552 - Posted: 29 Apr 2021, 19:34:27 UTC - in response to Message 550.  
Last modified: 29 Apr 2021, 19:43:52 UTC

I'll provide a couple of task links for my wingmen that errored out their tasks. I've seen two basic wingmen errors, either the application could not download the required 2.12 .cl file or for some reason the application has issues with OpenCL enumerating the gpu in the host.

HostID 785 with previous valid work. https://minecraftathome.com/minecrafthome/results.php?hostid=785&offset=0&show_names=0&state=4&appid=

https://minecraftathome.com/minecrafthome/result.php?resultid=4616045 OpenCL enumeration error

<core_client_version>7.16.6</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
Received work unit: 221390854994837
Data: n1: 986, n2: 970, n3: 458, di: 1, ch: 12, f: 70
Error: boinc_get_opencl_ids() failed with error -1
Error: clGetPlatformIDs() failed with error -1001

</stderr_txt>
]]>


HostID 12234 with previous valid work. https://minecraftathome.com/minecrafthome/results.php?hostid=12234&offset=0&show_names=0&state=4&appid=

https://minecraftathome.com/minecrafthome/result.php?resultid=4595919 .cl file download error

<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
  <file_name>kaktwoos_2.12_opencl_intel_gpu.cl</file_name>
  <error_code>-200 (wrong size)</error_code>
</file_xfer_error>
</message>
]]


Most of the errored wingmen tasks were for OpenCL enumeration errors. Predominately older Nvidia drivers but I found other examples with the latest 460.73 drivers like I am using. The Intel igpu hosts look to be predominately the download type errors and I think Neil.rs said he really messed up the copy of that .cl file.
ID: 552 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 53
Credit: 243,832,973
RAC: 0
Message 554 - Posted: 29 Apr 2021, 23:14:28 UTC

Reviewing some of the requirements for running gpu tasks on Nvidia, I was reminded of this statement in the requirements thread.

After installing the Nvidia drivers on Ubuntu, you *must* install the nvidia-opencl-dev packages or the related OpenCL Nvidia headers on your distribution. This has caused major errors due to the OpenCL driver being installed by the package manager, but *not* what helps kaktwoos actually interact with it

I wonder if the people that had the OpenCL enumeration issue ever installed those headers.
ID: 554 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 53
Credit: 243,832,973
RAC: 0
Message 555 - Posted: 30 Apr 2021, 2:10:06 UTC

Just an update. The admins/devs have restarted kaktwoos gpu task deployment again in the last hour. Getting both 2.11 and 2.12 species.

Main thing is that they have normal estimated runtimes. The previous issue of incorrectly calculated estimated runtimes is resolved.

Also, the main reason for the new beta application to fix the infinite running of a stopped and restarted task is resolved. Task properties does not show any checkpoints, but I have successfully stopped and restarted Minecraft gpu tasks and they do in fact restart from the same point they were stopped.

The kaktpoint.txt file in the slot is being updated.
ID: 555 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 53
Credit: 243,832,973
RAC: 0
Message 556 - Posted: 30 Apr 2021, 5:30:11 UTC

Looks like they are still having issues with the Intel igpu application. All of them are having the download issues. No problems with the AMD or Nvidia tasks.
ID: 556 · Report as offensive     Reply Quote

Message boards : Number crunching : New kaktwoos 2.12 beta application has issues