Kaktwoos 2.13 announcement / testing

Message boards : Number crunching : Kaktwoos 2.13 announcement / testing
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Hy
Project developer
Avatar

Send message
Joined: 15 Jun 20
Posts: 74
Credit: 19,537,761
RAC: 0
Message 558 - Posted: 2 May 2021, 21:09:21 UTC
Last modified: 4 May 2021, 18:14:32 UTC

Hello BOINCers!

As we've received reports from a few concerned users over the past week regarding issues with kaktwoos-cl, I've put together a new update with some 'finalizations' of some existing features. I'd consider this to be Kaktwoos-cl "2.15" rather than 2.13, but to make things simple we just bumped that version up.

First though, if you are on an Nvidia GPU and under Linux, please make sure you have installed your "nvidia-opencl-dev" package, or whichever package includes the Nvidia OpenCL headers, please! Your package manager should supply this on request (apt-get install), plus the latest or most compatible/stable proprietary driver which you likely already have. The majority of the OpenCL errors we have received, and failed tasks are due to users not checking their Tasks once started (missing this error), and not ensuring the OpenCL headers are installed prior to running BOINC.

Now, this update shall introduce a fallback (or well, swapped) kernel for Nvidia GPUs. On detection of a RTX or GTX 16XX series GPU, the existing optimizations brought to Kaktwoos 2.11 will be enabled. For any other older Nvidia GPU, these 'optimizations' will be disabled, and the old but stable kernel will be used, bringing 3-5% back those afflicted with reported regressions or on weaker cards.

Internally, the AMD GPU detection for optimizing is more generic, and if it detects it is being run on an RDNA 1 or 2 GPU (RX 5000/RX6000) then it will fall back to the generic kernel. The majority of AMD GPUs will use the new AMD kernel as before, but this just ensures that future architectural changes and new product releases are not missed by Kaktwoos-cl

The other changes to this code are mostly on the "C" plaintext side, with Neil's reformatting of the code to hopefully make the varying styles of coding over the past year to mesh together better visually.

I'm still considering a further safety check to improve the checkpointing system, for users who have an unrecoverable host crash and have a corrupted checkpoint (we say only "0.5%" of users ever experience this, which causes the "infinite kaktwoos" bug). Also, to aid in debugging or user interest, we will now print the seed search range for your current task into your log. And yes, there will be somebody who has a task from 000000000 to XXXXXXXX!

Anyways, soon we plan to begin a new (possibly short-term) project involving making another "impossible" possible, which involves finding many "11-eye" dungeons, and any/all in a Single Chunk. Some of you on the Discord server may remember this from December/January, but some recent interest has lead to a boinc-able set of computations and a rather large dataset to work through! This shall be our first official CPU application, if all goes well.

Imagine what could happen if you got a village, emerald veins, and an 11-Eye End Portal all in the same chunk...
ID: 558 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 53
Credit: 245,502,973
RAC: 1,626
Message 560 - Posted: 3 May 2021, 17:30:03 UTC - in response to Message 558.  

Thanks for the update Hy. Interesting times ahead. Optimizations appreciated and the new cpu app announcement is exciting.
Happy to see the devs so active.
ID: 560 · Report as offensive     Reply Quote
marmot

Send message
Joined: 9 Dec 20
Posts: 7
Credit: 1,582,604
RAC: 0
Message 571 - Posted: 6 May 2021, 22:44:29 UTC

Completed 8 of the test WU's on boine cboi_testbed_mch.

They all ended invalid and all wingunits ended invalid and all other users I checked showed no valids.

When running 8x of version 2.12 on my GTX 1060 on the Windows 10 (2017) box, the OS is responsive, the interrupts hovered at 1.2% CPU and system (kernel) using about 2% CPU.

When running 8X of version 2.13 on the same box; MSI Afterburner refuses to respond to the GUI when trying to interact with the 1060 GPU.
The process manager shows the interrupts is using 4% CPU and the kernel is using 12% CPU and the entire OS is a bit sluggish.

There is an issue in driver interaction for the interrupts to be so intense?
ID: 571 · Report as offensive     Reply Quote
Profile Hy
Project developer
Avatar

Send message
Joined: 15 Jun 20
Posts: 74
Credit: 19,537,761
RAC: 0
Message 572 - Posted: 6 May 2021, 22:54:08 UTC - in response to Message 571.  
Last modified: 6 May 2021, 22:55:24 UTC

Uh, that's not a server anyone should really be using for anything, outside of MC@Home devs logging in to run tasks for like 2hrs to make sure nothing's broken and us then forgetting to kill it, so people don't have... issues or complaints

Also the version of 2.13 on the server has a few coding issues that are resolved, but literally my only advice is to please get any wuprop people off the testbed and onto either our real MC@Home server or another project if they'd like to run tests or get hours racked in.

In general the Nvidia driver's interaction/usage of OpenCL is a nightmare as it happily locks a whole thread and does nothing inside the locked thread aside from syncing stuff. AMD and Intel GPUs don't have this problem. At some point in 2.13 also I swapped the kernel out, so your GTX 1060 should be utilized like 5-10% more than before, because reasons. RTX GPUs and GTX 1600 series cards will use the same Nvidia kernel found in 2.11/2.12 so everyone can get boosted speed, or higher avg speed of seed calculations without weaker or older cards missing out. Note that Kaktwoos is only designed to run one instance per GPU, so if you mean "8X" to be "8 kaktwoos-cl tasks on 1 GTX 1060", then there's no guarantee of any good outcomes / usable systems / efficient calculations from that. It's a resource issue and more isn't better, unlike some other GPU projects.

Aside from that, there's no support for the boincboi_testbed, pretend it doesn't exist (unless you want to heat your house up, I suppose)
ID: 572 · Report as offensive     Reply Quote
Profile Hy
Project developer
Avatar

Send message
Joined: 15 Jun 20
Posts: 74
Credit: 19,537,761
RAC: 0
Message 573 - Posted: 6 May 2021, 23:00:12 UTC
Last modified: 6 May 2021, 23:01:11 UTC

For the actual release of kaktwoos-2.13, there are about two or so github pull requests we need to verify and close (I submitted them, other devs need to make sure it's alright) and then we shall send that to the testbed again for a BOINC-sanity check. I've already done local, BOINC-free runs of the kernels on my Vega 56, GTX 1650Mobile and i5-5600U (Linux-OpenCL) GPUs without any issues. Benchmark seeds came back fine, no crashes in the usual environments so things should be alright.

I'd say give it another week/past the weekend before we are done, as we have had a surprise on a previous shelved project and some more Minecraft Alpha seed-announcements are likely within the next week!

Progress is also slowed on the OneChunk project from the above event, but once Kaktwoos' code changes are done, then focus can be put on to that
ID: 573 · Report as offensive     Reply Quote
marmot

Send message
Joined: 9 Dec 20
Posts: 7
Credit: 1,582,604
RAC: 0
Message 574 - Posted: 7 May 2021, 9:38:44 UTC - in response to Message 572.  

Uh, that's not a server anyone should really be using for anything, outside of MC@Home devs logging in to run tasks for like 2hrs to make sure nothing's broken and us then forgetting to kill it, so people don't have... issues or complaints

Also the version of 2.13 on the server has a few coding issues that are resolved, but literally my only advice is to please get any wuprop people off the testbed and onto either our real MC@Home server or another project if they'd like to run tests or get hours racked in.

WUprops users include other project managers and developers.
We have extensive skills in beta testing and stress testing and a wide variety of configurations and you could benefit from our test runs.

My report was for your benefit to assist in your trouble shooting needs. There was no complaint.
Those particular WU's behaving oddly (glad you caught the issues). I'm not concerned they ended as invalid state; it was a possible clue to the beta testing process.
For me it's about helping projects beta test WU's and getting project hours as badge of honor.

Aside from that, there's no support for the boincboi_testbed, pretend it doesn't exist (unless you want to heat your house up, I suppose)

I wasn't expecting support; just providing any clues from my test runs that would be helpful to you.

Yes, my entire house is heated by computers in the winter/fall/spring and only a laptop and the downclocked 2700x will be on this summer. Prolly not even a single GPU project now that the 550 died (5watts usage on that Lexa mobile GPU mounted on a card by Dell. so sad).
ID: 574 · Report as offensive     Reply Quote
boysanic
Project administrator
Project developer

Send message
Joined: 15 Jun 20
Posts: 10
Credit: 95,783,055
RAC: 43,343
Message 576 - Posted: 7 May 2021, 14:50:45 UTC - in response to Message 574.  

Thanks for the report on that particular issue.

The invalids are simply because I forgot to set permissions on the validator binary on my test VM.
I didn't intend for a lot of people to continue to use it for days after we were finished with that round of testing, so I'm likely to just shut down the VM until it's time to test something else.

In particular, I was actually testing older software, nothing in "beta" or "new". It's often useful for me to see how the older software behaved in a separate environment.

Anyways, I'll shut that down for now. Sorry for any confusion.
ID: 576 · Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 8 Mar 21
Posts: 53
Credit: 245,502,973
RAC: 1,626
Message 577 - Posted: 7 May 2021, 21:12:35 UTC - in response to Message 572.  

In general the Nvidia driver's interaction/usage of OpenCL is a nightmare as it happily locks a whole thread and does nothing inside the locked thread aside from syncing stuff. AMD and Intel GPUs don't have this problem.

Just need to implement non-blocking sync in the application so that the kernel doesn't spin on the wait loop. It's in the OpenCL specification. Works well for CUDA applications. Worked quite well for GPUGrid and Seti CUDA apps.
ID: 577 · Report as offensive     Reply Quote
marmot

Send message
Joined: 9 Dec 20
Posts: 7
Credit: 1,582,604
RAC: 0
Message 579 - Posted: 9 May 2021, 6:55:40 UTC - in response to Message 577.  
Last modified: 9 May 2021, 7:23:32 UTC

In general the Nvidia driver's interaction/usage of OpenCL is a nightmare as it happily locks a whole thread and does nothing inside the locked thread aside from syncing stuff. AMD and Intel GPUs don't have this problem.

Just need to implement non-blocking sync in the application so that the kernel doesn't spin on the wait loop. It's in the OpenCL specification. Works well for CUDA applications. Worked quite well for GPUGrid and Seti CUDA apps.



You would think NVidia coders would have gotten that feedback and implemented it after many years of OpenCL implementation.

But maybe not. How do you get that suggestion to them in a format they will actually notice and implement?
ID: 579 · Report as offensive     Reply Quote
marmot

Send message
Joined: 9 Dec 20
Posts: 7
Credit: 1,582,604
RAC: 0
Message 580 - Posted: 9 May 2021, 7:02:45 UTC - in response to Message 572.  

Note that Kaktwoos is only designed to run one instance per GPU, so if you mean "8X" to be "8 kaktwoos-cl tasks on 1 GTX 1060", then there's no guarantee of any good outcomes / usable systems / efficient calculations from that


The data collected shows that running 8x WU's is decidedly more efficient than 2 WU for my 1060 3GB.
8 WU's:   1.143	← credit/sec	1.00000	← CPU/GPU	0.572	← credit/(GPU+CPU s)	

2 WU's    0.656	← credit/sec	0.99723	← CPU/GPU	0.328	← credit/(GPU+CPU s)	



I'll complete the 6 WU, 4 WU and the 1 WU scenarios. Likely 6 WU will be the optimal for my setup.
ID: 580 · Report as offensive     Reply Quote

Message boards : Number crunching : Kaktwoos 2.13 announcement / testing