Posts by marmot

1) Message boards : Number crunching : AMD, Intel and Nvidia GPU requirements (Message 581)
Posted 9 May 2021 by marmot
Post:
Also, the Intel/AMD OpenCL 1.2 support, is that a hardware issue or a driver issue?
OpenCL is hardware. You cannot update it with drivers on most systems. Some AMD FirePro can be updated from 1.2 to 2.0 under Linux with specific drivers, but that's because the hardware was already 2.0 capable.


My R9 280x, under Windows 10, had OpenCL 2.0 compliance under an older driver. They latest driver forced it back to 1.2.

I'll have to rediscover which of the downloaded, multitude of drivers versions 18.x.x -> 19.x.x gave me the OpenCL 2.0 capability or if I needed to do something special to get it turned on.

I know Linux users had to resort to special drivers to get 2.0 for that series but some of the Windows versions had it.
2) Message boards : Number crunching : Kaktwoos 2.13 announcement / testing (Message 580)
Posted 9 May 2021 by marmot
Post:
Note that Kaktwoos is only designed to run one instance per GPU, so if you mean "8X" to be "8 kaktwoos-cl tasks on 1 GTX 1060", then there's no guarantee of any good outcomes / usable systems / efficient calculations from that


The data collected shows that running 8x WU's is decidedly more efficient than 2 WU for my 1060 3GB.
8 WU's:   1.143	← credit/sec	1.00000	← CPU/GPU	0.572	← credit/(GPU+CPU s)	

2 WU's    0.656	← credit/sec	0.99723	← CPU/GPU	0.328	← credit/(GPU+CPU s)	



I'll complete the 6 WU, 4 WU and the 1 WU scenarios. Likely 6 WU will be the optimal for my setup.
3) Message boards : Number crunching : Kaktwoos 2.13 announcement / testing (Message 579)
Posted 9 May 2021 by marmot
Post:
In general the Nvidia driver's interaction/usage of OpenCL is a nightmare as it happily locks a whole thread and does nothing inside the locked thread aside from syncing stuff. AMD and Intel GPUs don't have this problem.

Just need to implement non-blocking sync in the application so that the kernel doesn't spin on the wait loop. It's in the OpenCL specification. Works well for CUDA applications. Worked quite well for GPUGrid and Seti CUDA apps.



You would think NVidia coders would have gotten that feedback and implemented it after many years of OpenCL implementation.

But maybe not. How do you get that suggestion to them in a format they will actually notice and implement?
4) Message boards : Number crunching : Why so much CPU usage? (Message 575)
Posted 7 May 2021 by marmot
Post:
Answer from dev Hy on another thread:

In general the Nvidia driver's interaction/usage of OpenCL is a nightmare as it happily locks a whole thread and does nothing inside the locked thread aside from syncing stuff. AMD and Intel GPUs don't have this problem.

Anyway known to tame this behavior with custom drivers, particular driver versions, other techniques?
BOINC's CPU usage setting in the app_config.xml is completely ignored.

Only solution I currently have is to starve Kaktwoos CPU usage by using Process Hacker to lower Kaktwoos process to IDLE and raising the competing CPU project WU process to NORMAL or BELOW NORMAL giving it precedence.
I'll have to run tests (probably with Asteroids@Home CPU in competition) to see how much of a performance hit the NVidia WU's take and let you know my test results.


At some point in 2.13 also I swapped the kernel out, so your GTX 1060 should be utilized like 5-10% more than before, because reasons. RTX GPUs and GTX 1600 series cards will use the same Nvidia kernel found in 2.11/2.12 so everyone can get boosted speed, or higher avg speed of seed calculations without weaker or older cards missing out. Note that Kaktwoos is only designed to run one instance per GPU, so if you mean "8X" to be "8 kaktwoos-cl tasks on 1 GTX 1060", then there's no guarantee of any good outcomes / usable systems / efficient calculations from that. It's a resource issue and more isn't better, unlike some other GPU projects.

Yes, I was attempting 8 WU's at once and was dropping down 2 at a time as a performance/efficiency test.
From what you say then either 1 or possibly 2 WU's would be most efficient.

I'll skip the tests for 6 WU and 4 WU then (8x WU is already completed and makes a baseline), thanks.
5) Message boards : Number crunching : Kaktwoos 2.13 announcement / testing (Message 574)
Posted 7 May 2021 by marmot
Post:
Uh, that's not a server anyone should really be using for anything, outside of MC@Home devs logging in to run tasks for like 2hrs to make sure nothing's broken and us then forgetting to kill it, so people don't have... issues or complaints

Also the version of 2.13 on the server has a few coding issues that are resolved, but literally my only advice is to please get any wuprop people off the testbed and onto either our real MC@Home server or another project if they'd like to run tests or get hours racked in.

WUprops users include other project managers and developers.
We have extensive skills in beta testing and stress testing and a wide variety of configurations and you could benefit from our test runs.

My report was for your benefit to assist in your trouble shooting needs. There was no complaint.
Those particular WU's behaving oddly (glad you caught the issues). I'm not concerned they ended as invalid state; it was a possible clue to the beta testing process.
For me it's about helping projects beta test WU's and getting project hours as badge of honor.

Aside from that, there's no support for the boincboi_testbed, pretend it doesn't exist (unless you want to heat your house up, I suppose)

I wasn't expecting support; just providing any clues from my test runs that would be helpful to you.

Yes, my entire house is heated by computers in the winter/fall/spring and only a laptop and the downclocked 2700x will be on this summer. Prolly not even a single GPU project now that the 550 died (5watts usage on that Lexa mobile GPU mounted on a card by Dell. so sad).
6) Message boards : Number crunching : Kaktwoos 2.13 announcement / testing (Message 571)
Posted 6 May 2021 by marmot
Post:
Completed 8 of the test WU's on boine cboi_testbed_mch.

They all ended invalid and all wingunits ended invalid and all other users I checked showed no valids.

When running 8x of version 2.12 on my GTX 1060 on the Windows 10 (2017) box, the OS is responsive, the interrupts hovered at 1.2% CPU and system (kernel) using about 2% CPU.

When running 8X of version 2.13 on the same box; MSI Afterburner refuses to respond to the GUI when trying to interact with the 1060 GPU.
The process manager shows the interrupts is using 4% CPU and the kernel is using 12% CPU and the entire OS is a bit sluggish.

There is an issue in driver interaction for the interrupts to be so intense?
7) Message boards : Number crunching : Why so much CPU usage? (Message 570)
Posted 6 May 2021 by marmot
Post:
This is my RX 550 running at 25% power rating calculating v 2.10 OpenCL 1 WU at a ttime:
Task: 3826495
WU: 1969904
2 Feb 2021, 14:35:04 UTC
3 Feb 2021, 22:40:15 UTC
Completed and validated
Run: 20716.73
CPU: 7.25
Credit: 2939.68
Multichunk population seed cactus stacking v2.10 (opencl_amd)
windows_x86_64


This is my NVidia 1060 3GB running at 40% power usage calculating on ver 2.12 OpenCL 8 WU's at a time.
Task: 4582705
WU: 2321899
6 May 2021, 10:33:03 UTC
6 May 2021, 18:50:02 UTC
Completed and validated
Run: 20835.89
CPU: 20835.89
Credit: 3753.25
Multichunk population seed cactus stacking v2.12 (opencl_nvidia)
windows_x86_64



I do not remember all other project NVidia OpenCL WU's requiring 100% of a CPU thread.
CUDA WU's never use this much CPU.

The CPU's are needed for other projects and the AMD in this box is 1:3 FP64, so needed for MilkyWay@Home.
(The RX 550 died.... energy efficient, cheap and worked it's ass off. I will miss you buddy.)