Message boards :
News :
We are live
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Send message Joined: 24 Jun 20 Posts: 85 Credit: 207,156 RAC: 0 |
Meanwhilst, my AMD RX 5700 XT finished its first task, Run time 40 min 36 sec (or 2,436.47 seconds). [/url] |
Send message Joined: 24 Jun 20 Posts: 85 Credit: 207,156 RAC: 0 |
Just tested in preparation of installing 20.5.1 drivers. present task was at 27 minutes. Stopped BOINC, waited a minute. Started BOINC, task starts from the beginning. It would be nice to have checkpointing, especially for those with slower GPUs. |
Send message Joined: 24 Jun 20 Posts: 25 Credit: 448,784,541 RAC: 24,476 |
My windows machines seem to be working fine now. However, not so much with my linux machines. I have two identical machines. Hardware, GPUs, and OS, all the same. One can get tasks, but they never complete. The other can't get tasks at all. It asks for tasks, but nothing is sent. There is nothing in the event log explaining why. When I look at other linux machines attached to this project, I am not seeing much success either. Edit: It looks like one of my linux machines does not have OpenCL installed. I assume that is why it can't get work. I will do that now. Reno, NV Team: SETI.USA |
Send message Joined: 24 Jun 20 Posts: 85 Credit: 207,156 RAC: 0 |
The other can't get tasks at all. It asks for tasks, but nothing is sent. There is nothing in the event log explaining why.You have sched_op_debug enabled for more information? Nice to see you're still around. :) Edit: Edit: It looks like one of my linux machines does not have OpenCL installed. I assume that is why it can't get work. I will do that now.That would probably do it. |
Send message Joined: 24 Jun 20 Posts: 25 Credit: 448,784,541 RAC: 24,476 |
Hi Jord! Yeah, sched_op_debug is enabled. For whatever reason, it doesn't say anything when the issue is the app needs OpenCL, but it's not installed. In any case, that fixed the work fetch problem with the one machine. For the issue of not completing tasks, maybe that was fixed after all. I should know in an hour or two. Reno, NV Team: SETI.USA |
Send message Joined: 24 Jun 20 Posts: 25 Credit: 448,784,541 RAC: 24,476 |
|
Send message Joined: 24 Jun 20 Posts: 85 Credit: 207,156 RAC: 0 |
For whatever reason, it doesn't say anything when the issue is the app needs OpenCL, but it's not installed.I don't think it's up to BOINC to tell you about that. You do send information about your whole system to the project with the sched_request*.xml file so it would be nice if they told you you missed something required. But they'll have to program such a response. And I gather these guys just started with BOINC, so baby steps. ;-) For the issue of not completing tasks, maybe that was fixed after all. I should know in an hour or two.Too bad you can't use that VII, it would be interesting to see how it compares to a 5700 XT. One wingman I was paired against (but who apparently detached and reattached so his task is abandoned) runs a NVIDIA Tesla P100-PCIE-16GB (4095MB) (looks like he has 32bit driver problems) which does tasks in half the time my 5700 XT does. |
Send message Joined: 24 Jun 20 Posts: 25 Credit: 448,784,541 RAC: 24,476 |
It looks like this project has a similar problem to what SRBase has. A machine with multiple GPUs, all the tasks are running at the same time on just one of the GPUs. And the rest of the GPUs are idle. I am seeing this on all my machines, both Windows and Linux. Reno, NV Team: SETI.USA |
Send message Joined: 26 Jun 20 Posts: 25 Credit: 123,735,290 RAC: 182 |
ditto, except i have two types of GPUs in my machine (2 NVidia, 1 AMD) and i can get NVidia and AMD WUs to run but the NVidia WUs only seem to run on one of the cards. Also, the project doesn't correctly identify the machine resources - thinks I have 2 GTX 1060s, in reality I have 1 GTX 1060, 1 GTX 1070. (plus the Radeon WX5100) |
Send message Joined: 24 Jun 20 Posts: 25 Credit: 448,784,541 RAC: 24,476 |
Until the app gets fixed, for those with multiple GPU systems, add this to your cc_config.xml. That way you can free up the rest of your GPUs to run other projects. You will need to quit/restart BOINC for the changes to take effect.: <exclude_gpu> <url>project_URL</url> [<device_num>N</device_num>] [<type>NVIDIA|ATI|intel_gpu</type>] [<app>appname</app>] </exclude_gpu> It runs on GPU "0" of each type by default. So exclude GPUs 1+ for each type of additional GPUs you have in your machine. For example, Here are the entries I have on a Machine with three Nvidia GPUs: <exclude_gpu> <url>https://minecraftathome.com/minecrafthome/</url> <type>NVIDIA</type> <device_num>1</device_num> <app>kaktwoos</app> </exclude_gpu> <exclude_gpu> <url>https://minecraftathome.com/minecrafthome/</url> <type>NVIDIA</type> <device_num>2</device_num> <app>kaktwoos</app> </exclude_gpu> FYI, you can see which GPU is # 0, 1, 2, etc., bu looking at your event log at start up. This is what mine shows for this same machine: 6/25/2020 6:34:42 PM CUDA: NVIDIA GPU 0: GeForce GTX 1660 Ti (driver version 440.10, CUDA version 10.2, compute capability 7.5, 4096MB, 3972MB available, 11336 GFLOPS peak) 6/25/2020 6:34:42 PM CUDA: NVIDIA GPU 1: GeForce GTX 1660 Ti (driver version 440.10, CUDA version 10.2, compute capability 7.5, 4096MB, 3972MB available, 11336 GFLOPS peak) 6/25/2020 6:34:42 PM CUDA: NVIDIA GPU 2: Quadro K2000 (driver version 440.10, CUDA version 10.2, compute capability 3.0, 2000MB, 1973MB available, 733 GFLOPS peak) 6/25/2020 6:34:42 PM OpenCL: NVIDIA GPU 0: GeForce GTX 1660 Ti (driver version 440.100, device version OpenCL 1.2 CUDA, 5942MB, 3972MB available, 11336 GFLOPS peak) 6/25/2020 6:34:42 PM OpenCL: NVIDIA GPU 1: GeForce GTX 1660 Ti (driver version 440.100, device version OpenCL 1.2 CUDA, 5945MB, 3972MB available, 11336 GFLOPS peak) 6/25/2020 6:34:42 PM OpenCL: NVIDIA GPU 2: Quadro K2000 (driver version 440.100, device version OpenCL 1.2 CUDA, 2000MB, 1973MB available, 733 GFLOPS peak) Reno, NV Team: SETI.USA |
Send message Joined: 24 Jun 20 Posts: 25 Credit: 448,784,541 RAC: 24,476 |
Too bad you can't use that VII, it would be interesting to see how it compares to a 5700 XT. FWIW, now that I have only one task running on my 2080 Ti at a time, it takes 1960 seconds. Admin: It would be better if you turned on the "number crunching" sub-forum, so that things like this could be discussed without clogging up the news sub-forum. Reno, NV Team: SETI.USA |
Send message Joined: 24 Jun 20 Posts: 9 Credit: 47,348,036 RAC: 0 |
Until the app gets fixed, for those with multiple GPU systems, add this to your cc_config.xml. That way you can free up the rest of your GPUs to run other projects. You will need to quit/restart BOINC for the changes to take effect.: Does that actually work? This should but doesn't. <ignore_nvidia_dev>0</ignore_nvidia_dev> This from init_data in the slot folder: <gpu_type>NVIDIA</gpu_type> <gpu_device_num>1</gpu_device_num> <gpu_opencl_dev_index>1</gpu_opencl_dev_index> <gpu_usage>1.000000</gpu_usage> <ncpus>0.997799</ncpus> seems to indicate it should be running on the 2nd GPU, but it isn't. |
Send message Joined: 25 Jun 20 Posts: 2 Credit: 9,583,202 RAC: 0 |
[quote] It seems the BM shows the correct progress and remaining time now, great work, thanks ! And agree @Zombie67, a Number Crunching forum would be good to discuss such things ;-) |
Send message Joined: 24 Jun 20 Posts: 85 Credit: 207,156 RAC: 0 |
Admin: It would be better if you turned on the "number crunching" sub-forum, so that things like this could be discussed without clogging up the news sub-forum.In case he wonders how, rerun the html/ops/create_forums.php script after enabling catid 2: https://github.com/BOINC/boinc/blob/master/html/ops/create_forums.php And yes I know chip's asking for BOINC experts to come forward via Discord, but I'm not so much of a talker. More of a typer. :) |
Send message Joined: 26 Jun 20 Posts: 25 Credit: 123,735,290 RAC: 182 |
well zombie67, i thought that would work and had already put that in my cc_config, but it didn't. message in log at start up: 6/26/2020 4:53:08 AM | | Unrecognized tag in cc_config.xml: <exclude_gpu> my cc_config: <cc_config> <exclude_gpu> <url>https://minecraftathome.com/minecrafthome/</url> <type>NVIDIA</type> <device_num>1</device_num> <app>kaktwoos</app> </exclude_gpu> <options> <use_all_gpus>1</use_all_gpus> <skip_cpu_benchmarks>1</skip_cpu_benchmarks> <report_results_immediately>1</report_results_immediately> </options> </cc_config> What did i mess up? :) (Running version 7.16.5 of BOINC) |
Send message Joined: 24 Jun 20 Posts: 85 Credit: 207,156 RAC: 0 |
Ah I see it... your cc_config is set up wrong. Try this: <cc_config> <options> <exclude_gpu> <url>https://minecraftathome.com/minecrafthome/</url> <type>NVIDIA</type> <device_num>1</device_num> <app>kaktwoos</app> </exclude_gpu> <use_all_gpus>1</use_all_gpus> <skip_cpu_benchmarks>1</skip_cpu_benchmarks> <report_results_immediately>1</report_results_immediately> </options> </cc_config> For reference: https://boinc.berkeley.edu/wiki/Client_configuration#Options, exclude_gpu is an option, so should be inside the <options></options> tags in cc_config.xml And btw, for this project report results immediately isn't necessary as the deadline is 24 hours so any tasks done will be reported immediately automatically. |
Send message Joined: 24 Jun 20 Posts: 85 Credit: 207,156 RAC: 0 |
First task to validate against another user's Nvidia GT 1030 fetched me a validate error: https://minecraftathome.com/minecrafthome/workunit.php?wuid=1259872. So let's see what it will do against an RX 580 later today: https://minecraftathome.com/minecrafthome/workunit.php?wuid=1260373 I hope that validate errors aren't a precedence for the rest of them. It also doesn't help that when the wingman returns an error, or detaches/reattaches and thus abandons work, that remaining tasks stay unsent: https://minecraftathome.com/minecrafthome/workunit.php?wuid=1259732 I think I will only go for validate errors because that GT 1030 had a lot more info in its stderr.txt than my RX 5700 XT had: <core_client_version>7.14.2</core_client_version> <![CDATA[ <stderr_txt> 19:09:40 (10408): wrapper (7.5.26014): starting 19:09:40 (10408): wrapper: running ../../projects/minecraftathome.com_minecrafthome/kaktwoos_1.12_opencl ( --start 11400000000000 --end 11500000000000 --chunkseed 9567961692053 --neighbor1 856 --neighbor2 344 --neighbor3 840 --diagonalindex 0 --cactusheight 12) Received work unit: 9567961692053 Data: n1: 856, n2: 344, n3: 840, di: 0, ch: 12 Found seed: 5861617559634779173, 182644612078629, height: 20 5861617559634779173 Found seed: 6001229513563318277, 183010092132357, height: 20 6001229513563318277 Found seed: 6149848403176123013, 183112001710725, height: 21 6149848403176123013 Found seed: 5861618302365947205, 183387343246661, height: 20 5861618302365947205 Found seed: 6149848715775767557, 183424601355269, height: 21 6149848715775767557 Speed: 5.95m/s Done Processed 100000000000 seeds in 16798.995819 seconds Found seeds: 5861617559634779173 6001229513563318277 6149848403176123013 5861618302365947205 6149848715775767557 23:51:02 (10408): client exited; CPU time 13158.495873 23:51:02 (10408): called boinc_finish(0) </stderr_txt> ]]> vs <core_client_version>7.16.7</core_client_version> <![CDATA[ <stderr_txt> 00:18:42 (2832): wrapper (7.7.26016): starting 00:18:42 (2832): wrapper: running ../../projects/minecraftathome.com_minecrafthome/kaktwoos_1.12_opencl_amd.exe ( --start 11400000000000 --end 11500000000000 --chunkseed 9567961692053 --neighbor1 856 --neighbor2 344 --neighbor3 840 --diagonalindex 0 --cactusheight 12) 00:46:25 (9924): wrapper (7.7.26016): starting 00:46:25 (9924): wrapper: running ../../projects/minecraftathome.com_minecrafthome/kaktwoos_1.12_opencl_amd.exe ( --start 11400000000000 --end 11500000000000 --chunkseed 9567961692053 --neighbor1 856 --neighbor2 344 --neighbor3 840 --diagonalindex 0 --cactusheight 12) 00:56:59 (11396): wrapper (7.7.26016): starting 00:56:59 (11396): wrapper: running ../../projects/minecraftathome.com_minecrafthome/kaktwoos_1.12_opencl_amd.exe ( --start 11400000000000 --end 11500000000000 --chunkseed 9567961692053 --neighbor1 856 --neighbor2 344 --neighbor3 840 --diagonalindex 0 --cactusheight 12) 01:36:57 (11396): client exited; CPU time 10.531250 01:36:57 (11396): called boinc_finish(0) </stderr_txt> ]]>So just because my tasks end well doesn't mean it's doing any useful work. I also see for Nvidia users that their CPU time is high, whereas mine is 10 seconds. The kaktwoos application doesn't use any CPU in Task Manager Details. So it would seem that the application doesn't run correctly on my system. It does on that RX 580 I pointed out earlier, and there also the CPU time is high. So isn't the OpenCL app optimized for Navi GPUs? |
Send message Joined: 26 Jun 20 Posts: 25 Credit: 123,735,290 RAC: 182 |
Thank you, Jord. Figures that's what it was. Thought about trying that after seeing the message in the log, but I like "talking" to people, so... :) |
Send message Joined: 24 Jun 20 Posts: 85 Credit: 207,156 RAC: 0 |
Yup, I have validate errors for all tasks so far, so that means this cannot be run on an RX 5700 XT. Would be nice to have someone else with such a card, or at least 5000 series chime in to see if it's my system or not. |
Send message Joined: 26 Jun 20 Posts: 4 Credit: 1,530,361 RAC: 0 |
Anyone else running a 1080gtx and not getting any tasks? Event log says 0 tasks sent. |