Minecraft NVIDIA GPU client engages on already occupied GPU

Message boards : Number crunching : Minecraft NVIDIA GPU client engages on already occupied GPU
Message board moderation

To post messages, you must log in.

AuthorMessage
Michael H.W. Weber
Avatar

Send message
Joined: 27 Jun 20
Posts: 9
Credit: 788,006
RAC: 0
Message 123 - Posted: 1 Jul 2020, 16:38:36 UTC
Last modified: 1 Jul 2020, 16:39:34 UTC

I was baffled to find that the Minecraft@Home NVIDIA GPU client appears to add as a second task to the SAME NVIDIA GPU on which a GPUGRID distributed computing project task is already running.
Please correct this as it might cause serious crashes (depending on the RAM requirements of the paired projects) and delays tasks.

In this case, both tasks ran in parallel and I have aborted the Mincreaft one.

Michael.
President of Rechenkraft.net - This world's first and largest distributed computing organization. We make those things possible that supercomputers don't.
ID: 123 · Report as offensive
Profile nenym

Send message
Joined: 27 Jun 20
Posts: 2
Credit: 10,037,530
RAC: 0
Message 124 - Posted: 1 Jul 2020, 17:08:03 UTC

Application is set as NCI, correct it, please. Setting as CPU intensive via app_config.xml has no effect.
ID: 124 · Report as offensive
Profile Hy
Project developer
Avatar

Send message
Joined: 15 Jun 20
Posts: 74
Credit: 19,537,761
RAC: 0
Message 136 - Posted: 1 Jul 2020, 22:04:23 UTC - in response to Message 124.  
Last modified: 1 Jul 2020, 22:51:30 UTC

Chip, the project lead has made these changes.

Additionally, I have coded up fixes to kaktwoos-cl so that it respects each individual GPU / vendor better.

So, Nvidia GPU tasks will only run on Nvidia gpus 1,2,3,4 etc. Same with AMD and Intel.

BOINC's documentation is very confusing and there are contradictions between the source code examples they reference, and what (the many) wikis say to do
to implement integration properly

All of this will be pushed in "kaktwoos-2.0.1" or so in a day if I have enough systems running multi-vendor tasks.
ID: 136 · Report as offensive
Michael H.W. Weber
Avatar

Send message
Joined: 27 Jun 20
Posts: 9
Credit: 788,006
RAC: 0
Message 139 - Posted: 1 Jul 2020, 23:28:36 UTC - in response to Message 136.  

BOINC's documentation is very confusing and there are contradictions between the source code examples they reference, and what (the many) wikis say to do to implement integration properly.

...please report this to David Anderson such that other's won't stumble over the same issues in the future.

Michael.
President of Rechenkraft.net - This world's first and largest distributed computing organization. We make those things possible that supercomputers don't.
ID: 139 · Report as offensive
Penguin

Send message
Joined: 24 Jun 20
Posts: 5
Credit: 13,163,758
RAC: 0
Message 141 - Posted: 2 Jul 2020, 0:31:31 UTC

Was this corrected? I still have simultaneous tasks running on device 0 from minecraftathome and other projects...


It also appears I cannot get tasks for devices 1 or 2 on my 3 GPU boxes running Linux..

Requests work, 0 tasks sent for those gpus other than device 0.
ID: 141 · Report as offensive
Profile Hy
Project developer
Avatar

Send message
Joined: 15 Jun 20
Posts: 74
Credit: 19,537,761
RAC: 0
Message 143 - Posted: 2 Jul 2020, 2:42:34 UTC - in response to Message 139.  
Last modified: 2 Jul 2020, 2:49:28 UTC

Yeah, just for an example:

On the topic of selecting GPU devices (their device #) to run a task on, there's "three" sources of information regarding how to do it (plus the OpenCL App source code)

https://boinc.berkeley.edu/trac/wiki/GPUApp This page references " boinc_get_opencl_ids() " with no link or code / documentation regarding it.


While searching for boinc_get_opencl_ids() I also found:

https://boinc.berkeley.edu/trac/wiki/AppCoprocessor this page. But, this refers to " boinc_get_init_data() to get an APP_INIT_DATA structure"

Of course, app_init_data structure was removed from BOINC 7.0+, and so finally refers to


https://boinc.berkeley.edu/trac/wiki/OpenclApps which has what we *may* be looking for:

int boinc_get_opencl_ids(int argc, char** argv, int type, cl_device_id* device, cl_platform_id* platform); - With the documentation that-

With BOINC Clients version 7.0.12 or later, the first 3 arguments will be ignored and all data will be taken from the init_data.xml file in the slot directory. The first 3 arguments allow this to work with older BOINC Clients. If your OpenCL app can use OpenCL-capable GPUs from any vendor, you can pass 0 for the third argument (type); if you pass a type value of 0, the type will be taken from the gpu_type field of the init_data.xml file on newer clients, but will return an error code of CL_INVALID_DEVICE_TYPE on older clients

Alright, so how about the official OpenCL app example? Well...

// IMPORTANT NOTE: production applications should always specify
// the GPU type (vendor) in the call to boinc_get_opencl_ids as
// the third argument: it must be either PROC_TYPE_NVIDIA_GPU,
// PROC_TYPE_AMD_GPU or PROC_TYPE_INTEL_GPU. This is to support
// older versions of the BOINC client which do not include the
// field in the init_data.xml file.

Note that these arguments will of course not work as the third argument of boinc_get_opencl_ids is an *int*, meaning you must read the Wiki, not the OpenCLapp sample code to properly set
each GPU per vendor. Additionally:

// This sample passes -1 for the type argument to allow using
// just one sample for any GPU vendor (AMD, NVIDIA or Intel.)
// As a result, the init_data.xml file for this sample must
// specify the GPU type (vendor) and either gpu_device_num (the
// GPU's index from that vendor) or gpu_opencl_dev_index (the
// GPU's index among OpenCL-capable devices from that vendor.)

References using -1 to get data from init_data.xml, despite the the OpenCL app wiki only referencing an argument of 0, or completely ignoring the first three arguments on newer clients.

In my opinion, it is extremely hard to get from "GPU App" to where the information on OpenCL is, which virtually every modern GPU and platform supports. Also, googling the get_opencl_ids code does not link or refer to that third wiki page, making it much more difficult to find
ID: 143 · Report as offensive
Profile chip

Send message
Joined: 14 Jun 20
Posts: 78
Credit: 1,321,619
RAC: 0
Message 152 - Posted: 2 Jul 2020, 13:33:52 UTC - in response to Message 143.  

I’ll push this new change out today
ID: 152 · Report as offensive
Jord
Volunteer moderator
Help desk expert
Avatar

Send message
Joined: 24 Jun 20
Posts: 85
Credit: 207,156
RAC: 0
Message 156 - Posted: 2 Jul 2020, 14:18:04 UTC - in response to Message 143.  
Last modified: 3 Jul 2020, 18:55:17 UTC

Best report this as an issue to BOINC Github at https://github.com/BOINC/boinc/issues because otherwise it'll never be changed.
I co-write and update the user side documentation, only check the project side documentation for language problems. But as long as no one knows there is this problem with the documentation, there's no one who will fix it.
ID: 156 · Report as offensive
Profile chip

Send message
Joined: 14 Jun 20
Posts: 78
Credit: 1,321,619
RAC: 0
Message 169 - Posted: 3 Jul 2020, 17:04:21 UTC - in response to Message 156.  

This change has been pushed, if there are any further issues, please engage in a new thread. Thanks
ID: 169 · Report as offensive

Message boards : Number crunching : Minecraft NVIDIA GPU client engages on already occupied GPU