deadline is too short

Message boards : Number crunching : deadline is too short
Message board moderation

To post messages, you must log in.

AuthorMessage
Vitmalok

Send message
Joined: 2 Jul 20
Posts: 1
Credit: 409,979
RAC: 0
Message 194 - Posted: 5 Jul 2020, 13:20:26 UTC

My hardware is slow, so one task takes ~10 hours to complete. Due to the fact that the server sends several tasks at a time, the execution time increases to several tens of hours. However, the deadline is only 24 hours after receiving tasks. Because of this, I do not have time to pass the results in time. A possible solution is to split the tasks into smaller ones. You can also increase the deadline. Is it realistic to do any of this?

P.S. I use an translator program to English so the text may be unnatural.
ID: 194 · Report as offensive
Odd-Rod

Send message
Joined: 25 Jun 20
Posts: 4
Credit: 3,501,110
RAC: 0
Message 195 - Posted: 5 Jul 2020, 14:36:37 UTC

I also have a slow host that doesn't return work in time.
Not meant to be a complaint, just hoping this info will be useful in changing some settings.
ID: 195 · Report as offensive
Jord
Volunteer moderator
Help desk expert
Avatar

Send message
Joined: 24 Jun 20
Posts: 85
Credit: 207,156
RAC: 0
Message 196 - Posted: 5 Jul 2020, 15:33:14 UTC - in response to Message 194.  

Due to the fact that the server sends several tasks at a time...
The server sends as much work (and a little over) that what your client asks for. So if that's too much, then you'll have to adjust your work queue request values.
But it'll fix itself as well, because BOINC will learn and when enough tasks go over the deadline BOINC will automatically ask for less work from this project. That can take several tens or hundreds of tasks though.

My BOINC is set to as for 0 days of work and 0.1 days additional. This results in it getting 1 task at a time. And because of the speedy connection on both sides this results in a down time of mere seconds between tasks being uploaded and the next downloaded.
ID: 196 · Report as offensive
Odd-Rod

Send message
Joined: 25 Jun 20
Posts: 4
Credit: 3,501,110
RAC: 0
Message 197 - Posted: 5 Jul 2020, 15:52:17 UTC - in response to Message 196.  

This results in it getting 1 task at a time.


My situation is the same. It is a single work unit that doesn't complete in time.
ID: 197 · Report as offensive
Jord
Volunteer moderator
Help desk expert
Avatar

Send message
Joined: 24 Jun 20
Posts: 85
Credit: 207,156
RAC: 0
Message 198 - Posted: 5 Jul 2020, 16:32:26 UTC - in response to Message 197.  
Last modified: 5 Jul 2020, 16:34:26 UTC

But not always, that GPU did complete tasks in time with the older v1.12 and v2.00 applications. Just no longer with the nvidia_opencl v2.01 application.
So that's something in the application that changed. I'll leave that to Hy to explain.

The older applications allowed for the task to run way longer as well. Now you're capped at 49,470 seconds, while before it finished without problems in 55,981 and 80,314 seconds.
ID: 198 · Report as offensive
Odd-Rod

Send message
Joined: 25 Jun 20
Posts: 4
Credit: 3,501,110
RAC: 0
Message 200 - Posted: 5 Jul 2020, 16:42:56 UTC - in response to Message 198.  

The older applications allowed for the task to run way longer as well. Now you're capped at 49,470 seconds, while before it finished without problems in 55,981 and 80,314 seconds.


Indeed, I was about to paste such info when I saw your post.

From Stderr output:
<message>exceeded elapsed time limit 49469.66 (2000000.00G/40.43G)</message>
ID: 200 · Report as offensive
Profile chip
Project administrator

Send message
Joined: 14 Jun 20
Posts: 78
Credit: 1,321,619
RAC: 0
Message 201 - Posted: 5 Jul 2020, 19:03:56 UTC

Yeah, I think we're going to experiment wiht multi-size, those Intel GPU jobs take their sweet time.
Either that, or I'll just simply bump the deadlines up to 72 hours or something
ID: 201 · Report as offensive
Jord
Volunteer moderator
Help desk expert
Avatar

Send message
Joined: 24 Jun 20
Posts: 85
Credit: 207,156
RAC: 0
Message 202 - Posted: 5 Jul 2020, 19:11:20 UTC - in response to Message 201.  
Last modified: 5 Jul 2020, 19:12:05 UTC

Odd-Rod's troubles are with an Nvidia GT 740M GPU and the opencl_nvidia v2.01 application.

Edit: Ah you meant the thread-starter.
ID: 202 · Report as offensive
Profile Hy
Project administrator
Project developer
Avatar

Send message
Joined: 15 Jun 20
Posts: 74
Credit: 19,537,761
RAC: 0
Message 203 - Posted: 5 Jul 2020, 19:14:31 UTC - in response to Message 200.  
Last modified: 5 Jul 2020, 19:27:22 UTC

I know mostly why it's failing now. The main thing is we actually fixed multi-vendor support and specifically disallowed your older Intel iGPUs because they would give major errors, hence the task aborting and timeouts now

If the code doesn't give progress updates to BOINC now (it stalls/crashes), then there can be situations where this timeout happens.

On Kaktwoos 1.12 you were only running the code on your Nvidia GPU (one task)
On Kaktwoos 2.00 you had intel tasks, but because of the multi-vendor bug you had *Both* tasks running on your compatible Nvidia gpu (hence 80,000 seconds) at once rather than only Nvidia. Your Intel Graphics OpenCL is too old to run, and so we suggest that you reset the project in BOINC and re-get new tasks with the new enforced compatibility and fixes.

Sadly, your Intel GPU will no longer be used, Odd-Rod. Just your Nvidia GPU. It never ran code in the first place from the tests I've seen.

https://minecraftathome.com/minecrafthome/result.php?resultid=2476780 < Example of how your Nvidia GPU should be running

https://minecraftathome.com/minecrafthome/result.php?resultid=2717247 < It says "Intel", but curiously it's the exact same speed as your Nvidia GPU

https://minecraftathome.com/minecrafthome/result.php?resultid=2714458 < This Nvidia task's average time is much longer and slower than 1.12, because it had to run two tasks at once occasionally.

https://minecraftathome.com/minecrafthome/result.php?resultid=2704553 < The fixed code, *but* our enforced compatibility wasn't set correctly on your system and so it ran old jobs on the right, but incompatible GPU (Intel). See all the IntelOpenCL errors in the log

Hopefully that clears things up some. We do have some good news though. We are testing a new universal (all systems) optimization that should increase speed by 10-20% in kaktwoos.
ID: 203 · Report as offensive
ChelseaOilman

Send message
Joined: 27 Jun 20
Posts: 3
Credit: 87,272,308
RAC: 0
Message 204 - Posted: 5 Jul 2020, 19:43:10 UTC - in response to Message 203.  

Another way to decrease how many tasks your client caches is to set the weight/priority in preferences for the project. You can set it anywhere from 0 to 10,000. Default is 100. Set it to zero and the client cache to zero and you should only get 1 task at a time. I do this for PrimeGrid.
ID: 204 · Report as offensive
Odd-Rod

Send message
Joined: 25 Jun 20
Posts: 4
Credit: 3,501,110
RAC: 0
Message 205 - Posted: 5 Jul 2020, 20:18:12 UTC - in response to Message 203.  

On Kaktwoos 1.12 you were only running the code on your Nvidia GPU (one task)
On Kaktwoos 2.00 you had intel tasks, but because of the multi-vendor bug you had *Both* tasks running on your compatible Nvidia gpu (hence 80,000 seconds) at once rather than only Nvidia. Your Intel Graphics OpenCL is too old to run, and so we suggest that you reset the project in BOINC and re-get new tasks with the new enforced compatibility and fixes.

Sadly, your Intel GPU will no longer be used, Odd-Rod. Just your Nvidia GPU. It never ran code in the first place from the tests I've seen.

Thanks for the great explanation!
I had gathered that the IntelGPU wasn't going to work, but interesting about 2 WUs on the NVidia, No wonder it was slow.
I have now disabled IntelGPU on that host (Darn, I was looking forward to having another project to run on it).
Since I'm about to go to bed, and I have a full day's work tomorrow (thank goodness, with our lockdown in South Africa!) I've got the host on No New Work until I'm able to monitor it.
I'll check the forum again tomorrow evening.
ID: 205 · Report as offensive
Jord
Volunteer moderator
Help desk expert
Avatar

Send message
Joined: 24 Jun 20
Posts: 85
Credit: 207,156
RAC: 0
Message 206 - Posted: 5 Jul 2020, 20:40:36 UTC - in response to Message 205.  

I have now disabled IntelGPU on that host (Darn, I was looking forward to having another project to run on it).

But you can do that without having to disable the iGPU. All you have to do here at this project is go to your account (here), then on to the Project preferences (here), edit those, uncheck Use Intel GPU, and save those preferences. Then you can still use the iGPU at another project that supports it, like the Collatz or Einstein@Home.
ID: 206 · Report as offensive
Profile Hy
Project administrator
Project developer
Avatar

Send message
Joined: 15 Jun 20
Posts: 74
Credit: 19,537,761
RAC: 0
Message 208 - Posted: 5 Jul 2020, 22:20:36 UTC

Yeah, the iGPU work disabling is on BOINC's side. The hardware can be kept active and used for other projects if you want
ID: 208 · Report as offensive
Profile chip
Project administrator

Send message
Joined: 14 Jun 20
Posts: 78
Credit: 1,321,619
RAC: 0
Message 210 - Posted: 6 Jul 2020, 16:08:14 UTC - in response to Message 208.  
Last modified: 6 Jul 2020, 16:09:22 UTC

Yeah, the iGPU work disabling is on BOINC's side. The hardware can be kept active and used for other projects if you want


You can exclude a specific GPUs in your cc_config like this:
<exclude_gpu>
   <url>https://minecraftathome.com/minecrafthome/</url>
   <type>intel_gpu</type>
</exclude_gpu>


However, I have increased the deadline on all jobs by 3x to 72 hours and reliable hosts which normally turn work around in under 4 hours will receive priority work with a 36-hour deadline (duplicate results, usually retries or errors from incomplete workunits).
ID: 210 · Report as offensive

Message boards : Number crunching : deadline is too short