Message boards :
Number crunching :
Are the size estimates accurate?
Message board moderation
Author | Message |
---|---|
Send message Joined: 13 Jul 20 Posts: 3 Credit: 128,216 RAC: 0 |
doing the math of size estimate / time to complete I am coming back with a result of ~ 18 TFLOP/s or 18 TIOP/s per gpu. 1080 Ti's don't run that fast.... Is the size estimate accurate, and also, is the project floating point or integer math? |
Send message Joined: 24 Jun 20 Posts: 85 Credit: 207,156 RAC: 0 |
How do you do the math for the size estimate or time? You can see the FLOPs estimate of tasks by checking the properties of a running task in BOINC Manager Advanced view, Tasks tab. The latest I've seen was <rsc_fpops_est>30000000000000000.000000</rsc_fpops_est> And you can see how BOINC calculates the run time for tasks by enabling the rr_simulation in the Event Log options... menu |
Send message Joined: 13 Jul 20 Posts: 3 Credit: 128,216 RAC: 0 |
I don't think you understood my question. The size of the computation is being reported as 3x10^16 operations. The run time for one of the work units I completed last night was 1703.89 seconds. 3x10^16 / 1703.89 = 17606.7704 operations per second. My graphics cards are physically incapable of running that fast ( 2 IPC * 1.911 GHz max boost * 3584 CUDA cores = 13698.048 operations per second, theoretical maximum), so I am wondering how accurate the estimate of the size of the work units is. |
Send message Joined: 12 Jul 20 Posts: 2 Credit: 167,441 RAC: 0 |
Just so you know 1.911GHz is 1'911'000'000 so the actual result is 2*1'911'000'000*3'584 = 13'736'268'000'000 which is 13 T-OP/S but when taking into account things like warps and it quite easily can reach 18 TFLOPS or TIOPS with the most major bottle neck being how fast you can put the operations and data into VRAM |
Send message Joined: 15 Jun 20 Posts: 74 Credit: 19,537,761 RAC: 0 |
Absolutely, we're VRAM latency and core-clock / core bottlenecked. Bandwidth doesn't matter as much from what I saw super early on (I was surprised to see the same speed going from 480GB/s to 78GB/s HBM2) GDDR6 GPUs with worse compute than their previous gen (AMD RX 5700 vs my RX Vega 56 OC) perform the same or like 25-40% faster despite their similar gaming perf and worse compute perf in some situations than the previous GPUs So, we have to deal with NV: Turing(GDDR6), Pascal (GDDR5/X) and even Maxwell (GDDR5) AMD : RDNA Navi (GDDR6), GCN Vega(HBM2) and GCN Polaris (GDDR5) affecting our time/tflops per task estimates. There's also some in-task variances for some set of seeds/inputs due to cactus placement optimization as well |
Send message Joined: 12 Jul 20 Posts: 2 Credit: 167,441 RAC: 0 |
When i said putting into vram I meant locality not bandwidth |
Send message Joined: 13 Jul 20 Posts: 3 Credit: 128,216 RAC: 0 |
If you want to pretend l''m an idiot and you need to explain computer architecture and third grade mathematics to me, the least you could do is actually get your arithmetic right. I'm done here. |