momaka wrote on Today, 11:20:
Not arbitrary.
But with safety margin included, like you said.
So there is statistical data available with significant number of samples which shows failure rates for different temperatures and, for example, at 65C it is significantly higher than at 55C? I suspect not...
Why margin has to be 15C? Not 5C, 7C, 10C, whatever?
Is there any data at all, showing that lower temperature even helps defective chips, assuming in any case Tg is not reached?
momaka wrote on Today, 11:20:
Overkill cooling IS good. The only issue you may get from it is if you put a really heavy cooler and don't properly support the card afterwards. On that note, even stock coolers on higher-end cards can be heavy enough to warp the card and break the BGA over time, so nothing new here.
Sufficient cooling is good. Overkill anything is bad. A waste of effort at best, potential cause for different failure at worst. Any modifications can have unintended consequences which are hard to foresee and i've seen plenty of hardware murdered by attempts at "overkill cooling".
Overall my opinion:
- All hardware manufactured in this way is doomed to fail at some point. Design itself implies that, certain wear is happening no matter what. Just like any ball bearing will fail eventually.
- Hardware from bumpgate era is doomed to fail faster. There are multiple reasons, not just underfill. This is unavoidable.
- There is nothing but logical conclusions based on unverified assumptions to show that lower temperature at any cost = better. There are multiple factors omitted from this conclusions, like rates of temperature change (up or down), mounting pressure/mechanical stresses, etc.
- There is a reason to be careful with any modifications to cooling system, as such modifications can have unintended consequences and it is highly arrogant to think you can foresee everything. Simple example - someone does not like stuff idling at 55C with fans off and disables fan stop. Now it idles at 35C with fans on. Nice? Seems so. Except temperature variation between load and idle was just dramatically increased, say instead of 55->65C you now get 35->65C. How would that affect longevity?
- So there is no reason to get obsessed with maintaining temperature below certain arbitrary value or even "as low as possible", assuming it somehow helps. Making sure that it does not reach underfill Tg (70C) in case of bumpgate affected HW is sufficient. So set fan curve to max out at 65C, if that's too loud or even can not be maintained then there is a reason to consider swapping the cooler. But on most cards it should be fine and simply decreasing ambient (improving case airflow) should provide reasonably good results.
And just to be clear - this is nothing more than my opinion. I just highly dislike the idea of "overkill cool everything" some people get obsessed with, because in my experience i never seen anything come out of this other than wasted money and dead hardware. One funny example of how "overkill cooling" can spectacularly backfire are SSDs and whole story with water blocks for them...