Intel reports it has a fix for instability and degradation in 13th and 14th generation Core processors for desktop
Shortly after Intel officially commented on the reports that CPUs in laptops are also affected by the instability issues of Raptor Lake processors (stating that this is not true), it finally comes with new information on the core issues themselves. Namely, the widely reported problems of 13th and 14th generation Core CPUs for desktop being unstable in games and worse, showing symptoms of degrading. That problem turns out to be very real.
The Raptor Lake processor problems, which became more widely known around February this year, have become much louder and wider discussed in recent days, especially when popular YouTubers have started to cover the topic and present various investigations or rumors. It should be added though, that the problems have been confirmed by a lot of game development companies as well, but indirectly also by Nvidia, who too pointed to Intel (because CPU instability often ended in a bug seemingly implicating GPU drivers).
For a long time Intel had very little to say about the problem and only confirmed that it was aware of it. However, the company has worked with board manufacturers to develop BIOS updates that will introduce a new feature where by default processors will obey recommended settings such as power limits – which until now has not been the case, and virtually all boards have been setting higher limits (which Intel didn’t mind at all, until the problems began).
- Read more: Raptor Lake is unstable in games. Too high clock speeds or PL2
- Read more: Unstable Raptor Lake CPUs on the rise, Intel analyzes the issue
- Read more: Unstable Intel CPUs: performance drop with new BIOSes will be smaller
- Read more: Unstable Intel processors have TVB bug, but still no solution
- Read more: Stability issues and crashes affect large part of Raptor Lake CPUs
Then, in June, Intel officially confirmed information that it had found a related issue in the functioning of Thermal Velocity Boost technology that may have contributed to the problems (this issue will be fixed by a microcode update), but the important part was that this discovered flaw was not actually the root cause of the overall instability issues.
Thirdly, Intel issued a statement last week saying that, as far as the company knows, the problem only affects desktop processors and laptop chips are not affected. This was in response to reports that had just talked about problems allegedly seen on mobile CPUs as well.
New response: Final solution or will the problems continue?
Intel has now issued a fourth statement, which announces the discovery of an issue that is said to have caused the processors to be exposed to higher voltages than the power delivery was supposed to supply. Fixing the issue is expected to address processor instability.
However, doubts quickly surfaced that this may not be the ultimate solution to all the observed issues. The wording of the statement specifically talks about solving certain problems with excessive voltage, but nowhere does it say that these faults are the root cause that is behind all of the problems that have been discussed over the last six months. Like the Thermal Velocity Boost fix mentioned, it may be another sub-factor, but not necessarily the last.
Thus, for now, it can be said that this patch should definitely improve the situation for users who are lucky enough to have one of the Core 13th and 14th generation desktop processors. But this may not be the complete end to the problems and we will likely have to follow this case further. We’ll be happy if we’re wrong about this, hopefully we’ll see some additional information and clarification in the coming days.
Intel’s wording seems rather evasive and deliberately vague, but it’s entirely possible that Intel does really hope they have found a solution that will successfully fix the problem permanently – it’s just that caution tells them not to explicitly promise this yet, until the fix is deployed en masse and confirmed to actually eliminate problems everywhere. The patch is apparently still in the active testing phase.
This Intel’s statement, verbatim:
Based on extensive analysis of Intel Core 13th/14th Gen desktop processors returned to us due to instability issues, we have determined that elevated operating voltage is causing instability issues in some 13th/14th Gen desktop processors. Our analysis of returned processors confirms that the elevated operating voltage is stemming from a microcode algorithm resulting in incorrect voltage requests to the processor..
[Note: The CPU typically communicates to the board’s power circuits though feedback channel how much voltage to supply to the chip; what this means is that the excessive voltage was not caused by the motherboards, but by erroneous requests from the CPU, which is thus responsible.]
Intel is delivering a microcode patch which addresses the root cause of exposure to elevated voltages. We are continuing validation to ensure that scenarios of instability reported to Intel regarding its Core 13th/14th Gen desktop processors are addressed. Intel is currently targeting mid-August for patch release to partners following full validation.
Intel is committed to making this right with our customers, and we continue asking any customers currently experiencing instability issues on their Intel Core 13th/14th Gen desktop processors reach out to Intel Customer Support for further assistance.
The fix will come in mid-August
So, unfortunately, we now learn that the fix will be available in three weeks at the earliest, which is unpleasant. Delivering the fix to partners means that at that point, the microcode will be given to motherboard manufacturers, who will then have to incorporate it into BIOSes for individual board models and release them. In practice, this can add days or weeks to the date when the patch for your particular motherboard will appear, although it is possible that in this case the board manufacturers will try to move as quick as possible.
It’s a bit of a question of what to do in the meantime since you now know that your computer’s processor is potentially practising self-harm, but you’ll only be able to prevent it from doing so in a month. The solution is probably to manually lower the voltage (undervolt), and ideally, probably also underclock the processor to maintain stability (to a maximum of 5.0–5.3 GHz, as has been occasionally recommended). However, it is not clear how exactly this incorrect voltage control works. It’s possible that the error doesn’t lead to some fixed offset, and thus manually lowering the voltage may not be guaranteed to help, depending hot the mechanism of the error works.
However, the fact that lower-end processor models with more conservative clock speeds have always displayed these problems less frequently (but apparently they still occurred with the 125W Core i5) could indicate that undervolting and underclocking may be beneficial. You certainly shouldn’t go wrong by doing this, so we recommend that owners of 13th and 14th generation Core desktop processors apply this measure temporarily.
Degradation is confirmed, a significant number of processors are probably physically damaged
There’s a second piece of bad news, too. You may be asking how increasing voltage can even lead to processor instability, given how increasing voltage on the contrary enhances stability and adds extra margin (but at the expense of power consumption), generally. The answer seems to be that this exposure to higher voltages has in fact been (and continues to be) damaging to processors. Above a certain level of voltage, which depends on various factors and thus no exact number can be given, physical damage to silicon circuits occurs.
In this case, the process is such that the instability observed by the user probably manifests after a certain period of time and is not a sign of the abnormally increased voltage, but a manifestation that the processor has already been damaged to such an extent that it is no longer capable of stable operation. At least not at the default clock speeds anyway. It may still be possible to get it to function stably by lowering the clock speeds in theory, thus gaining an extra margin (the question is whether it is really stable then, or whether you are just not able to test the chip thoroughly enough to detect potentially faulty behavior).
This information from Intel means, in other words, that the degradation issue that has been suspected and feared from the beginning as a possible mechanism behind the observed problems is unfortunately real. It also means that those processors that have experienced instability, are already physically damaged. This damage is unfortunately irreversible. Intel has even explicitly confirmed to Tom’s Hardware that applying the august microcode patch to already degraded units will not help. And even processors that perhaps don’t suffer from instability now just yet may also be very close to showing issues in some cases, and thus factors like natural sort of slow aging, random power fluctuations leading to voltage spikes, aging capacitors on the board changing the VRM’s characteristics, or a user’s attempt to overclock them could also theoretically push the processors over the edge in the future, even after the introduction of the fixed microcode.
Intel is not saying that it will be recalling processors across the board or launching a special replacement program, so the matter will be handled through traditional RMA mechanisms. However, Intel is urging customers to be sure to use them if problems arise.
What about the reports of via oxidation problem?
Along with this statement on Intel’s community site, the same text was also published on Reddit, where, however, more interesting information was added by the company’s representative in charge of communication on the r/intel subreddit. In particular, this post comments on reports that a manufacturing anomaly during the production of the silicon itself is responsible for the failure of Intel processors, where allegedly the metal wire layers (or more precisely the vias interconnecting between them) were supposed to be contaminated with compounds causing oxidation and thus causing deterioration of the functioning and quality of the wiring in the chip.
This information was published by the GamersNexus YouTube channel, which may have received it as a tip from certain Intel insiders or partners. It seems that this is not the real cause of the problems, although there is a grain of truth to the information. Intel has confirmed on Reddit that anoxidation incident during production has actually occurred at its factories, but it has affected early batches of 13th-generation Core processors for desktop, probably mainly those made in 2022 (it’s quite likely it’s not all of them, though). According to Intel, the procedures were improved during 2023 and eliminated these problems.
According to Intel, the company has also analyzed this problem as a possible cause, but has ruled out that this oxidation flaw could be the root cause of the current instability problems – the problem with oxidation in manufacturing was reportedly found in only a small number of unstable processors the company’s teams analyzed.
For one thing, it is likely that the oxidation-affected wafers were already identified at the time, and it’s possible processors made from them have often not been able to even pass validation, so they may not even have been sold in large numbers. In particular, however, this would probably not explain the problems with the newer 14th generation Core processors, which were not released until late 2023.
And a problem of this kind would probably not be most pronounced on chips using high voltages and clock speeds and least on those with lower clock speeds. Such failure distribution, on the other hand, is consistent with that elevated voltage problem that Intel has now admitted to. Faster processors use higher voltages, so if a bug in the algorithms raises the voltage even higher, they will more easily go over the edge and get damaged than a slower processor always running with generally lower voltage, where the voltage “overshoot” would have to be significantly larger to cause the same damage.
Is it safe to buy Intel processors now?
This is a complicated question. At the very least, we would advise you to wait for the BIOS for your board that integrates the promised microcode update that addresses this error causing the dangerous voltage to actually be available. Ideally, you should probably use the board’s BIOS flashback feature and load the BIOS update from a USB drive before the processor is even installed in the socket, to save it from the potentially harmful effect that the old microcode might have before you manage to update it.
Still, it’s probably true that not all 13th- and 14th-generation Core desktop processors are equally prone to problems. The biggest issues are seen with those with high clock speeds, with the Core i9-13900K and i9-14900K having the most reported problems, followed by the Core i7-13700K and i7-14700K. However, 65W Core i7 and i9 and 125W Core i5-13600K and Core i5-14600K models can also have problems. Newly acquired previously unused processors should hopefully be fine once you have a board with the patched microcode, but we recommend you avoid buying second-hand processors (of the types that are know ne to suffer these issues) from now on. Any used CPU may well have a large amount of accumulated damage in it, and you have no way of knowing how degraded the CPU you are buying is. For example, the previous owner may be selling precisely them because they encountered the first sign of instability.
On the other hand, the 65W 13th and 14th generation Core i5 models are probably pretty safe. If excessive voltages occur with them at all, it’s probably to a small degree and there’s probably not much risk of them failing in the future. We still strongly recommend to install the BIOS update when it’s available, though. Older 12th generation Core processors seem to be be safe overall. Of course, another obvious solution to this situation is to choose an AMD processor instead…
Trust in Intel has been significantly undermined
But all this is assuming that after this patch, Intel will actually really have the problem under control and its causes eliminated. That being said, this is not yet certain and Intel’s statement does not rule out the possibility that other problems leading to instability and degradation will persist after fixing the currently discussed one. It is possible that the problem will be mitigated to a large degree, but will continue persist to some perhaps lesser extent.
However, the fact that it took Intel five months from the period the problem was widely reported in the media to find and publish the root cause (if it is indeed the root cause) and it will take six months before a fix is available does not make a good impression. Some scepticism and concern about whether Intel has sufficient verification and testing processes in place for its processors, their architecture and physical manufacturing processes is now understandable. Thus, after this issue, there may be lingering doubts as to whether Raptor Lake processors are really verified and validated properly in other respects – including whether they may hide additional factors leading to gradual degradation and future problems other than the one now discovered and addressed. Indeed, other rarer factors causing degradation might only become apparent when the current fix filters out some of the more common and obvious cases…
Is a performance drop possible?
It’s also theoretically possible that microcode patches and fixes to address the abnormally elevated voltage issues will lead to some (perhaps only minor) performance degradation in some situations. In particular, this would probably affect single-threaded performance at the maximum 1T boost.
Again, this possibility is something to consider. But unfortunately, we won’t know if there is a performance degradation until the fixes are available, probably in a month’s time. Intel has reportedly told Tom’s Hardware that no significant performance impact is expected, but testing is still ongoing, so this is probably not yet certain.
Sources: Intel, Intel (Reddit), HardwareLuxx, Tom’s Hardware
English translation and edit by Jozef Dudáš
⠀