Would 3D V-Cache Help Intel CPUs? 14th-gen Cores vs. Cache

Thing is: the whole X3D chips are derived from initial EPYC CPU's. Amd did have some customers who required CPU's with more cache, so they kind of invented rebranding the (failed) epyc variants into Ryzen consumer CPU's.

It's all explained in a video of Gamernexus; where the hardware team is presented a epyc chip that did not meet requirements and got told "Do something with it". They learned that the cache in esp. games would be extremely well and thus the X3D was born.

Huge advantages for things that could benefit from the extra cache, other then that it's extremely expensive, it takes up die space and it requires a good sum of power (because of the always on state).

Downside: AMD's implementation prohibits OC'ing or voltage selection. The cache is tied to the CPU's VCORE and that explains why the whole voltage thing is locked. And a few who managed to increase the voltage(s) learned quickly that the X3D cache died really really fast.

Intel cant just slap extra cache onto it. I mean it can but the costs would be significant. The way AMD designs it's chips vs Intel's is completely different.
 
"Oddly, an increased cache capacity often helped the most when fewer cores were active, though that wasn't always the case."

What I think is happening:

*Four cores were sometimes bottlenecked to the point where cache didn't move the needle, although occasionally there was a significant performance boost based on cache

*Six cores wasn't as bottlenecked as four cores, and would sometimes be sensitive to cache performance, but typically the cache made little difference

*Eight cores weren't bottlenecked and cache wasn't a notable performance factor
 
@TechSpot. Your method is unsound:

- There is on average 10% performance difference @1080p-RTX4090 between 7600X (32MiB L3 cache) and 8600G (16MiB L3 cache)

- You didn't test any Intel CPU with 16MiB L3 cache or smaller (despite the fact that Intel CPUs with smaller L3 cache sizes are available), yet you "inferred" that lower L3 cache size wouldn't affect Intel

- Please add i3-14100 (12MiB L3 cache, 4 P-cores), compare it to 4 cores on i9-14900K at the same clock frequency in games that do NOT benefit from more than 4 cores, and then REPUBLISH both the video and the article

- There is little performance difference between 5800X (1*32MiB L3 cache) and 5900X (2*32MiB L3 cache), and there is large performance difference between 5800X3D (1*96MiB L3 cache) and 5800X in one half of games, while 5800X3D can outperform 5900X in one quarter of games: this implies that the "split 64MiB" cache in 5900X does NOT matter for gaming performance in one quarter of games, but non-split 96MiB in 5800X3D DOES matter for gaming performance in one half of games

- You didn't simulate any Intel CPU with non-split 96MiB L3 cache, yet you "inferred" that a larger (64MiB or larger) non-split L3 cache wouldn't affect Intel gaming performance

- As far as I know you didn't use 5800X3D (or 7800X3D, or even 8600G) to divide games into two categories BEFORE performing Intel benchmarks: one category is games that do benefit from large L3 cache of 5800X3D and the 2nd category is games that don't benefit. Please do use 5800X3D/7800X3D vs 5800X/7700X as a precursor for Intel cache sensitivity benchmarks to "bin" games into these two categories, with a sufficient number of games in each bin, and then REPUBLISH both the video and the article about how Intel cache size affects gaming performance.

What you did ISN'T the SCIENTIFIC METHOD. Please redo all the measurements.
 
Huge advantages for things that could benefit from the extra cache, other then that it's extremely expensive, it takes up die space and it requires a good sum of power (because of the always on state).
It's another die altogether, so not really any die space used at all. Although I guess the connecting vias do take some space.

Power efficiency improvement is significant for many uses. Memory accesses are reduced even when there is no apparent performance gains.

Downside: AMD's implementation prohibits OC'ing or voltage selection. The cache is tied to the CPU's VCORE and that explains why the whole voltage thing is locked. And a few who managed to increase the voltage(s) learned quickly that the X3D cache died really really fast.
Power efficiency is improved again.

Intel cant just slap extra cache onto it. I mean it can but the costs would be significant. The way AMD designs it's chips vs Intel's is completely different.
It would mean a respin is all. If Intel wanted to it could.

The fail with the EPYCs is the reason why the 7950X3D is stuck with only one V-cache die stacked on one chiplet - When two stacks exist the traffic between chiplets skyrockets, ruining performance. Some have blamed latency alone but it's probably a combination of latency and saturation.
 
The fail with the EPYCs is the reason why the 7950X3D is stuck with only one V-cache die stacked on one chiplet - When two stacks exist the traffic between chiplets skyrockets, ruining performance. Some have blamed latency alone but it's probably a combination of latency and saturation.
What AMD probably needs to do now is implement a virtual L4 cache mechanism that alters the utilisation priorities of distance L3 caches.
 
@techspot IMHO, the L3 miss chance that goes to DDR memory is what determines the performance difference. For example, if AMD's 96MB L3 allows 99% hit rate for an imaginary game, then Intel going from 24MB to 36MB is going from 25% to 38% hit rate (not quite true but go with it) which is going to not show any performance benefit. In other words, for this article, you have to search for games that allow a high hitrate with 36MB L3 cache, ALSO with 24MB resulting in more misses, to see a large performance uplift. This is much harder than finding games that straddle the gap between 36MB and 96MB L3 unfortunately, which is why the article shows little performance difference.
 
What resolution is this? I couldn't find it in the article.

We know 4k is still almost entirely limited by the GPU still so it seems like a bad idea to test that. Should have tested a fewer games with more parameters.
 
Interesting. I suppose what happens is the cache is big enough to hold the "working set" for the actual game code, so larger cache doesn't benefit there. And the "working set" for the game data will be far larger than would fit in even a 100MB cache so having a somewhat larger cache doesn't help there either.

Back in the old days, John Carmack etc. would get absolutely massive speedups by making sure bits of code would fit within the like 64KB caches or whatever of chips of that era.

I doubt there's still code hand optimized to ensure anything like that (partially because the cache is simply less larger so it's more likely code will fit in there anyway). But gcc, llvm clang, Visual C++, etc., (unless you compile something with gcc "-mcpu=native" where it WILL use the cache size from your specific CPU...) all do have an assumed cache size, and will go to efforts to optimize code to fit within the cache when possible.

I could be wrong, but I suppose you'd get a nice large boost in speeds if you had multiple unrelated items running. Like (silly example I know but) a triple-head with 3 games running. Or several VMs running (and doing something CPU-intensive). Maybe even multiple copies of video encoders. CPU benchmarks would not work well for this, since they usually are running a rather small amount of code in a loop.

The point remains, this shows the cache is not beneficial for games. The Intel CPUs are hot as hell so slapping on vcache but having to clock it down would likely reduce performance. (I imagine even in the "multiple unrelated items" where I think you'd get a boost, you could EASILY get the loss from the lower clock speed outweighing the boost from higher cache hit ratio.)
 
Last edited:
@TechSpot. Your method is unsound:

- There is on average 10% performance difference @1080p-RTX4090 between 7600X (32MiB L3 cache) and 8600G (16MiB L3 cache)

- You didn't test any Intel CPU with 16MiB L3 cache or smaller (despite the fact that Intel CPUs with smaller L3 cache sizes are available), yet you "inferred" that lower L3 cache size wouldn't affect Intel

- Please add i3-14100 (12MiB L3 cache, 4 P-cores), compare it to 4 cores on i9-14900K at the same clock frequency in games that do NOT benefit from more than 4 cores, and then REPUBLISH both the video and the article

- There is little performance difference between 5800X (1*32MiB L3 cache) and 5900X (2*32MiB L3 cache), and there is large performance difference between 5800X3D (1*96MiB L3 cache) and 5800X in one half of games, while 5800X3D can outperform 5900X in one quarter of games: this implies that the "split 64MiB" cache in 5900X does NOT matter for gaming performance in one quarter of games, but non-split 96MiB in 5800X3D DOES matter for gaming performance in one half of games

- You didn't simulate any Intel CPU with non-split 96MiB L3 cache, yet you "inferred" that a larger (64MiB or larger) non-split L3 cache wouldn't affect Intel gaming performance

- As far as I know you didn't use 5800X3D (or 7800X3D, or even 8600G) to divide games into two categories BEFORE performing Intel benchmarks: one category is games that do benefit from large L3 cache of 5800X3D and the 2nd category is games that don't benefit. Please do use 5800X3D/7800X3D vs 5800X/7700X as a precursor for Intel cache sensitivity benchmarks to "bin" games into these two categories, with a sufficient number of games in each bin, and then REPUBLISH both the video and the article about how Intel cache size affects gaming performance.

What you did ISN'T the SCIENTIFIC METHOD. Please redo all the measurements.
Do you, for even one second, really think TS is going to redo "all the measurements" based on the swirling thoughts in your head? Start a blog and publish your own results, Demand Boy. 🤔 :joy:
 
@TechSpot. Your method is unsound:

- There is on average 10% performance difference @1080p-RTX4090 between 7600X (32MiB L3 cache) and 8600G (16MiB L3 cache)
Yes.
- You didn't test any Intel CPU with 16MiB L3 cache or smaller (despite the fact that Intel CPUs with smaller L3 cache sizes are available), yet you "inferred" that lower L3 cache size wouldn't affect Intel
They don't really infer that, they refer to the 10th gen review where the L3 cache size (between the 12, 16, and 20MB caches on them) did make a difference, they're just saying the smaller 24/33/36MB compared to AMD's larger caches is not making a big difference. It's true though, if there's a 13th/14th gen that can hit 5ghz and has a 16MB or smaller cache that would have been a nice data point to have anyway.

- There is little performance difference between 5800X (1*32MiB L3 cache) and 5900X (2*32MiB L3 cache), and there is large performance difference between 5800X3D (1*96MiB L3 cache) and 5800X in one half of games, while 5800X3D can outperform 5900X in one quarter of games: this implies that the "split 64MiB" cache in 5900X does NOT matter for gaming performance in one quarter of games, but non-split 96MiB in 5800X3D DOES matter for gaming performance in one half of games

- You didn't simulate any Intel CPU with non-split 96MiB L3 cache, yet you "inferred" that a larger (64MiB or larger) non-split L3 cache wouldn't affect Intel gaming performance
As far as I know the only Intel CPUs with that much cache are a couple Xeon models, with like 56-64 cores. I don't know that it'd be particularly comparable to get this CPU with like quad memory controllers etc. and just turn off like 95% of the cores on it.
- As far as I know you didn't use 5800X3D (or 7800X3D, or even 8600G) to divide games into two categories BEFORE performing Intel benchmarks: one category is games that do benefit from large L3 cache of 5800X3D and the 2nd category is games that don't benefit. Please do use 5800X3D/7800X3D vs 5800X/7700X as a precursor for Intel cache sensitivity benchmarks to "bin" games into these two categories, with a sufficient number of games in each bin, and then REPUBLISH both the video and the article about how Intel cache size affects gaming performance.
So? They could have tested fewer games than they did. That doesn't change the results.

I do think you make a few good points, it'd be real interesting if there was some way to just take something like that Xeon and progressively reduce the L3 cache.
 
Nice job, steve. Imho L3 cache size matters but only after it passed certain size which made most of the reusable instructions and data fit on it with minimal.miss. AMD seems to know that well that's why they package both the normal ryzen and the V3D ryzen with certain sizes. There seems to be 'science' behind each, most effecfive size (hit performance vs drawbacks/cost/power)
 
Maybe 33 -> 36 MB L3 cache is too small upgrade, even 24 -> 36 MB also not great? If test 2X/3X L3 cache enlargement then probably see 10-30% FPS gain?
 
Maybe the cache difference is too small to have noticeable effect, and the larger L2 cache in Raptor Lake makes the difference in L3 cache matters less (comparing with Comet Lake).

Here are the ratios of the total cache (L2+L3) of some CPUs:
- 7800X3D/7800X: 104MB/40MB = 2.6X
- 14900K(6 cores)/14600K: 48MB/36MB = 1.25X
- 10900K(6 cores)/10600L: 21.5MB/13.5MB = 1.59X
 
Do you, for even one second, really think TS is going to redo "all the measurements"
I don't think in absolute terms, I think in probabilities. Thus, the question whether "I really think TS will redo all the measurements" is nonsensical, such a question has no answer in my mind.
 
They could have tested fewer games than they did. That doesn't change the results.
No. With such a small sample size as in the article (11 games), while at the same time the authors having NO IDEA whether the 11 selected games are large-cache sensitive or insensitive in the first place, they might have actually selected a wrong set of games and thus the measurements might be invalid.

PS: I own just 2 games out of the 11 tested (I also own other games, which weren't tested by the article), which actually means that I cannot learn anything from the article because 2 is a ridiculously small sample size for me to make any useful conclusion. The question is: How many games do you own out of the 11 tested?
 
Last edited:
No. With such a small sample size as in the article (11 games), while at the same time the authors having NO IDEA whether the 11 selected games are large-cache sensitive or insensitive in the first place
...

From the article about 5800X3D I only can tell that 3-4 games here are cache sensitive:

BG3
Hogwarts Legacy
Assassin's Creed Mirage
Cyberpunk ? Not the same settings in previous article

Spider-man here is non RT, so we cannot tell. I guess we look at 1080p results in both articles (not stated which resolution was used in this article)

 
Zen suffers from higher RAM access latency, so having a larger L3 cache helps to alleviate this bottleneck.
I think you hit the nail on the head.
X3D is kind of a workaround for RAM access latency in zen architecture.
What could be done to recreate similar environment for Intel is to artificiality limit the Intel mem controller to low clock and high timings. Maybe in such setting the cache size in Intel cpus would matter more.
 
Back