Today, Apple has unveiled their brand-new MacBook line-up. This isn’t an ordinary release – if anything, the move that Apple is making today is something that hasn’t happened in 15 years: The start of a CPU architecture transition across their whole consumer Mac line-up.

Thanks to the company’s vertical integration across hardware and software, this is a monumental change that nobody but Apple can so swiftly usher in. The last time Apple ventured into such an undertaking in 2006, the company had ditched IBM’s PowerPC ISA and processors in favor of Intel x86 designs. Today, Intel is being ditched in favor of the company’s own in-house processors and CPU microarchitectures, built upon the Arm ISA.

The new processor is called the Apple M1, the company’s first SoC designed with Macs in mind. With four large performance cores, four efficiency cores, and an 8-GPU core GPU, it features 16 billion transistors on a 5nm process node. Apple’s is starting a new SoC naming scheme for this new family of processors, but at least on paper it looks a lot like an A14X.

Today’s event contained a ton of new official announcements, but also was lacking (in typical Apple fashion) in detail. Today, we’re going to be dissecting the new Apple M1 news, as well as doing a microarchitectural deep dive based on the already-released Apple A14 SoC.

The Apple M1 SoC: An A14X for Macs

The new Apple M1 is really the start of a new major journey for Apple. During Apple’s presentation the company didn’t really divulge much in the way of details for the design, however there was one slide that told us a lot about the chip’s packaging and architecture:

This packaging style with DRAM embedded within the organic packaging isn't new for Apple; they've been using it since the A12. However it's something that's only sparingly used. When it comes to higher-end chips, Apple likes to use this kind of packaging instead of your usual smartphone POP (package on package) because these chips are designed with higher TDPs in mind. So keeping the DRAM off to the side of the compute die rather than on top of it helps to ensure that these chips can still be efficiently cooled.

What this also means is that we’re almost certainly looking at a 128-bit DRAM bus on the new chip, much like that of previous generation A-X chips.

On the very same slide, Apple also seems to have used an actual die shot of the new M1 chip. It perfectly matches Apple’s described characteristics of the chip, and it looks looks like a real photograph of the die. Cue what's probably the quickest die annotation I’ve ever made:

We can see the M1’s four Firestorm high-performance CPU cores on the left side. Notice the large amount of cache – the 12MB cache was one of the surprise reveals of the event, as the A14 still only featured 8MB of L2 cache. The new cache here looks to be portioned into 3 larger blocks, which makes sense given Apple’s transition from 8MB to 12MB for this new configuration, it is after all now being used by 4 cores instead of 2.

Meanwhile the 4 Icestorm efficiency cores are found near the center of the SoC, above which we find the SoC’s system level cache, which is shared across all IP blocks.

Finally, the 8-core GPU takes up a significant amount of die space and is found in the upper part of this die shot.

What’s most interesting about the M1 here is how it compares to other CPU designs by Intel and AMD. All the aforementioned blocks still only cover up part of the whole die, with a significant amount of auxiliary IP. Apple made mention that the M1 is a true SoC, including the functionality of what previously was several discrete chips inside of Mac laptops, such as I/O controllers and Apple's SSD and security controllers.

The new CPU core is what Apple claims to be the world’s fastest. This is going to be a centre-point of today’s article as we dive deeper into the microarchitecture of the Firestorm cores, as well look at the performance figures of the very similar Apple A14 SoC.

With its additional cache, we expect the Firestorm cores used in the M1 to be even faster than what we’re going to be dissecting today with the A14, so Apple’s claim of having the fastest CPU core in the world seems extremely plausible.

The whole SoC features a massive 16 billion transistors, which is 35% more than the A14 inside of the newest iPhones. If Apple was able to keep the transistor density between the two chips similar, we should expect a die size of around 120mm². This would be considerably smaller than past generation of Intel chips inside of Apple's MacBooks.

Road To Arm: Second Verse, Same As The First

Section by Ryan Smith

The fact that Apple can even pull off a major architectural transition so seamlessly is a small miracle, and one that Apple has quite a bit of experience in accomplishing. After all, this is not Apple’s first-time switching CPU architectures for their Mac computers.

The long-time PowerPC company came to a crossroads around the middle of the 2000s when the Apple-IBM-Motorola (AIM) alliance, responsible for PowerPC development, increasingly struggled with further chip development. IBM’s PowerPC 970 (G5) chip put up respectable performance numbers in desktops, but its power consumption was significant. This left the chip non-viable for use in the growing laptop segment, where Apple was still using Motorola’s PowerPC 7400 series (G4) chips, which did have better power consumption, but not the performance needed to rival what Intel would eventually achieve with its Core series of processors.

And thus, Apple played a card that they held in reserve: Project Marklar. Leveraging the flexibility of the Mac OS X and its underlying Darwin kernel, which like other Unixes is designed to be portable, Apple had been maintaining an x86 version of Mac OS X. Though largely considered to initially have been an exercise in good coding practices – making sure Apple was writing OS code that wasn’t unnecessarily bound to PowerPC and its big-endian memory model – Marklar became Apple’s exit strategy from a stagnating PowerPC ecosystem. The company would switch to x86 processors – specifically, Intel’s x86 processors – upending its software ecosystem, but also opening the door to much better performance and new customer opportunities.

The switch to x86 was by all metrics a big win for Apple. Intel’s processors delivered better performance-per-watt than the PowerPC processors that Apple left behind, and especially once Intel launched the Core 2 (Conroe) series of processors in late 2006, Intel firmly established itself as the dominant force for PC processors. This ultimately setup Apple’s trajectory over the coming years, allowing them to become a laptop-focused company with proto-ultrabooks (MacBook Air) and their incredibly popular MacBook Pros. Similarly, x86 brought with it Windows compatibility, introducing the ability to directly boot Windows, or alternatively run it in a very low overhead virtual machine.

The cost of this transition, however, came on the software side of matters. Developers would need to start using Apple’s newest toolchains to produce universal binaries that could work on PPC and x86 Macs – and not all of Apple’s previous APIs would make the jump to x86. Developers of course made the jump, but it was a transition without a true precedent.

Bridging the gap, at least for a bit, was Rosetta, Apple’s PowerPC translation layer for x86. Rosetta would allow most PPC Mac OS X applications to run on the x86 Macs, and though performance was a bit hit-and-miss (PPC on x86 isn’t the easiest thing), the higher performance of the Intel CPUs helped to carry things for most non-intensive applications. Ultimately Rosetta was a band-aid for Apple, and one Apple ripped off relatively quickly; Apple already dropped Rosetta by the time of Mac OS X 10.7 (Lion) in 2011. So even with Rosetta, Apple made it clear to developers that they expected them to update their applications for x86 if they wanted to keeping selling them and to keep users happy.

Ultimately, the PowerPC to x86 transitions set the tone for the modern, agile Apple. Since then, Apple has created a whole development philosophy around going fast and changing things as they see fit, with only limited regard to backwards compatibility. This has given users and developers few options but to enjoy the ride and keep up with Apple’s development trends. But it has also given Apple the ability to introduce new technologies early, and if necessary, break old applications so that new features aren’t held back by backwards compatibility woes.

All of this has happened before, and it will all happen again starting next week, when Apple launches their first Apple M1-based Macs. Universal binaries are back, Rosetta is back, and Apple’s push to developers to get their applications up and running on Arm is in full force. The PPC to x86 transition created the template for Apple for an ISA change, and following that successful transition, they are going to do it all over again over the next few years as Apple becomes their own chip supplier.

A Microarchitectural Deep Dive & Benchmarks

In the following page we’ll be investigating the A14’s Firestorm cores which will be used in the M1 as well, and also do some extensive benchmarking on the iPhone chip, setting the stage as the minimum of what to expect from the M1:

Apple's Humongous CPU Microarchitecture
POST A COMMENT

644 Comments

View All Comments

  • KarlKastor - Monday, November 16, 2020 - link

    @techconc
    Do you think a 4 Core Zen2 was different than a 8 Core Zen2. Yes it is much different. Zen 1 was even much inhomogenous with increasing core count.

    Apple can't just put 8 big cores in it and is finished. All cores with one unified L2 Cache? The core interconnect will be different for sure. The cache system too, I bet.

    M1 and A14 will be much similar, yes.
    But you can't extrapolate from a single thread benchmark to a multi thread practical case. It can work, but don't have to.
    The cache system, core interconnect, memeory subsystem, all is much more important with many cores working at the same time.
    Reply
  • Kangal - Thursday, November 12, 2020 - link

    Hi Andrei,
    I'm very disappointed with this article. It is not very professional nor upto Anandtech standards. Whilst I don't doubt the Apple A14/A14X/M1 is a very capable chipset, we shouldn't take Apple's claims at face-value. I feel like you've just added more fuel to the fire, that which is hype.

    I've read the whole thing, and you've left me thinking like this ARM Chipset is supposedly similar to the 5W TDP we have on iPhones/iPads, and able to compete with 150W Desktop x86 chipsets. While that possible, it doesn't pass the sniff test. And even more convoluted, is that this chipset is supposed to extend the battery life notably (from 10hrs upto 17hrs or 20hrs) by x1.7-x2.0 factor, yet the difference in the TDP is far greater (from 5W compared to 28W) in x4.5-x6.0 difference. So this is losing efficiency somewhere, otherwise we should've seen battery life estimates like 45hrs to 60hrs. Both laptops have the same battery size.

    Apple has not earned the benefit of the doubt, instead they have a track-record of lying (or "exaggerating"). I think these performance claims, and estimates by you, really needed to be downplayed. And we should be comparing ACTUAL performance when that data is available. And by that I mean running it within proper thermal limits (ie 10-30min runtime), with more rounded benchmarking tools (CineBench r23 ?), to deduce the performance deficits and improvements we are likely to experience in real-world conditions (medium duration single-core, thermal throttling multi-thread, GPU gaming, and power drain differences). Then we can compare that to other chipsets like the 15W Macbook Air, the 28W MacBook Pro, and Hackintosh Desktops with Core i9-9900k or r9-5950x chipsets. And if the Apple M1 passes with flying colours, great, hype away! But if they fail abysmally, then condemn. Or if it is very mixed, then only give a lukewarm reception.

    So please follow up this article, with a more accurate and comprehensive study, and revert back to the professional standards that allow us readers to continue sharing your site with others. Thank you for reading my concerns.
    Reply
  • Kangal - Thursday, November 12, 2020 - link

    I just want to add, that during the recent announcement by Nvidia, we were lead to believe that the RTX 3080 has a +100% performance uplift over the RTX 2080. Now that tests have been conducted by trustworthy, professional, independent reviewers. Well, it is actually more like +45% performance uplift. To get to the +70% -to- +90% performance uplift requires us to do some careful cherry-picking of data.

    My fear is that a similar case has happened with the Apple M1. With your help, they've made this look like it is as fast as an Intel Core i9-9900k. I suspect it will be much much much much slower, when looking at non-cherry picked data. And I suspect it will still be a modest improvement over the Intel 28W Laptop chipsets. But that is a far cry from the expectations that have been setup. Just like the case was with the RTX-3000 hype launch.
    Reply
  • Spunjji - Thursday, November 12, 2020 - link

    @Kangal - Personally, I'm very disappointed in various commenters' tendency to blame the article authors for their own errors in reading the article.

    Firstly, it's basically impossible to read the whole thing and come away with the idea that M1 will have a 5W TDP. It has double the GPU and large-core CPU resources of A14 - what was measured here - so logically it should start at somewhere around 10W TDP and move up from there.

    To your battery life qualms - throw in some really simple estimates to account for other power draw in the system (storage, display, etc.) would get you to an understanding of why the battery life is "only" 1.7X to 2X their Intel models.

    As for Apple's estimates being "downplayed" - sure, only they provide *actual test data* in here that appears to validate their claims. I don't know why you think CineBench is more "rounded" than SPEC - the opposite is actually true; CineBench does lots of one thing that's easily parallelized, whereas SPEC tests a number of different features of a CPU based on a large range of workloads.

    In summary: your desire for this not to be as good as it *objectively* appears to be is what's informing your comment. The article was thoroughly professional. In case you're wondering, I generally despise Apple and their products - but I can see a well-designed CPU when the evidence is placed directly in front of me.
    Reply
  • Kangal - Friday, November 13, 2020 - link

    @Spunjji

    First of all, you are objectively wrong. It is not debatable, it is a fact. That this article CAN (could, would, has) been read and understood in a manner different to yours. So you can't just use a blanket statement like "you're holding it wrong" or "it's the readers fault". When clearly there are things that can be done to mitigate the issue, and that was my qualm. This article glorifies Apple, when it should be cautioning consumers. I'm not opposed to glorifying things, credit where due.

    The fact is Andrei, who representing Anandtech, is assuming a lot of the data points. He's taking Apple's word at face value. Imagine the embarrassment if they take a stance such as this, only to be proven wrong a few weeks later. What should have been done, is that more effort and more emphasis should have been placed on comparisons to x86 systems. My point still stands, that there's a huge discrepancy between "User Interface fluidity", "Synthetic Benchmarks", "Real-world Applications", and "Legacy programs". And also there's the entire point of power-draw limitations, heat dissipation, and multi-threaded processing.

    Based on this article, people will see the ~6W* Apple A14 chipset is only 5%-to-10% slower than the ~230W (or 105W TDP) AMD r9-5950x that just released and topped all the charts. So if the Apple Silicon M1 is supposed to be orders of magnitude faster, (6W vs 12W or maybe even more), then you can make the logical conclusion that the Apple M1 is +80% -to- +290% faster when compared to the r9-5950x. That's insane. Yet it could be plausible. So the sensible thing to do is to be skeptical. As for CineBench, I think it is a more rounded test. I am not alone in this claim, many other users, reviewers, testers, and experts also vouch for it. Now, I'm not prepared to die on this hill, so I'll leave it at that.

    I realised the answer to the battery life question as I was typing it. And I do think a +50% to +100% increase is revolutionary (if tested/substantiated). However, the point was that Andrei was supposed to look into little details like that, and not leave readers thinking. I know that Apple would extend the TDP of the chip, that much is obvious to me even before reading anything, the issue is that this point itself was never actually addressed.

    Your summary is wrong. You assume that I have a desire, to see Apple's products to be lower than claimed. I do not. I am very unbiased, and want the data as clean as possible. Better competition breeds better progress. In fact, despite my reservations against the company, this very comment is being typed on an Early-2015 MacBook Pro Retina 13inch. The evidence that's placed in front of you isn't real, it is a guesstimate at best. There's many red-flags seeing their keynote and reading this article. Personally, I will have to wait for the devices to release, people to start reviewing them thoroughly, and I will have to think twice about digesting the Anandtech version when released. However, I'm not petty enough to boycott something because of subjective reasons, and will likely give Anandtech the benefit of the doubt. I hope I have satisfied some of your concerns.

    *based on a previous test by Anandtech.
    Reply
  • Spunjji - Friday, November 13, 2020 - link

    @Kangal - The fact that a reader *can* get through the whole thing whilst imposing their own misguided interpretations on it doesn't mean it's the author's fault for them doing so. Writers can't spend their time reinventing the wheel for the benefit of people who didn't do basic background reading that the article itself links to and/or acknowledge the article's stated limitations.

    Your "holding it wrong" comparison is a funny one. You've been trying to chastise the article's author for not explicitly preventing people from wilfully misinterpreting the data therein, which imposes an absurd burden on the author. To refer back to the "holding it wrong" analogy, you've tried to chew on the phone and are now blaming the phone company for failing to tell people not to chew on it. It's not a defensible position.

    As it stands, he assumes nothing - nothing is taken at face value with regard to the conclusions drawn. He literally puts their claims to the test in the only manner currently available to him at this point in time. The only other option is for him to not do this at all, which would just leave you with Apple's claims and nothing else.

    As it is, the article indicates that the architecture inside the A14 chip is capable of single-core results comparable to AMD and Intel's best. It tells us nothing about how M1 will perform in its complete form in full applications compared with said chips, and the article acknowledges that. The sensible thing to do is /interpret the results according to their stated limitations/, not "be sceptical" in some generic and uncomprehending way.

    I think this best sums up the problem with your responses here: "The evidence that's placed in front of you isn't real, it is a guesstimate at best". Being an estimate doesn't make something not real. The data is real, the conclusions drawn from it are the estimates. Those are separate things. The fact that you're conflating them - even though the article is clear about its intent - indicates that the problem is with how you're thinking about and responding to the article, not the article itself. That's why I assumed you were working from a position of personal bias - regardless of that, you're definitely engaged in multiple layers of flawed reasoning.
    Reply
  • Kangal - Friday, November 13, 2020 - link

    @Spunjji

    I agree, it is not the writers fault for having readers misinterpret some things. However, you continue fail to acknowledge that a writer actually has the means and opportunity to greatly limit such things. It is not about re-inventing the wheel, that's a fallacy. This is not about making misguided people change their minds, it is about allowing neutral readers be informed with either tangible facts, or putting disclaimers on claims or estimates. I even made things simple, said that Andrei simply needed to address that the figures are estimates so that the x86 comparisons aren't absurd.

    "You're holding it wrong" is an apt analogy. I'm not chewing on the phone, nor the company. I've already stated my reservations (they've lied before, and aren't afraid of exaggerating things). So you're misguided here, if you actually think I was even defending such a position. I actually think you need to increase your reading comprehension, something that you actually have grilled me on. Ironic.

    I have repeated myself several times, there are some key points that need to be addressed (eg/ legacy program performance, real-world applications, multi-threaded, synthetic benchmarks, and user experience). None of these have been addressed. You said the article acknowledges this, yet you haven't quoted anything. Besides, my point was this point needed to be stressed in the article multiple times, not just an off-hand remark (and even that wasn't made).

    Being an estimate doesn't make something not real. Well, sure it does. I can make estimates about a certain satellites trajectory, yet it could all be bogus. I'm not conflating the issue, you have. I've displayed how the information presented could be misinterpreted. This is not flawed reasoning, this is giving you an example of how loosely this article has been written. I never stated that I've misinterpreted it, because I'm a skeptical individual and prefer to dive deeper, read back my comments and you can see I've been consistent on this point. Most other readers can and would make that mistake. And you know what, a quick look on other sites and YouTube, well it shows that is exactly what has happened (there are people thinking the MBA is faster than almost all high-end desktops).

    I actually do believe that some meaningful insights can be gathered by guesstimates. Partial information can be powerful, but it is not the complete information. Hence, estimated need to be taken with a pinch of salt, sometimes a little, other times a lot. Any professional who's worth their salt (pun intended) will make these disclaimers. Even when writing Scientific Articles, we're taught to always put disclaimers when making interpretations. Seeing the quality of writing drop on Anandtech begs one to not defend them, but to pressure them instead to improve.
    Reply
  • varase - Wednesday, November 11, 2020 - link

    Remember that M1 has a higher number of Firestorm cores which produce more heat - though not as much as x86 cores.

    There may be some throttling going on - especially on the fanless laptop (MacBook Air?).

    Jeez ... think of those compute numbers on a fanless design. Boggles the mind.

    Whenever you compare computers in the x86 world at any performance level at all, the discussion inevitably devolves into, "How good is the cooling?" Now imagine a competitor who can get by with passive heat pipe/case radiation cooling - and still sustain impressive compute numbers. Just the mechanical fan energy savings alone can go a good way to preserving battery life, not to mention a compute unit with such a lower TDP.
    Reply
  • hecksagon - Tuesday, November 10, 2020 - link

    These benchmarks don't show time as an axis. Yes the A14 can compete with an i7 laptop in bursty workloads. Once the iPhone gets heat soaked performance starts to tank pretty quickly. This throttling isn't represented in the charts because these are short benchmarks and the performance isn't plotted over time. Reply
  • Zerrohero - Wednesday, November 11, 2020 - link

    Do you think that the M1 performance with active cooling will “tank” like A14 performance does in an iPhone enclosure?

    Do you understand how ridiculous your point is?
    Reply

Log in

Don't have an account? Sign up now