Intel Roadmap Introduction

We've skipped a lot of the corporate and enterprise discussion for most of our recent Intel roadmaps. There's a good reason for the omission: very little had changed from the previous roadmaps. Every time we see a new Intel roadmap, the amount of information compressed into the 60 to 80 slides is simply staggering. Desktop, Mobile, Server, Enterprise, and Internet Appliance plans are included, and that's just the broad categories. Within each of those you find information on the chipsets, motherboards, and sometimes cases and other details. With few exceptions, there is almost always enough content for a couple articles, though you often need to dig a little deeper to find the interesting bits. We've got several pieces coming out of our latest Intel roadmap, with this article focusing primarily on the corporate sector.

Besides the information density of their roadmaps, devoting large portions to the needs of businesses and corporations is one of the things that really sets Intel apart from AMD right now. Sure, AMD has the faster desktop parts, and there are many AMD adherents that feel that should be enough for anyone looking to purchase a new computer. That is, simply put, a distorted view of the world. For home users and enthusiasts, that attitude makes a lot of sense. As much as I like my AMD systems, though, if I were to start a business that had 25 or more computers, there's a good chance I'd be running Intel systems, and there's a good chance they would come from Dell (or some other large OEM). How can that be!? Am I not an enthusiast? That's near-blasphemy! Before any of you begin leveling charges of Intel favoritism, let me explain.

If I were running a medium-sized or larger business, as much as I would like to have faster systems, taking the time to hand-build that many PCs is simply not making good use of time. Businesses normally want identical PCs (in order to simplify support), they want better warranties, they want one point of contact, and they want all of the systems assembled and delivered in a relatively short time frame. Higher performance that might enable employees to play games would actually be a bad thing, so sticking with an integrated graphics solution unless something faster is required would be a good idea. Finally, businesses don't want some fly-by-night shop to disappear after building the systems, leaving them to deal with problems on their own.

Until AMD can get partners that focus on bringing out Corporate/Enterprise desktop systems - not just "Small or Medium Business" systems - with AMD processors, most companies won't consider switching. (Incidentally, we're actually testing some AMD SMB systems right now, and they've left a good impression. It's unfortunate that they aren't billed as Large Business systems, though.)

Before we continue with the roadmap, we found some information at the end of the roadmap that can serve as a helpful glossary and/or technology primer. Intel throws around code names, acronyms, and technical jargon with wild abandon in their roadmaps, and we tend to follow suit. (We would guess that there are at least 50 code names listed in any given roadmap!) We'll use quite a few of these terms throughout many of our roadmap articles, so it's only fair to give you a quick cheatsheet.

Intel Technology Glossary
Feature Description
Hyper-Threading Technology (HTT) Improves CPU utilization by processing two software threads on one core.
64-bit computing / Intel EM64T 64-bit computing and related instructions.
Demand Based Switching (DBS) with EIST Enables server/workstation platform to go into reduced power state during periods of low use.
PCI Express Next generation serial I/O technology offering scalable bandwidth up to 8 Gigabits/Second.
DDR2 Memory Enables faster memory and increased memory bandwidth at lower power compared to DDR.
Dual Core Improves processor throughput by increasing CPU resources.
Intel I/O Acceleration Technology (I/OAT) Platform level I/O acceleration based on improvements in the Processor; MCH and LAN (ESB2 or NIC).
FBD (Fully Buffered DIMM) Memory Next generation memory technology that uses DDR2 DRAMS in a serial point-to-point interconnect.
Intel Active Management Technology (IAMT) System state-independent access to management functions and asset data.
Intel Virtualization Technology (VT) Hardware enhancements to the processor enabling Improved virtualization solutions.
Pellston Certain cache errors can be handled without restarting the system.
Foxton Enables CPU to operate at increased frequency when CPU power is below specified max levels.

You know things are complex when the simplified definitions of terms include cross references and even self-references. Virtualization Technology enables improved virtualization solutions? Who would have guessed? If you'd like additional explanations of what some of the terms mean, feel free to ask and we'll do our best to answer. Several of the above features that are summarized with a single sentence could easily be the topic of a lengthy article.
Stable Image Platform Program


View All Comments

  • IntelUser2000 - Monday, September 12, 2005 - link

    Itanium either supports hardware emulation OR software translation. The difference between emulation and translation may seem to be minimal, but translation has much better performance than emulation. While the hardware emulation just emulates instructions, the software translator dynamically optimizes the code on the fly to improve performance.

    Hardware emulation is NOT present on Montecito in favor of IA-32EL(software translation)
  • IntelUser2000 - Monday, September 12, 2005 - link

    The MAJOR difference betweeen Foxton and *OTHER* dynamic overclocking is that Foxton is implemented on HARDWARE, while other dynamic overclocking is based on SOFTWARE.

    I guess you guys may refer to the dynamic overclocking by MSI by D.O.T. or the one in ATI Catalyst driver. But they are software based. 30 million of the LOGIC transistors are dedicated to JUST Foxton technology.

    Foxton isn't just dynamic overclocking. If the power consumption exceeds the set threshold, it clocks the CPU down until its equal or under the threshold point. Unlike conventional overclocking, Foxton FINDS the right point where it won't damage the CPU, while providing the maximum clockspeed the design can provide.

    OCing Prescott to 6GHz is not safe point, BTW.

    Foxton responds extremely fast on demand and power consumption. The hardware feature for Foxton is extensive for power management, basing it on power consumption, temperature, workload.
  • JarredWalton - Monday, September 12, 2005 - link

    Good points, and obviously I wasn't trying to get into the deep details of Itanium. I have a question for you, though, as you seem to know plenty about Itanium: Intel currently has IA-32EL; is there an IA-EM64T-EL in the works? (It might be called something else, but basically EM64T emulation for Itanium?)

    Even though Foxton is hardware based, we still don't know how it actually performs in practice - at least, I don't. (I probably never will, as I haven't even used an Itanium system other than to poke around a bit at some tradeshows.) 955 can run as high as 2.0 GHz under load - in practice, can you actually reach that speed most of the time, or is it more like 1.80 GHz for a bit, then 2.0 GHz for a bit, and maybe 1.90 GHz in between?

    Also, are you sure about the "30 million transistors" part? That's larger than the entire Itanium Merced core (not counting the L3 cache). I suppose if you're talking about all the debugging and monitoring transistors, 30 million might be possible, but I didn't think all of that was lumped under "Foxton"?
  • IntelUser2000 - Monday, September 12, 2005 - link

    I think there is plan for EM64T extension to IA-32EL. I heard from Inquirer that Montvale may have that, but either I could have misunderstood it/or its a rumor. Its just software support so I guess Intel can put it whenever they want to.

    For Foxton speeds, it depends. From what I understand, there is a thing called a power virus(A power virus is a malicious computer program that executes a specific instruction mix in order to establish the maximum power rating for a given CPU.), and if a number for power virus is 1.0(meaning 100% of maximum power), for Linpack its 0.8, specfp2k is 0.7, specint2k is 0.65, TpmC is 0.6. Since TpmC is furthest away from the power virus figure, it would reach maximum speed all the time, for 9055, that is 2.0GHz. For speccpu2k, it may be 1.9GHz, and for Linpack it may be 1.8GHz. So for some programs, there may be no benefit AT ALL, while others may get the maximum.

    Foxton can sample every 8uS to change voltage and frequency.

    Yes, I am sure about the Foxton hardware transistor count part. It uses custom 32-bit DSP with its own RAM to process the data necessary for Foxton. I was sort of surprised but yeah, around 30 million. Sorry I couldn't give the link, I'll send you somehow, give me info of how, but I do remember clearly. Merced has 25 million transistors including 96KB L2, without it that's around 20 million I guess, but Mckinley is actually simpler and has less logic transistors than Merced, which according to some, its around 15-17 million transistors.

    Montecito has 64 million transistors NOT including L2. 64-30=34 million/2=17 million transistors, which is right on mark for
  • IntelUser2000 - Wednesday, September 14, 2005 - link">

    Well, I was KINDA right.


    Hewlett-Packard declared. 30 million transistors, as many as are in a Pentium II, are responsible solely for power management

    Though, yes that doesn't mean they are all for Foxton. Maybe, I don't know.

    Itanium Merced has 25.4 million transistors. ~6 million of that is dedicated to x86 hardware emulator. Which leaves with 19.4 million transistors. W/O including 96KB L2, it would be around 14-15 million transistors for Merced core logic.

  • IntelUser2000 - Wednesday, September 14, 2005 - link

    OTOH, I think the site could be wrong. It doesn't make sense with other Montecito papers saying it consumes less than 0.5W and takes less than 0.5% die size. I give up haha. Reply
  • Jimw18600 - Monday, September 12, 2005 - link

    Your definition of HTT is a little skewed. It doesn't enable processing multiple threads; that was always there, whether they were earmarked or not. What it does do, is instead of flushing the instruction buffer back to the missed branch, it restarts the broken thread and continues the rest forward. Broken threads are simply tossed out and resources are reclaimed in the last stage in the pipeline; completed threads are retired. And by the way, the reason Intel was forced to go to HTT was they were heading for 31-stage pipelines. If you were still back at 12-15 stages, HTT didn't have that much to offer. Reply
  • JarredWalton - Monday, September 12, 2005 - link

    My definition of HTT was actually taken directly from the roadmap. That's how Intel describes it, and obviously a 1 sentence summary leaves out a lot of details. HTT does allow the concurrent execution of more than one thread, but resource contention makes it difficult to say exactly how HTT will affect performance.

    One interesting point about SMT in general is that POWER5 doesn't have 20 to 31 pipeline stages and yet it still benefits from the IBM SMT design. This is purely a hunch on my part, but I wouldn't be at all surprised to see some form of HT come out for Conroe/Woodcrest in the future. Trouble filling all for issue slots from one thread? SMT could help out. We'll see if Intel does that or not in the future.

    Note: HTT was actually present (but disabled) since Northwood for sure. Some people suspect that it was actually present in an early form in Willamette. Just because Conroe doesn't currently show any HT support, doesn't mean there's not some deactivaated features awaiting further testing. :)
  • IntelUser2000 - Monday, September 12, 2005 - link

    From what I understand, modern single thread processors like the early Northwood P4's can execute multiple threads, but not ALL simultaneously. Since today's processors are fast enough anyway, it SEEMS like multi-tasking. The OS decides how to devote the time to the CPUs I guess.

    HT, makes use of the otherwise idle units, since it will give basically double demand to the CPU. None of the thread can make full advantage of the CPU(say 15%), but second thread makes it more efficient by taking 20% advantage of the CPU, which is 33% better throughput. It is more complex than that, but I think that explanation is enough.

    Power 4/5 issue rate is 5-wide, which is quite a lot. It also has 17-stage pipeline, which is close to Pentium 4 Willamette/Northwood. Wide and deep, with lots of bandwidth and enough execution units, its perfect for SMT.
  • coomar - Monday, September 12, 2005 - link

    kind of difficult to read the confidential

    virtualization sounds interesting

Log in

Don't have an account? Sign up now