Chip Architecture: Intel Still Winning?
Several points, some minor, some major...
The cost of manufacture of the wafers is a constant. It costs the same amount to make a wafer with 100 ICs on it or to make a wafer with 1000 ICs on it.
This is a gross oversimplification. Over the life of an older process, the (economic) cost per wafer started high then dropped to a plateau. In practice a full node, like 90 nm to 65 nm reduced the area for equivalent parts. So once the new process became stable, the cost per piece for equivalent products dropped by half. This has been the driver behind technological innovation in chipmaking for decades.
Around the 130 nm node, which was a long time ago in semiconductor terms, the resistance and reactance of the interconnects between transistors grew faster than the reduction size reduced the interconnect lengths. Also chips are three dimensional. Signals go up and down, in addition to forward, back, left and right. The thickness of the gate oxide has decreased about as much as possible, but the thickness of resist and interconnect layers has held relatively constant. (There is actually some advantage to be gained by making the uppermost layers thicker.) All this means that for a given CPU core design, more transistors are needed to amplify signals which travel any distance on the chip.
Let me say a bit more about half-pitch. At a half-pitch of 45 nm you can draw lines 90 nm apart. The line can be 30 nm wide or 60, with the other half of the pitch spent on the gaps between the lines. How do you choose narrow or fat lines? You can underexpose or overexpose the whole wafer, or change the width of the lines in the mask. That doesn't change the half-pitch, but it does change the total amount of light that gets through. Up until recently, the smallest transistors have been about 1 pitch or two half pitches square. Now Intel is using FinFETs or tri-gate transistors where the gate wraps around the drain of the transistor, and the source wraps around that. If you are very careful in creating this more complex transistor, you can make it not much too larger than the transistors it replaces. Does this wipe out the advantage of 22 nm production? Not completely, but add in...
There has been no progress towards higher frequency lasers in the photolithography parts of the chipmaking process in a long time. Using new and more complex equipment has allowed ArF (193 nm) lasers to be extended further and further. Rayleigh equation: CD = k1· ? / NA. Where ? is the wavelength of (ultraviolet) light used, and CD is the critical dimension--the smallest dimension you can create in theory. Water immersion lithography has allowed NA to be pushed past 1, but getting to 1.4 will require switching to something other than water. Since ? has been fixed for almost a decade, how to get finer features? You can "push" k1 by using a high amplifying factor resist--but those resists also make images fuzzier, and there is a limit. The solution? Expose a wafer more than once. Make the traces thin, and interleave them. This way with double patterning you can get twice as many lines in a given width. Two sets of lines with a 45 nm half-pitch and say 12 nm line with results in 10 nm effective gaps, and a 22 nm half-pitch. (Very important where you have parallel traces, and in all the cache memory areas on chip.) But you just doubled the largest cost (and the most capital intensive) step in the process.
So why go to 22 nm at all? If you think it will lead to 15 nm and below, it is a no-brainer. But for now it means that the full step from 32 nm doesn't buy much and the half-step from 28 nm buys nothing at all. Don't get me wrong. Intel's processor designs will improve from 32 nm, and the FinFET is a much more effective transistor, as long as you have to build it. But the FinFET was discovered about a decade ago. Everybody has been holding it in readiness for when other solutions to the leakage problem fell short. (Leakage is, in effect, the current through a turned off CMOS transistor pair. When you have billions of transistors on a chip, even a little leakage creates a lot of extra heat.)
Anyway, even if Intel goes to triple patterning for the 15-nm node, they will still reduce costs per CPU core. The various foundries are also using double patterning at the 28 nm half-step. (PHysics is the same for everyone.) TSMC's 28 nm process is very nice, but suffers right now from a lack of capacity. Immersion wafer scanners and steppers are extremely expensive, and there are only three companies I know of that make them. ASML in the Netherlands, Canon, and Nikon. AFAIK Ultratech doesn't make immersion steppers.
TSMC will get past the low yields at 28 nm. (Which are not significantly low by traditional standards for this early in the process node.) It's just that demand is so high, since AMD and nVidia are using it for their high-end GPU production. In fact, watching the price of AMD's 7870, 7850, and 7770 cards will give you great visibility into TSMC's yields. Same for nVidia when they bring out performance and mid-range members of their new Kepler line. (The highest of high-end parts are not sensitive to the die cost.) Will AMD use TSMC for 28 nm CPUs? Sure, in their Bobcat series of chips. The drop in leakage (partially from design) is already impressive in their 28 nm GPUs.
The next step for AMD and nVidia at TSMC will be the 20 nm process node, which is one (full) step down from 28 nm. What about Global Foundries? In addition to competing for the bulk semiconductor merchant fab business, they are still making SOI chips for AMD at 32 nm. The Bulldozer family has been sort of underwhelming, but the issue has been leakage. Moving Bulldozer (actually its successors) to FinFETs should allow extremely high clockspeeds. (Actually if you know what you are doing, you can push Bulldozer chips to about 6 GHz without much trouble.*) The other AMD chip fabbed at 32 nm at GloFlo is Llano "APU", which has been pretty successful since it has "decent enough" graphics built in. (Better than Sandy Bridge although Sandy Bridge has a much more powerful CPU.) The latest family of APUs from AMD Trinity will be out shortly, also using GloFlo's 32 nm process.
It is possible that GloFlo's new fab in Malta, New York will use EUV at 20, 22, or 28 nm. (Place your bets.) If some of the potential problems with EUV can be worked around, GloFlo will have a facility that can expose wafers at a much lower price than Intel. The big problem right now is in the resist area. EUV right now needs a high amplification factor, but if the amplification factor is too high it smears out the features. The fix may be a very fine emulsion, where one photon will expose an entire bubble, but not spread to neighboring bubbles.
At one time I thought that EUV was the only way to get past 20 nm, right now it looks as if even if EUV is practical in the 15 to 32 nm range, it certainly won't get to 10 nm. Besides, before the technology gets there, better interconnects than copper are needed. My guess is that the first step in this direction will be (conducting) carbon nanotubes embedded in copper. Eventually nanotubes grown in place will be needed.
Is Intel winning the race today? Sure. Will they be winning two years from now? Almost certainly. Beyond that we get into an area where Moore's Law is likely to break down (for a while) as new and very different technologies are developed to get below 15 to 20 nm. Am I calling the end to Moore's Law? Nope. Just saying that at some point you have to shift from tuning today's best processes to new processes that will have different problems. EUV is one example. It may not be a win when it first enters production, but as the learning curve sets in, EUV has a lot more headroom than ArF litho. It also has new sets of problems. :-( Same with e-beam writing, and nano-imprinting, but one of them will win out. (Well maybe self-assembly can be combined with current interconnect technologies. Very small transistors that build themselves. ;-)
* The first thing you do is turn the voltage down as much as possible. Good overclocking chips are those that can still run at a lower voltage. You may end up somewhere between the original voltage and the lowest possible voltage at your highest overclock. I bought a six-core Bulldozer rated at 95 watts, so when I push the clocks, and my chip to 125 watts my motherboard is still within spec. ;-)
The other problem/oddity of Bulldozer, is an old AMD trait going back to the Athlon days. The best memory is the lowest latency memory. I have DDR3 1866 memory which actually works better if I turn the speed down a bit (and voltage up a bit) to shave (memory) clock cycles from tCAS, tRCD, tRP, and tRAS.