This afternoon, I was feeling bored, so decided to get some practical IA64 tutorial in: I exported a makefile for the project I was working on, changed a couple of things (notably switching /MACHINE to IA64 from I386) and turned on /FAs. This allows me to read the assembly of code I'm already familiar with. I don't have an IA64, so I have to be my own processor while reading it. My brain is getting in quite a tangle from the compiler pushing quite a few operations up the instruction stream from where they were requested in the source (which it's permitted to do in C++ so long as the external view is that the operations happened in this order). It also does quite a bit of speculative execution, computing results that may never be needed.
Of course, this is different from modern x86 processors, which take the x86 instruction stream, convert it to smaller RISC-like operations, schedule instructions out of order with speculative execution, then work out how to recover the x86 state from that. However, a large proportion of the processor is simply made up of the translators, out-of-order schedulers, and instruction retire logic - larger than the portion that actually performs the computations. IA64 offloads all this work onto the compiler - the core itself is a simple in-order execution engine. The instruction stream explicitly tells the processor which instructions can execute in parallel, and which depend upon each other.