Welcome to HardwareForumz.com!
FAQFAQ      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

Itanium Montecito stuff

 
   Hardware Problem Solving Community! (Home) -> Chips RSS
Next:  HPaq vs Dell - actual data  
Author Message
Yousuf Khan1

External


Since: Dec 13, 2003
Posts: 214



(Msg. 1) Posted: Sun Nov 16, 2003 7:50 pm
Post subject: Itanium Montecito stuff
Archived from groups: comp>sys>ibm>pc>hardware>chips, others (more info?)

Multicore, symettric multi-threading, and 24MB of cache. Looks like this one
was designed with help from the Alpha team that Intel just bought out
recently from HPaq.

Yousuf Khan

http://www.theinquirer.net/?article=12686

 >> Stay informed about: Itanium Montecito stuff 
Back to top
Login to vote
Robert Myers

External


Since: Oct 06, 2003
Posts: 156



(Msg. 2) Posted: Sun Nov 16, 2003 7:50 pm
Post subject: Re: Itanium Montecito stuff [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Sun, 16 Nov 2003 16:50:50 GMT, "Yousuf Khan"
<removethisspam.bjsk90.removethispam.TakeThisOut@hotmail.com> wrote:

 >Multicore, symettric multi-threading, and 24MB of cache. Looks like this one
 >was designed with help from the Alpha team that Intel just bought out
 >recently from HPaq.
 >
 > Yousuf Khan
 >
 >http://www.theinquirer.net/?article=12686
 >

SMT was always aimed at Itanium. You can achieve most of the benefits
of OoO execution without actually going OoO by using SMT helper
threads. If you're supporting two cores with four threads each, the
huge cache is inevitable.

RM<!-- ~MESSAGE_AFTER~ -->

 >> Stay informed about: Itanium Montecito stuff 
Back to top
Login to vote
Bill Todd

External


Since: Jul 19, 2004
Posts: 15



(Msg. 3) Posted: Sun Nov 16, 2003 7:50 pm
Post subject: Re: Itanium Montecito stuff [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

"Robert Myers" <rmyers.RemoveThis@rustuck.com> wrote in message
news:bsgfrvcg4lfs92524p2r2i2tnqe3hhbs36@4ax.com...
 > On Sun, 16 Nov 2003 16:50:50 GMT, "Yousuf Khan"
 > <removethisspam.bjsk90.removethispam.RemoveThis@hotmail.com> wrote:
 >
  > >Multicore, symettric multi-threading, and 24MB of cache. Looks like this
one
  > >was designed with help from the Alpha team that Intel just bought out
  > >recently from HPaq.

I kind of doubt that: those people are reportedly all working on
Tanglewood, any Itanic SMT effort aimed at shipping in 2005 would have had
to have started at least a bit before the first of them settled in at Intel,
and while they may have offered comments I suspect that whatever SMT
mechanism may be incorporated into Itanic (I'm still a bit skeptical of this
report, but it does seem to be pretty wide-spread) differs sufficiently at a
very basic level from what they were working on for EV8 that their
experience may not have been directly transferrable.

  > >
  > > Yousuf Khan
  > >
  > >http://www.theinquirer.net/?article=12686
  > >
 >
 > SMT was always aimed at Itanium.

Really? My impression is that the Itanic architecture was largely
established somewhat before SMT appeared on the horizon, that most of the
coordination by the University of Washington researchers was with DEC and
Alpha, and that SMT is particularly amenable to leveraging existing
mechanisms for out-of-order execution (e.g., in Alpha) that are
conspicuously absent in Itanic.

Intel may later have investigated ways to make use of SMT in Itanic, but I
think it was definitely a retrofit.

You can achieve most of the benefits
 > of OoO execution without actually going OoO by using SMT helper
 > threads.

Maybe. But without doubt one of the things that you sacrifice is power
efficiency (not that Itanic appears to worry about this much), since without
the OoO hardware facilities you don't have a clue whether the extra work
you're doing will be useful (and even if it is useful in preloading the
caches, when the *real* code path reaches that point the instructions still
get executed a second time anyway).

Such helper threads are also a lot more expensive in use of execution units
than OoO SMT mechanisms are (again, because of the redundant or useless
execution activity noted above), so you need more EUs (and thus more core
area, which starts to limit clock rates unless you go asynchronous) than
you'd need in an OoO SMT implementation to perform as well.

 > If you're supporting two cores with four threads each,

Do you have a source for the suggestion that each Montecito core supports 4
threads?

the
 > huge cache is inevitable.

Not if you're primarily using the SMT for helper threads (not that I'm
suggesting that this as a great idea).

- bill<!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: Itanium Montecito stuff 
Back to top
Login to vote
Robert Myers

External


Since: Oct 06, 2003
Posts: 156



(Msg. 4) Posted: Sun Nov 16, 2003 7:50 pm
Post subject: Re: Itanium Montecito stuff [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

On Sun, 16 Nov 2003 15:00:21 -0500, "Bill Todd"
<billtodd.TakeThisOut@metrocast.net> wrote:

 >
 >"Robert Myers" <rmyers.TakeThisOut@rustuck.com> wrote in message
 >news:bsgfrvcg4lfs92524p2r2i2tnqe3hhbs36@4ax.com...
  >> On Sun, 16 Nov 2003 16:50:50 GMT, "Yousuf Khan"
  >> <removethisspam.bjsk90.removethispam.TakeThisOut@hotmail.com> wrote:
  >>
<snip>
  >>
  >> SMT was always aimed at Itanium.
 >
 >Really? My impression is that the Itanic architecture was largely
 >established somewhat before SMT appeared on the horizon, that most of the
 >coordination by the University of Washington researchers was with DEC and
 >Alpha, and that SMT is particularly amenable to leveraging existing
 >mechanisms for out-of-order execution (e.g., in Alpha) that are
 >conspicuously absent in Itanic.
 >

Oh, there I go again.

SMT at _Intel_ was always aimed at Itanium.

 >Intel may later have investigated ways to make use of SMT in Itanic, but I
 >think it was definitely a retrofit.
 >

I don't think there's much doubt about that.

  >> You can achieve most of the benefits
  >> of OoO execution without actually going OoO by using SMT helper
  >> threads.
 >
 >Maybe. But without doubt one of the things that you sacrifice is power
 >efficiency (not that Itanic appears to worry about this much), since without
 >the OoO hardware facilities you don't have a clue whether the extra work
 >you're doing will be useful (and even if it is useful in preloading the
 >caches, when the *real* code path reaches that point the instructions still
 >get executed a second time anyway).
 >

I expect helper threads to find a place even in OoO processors. The
available work on prescheduled speculative slices looks very
promising. A helper thread would also make things like DynamoRIO look
more attractive.

 >Such helper threads are also a lot more expensive in use of execution units
 >than OoO SMT mechanisms are (again, because of the redundant or useless
 >execution activity noted above), so you need more EUs (and thus more core
 >area, which starts to limit clock rates unless you go asynchronous) than
 >you'd need in an OoO SMT implementation to perform as well.
 >

A paper at SC 2003 suggests that "arithmetic is free, bandwidth is
expensive." If someone else doesn't get there first, I'll post a
thread for discussion. It warrants a separate thread.

  >> If you're supporting two cores with four threads each,
 >
 >Do you have a source for the suggestion that each Montecito core supports 4
 >threads?
 >

The paper I cited previously in comp.arch
:
:http://www.cs.ucsd.edu/users/jbrown/papers/sp-cmp.pdf
:
:"Speculative Precomputation on Chip Multiprocessors"
:
:which I gather is from
:
:6th Workshop on Multithreaded Execution, Architecture, and Compilation
:(MTEAC-6) Tuesday, November 19 (2002) Istanbul, Turkey.
:
:"Figure 2 indicates that across the board, SMT consistently
:provides the greatest speedup of the four configurations
:shown, even though it has the fewest overall execution
:resources and the least amount of aggregate cache capacity."
:
:with the four configurations being 4-way SMT, vs 2, 4, and 8 way CMP.

 > the
  >> huge cache is inevitable.
 >
 >Not if you're primarily using the SMT for helper threads (not that I'm
 >suggesting that this as a great idea).
 >

Scheduling helper threads without a roomy cache is tricky. The whole
purpose is to pull stuff into cache ahead of time, and it would be
annoying to have a helper thread bump something else out of cache that
was needed sooner than what the helper thread just pulled in.

RM<!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: Itanium Montecito stuff 
Back to top
Login to vote
Peter_Perlsø

External


Since: Nov 23, 2003
Posts: 15



(Msg. 5) Posted: Sun Nov 16, 2003 9:03 pm
Post subject: Re: Itanium Montecito stuff [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Yousuf Khan wrote:

 > Multicore, symettric multi-threading, and 24MB of cache. Looks like this one
 > was designed with help from the Alpha team that Intel just bought out
 > recently from HPaq.
 >
 > Yousuf Khan
 >
<font color=purple> > <a style='text-decoration: underline;' href="http://www.theinquirer.net/?article=12686</font" target="_blank">http://www.theinquirer.net/?article=12686</font</a>>
 >
 >

24 Megs of high-speed SRAM ???

Think $$$!

--



- Peter Perls¿ - web: <a style='text-decoration: underline;' href="http://u238.dk" target="_blank">http://u238.dk</a>

"If you have been voting for politicians who promise to give you goodies
at someone else's expense, then you have no right to complain when they
take your money and give it to someone else, including themselves."

-- Thomas Sowell (1992)<!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: Itanium Montecito stuff 
Back to top
Login to vote
Yousuf Khan1

External


Since: Dec 13, 2003
Posts: 214



(Msg. 6) Posted: Sun Nov 16, 2003 9:03 pm
Post subject: Re: Itanium Montecito stuff [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

"Peter Perlsø" <nospam.DeleteThis@nospam.com> wrote in message
news:3fb7ade3$0$27424$edfadb0f@dread16.news.tele.dk...
  > > Multicore, symettric multi-threading, and 24MB of cache. Looks like this
one
  > > was designed with help from the Alpha team that Intel just bought out
  > > recently from HPaq.
 >
 > 24 Megs of high-speed SRAM ???
 >
 > Think $$$!

Yeah, I'm not even sure why they're dicking around. Just get it over and
done with, put 1GB of SRAM
on it, and get rid of that DRAM already. That would be a feature of the
processor, doesn't need any external RAM. Smile

Yousuf Khan<!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: Itanium Montecito stuff 
Back to top
Login to vote
Bill Todd

External


Since: Jul 19, 2004
Posts: 15



(Msg. 7) Posted: Sun Nov 16, 2003 10:15 pm
Post subject: Re: Itanium Montecito stuff [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

"Robert Myers" <rmyers.RemoveThis@rustuck.com> wrote in message
news:aaofrv8o2m2955keltiu8e3vlhiob0n077@4ax.com...
 > On Sun, 16 Nov 2003 15:00:21 -0500, "Bill Todd"
 > <billtodd.RemoveThis@metrocast.net> wrote:
 >
  > >
  > >"Robert Myers" <rmyers.RemoveThis@rustuck.com> wrote in message
  > >news:bsgfrvcg4lfs92524p2r2i2tnqe3hhbs36@4ax.com...

....

   > >> You can achieve most of the benefits
   > >> of OoO execution without actually going OoO by using SMT helper
   > >> threads.
  > >
  > >Maybe. But without doubt one of the things that you sacrifice is power
  > >efficiency (not that Itanic appears to worry about this much), since
without
  > >the OoO hardware facilities you don't have a clue whether the extra work
  > >you're doing will be useful (and even if it is useful in preloading the
  > >caches, when the *real* code path reaches that point the instructions
still
  > >get executed a second time anyway).
  > >
 >
 > I expect helper threads to find a place even in OoO processors.

Possibly, but I suspect only in situations where the workload has fewer
threads than the SMT core supports: otherwise, the other core threads will
likely be far more effective servicing real threads and leaving the
individual thread IPC up to the OoO mechanisms. With Itanic, the trade-off
may be less clear (since it has more to gain on an individual thread from SP
than an OoO core does).

The
 > available work on prescheduled speculative slices looks very
 > promising. A helper thread would also make things like DynamoRIO look
 > more attractive.
 >
  > >Such helper threads are also a lot more expensive in use of execution
units
  > >than OoO SMT mechanisms are (again, because of the redundant or useless
  > >execution activity noted above), so you need more EUs (and thus more core
  > >area, which starts to limit clock rates unless you go asynchronous) than
  > >you'd need in an OoO SMT implementation to perform as well.
  > >
 >
 > A paper at SC 2003 suggests that "arithmetic is free, bandwidth is
 > expensive."

Free in what respect(s)? The specific context above is power and chip area
(and by extension of the latter clock rate).

If someone else doesn't get there first, I'll post a
 > thread for discussion. It warrants a separate thread.
 >
   > >> If you're supporting two cores with four threads each,
  > >
  > >Do you have a source for the suggestion that each Montecito core supports
4
  > >threads?
  > >
 >
 > The paper I cited previously in comp.arch
 > :
 > :http://www.cs.ucsd.edu/users/jbrown/papers/sp-cmp.pdf
 > :
 > :"Speculative Precomputation on Chip Multiprocessors"
 > :
 > :which I gather is from
 > :
 > :6th Workshop on Multithreaded Execution, Architecture, and Compilation
 > :(MTEAC-6) Tuesday, November 19 (2002) Istanbul, Turkey.
 > :
 > :"Figure 2 indicates that across the board, SMT consistently
 > :provides the greatest speedup of the four configurations
 > :shown, even though it has the fewest overall execution
 > :resources and the least amount of aggregate cache capacity."
 > :
 > :with the four configurations being 4-way SMT, vs 2, 4, and 8 way CMP.

That paper concentrates on SP in CMP-only environments, and uses the
4-thread SMT core only for comparison purposes. There's nothing in it to
suggest that it refers in any way specifically to Montecito.

 >
  > > the
   > >> huge cache is inevitable.
  > >
  > >Not if you're primarily using the SMT for helper threads (not that I'm
  > >suggesting that this as a great idea).
  > >
 >
 > Scheduling helper threads without a roomy cache is tricky. The whole
 > purpose is to pull stuff into cache ahead of time, and it would be
 > annoying to have a helper thread bump something else out of cache that
 > was needed sooner than what the helper thread just pulled in.

If that were a serious problem, it would be worst in the extremely small L1
cache and significant in the modest L2 cache. The size of the L3 cache
should be completely insensitive to it by comparison, especially with the
24-way associativity that the current Itanic2 L3 cache has: whatever data
is evicted from the L3 by the helper thread is unlikely to be very
important, whereas the new data that the helper thread is bringing in will
almost certainly be needed almost immediately.

- bill<!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: Itanium Montecito stuff 
Back to top
Login to vote
James Boswell

External


Since: Oct 11, 2003
Posts: 5



(Msg. 8) Posted: Fri Nov 28, 2003 2:37 pm
Post subject: Re: Itanium Montecito stuff [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Yousuf Khan <removethisspam.bjsk90.removethispam DeleteThis @hotmail.com> wrote:
 > "Peter Perlsø" <nospam DeleteThis @nospam.com> wrote in message
 > news:3fb7ade3$0$27424$edfadb0f@dread16.news.tele.dk...
   >>> Multicore, symettric multi-threading, and 24MB of cache. Looks like
   >>> this one was designed with help from the Alpha team that Intel just
   >>> bought out recently from HPaq.
  >>
  >> 24 Megs of high-speed SRAM ???
  >>
  >> Think $$$!
 >
 > Yeah, I'm not even sure why they're dicking around. Just get it over and
 > done with, put 1GB of SRAM
 > on it, and get rid of that DRAM already. That would be a feature of the
 > processor, doesn't need any external RAM. Smile

Oddly enough, IBM were going on about that..

and on a .045 process, they could probably get a gig of edram in under
200mm^2 of die area, using the 36MB edram dies they've got alongside the
POWER5 as a guide

-JB<!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: Itanium Montecito stuff 
Back to top
Login to vote
Peter_Perlsø

External


Since: Nov 23, 2003
Posts: 15



(Msg. 9) Posted: Fri Nov 28, 2003 8:20 pm
Post subject: Re: Itanium Montecito stuff [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

James Boswell wrote:

 > Yousuf Khan <removethisspam.bjsk90.removethispam.TakeThisOut@hotmail.com> wrote:
 >
  >>"Peter Perlsø" <nospam.TakeThisOut@nospam.com> wrote in message
  >>news:3fb7ade3$0$27424$edfadb0f@dread16.news.tele.dk...
  >>
   >>>>Multicore, symettric multi-threading, and 24MB of cache. Looks like
   >>>>this one was designed with help from the Alpha team that Intel just
   >>>>bought out recently from HPaq.
   >>>
   >>>24 Megs of high-speed SRAM ???
   >>>
   >>>Think $$$!
  >>
  >>Yeah, I'm not even sure why they're dicking around. Just get it over and
  >>done with, put 1GB of SRAM
  >>on it, and get rid of that DRAM already. That would be a feature of the
  >>processor, doesn't need any external RAM. Smile
 >
 >
 > Oddly enough, IBM were going on about that..
 >
 > and on a .045 process, they could probably get a gig of edram in under
 > 200mm^2 of die area, using the 36MB edram dies they've got alongside the
 > POWER5 as a guide
 >
 > -JB
 >
 >


EDRAM

Enhanced Dynamic Random Access Memory
(E-D-ram)

Another form of DRAM that includes an SRAM cache on the chip. This
allows frequently accessed data to be obtained faster. (Also known as
CDRAM.)


Just FYI.

--



- Peter Perls¿ - web: <a style='text-decoration: underline;' href="http://u238.dk" target="_blank">http://u238.dk</a>

"If you have been voting for politicians who promise to give you goodies
at someone else's expense, then you have no right to complain when they
take your money and give it to someone else, including themselves."

-- Thomas Sowell (1992)<!-- ~MESSAGE_AFTER~ -->
 >> Stay informed about: Itanium Montecito stuff 
Back to top
Login to vote
Display posts from previous:   
   Hardware Problem Solving Community! (Home) -> Chips All times are: Pacific Time (US & Canada) (change)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]