2 Dec 2008 05:08
Re: Intel Core i7 specs available.
stephane eranian <eranian <at> googlemail.com>
2008-12-02 04:08:48 GMT
2008-12-02 04:08:48 GMT
Dan, On Mon, Dec 1, 2008 at 11:03 PM, Dan Terpstra <terpstra <at> eecs.utk.edu> wrote: > > > So I spent the day cuddled up with my pdf of Intel Vol 3B, re-reading > Chapter 18 with a hope of understanding i7. > If any of the below is muddled or just plain wrong, I'd welcome > clarification... > > Looks to me like there's 3 fixed counters, just like the good ol' days, and > 4 programmable counters with some restrictions on what can be counted where > (some events are counter 0 only; some are counters 0 and 1 only). > Everything's per *logical* (hyper-)thread unless and until the thread-any > bit is implemented, and any counter (including fixed) can overflow. All 4 > counters are PEBS enabled. > That is more or less correct: 3 fixed + 4 generic. Some restrictions on some events. All counters can overflow. Generic counters do support the anythread bit as introduced by architectural perfmon v3 (Atom). New addition is PEBS on any generic counters. But if you look closer to PEBS, you will see that is now includes support for capturing where cache misses occur, similar to Itanium Data EAR and AMD IBS. You missed one thing, however, the offcore_response feature. That one is tricky because it uses a register that is shared per core (if I recall). Perfmon handles offcore_response similaryl to what is going on with AMD northbridge event. It enforces some form of mutual exclusion. > And then there's the uncore. It has an additional 8 counters, and a bunch of > interesting events. No restrictions on what can be counted where, but > everything's system-wide. No clear indication of how contention between > cores for these resources is resolved. I assume programming these things is > atomic, but that doesn't resolve who 'owns' what. I suppose that could be > done a'la' SiCortex, with each core getting 2 counters (but what about > logical cores?). Probably a better way would be to let perfmon reserve > resources on a first-come, first-served basis? > Perfmon manages uncore as follows: - restricted to system-wide - only one system-wide session per socket (i.e., per uncore). I know it seems too restrictive but there are nasty details. In particular, you have to indicate on which core to interrupt. Could be all, but then perfmon would interpret those as spurious on cores without the system-wide session. To make things easier, I have chosen the restrictions listed above. > I'm not sure that uncore counters should be restricted to system-wide > counting only; I think it could be quite useful, as Phil described for > SiCortex, to measure "what's happening to this shared resource while I'm > active". That's not unlike Component PAPI measuring network activity on a Well, that gets ugly, because you have to partition the resource. How? With 8 logical CPUs per socket and 8 counters, it does not make a lot of sense to give only one counter per CPU. IT would have to be dynamic and that would turn into something quite complicated. > NIC from within a first-person interface. Maybe the "oncore" events could be > implemented as a traditional PAPI CPU interface and the "uncore" events > could be implemented as a second PAPI component. That raises a question > whether there can be two simultaneous perfmon sessions active at the same > time (system-wide and first-person). No, this is not yet supported. I think on x86, this is not that far off. > Fun times; interesting questions... Things are getting very complicated very quickly.... The good things is that PMU are improving rapidly now. >> -----Original Message----- >> From: Corey J Ashford [mailto:cjashfor <at> us.ibm.com] >> Sent: Friday, November 28, 2008 2:53 PM >> To: Philip Mucci >> Cc: eranian <at> gmail.com; perfmon2-devel >> Subject: Re: [perfmon2] Intel Core i7 specs available. >> >> Philip Mucci <mucci <at> cs.utk.edu> wrote on 11/28/2008 09:38:37 AM: >> >> > > I will release support for core and uncore. But uncore will be >> > > restricted to system-wide sessions only. It does not make sense >> > > to support this for per-thread session, as there is no correlation >> > > possible with a PID or even a CPU. >> > >> > This is true only if you assume each process wants all of the 'off-core' >> >> > counters. >> > >> > An approach we took on SiCortex for our 'off-core' PMU, was to partition >> >> > the 256 counters into 6 blocks of 42, each process on each core gets 42 >> > counters, dedicated. So there is no contention for these registers, and >> > they can measure system-wide-events, but maintain the idea of a first >> > person context, since, after all that is the workload that is being >> > measured. When the process on that core is not active, neither is the >> > perfmon context...so the counts are stopped. We can do very nice >> > sampling on external events, DDR transactions, pciE stuff, fabric >> > packets, using the same interface we do for the in-core counters, and >> > these happen only when the process is active in non-system-wide mode. >> >> When you have two or more processes active, and one process (at least) is >> sampling these off-core events on its dedicated counters, don't you still >> end up with the same problem of getting events from other processes mixed >> up with yours? I can see that it does give you a better idea about the >> events while your process is running, because they are disabled while your >> process is not running. >> >> - Corey >> >> Corey Ashford >> Software Engineer >> IBM Linux Technology Center, Linux Toolchain >> Beaverton, OR >> 503-578-3507 >> cjashfor <at> us.ibm.com >> >> >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win great >> prizes >> Grand prize is a trip for two to an Open Source event anywhere in the >> world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> perfmon2-devel mailing list >> perfmon2-devel <at> lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > perfmon2-devel mailing list > perfmon2-devel <at> lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/