Article <2024Oct13.172037@mips.complang.tuwien.ac.at>

Deutsch English Français Italiano
<2024Oct13.172037@mips.complang.tuwien.ac.at>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Interrupts in OoO
Date: Sun, 13 Oct 2024 15:20:37 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 52
Message-ID: <2024Oct13.172037@mips.complang.tuwien.ac.at>
References: <2024Oct3.160055@mips.complang.tuwien.ac.at> <vdmrk6$3rksr$1@dont-email.me> <LyELO.69485$2nv5.62232@fx39.iad> <TdWLO.282116$FzW1.158190@fx14.iad> <963a276fd8d43e4212477cefae7f6e46@www.novabbs.org> <8IcMO.249144$v8v2.147178@fx18.iad> <2024Oct5.195712@mips.complang.tuwien.ac.at> <30YMO.378690$WOde.271267@fx09.iad>
Injection-Date: Sun, 13 Oct 2024 17:43:26 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="af58cdb9027b0750dcd0965a898e1e8e";
	logging-data="763577"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/UqCjefwJ0z+hkEmcX27dG"
Cancel-Lock: sha1:aj3ZqGLDSJ8AYKga3Ir7NTeMqrg=
X-newsreader: xrn 10.11
Bytes: 3667

EricP <ThatWouldBeTelling@thevillage.com> writes:
>Anton Ertl wrote:
>> EricP <ThatWouldBeTelling@thevillage.com> writes:
>>> That's difficult with a circular buffer for the instruction queue/rob
>>> as you can't edit the order.
>> 
>> What's wrong with performing an asynchronous interrupt at the ROB
>> level rather than inserting it at the decoder?  Just stop commiting at
>> some point, record this at the interrupt return address and start
>> decoding the interrupt code.
>
>That's worse than a pipeline drain because you toss things you already
>invested in, by fetch, decode, rename, schedule, and possibly execute.

The question is what you want to optimize.

Design simplicity?  I think my approach wins here, too.
Interrupt response latency?  Use what I propose.
Maximum throughput?  Then follow your approach.

The throughput issue is only relevant if you have lots of interrupts.

>The way I saw it, the core continues to execute its current stream while
>it prefetches the handler prologue into I$L1, then loads its fetch buffer.
>At that point fetch injects a special INT_START uOp into the instruction
>stream and switches to the handler. The INT_START uOp travels down the
>pipeline following right behind the tail of the original stream.
>If none of the flow disrupting events occur to the original stream then
>the handler just tucks in behind it. When INT_START hits retire then core
>send the commit signal to the interrupt controller to confirm the hand-off.
>
>The interrupt handler should start executing at the same time as it would
>otherwise.

Architecturally, an instruction is only executed when it
commits/retires.  Only then do I/O devices or other CPUs see any
stores or I/O operations performed in the interrupt handler.  With
your approach, if there are long-latency instructions in the pipeline
(say, dependence chains containing multiple cache misses) when the
interrupt strikes, the instructions in your interrupt handler will
have to wait until the preceding instructions retire, which can take
thousands of cycles in the worst case.

By contrast, if you treat an interrupt like a branch misprediction and
cancel all the speculative work, the instructions of the interrupt
handler go through the engine as fast as possible, and you get the
minimum response latency possible in the engine.

- anton
-- 
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
  Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>