Function:
While using a maths coprocessor (also referred to as floating point
unit FPU), errors may occur and invalid numbers may be generated.
While most FPUs don't have any problem handling these situations, some
steps may lock up or misbehave otherwise. The list below shows known
malfunctions which may arise during FPU operations on some systems.
True bugs:
<FERR# not handled correctly by FPU>
<FPU performance degradation because IGNNE# active>
Incompatibilities between different types of FPU:
<Four indications for 'empty' in Condition Code Bits after FXAM>
'87 to 287 specific differences:
<Error signal does not go through PIC on 287+>
<Exceptions are different>
<Exception pointers saved by 287+ save prefixes>
<287+ need no synchronization>
<287 & 387 use reserved I/O ports>
FERR# not handled correctly by FPU
──────────────────────────────────────────────────────────────────────────────
<Back> (General Intel FPU bugs, unrelated to opcodes)
* FERR# not handled correctly by FPU:
In some cases an FPU operation may generate a floating point error,
which will not be recognized by the CPU.
The workaround for this is to replace all FWAIT with FNOP or follow
all FWAIT with a NOP, while masking all floating point errors.
FPU performance degradation because IGNNE# active
──────────────────────────────────────────────────────────────────────────────
<Back> (General Intel FPU bugs, unrelated to opcodes)
* FPU performance degradation because IGNNE# active:
If an unmasked exception occurs with bit NE (Numeric Error or Numeric
Exception) in CR0 cleared (recognize exceptions), while IGNNE# is
active, all following FPU instructions will require an additional 17 to
22 clocks. This because the exception remains pending due to the logic
conflict caused by contradicting signals. It lets the 486/487 execute
microcode in order to classify and analyze the exception, but it does
not let it handle it, prior to executing the next FPU opcode.
A workaround is to clear all unmasked exceptions with FCLEX or FINIT
within an exception handler before it finishes or to make sure IGNNE#
is not made active so exceptions are recognized and handled immediately
as they occur (when NE is cleared).
Four indications for 'empty' in Condition Code Bits after FXAM
──────────────────────────────────────────────────────────────────────────────
<Back> (General Intel FPU bugs, unrelated to opcodes)
* Four different indications for 'empty' in Condition Code Bits after FXAM:
The various FPUs use different bit patterns to indicate an empty FPU
register after the FXAM instruction. You should rely only on bits C0
and C3 to be 1 in case an FPU register is to be considered empty.
(See <FPU Condition Code Bits>)
Error signal does not go through PIC on 287+
──────────────────────────────────────────────────────────────────────────────
<Back> (General Intel FPU bugs, unrelated to opcodes)
* Error signal does not go through PIC on 287+
On the 86, an FPU error is signalled through the PIC (Programmable
Interrupt Controller). Starting with the 287, FPU errors are
signalled over a dedicated pin on the CPU / FPU combination,
namely ERROR#. There may be code which depends on the PIC handling
the error. These error handlers will need to be rewritten.
Exceptions are different
──────────────────────────────────────────────────────────────────────────────
<Back> (General Intel FPU bugs, unrelated to opcodes)
* Exceptions are different
The coprocessor segment overrun exception (09) is issued when the
FPU attempts to read the second or subsequent words of a data
operand beyond a segment limit on a 286. On a 386 it is not normally
used. The 486 signals exception 0dh instead.
The segment wraparound exception (General Protection exception 0dh)
will be issued if the FPU attempts to execute an instruction that
spans into or lies beyond a segment limit.
All other errors are signalled through interrupt 10h in 286 systems.
Exception pointers saved by 287+ save prefixes
──────────────────────────────────────────────────────────────────────────────
<Back> (General Intel FPU bugs, unrelated to opcodes)
* Exception pointers saved by 287+ save prefixes
The exception pointers on the 87 would point to the ESC instruction
itself, regardless of any segment overrides (or other prefixes for
that matter). The 287+ pointers point to the first prefix before
the ESC instruction, if any.
287+ need no synchronization
──────────────────────────────────────────────────────────────────────────────
<Back> (General Intel FPU bugs, unrelated to opcodes)
* 287+ need no synchronization
On the 87, the FPU and CPU worked separated from each other. Any
communication between the FPU and CPU had to be coordinated with
WAITs. On the 287+, no WAITs are required except for control
instructions. The CPU examines the BUSY# signal before communicating
with the FPU to assure the FPU can accept commands.
The 387 also examines BUSY# before sending commands to the FPU.
Data transfers are regulated by monitoring the PEREQ# pin.
287 & 387 use reserved I/O ports
──────────────────────────────────────────────────────────────────────────────
<Back> (General Intel FPU bugs, unrelated to opcodes)
* 287 & 387 use reserved I/O ports
On the 287, FPU instructions and data are sent to and received from
the FPU via I/O ports. These ports are f0-ff on the 286 / 287.
This property is important to consider when the number of I/O
waitstates on the mainboard can be changed. To safely increase the
FPU performance some experimentation may be necessary, but a 25%
speed increase has been accomplished on a 12 MHz 286 with 20 MHz
IIT 2c87 by decreasing the number of I/O waitstates from 6 to 4.
On the 387, FPU instructions and data are sent to and received from
the FPU via I/O ports too. These ports are 800000f0 - 800000ff.
Note that the I/O waitstate trick may very well work on 386 / 387
systems as well.
FPU Condition Code Bits after a test, compare or reduction
──────────────────────────────────────────────────────────────────────────────
Vatious FPU test instructions set the Condition Code bits C0 to C3 based
on the values tested. Below is a list of possible bit combinations.
These C-bits map to the flags register as follows after stswax and sahf:
Eflags map: ZF PF - CF (C1 has no flag assigned to it)
C3 C2 C1 C0
Examine 0 0 0 0 +Unnormal (positive, valid, unnormalized)
0 0 0 1 +NaN (positive, invalid, exponent is 0)
0 0 1 0 -Unnormal (negative, valid, unnormalized)
0 0 1 1 -NaN (negative, invalid, exponent is 0)
0 1 0 0 +Normal (positive, valid, normalized)
0 1 0 1 +Infinity (positive, infinity)
0 1 1 0 -Normal (negative, valid, normalized)
0 1 1 1 -Infinity (negative, infinity)
1 0 0 0 +Zero (positive, zero)
1 0 0 1 Empty (empty register)
1 0 1 0 -Zero (negative, zero)
1 0 1 1 Empty (empty register)
1 1 0 0 +Denormal (positive, invalid, exponent is 0)
1 1 0 1 Empty (empty register)
1 1 1 0 -Denormal (negative, invalid, exponent is 0)
1 1 1 1 Empty (empty register)
FCOM or
STST 0 0 ? 0 ST > Source with FCOM or ST > 0 with FSTST
0 0 ? 1 ST < Source with FCOM or ST < 0 with FSTST
1 0 ? 0 ST = Source with FCOM or ST = 0 with FSTST
1 1 ? 1 ST cannot be compared ot tested
Reduction b1 0 b0 b2 If reduction was complete, bits 0,1 and 2
equal the three lowest bits of the qoutient
? 1 ? ? Reduction was incomplete
FPU Status Word, Control Word and Tag Word layout
──────────────────────────────────────────────────────────────────────────────
The layout of the Status-, Control- and Tag Word of the FPU.
FPU Status Word
Bit 15 8 0
┌──┬──┬──┬──┼──┬──┬──┬──┼──┬──┬──┬──┼──┬──┬──┬─┴┐
│ B│c3│ ST n │c2│c1│c0│ES│sf│Pe│Ue│Oe│Ze│De│Ie│
└─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┛
│ │ └──┼──┛ └──┼──┛ │ │ │ │ │ │ │ │
Busy ╘═════╪════════╡ │ │ │ │ │ │ │ │
Stack Top──┛ │ │ │ │ │ │ │ │ │
Condition Code Bits─┛ │ │ │ │ │ │ │ │
Exception Summary * ──────┛ │ │ │ │ │ │ │
Stack fault──────────────────┛ │ │ │ │ │ │
Precision exception (1=occurred)┛ │ │ │ │ │
Underflow exception (1=occurred)───┛ │ │ │ │
Overflow exception (1=occurred)───────┛ │ │ │
Zero divison exception (1=occurred)──────┛ │ │
Denormalized operand exception (1=occurred)─┛ │
Invalid operation exception (1=occurred)───────┛
* The Exception summary is called Interrupt request on 8087.
FPU Control Word
Bit 15 8 0
┌──┬──┬──┬──┼──┬──┬──┬──┼──┬──┬──┬──┼──┬──┬──┬─┴┐
│ r│ r│ r│ic│round│prec.│ie│ r│Pm│Um│Om│Zm│Dm│Im│
└──┴──┴──┴─┼┴──┴─┼┴─┼┴──┴─┼┴──┴─┼┴─┼┴─┼┴─┼┴─┼┴─┼┛
Infinity │ │ │ │ │ │ │ │ │ │
control────┛ │ │ │ │ │ │ │ │ │
Rounding control─┛ │ │ │ │ │ │ │ │
Precision control───┛ │ │ │ │ │ │ │
Interrupt enable mask─────┛ │ │ │ │ │ │
└┐ │ │ │ │ │
Precision exception Mask 1=masked┛ │ │ │ │ │
Underflow exception Mask 1=masked──┛ │ │ │ │
Overflow exception Mask 1=masked──────┛ │ │ │
Zero divison exception Mask 1=masked─────┛ │ │
Denormalized operand exception Mask 1=masked┛ │
Invalid operation exception Mask 1=masked──────┛
Infinity control is supported on the 8087 and 287 only.
The 87 and 287 (not the 287xl) have ic cleared by default and then
support projective closure. The 287xl+ only support affine closure.
To make sure an 87 or 287 will handle the numbers in the same way
as the 287xl+, set bit ic to make 87 & 287 support affine closure
as well. Note that a FINIT will clear ic again.
The ic setting is ignored on 287xl+.
Rounding control is set to 00 by default.
00 = Round to nearest or even
01 = Round down (towards negative infinity)
10 = Round up (towards positive infinity)
11 = Chop towards zero
Precision control is set to 11 by default.
00 = 24 bit precision (mantissa)
01 = reserved
10 = 53 bit precision (mantissa)
11 = 64 bit precision (mantissa)