IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 1 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New I've done the subtract
It returns 0 as near as I can tell.

The point is, regardless of error the calculation should be deterministic. If you do exactly the same thing on each side of the equality, you should get exactly the same result.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Assembly dump?
From my gcc .s intermediate file I get the following sequence (note I placed them side by side for comparison purposes - the sequence is to run thru the left side first):

\nLC1:\n   .long 0x0,0x40240000\n\n   pushl $3                pushl $3\n   fldl LC1                fldl LC1\n   subl $8,%esp            subl $8,%esp\n   fstpl (%esp)            fstpl (%esp)\n   call _pow               call _pow\n   addl $16,%esp           addl $16,%esp\n   movl %eax,-4(%ebp)      movl %eax,-4(%ebp)\n   movl $123456,%ecx       movl $123456,%ecx\n   movl %ecx,%eax          movl %ecx,%eax\n   cltd                    cltd\n   idivl -4(%ebp)          idivl -4(%ebp)\n   movl %eax,%ebx          movl %eax,-4(%ebp)\n\n   addl $-4,%esp           cmpl -4(%ebp),%ebx\n


Other than what registers (and offsets) are being used, the operations appear identical. On my tests, the equality test did return true.

New I did my dump with GDB
same result. Except equality returns "false". And, when I save the results of 2 computations in variables, they have the same value, bit by bit. I tried float, double and long double.

I have a hypothesis: what if pow uses random rounding to help eliminate rounding errors? It's so random that single command difference in calling seq can affect it. Far fetched, I know.

--

We have only 2 things to worry about: That
things will never get back to normal, and that they already have.
New What I get:
\n\tmovl\t$0, 8(%esp)\n\tmovl\t$1074266112, 12(%esp)\n\tmovl\t$0, (%esp)\n\tmovl\t$1076101120, 4(%esp)\n\tcall\tpow\n\tfldl\t.LC0\n\tfdivp\t%st, %st(1)\n\tfstpl\t-8(%ebp)\n\n\tmovl\t$0, 8(%esp)\n\tmovl\t$1074266112, 12(%esp)\n\tmovl\t$0, (%esp)\n\tmovl\t$1076101120, 4(%esp)\n\tcall\tpow\n\tfldl\t.LC0\n\tfdivp\t%st, %st(1)\n\tfldl\t-8(%ebp)\n\n\tfucompp\n

Looks the same to me, other than the fstdl vs. fldl.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Speculating...
...but I think that internally the FP processor is using Long Doubles (12 bytes) but it is storing the first result on the stack as a Double (8 bytes). That is, the operation that takes the first result out of the coprocessor:

\tfstpl\t-8(%ebp)

Results in a truncation of the result by placing it back on a stack. Notice how on the result of the second half of the compare, it is not pulled back locally - being left in the floating point register:

\tfdivp\t%st, %st(1)\n\tfldl\t-8(%ebp)


The compare is then done within the FPU. My guess is that the first calculation has been truncated to 8 bytes after being pulled out of the FPU, but the second calculation is never pulled out and remained in 12 byte form. Hence, you can get an inequality when the truncated bits differ.

Just speculation, of course. :-)
New gcc 2.95.4 results
\n.LM3:\n\taddl $-8,%esp\n\tfldl .LC0\n\tsubl $8,%esp\n\tfstpl (%esp)\n\tfldl .LC1\n\tsubl $8,%esp\n\tfstpl (%esp)\n.LCFI3:\n\tcall pow\n\taddl $16,%esp\n\tfstpl -8(%ebp)\n\tfldl .LC0\n\tsubl $8,%esp\n\tfstpl (%esp)\n\tfldl .LC1\n\tsubl $8,%esp\n\tfstpl (%esp)\n\tcall pow\n\taddl $16,%esp\n\tfldl .LC2\n\tfldl -8(%ebp)\n\tfdivrp %st,%st(1)\n\tfldl .LC2\n\tfdivp %st,%st(2)\n\tfucompp\n

Being a Bear Of Little Clue when it comes to interpreting x86 assembler, I'm not quite sure what the above means... ;-) However, the comparison does work. :-P
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Re: gcc 2.95.4 results
ESP is the 32-bit stack pointer. Instructions that start with F are FPU operations. EBP is the 32-bit "base pointer" for 32-bit memory operations.

Unfortunately, the business end of things is missing, the call to "pow". That will almost certainly be an implementation of 2^(y log2 x) as mentined in the code I posted.
-drl
New EBP is base of stack frame
EBP(-x) is a local variable, EBP(+x) is a function parameter. EBP(+4) (or may be EBP(0)), I suspect, is the function's return address.
--

We have only 2 things to worry about: That
things will never get back to normal, and that they already have.
New I86 Assembly is not my specialty...
...having done much more with the MC680x0 and MC6888x combinations. The considerations are similar, but I find x86 assembly to be harder to read and write. (Bottom line is I can guess, but it takes someone that is more up to speed on Assembly to spot the exact problem - I haven't done ASM for money in almost 10 years).

Ok, the difference between my original code and both your samples is that my code does the compare in the cpu not in the fpu. The difference between your code that returns equality and the one that doesn't is the order of the divide operations (fdivp). In the first code, the result of the left side is calculated all the way through the divide before the intermediate result is placed on the stack. Means the rounding occurs after the divide operation.

In the second case, the divides are saved as the last operation (kind of interspersing the calculation of the left and right hand sides). Means that there is some initial rounding which may occur but it is prior to the divide. Once division starts taking place, everything is done internal to the FPU.

Don't have enuf expertise to figure it out totally. It is surprising that the code has compiled out so differently - the newest code looks much cleaner (though it returns what appears to be an errant result).
New ASM Comments
Note: I've marked in red where I see the mismatch in precision being introduced. Although the 2.95 does test equal, I would speculate that if you placed some different numbers in the power function, it may also cause a failure in the equality (due to the fact that the result of the first power function is truncated to 64 bits, whereas the second one is carried on through the 80 bit FP register).

ggcc 3.2.1 results


STEPA:\n   movl  $0, 8(%esp)\n   movl  $1074266112, 12(%esp)            -- 3.0\n   movl  $0, (%esp)\n   movl  $1076101120, 4(%esp)             -- 10.0\n                                          -- stack should look something like\n                                          --    12(esp)  3.0\n                                          --    8(esp)     0\n                                          --    4(esp)  10.0\n                                          --    0(esp)     0\n   call  pow                              -- pow(10.0, 3.0)\n                                          -- result left in ST(0)\n   fldl  .LC0                             -- push 123456 onto stack\n                                          -- ST(0)=123456 and ST(1)=pow(10.0, 3.0)\n   fdivp %st, %st(1)                      -- 123456 / pow(10.0, 3.0)\n   fstpl -8(%ebp)                         -- store result in local variable\n                                          -- note: 80 bit ST result narrowed to 64 bits\nSTEPB:\n   movl  $0, 8(%esp)\n   movl  $1074266112, 12(%esp)            -- 3.0\n   movl  $0, (%esp)\n   movl  $1076101120, 4(%esp)             -- 10.0\n                                          -- stack should look something like\n                                          --    12(esp)  3.0\n                                          --    8(esp)     0\n                                          --    4(esp)  10.0\n                                          --    0(esp)     0\n   call  pow                              -- pow(10.0, 3.0)\n                                          -- result left in ST(0)\n   fldl  .LC0                             -- push 123456 onto stack\n                                          -- ST(0)=123456 and ST(1)=pow(10.0, 3.0)\n   fdivp %st, %st(1)                      -- 123456 / pow(10.0, 3.0)\n   fldl  -8(%ebp)                         -- push result from STEPA into the FP stack\n                                          -- FP Stack now has:\n                                          --    ST(0) = STEPA result\n                                          --    ST(1) = STEPB result\n                                          -- note that STEPB result was never round to 64bits\n   fucompp                                -- compare ST(0) with ST(1)\n



gcc 2.95.4 results


STEPA:\n   fldl .LC0                              -- push 3.0 into FPStack ST(0)\n   subl $8,%esp                           -- reserve another 8 bytes on the stack\n   fstpl (%esp)                           -- pull ST(0)=3.0 off the stack\n   fldl .LC1                              -- push 10.0 into FPStack ST(0)\n   subl $8,%esp                           -- reserve another 8 bytes on the stack\n   fstpl (%esp)                           -- pull ST(0)=10.0 off the stack\n                                          -- stack should look something like\n                                          --    12(esp)  3.0\n                                          --    8(esp)     ?\n                                          --    4(esp)  10.0\n                                          --    0(esp)     ?\n   call  pow                              -- pow(10.0, 3.0)\n                                          -- result left in ST(0)\n   addl $16,%esp                          -- reclaim stack space\n   fstpl -8(%ebp)                         -- store result in local variable\n                                          --    Conversion to 64 bits here.\n                                          --       but 10^^3 is less likely to have stray bits\n\nSTEPB:\n   fldl .LC0                              -- push 3.0 into FPStack ST(0)\n   subl $8,%esp                           -- reserve another 8 bytes on the stack\n   fstpl (%esp)                           -- pull ST(0)=3.0 off the stack\n   fldl .LC1                              -- push 10.0 into FPStack ST(0)\n   subl $8,%esp                           -- reserve another 8 bytes on the stack\n   fstpl (%esp)                           -- pull ST(0)=10.0 off the stack\n                                          -- stack should look something like\n                                          --    12(esp)  3.0\n                                          --    8(esp)     ?\n                                          --    4(esp)  10.0\n                                          --    0(esp)     ?\n   call pow                               -- pow(10.0, 3.0)\n                                          -- result left in ST(0)\n   addl $16,%esp                          -- reclaim stack space\n\n   fldl .LC2                              -- push 123456.0 into FPStack ST(0)\n                                          --    ST(0)=123456.0\n                                          --    ST(1)=pow(10.0, 3.0) => From STEPB\n   fldl -8(%ebp)                          -- push pow(10.0, 3.0)=>STEPA into FPStack ST(0)\n                                          --    ST(0)=pow(10.0, 3.0) => From STEPA\n                                          --    ST(1)=123456.0\n                                          --    ST(2)=pow(10.0, 3.0) => From STEPB\n   fdivrp %st,%st(1)                      -- Divide ST(1) / ST(0) and pop 2\n                                          -- 123456.0 / pow(10.0, 3.0)\n                                          --    ST(0)=123456.0/pow(10.0, 3.0) => From STEPA\n                                          --    ST(1)=pow(10.0, 3.0) => From STEPB\n   fldl .LC2                              -- push 123456.0 into FPStack ST(0)\n                                          --    ST(0)=123456.0\n                                          --    ST(1)=123456.0/pow(10.0, 3.0) => From STEPA\n                                          --    ST(2)=pow(10.0, 3.0) => From STEPB\n   fdivp %st,%st(2)                       -- Divide ST(2) / ST(0)\n                                          --    ST(0)=123456.0/pow(10.0, 3.0) => From STEPB\n                                          --    ST(1)=123456.0/pow(10.0, 3.0) => From STEPA\n   fucompp                                -- compare ST(0) with ST(1)\n

Expand Edited by ChrisR Jan. 6, 2003, 02:11:10 PM EST
Expand Edited by ChrisR Jan. 6, 2003, 02:12:59 PM EST
New Errrr...
Wow. Thanks for the comprehensive answer. :-D

So, the FPU on a Xeon chip is 80 bits?

Anyone aware of a way to force the compiler to use IEEE compliant 64-bit only for compares? I tried a few settings (-ffloat-store and -mieee-fp) but neither seemed to make a difference in the output.

I can get this to work in gcc 3.2.1 by explicitly storing to a double before making the compare, but I have been informed that there are a number of areas in the code that will fail with this behaviour.

On a side note, does anyone know how to view the intermediary ASM from Visual C++ 6? I want to compare that, gcc 2.95, gcc 3.2.1, and gcc 2.95 Solaris (which is where the system currently runs). If I can show that the behavior can be induced on either NT or Solaris, then this ceases to be a Linux-only problem.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Getting ASM from VC6
Project/Settings/C-C++ tab/Category dropdown -> select "Listing Files", then one of "Assem. Only", "Assem, Mach Code, Source" etc. in the listing file type dropdown.
-drl
New Results
Source:

#include <math.h>

void main(void)
{
(123456 / pow(10.0, 3)) == (123456 / pow(10.0, 3));
return;
}



Output:

\tTITLE\tC:\\My Documents\\Visual Studio Projects\\Junk1\\Junk.cpp
\t.386P
include listing.inc
if @Version gt 510
.model FLAT
else
_TEXT\tSEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT\tENDS
_DATA\tSEGMENT DWORD USE32 PUBLIC 'DATA'
_DATA\tENDS
CONST\tSEGMENT DWORD USE32 PUBLIC 'CONST'
CONST\tENDS
_BSS\tSEGMENT DWORD USE32 PUBLIC 'BSS'
_BSS\tENDS
$$SYMBOLS\tSEGMENT BYTE USE32 'DEBSYM'
$$SYMBOLS\tENDS
$$TYPES\tSEGMENT BYTE USE32 'DEBTYP'
$$TYPES\tENDS
_TLS\tSEGMENT DWORD USE32 PUBLIC 'TLS'
_TLS\tENDS
;\tCOMDAT _main
_TEXT\tSEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT\tENDS
FLAT\tGROUP _DATA, CONST, _BSS
\tASSUME\tCS: FLAT, DS: FLAT, SS: FLAT
endif
PUBLIC\t_main
EXTRN\t_pow:NEAR
EXTRN\t__chkesp:NEAR
EXTRN\t__fltused:NEAR
;\tCOMDAT _main
_TEXT\tSEGMENT
_main\tPROC NEAR\t\t\t\t\t; COMDAT
; File C:\\My Documents\\Visual Studio Projects\\Junk1\\Junk.cpp
; Line 4
\tpush\tebp
\tmov\tebp, esp
\tsub\tesp, 64\t\t\t\t\t; 00000040H
\tpush\tebx
\tpush\tesi
\tpush\tedi
\tlea\tedi, DWORD PTR [ebp-64]
\tmov\tecx, 16\t\t\t\t\t; 00000010H
\tmov\teax, -858993460\t\t\t\t; ccccccccH
\trep stosd
; Line 5
\tpush\t1074266112\t\t\t\t; 40080000H
\tpush\t0
\tpush\t1076101120\t\t\t\t; 40240000H
\tpush\t0
\tcall\t_pow
\tfstp\tST(0)
\tadd\tesp, 16\t\t\t\t\t; 00000010H
\tpush\t1074266112\t\t\t\t; 40080000H
\tpush\t0
\tpush\t1076101120\t\t\t\t; 40240000H
\tpush\t0
\tcall\t_pow
\tfstp\tST(0)
\tadd\tesp, 16\t\t\t\t\t; 00000010H
; Line 7
\tpop\tedi
\tpop\tesi
\tpop\tebx
\tadd\tesp, 64\t\t\t\t\t; 00000040H
\tcmp\tebp, esp
\tcall\t__chkesp
\tmov\tesp, ebp
\tpop\tebp
\tret\t0
_main\tENDP
_TEXT\tENDS
END


-drl
New Request...
Could you change that line to:

if ((123456 / pow(10.0, 3)) == (123456 / pow(10.0, 3))) {\n   printf("hello\\n");\n}


As it stands, I can't if it really does a compare. Thanks.
Expand Edited by ChrisR Jan. 6, 2003, 04:34:54 PM EST
New Re: Request...
OK:


\tTITLE\tC:\\My Documents\\Visual Studio Projects\\Junk1\\Junk.cpp
\t.386P
include listing.inc
if @Version gt 510
.model FLAT
else
_TEXT\tSEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT\tENDS
_DATA\tSEGMENT DWORD USE32 PUBLIC 'DATA'
_DATA\tENDS
CONST\tSEGMENT DWORD USE32 PUBLIC 'CONST'
CONST\tENDS
_BSS\tSEGMENT DWORD USE32 PUBLIC 'BSS'
_BSS\tENDS
$$SYMBOLS\tSEGMENT BYTE USE32 'DEBSYM'
$$SYMBOLS\tENDS
$$TYPES\tSEGMENT BYTE USE32 'DEBTYP'
$$TYPES\tENDS
_TLS\tSEGMENT DWORD USE32 PUBLIC 'TLS'
_TLS\tENDS
;\tCOMDAT _main
_TEXT\tSEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT\tENDS
FLAT\tGROUP _DATA, CONST, _BSS
\tASSUME\tCS: FLAT, DS: FLAT, SS: FLAT
endif
PUBLIC\t_main
PUBLIC\t__real@8@400ff120000000000000
EXTRN\t_pow:NEAR
EXTRN\t__chkesp:NEAR
EXTRN\t__fltused:NEAR
;\tCOMDAT __real@8@400ff120000000000000
; File C:\\My Documents\\Visual Studio Projects\\Junk1\\Junk.cpp
CONST\tSEGMENT
__real@8@400ff120000000000000 DQ 040fe240000000000r ; 123456
CONST\tENDS
;\tCOMDAT _main
_TEXT\tSEGMENT
_main\tPROC NEAR\t\t\t\t\t; COMDAT
; File C:\\My Documents\\Visual Studio Projects\\Junk1\\Junk.cpp
; Line 4
\tpush\tebp
\tmov\tebp, esp
\tsub\tesp, 72\t\t\t\t\t; 00000048H
\tpush\tebx
\tpush\tesi
\tpush\tedi
\tlea\tedi, DWORD PTR [ebp-72]
\tmov\tecx, 18\t\t\t\t\t; 00000012H
\tmov\teax, -858993460\t\t\t\t; ccccccccH
\trep stosd
; Line 5
\tpush\t1074266112\t\t\t\t; 40080000H
\tpush\t0
\tpush\t1076101120\t\t\t\t; 40240000H
\tpush\t0
\tcall\t_pow
\tadd\tesp, 16\t\t\t\t\t; 00000010H
\tfdivr\tQWORD PTR __real@8@400ff120000000000000
\tfstp\tQWORD PTR -8+[ebp]
\tpush\t1074266112\t\t\t\t; 40080000H
\tpush\t0
\tpush\t1076101120\t\t\t\t; 40240000H
\tpush\t0
\tcall\t_pow
\tadd\tesp, 16\t\t\t\t\t; 00000010H
\tfdivr\tQWORD PTR __real@8@400ff120000000000000
\tfld\tQWORD PTR -8+[ebp]
\tfcompp
; Line 7
\tpop\tedi
\tpop\tesi
\tpop\tebx
\tadd\tesp, 72\t\t\t\t\t; 00000048H
\tcmp\tebp, esp
\tcall\t__chkesp
\tmov\tesp, ebp
\tpop\tebp
\tret\t0
_main\tENDP
_TEXT\tENDS
END

-drl
New Trimming it down...
Just wondering whether it printed the "hello" - i.e. did the equality test as true or false?

(Note: see [link|http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore/html/_core_.2f.op.asp|/Op Option] for a description of the settings for VC6).

\nSTEPA:\n   push 1074266112                        -- 3.0\n   push 0\n   push 1076101120                        -- 10.0\n   push 0\n                                          -- cpu stack should look something like\n                                          --    12(esp)  3.0\n                                          --    8(esp)     0\n                                          --    4(esp)  10.0\n                                          --    0(esp)     0\n   call _pow                              -- pow(10.0, 3.0)\n                                          -- result left in FPU ST(0)\n   add esp, 16                            -- reclaim cpu stack space\n   fdivr QWORD PTR __real@8@400ff120000000000000  -- 123456.0 / pow(10.0, 3.0) \n   fstp QWORD PTR -8+[ebp]                -- store result in local variable\n                                          -- not sure if this rounds to 64 bits\n                                          -- QWORD is quad-word - 8 bytes in length\n\nSTEPB:\n   push 1074266112                        -- 3.0\n   push 0\n   push 1076101120                        -- 10.0\n   push 0\n                                          -- cpu stack should look something like\n                                          --    12(esp)  3.0\n                                          --    8(esp)     0\n                                          --    4(esp)  10.0\n                                          --    0(esp)     0\n   call _pow                              -- pow(10.0, 3.0)\n   add esp, 16                            -- reclaim cpu stack space \n   fdivr QWORD PTR __real@8@400ff120000000000000 -- 123456.0 / pow(10.0, 3.0)\n\n   fld QWORD PTR -8+[ebp]                 -- load STEPA result into fpu ST(0)\n                                          --    ST(0) = STEPA result\n                                          --    ST(1) = STEPB result\n   fcompp                                 -- compare ST(0) and ST(1)

Expand Edited by ChrisR Jan. 6, 2003, 04:57:34 PM EST
New Re: Trimming it down...
Yes, it did the braces.

/Od BTW (no opt).
-drl
New Scratches head...
Kind of goes against my theory. :-(

From looking at the ASM again, it doesn't look like you put any instructions within the conditional brackets. The ASM does an float compare but it doesn't actually use the result for any branch or jump. Could you try it with a simple print within the conditionals?

New Re: Scratches head...
Oh sure. It won't really be essentially different, but here goes:


\tTITLE\tC:\\My Documents\\Visual Studio Projects\\Junk1\\Junk.cpp
\t.386P
include listing.inc
if @Version gt 510
.model FLAT
else
_TEXT\tSEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT\tENDS
_DATA\tSEGMENT DWORD USE32 PUBLIC 'DATA'
_DATA\tENDS
CONST\tSEGMENT DWORD USE32 PUBLIC 'CONST'
CONST\tENDS
_BSS\tSEGMENT DWORD USE32 PUBLIC 'BSS'
_BSS\tENDS
$$SYMBOLS\tSEGMENT BYTE USE32 'DEBSYM'
$$SYMBOLS\tENDS
$$TYPES\tSEGMENT BYTE USE32 'DEBTYP'
$$TYPES\tENDS
_TLS\tSEGMENT DWORD USE32 PUBLIC 'TLS'
_TLS\tENDS
;\tCOMDAT ??_C@_02ELOP@42?$AA@
CONST\tSEGMENT DWORD USE32 PUBLIC 'CONST'
CONST\tENDS
;\tCOMDAT _main
_TEXT\tSEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT\tENDS
FLAT\tGROUP _DATA, CONST, _BSS
\tASSUME\tCS: FLAT, DS: FLAT, SS: FLAT
endif
PUBLIC\t_main
PUBLIC\t??_C@_02ELOP@42?$AA@\t\t\t\t; `string'
PUBLIC\t__real@8@400ff120000000000000
EXTRN\t_pow:NEAR
EXTRN\t_puts:NEAR
EXTRN\t__chkesp:NEAR
EXTRN\t__fltused:NEAR
;\tCOMDAT ??_C@_02ELOP@42?$AA@
; File C:\\My Documents\\Visual Studio Projects\\Junk1\\Junk.cpp
CONST\tSEGMENT
??_C@_02ELOP@42?$AA@ DB '42', 00H\t\t\t; `string'
CONST\tENDS
;\tCOMDAT __real@8@400ff120000000000000
CONST\tSEGMENT
__real@8@400ff120000000000000 DQ 040fe240000000000r ; 123456
CONST\tENDS
;\tCOMDAT _main
_TEXT\tSEGMENT
_main\tPROC NEAR\t\t\t\t\t; COMDAT
; File C:\\My Documents\\Visual Studio Projects\\Junk1\\Junk.cpp
; Line 5
\tpush\tebp
\tmov\tebp, esp
\tsub\tesp, 72\t\t\t\t\t; 00000048H
\tpush\tebx
\tpush\tesi
\tpush\tedi
\tlea\tedi, DWORD PTR [ebp-72]
\tmov\tecx, 18\t\t\t\t\t; 00000012H
\tmov\teax, -858993460\t\t\t\t; ccccccccH
\trep stosd
; Line 6
\tpush\t1074266112\t\t\t\t; 40080000H
\tpush\t0
\tpush\t1076101120\t\t\t\t; 40240000H
\tpush\t0
\tcall\t_pow
\tadd\tesp, 16\t\t\t\t\t; 00000010H
\tfdivr\tQWORD PTR __real@8@400ff120000000000000
\tfstp\tQWORD PTR -8+[ebp]
\tpush\t1074266112\t\t\t\t; 40080000H
\tpush\t0
\tpush\t1076101120\t\t\t\t; 40240000H
\tpush\t0
\tcall\t_pow
\tadd\tesp, 16\t\t\t\t\t; 00000010H
\tfdivr\tQWORD PTR __real@8@400ff120000000000000
\tfcomp\tQWORD PTR -8+[ebp]
\tfnstsw\tax
\ttest\tah, 64\t\t\t\t\t; 00000040H
\tje\tSHORT $L928
\tpush\tOFFSET FLAT:??_C@_02ELOP@42?$AA@\t; `string'
\tcall\t_puts
\tadd\tesp, 4
$L928:
; Line 8
\tpop\tedi
\tpop\tesi
\tpop\tebx
\tadd\tesp, 72\t\t\t\t\t; 00000048H
\tcmp\tebp, esp
\tcall\t__chkesp
\tmov\tesp, ebp
\tpop\tebp
\tret\t0
_main\tENDP
_TEXT\tENDS
END

-drl
New Thanks...
...I was just wanting to see the instructions that immediately followed the float compare. In this case it was:

fnstsw ax
test ah, 64
je SHORT $L928

Which is strange compared to my experience with the MC68881, where there are direct floating point branch instructions (at least that's the way I remember it).
New BTW
Did you figure out what I put between the brackets? :)
-drl
New Hmmmm
Well, I see the instructions:

push OFFSET FLAT:??_C@_02ELOP@42?$AA@ ; `string'\ncall _puts


And I know that _puts is most likely the printf function, but I can't really decode the constant that's pushed on the stack there. The assembler comment 'string is pretty useless - yes I know it's a string - tell me something I didn't know. But I can't make out the rest of the encryption (something with ELOP in it). :-)
New puts("42");
-drl
New gcc 2.95.26
Just as another data point, I tried this version under cygwin and it also fails to give an equality.

\n\tfldl LC0\n\tsubl $8,%esp\n\tfstpl (%esp)\n\tfldl LC1\n\tsubl $8,%esp\n\tfstpl (%esp)\n\tcall _pow\n\taddl $16,%esp\n\tfldl LC2\n\tfdivp %st,%st(1)\n\tfstpl -8(%ebp)\n\tfldl LC0\n\tsubl $8,%esp\n\tfstpl (%esp)\n\tfldl LC1\n\tsubl $8,%esp\n\tfstpl (%esp)\n\tcall _pow\n\taddl $16,%esp\n\tfldl LC2\n\tfdivp %st,%st(1)\n\tfldl -8(%ebp)\n\tfucompp\n\tfnstsw %ax\n\tandb $68,%ah\n\txorb $64,%ah\n\tjne L3\n
New Did you try it with optimizations off?
-drl
New A really good optimizing compiler...
...would have seen that the expression could be completely evaluated at compile time and eliminated any runtime evaluation.

My personal suspicion is that there's an intermediate form of optimization where the right hand side is evaluated in the FP chip (which uses 12 byte representations) and then returns the result to the stack as a double (8 bytes in length). Then it evaluates the right hand side in the FP chip (as a 12 byte result) but does not convert it back to double format (8 bytes) before it does the FCompare. In other words, the right side is converted to and from double to the double long format, but the left hand side stays as a double long.

Would have to see the assembly to see if that's what's happening. I can't seem to emulate here though.
New Yes..
..I was thinking along those lines. That's why I suggested to Scott that he append an L to the integer in his division.

Still, it's an interesting thing he's hit on.
-drl
New Compiler has no way to predict how pow() is implemented
For all it knows, it can have side effects.
--

We have only 2 things to worry about: That
things will never get back to normal, and that they already have.
New IEEE
has gone far toward eliminating this type of issue. The hardware has to be compliant - that's not the responsibility of the compiler.
-drl
New pow() is library, not hardware
it may be hardware on some architectures, but it's not standard.
--

We have only 2 things to worry about: That
things will never get back to normal, and that they already have.
New True
..but that's not the point - it's standardization of representation, and denormalization.
-drl
Expand Edited by deSitter Jan. 5, 2003, 07:38:57 PM EST
New Re: Compiler has no way to predict how pow() is implemented
Was poking around in my numerics code, and found the code from the Cephes math library below (attribution included) to illustrate the point I wanted to make - that pow(x,y) is exp(y log x). Note that FORTRAN has an exponentiation operator ** and that it works differently for small integers as exponents - so there would be alternate entry points.


/*\t\t\t\t\t\t\tpowf.c
*
*\tPower function
*
*
*
* SYNOPSIS:
*
* float x, y, z, powf();
*
* z = powf( x, y );
*
*
*
* DESCRIPTION:
*
* Computes x raised to the yth power. Analytically,
*
* x**y = exp( y log(x) ).
*
* Following Cody and Waite, this program uses a lookup table
* of 2**-i/16 and pseudo extended precision arithmetic to
* obtain an extra three bits of accuracy in both the logarithm
* and the exponential.
*
*
*
* ACCURACY:
*
* Relative error:
* arithmetic domain # trials peak rms
* IEEE -10,10 100,000 1.4e-7 3.6e-8
* 1/10 < x < 10, x uniformly distributed.
* -10 < y < 10, y uniformly distributed.
*
*
* ERROR MESSAGES:
*
* message condition value returned
* powf overflow x**y > MAXNUMF MAXNUMF
* powf underflow x**y < 1/MAXNUMF 0.0
* powf domain x<0 and y noninteger 0.0
*
*/
\x0c
/*
Cephes Math Library Release 2.2: June, 1992
Copyright 1984, 1987, 1988 by Stephen L. Moshier
Direct inquiries to 30 Frost Street, Cambridge, MA 02140
*/


#include "mconf.h"
static char fname[] = {"powf"};


/* 2^(-i/16)
* The decimal values are rounded to 24-bit precision
*/
static float A[] = {
1.00000000000000000000E0,
9.57603275775909423828125E-1,
9.17004048824310302734375E-1,
8.78126084804534912109375E-1,
8.40896427631378173828125E-1,
8.05245161056518554687500E-1,
7.71105408668518066406250E-1,
7.38413095474243164062500E-1,
7.07106769084930419921875E-1,
6.77127778530120849609375E-1,
6.48419797420501708984375E-1,
6.20928883552551269531250E-1,
5.94603538513183593750000E-1,
5.69394290447235107421875E-1,
5.45253872871398925781250E-1,
5.22136867046356201171875E-1,
5.00000000000000000000E-1
};
/* continuation, for even i only
* 2^(i/16) = A[i] + B[i/2]
*/
static float B[] = {
0.00000000000000000000E0,
-5.61963907099083340520586E-9,
-1.23776636307969995237668E-8,
4.03545234539989593104537E-9,
1.21016171044789693621048E-8,
-2.00949968760174979411038E-8,
1.89881769396087499852802E-8,
-6.53877009617774467211965E-9,
0.00000000000000000000E0
};

/* 1 / A[i]
* The decimal values are full precision
*/
static float Ainv[] = {
1.00000000000000000000000E0,
1.04427378242741384032197E0,
1.09050773266525765920701E0,
1.13878863475669165370383E0,
1.18920711500272106671750E0,
1.24185781207348404859368E0,
1.29683955465100966593375E0,
1.35425554693689272829801E0,
1.41421356237309504880169E0,
1.47682614593949931138691E0,
1.54221082540794082361229E0,
1.61049033194925430817952E0,
1.68179283050742908606225E0,
1.75625216037329948311216E0,
1.83400808640934246348708E0,
1.91520656139714729387261E0,
2.00000000000000000000000E0
};

#ifdef DEC
#define MEXP 2032.0
#define MNEXP -2032.0
#else
#define MEXP 2048.0
#define MNEXP -2400.0
#endif

/* log2(e) - 1 */
#define LOG2EA 0.44269504088896340736F
extern float MAXNUMF;

#define F W
#define Fa Wa
#define Fb Wb
#define G W
#define Ga Wa
#define Gb u
#define H W
#define Ha Wb
#define Hb Wb


#ifdef ANSIC
float floorf( float );
float frexpf( float, int *);
float ldexpf( float, int );
float powif( float, int );
#else
float floorf(), frexpf(), ldexpf(), powif();
#endif

/* Find a multiple of 1/16 that is within 1/16 of x. */
#define reduc(x) 0.0625 * floorf( 16 * (x) )

#ifdef ANSIC
float powf( float x, float y )
#else
float powf( x, y )
float x, y;
#endif
{
float u, w, z, W, Wa, Wb, ya, yb;
/* float F, Fa, Fb, G, Ga, Gb, H, Ha, Hb */
int e, i, nflg;


nflg = 0;\t/* flag = 1 if x<0 raised to integer power */
w = floorf(y);
if( w < 0 )
\tz = -w;
else
\tz = w;
if( (w == y) && (z < 32768.0) )
\t{
\ti = w;
\tw = powif( x, i );
\treturn( w );
\t}


if( x <= 0.0F )
\t{
\tif( x == 0.0 )
\t\t{
\t\tif( y == 0.0 )
\t\t\treturn( 1.0 ); /* 0**0 */
\t\telse
\t\t\treturn( 0.0 ); /* 0**y */
\t\t}
\telse
\t\t{
\t\tif( w != y )
\t\t\t{ /* noninteger power of negative number */
\t\t\tmtherr( fname, DOMAIN );
\t\t\treturn(0.0);
\t\t\t}
\t\tnflg = 1;
\t\tif( x < 0 )
\t\t\tx = -x;
\t\t}
\t}

/* separate significand from exponent */
x = frexpf( x, &e );

/* find significand in antilog table A[] */
i = 1;
if( x <= A[9] )
\ti = 9;
if( x <= A[i+4] )
\ti += 4;
if( x <= A[i+2] )
\ti += 2;
if( x >= A[1] )
\ti = -1;
i += 1;


/* Find (x - A[i])/A[i]
* in order to compute log(x/A[i]):
*
* log(x) = log( a x/a ) = log(a) + log(x/a)
*
* log(x/a) = log(1+v), v = x/a - 1 = (x-a)/a
*/
x -= A[i];
x -= B[ i >> 1 ];
x *= Ainv[i];


/* rational approximation for log(1+v):
*
* log(1+v) = v - 0.5 v^2 + v^3 P(v)
* Theoretical relative error of the approximation is 3.5e-11
* on the interval 2^(1/16) - 1 > v > 2^(-1/16) - 1
*/
z = x*x;
w = (((-0.1663883081054895 * x
+ 0.2003770364206271) * x
- 0.2500006373383951) * x
+ 0.3333331095506474) * x * z;
w -= 0.5 * z;

/* Convert to base 2 logarithm:
* multiply by log2(e)
*/
w = w + LOG2EA * w;
/* Note x was not yet added in
* to above rational approximation,
* so do it now, while multiplying
* by log2(e).
*/
z = w + LOG2EA * x;
z = z + x;

/* Compute exponent term of the base 2 logarithm. */
w = -i;
w *= 0.0625; /* divide by 16 */
w += e;
/* Now base 2 log of x is w + z. */

/* Multiply base 2 log by y, in extended precision. */

/* separate y into large part ya
* and small part yb less than 1/16
*/
ya = reduc(y);
yb = y - ya;


F = z * y + w * yb;
Fa = reduc(F);
Fb = F - Fa;

G = Fa + w * ya;
Ga = reduc(G);
Gb = G - Ga;

H = Fb + Gb;
Ha = reduc(H);
w = 16 * (Ga + Ha);

/* Test the power of 2 for overflow */
if( w > MEXP )
\t{
\tmtherr( fname, OVERFLOW );
\treturn( MAXNUMF );
\t}

if( w < MNEXP )
\t{
\tmtherr( fname, UNDERFLOW );
\treturn( 0.0 );
\t}

e = w;
Hb = H - Ha;

if( Hb > 0.0 )
\t{
\te += 1;
\tHb -= 0.0625;
\t}

/* Now the product y * log2(x) = Hb + e/16.0.
*
* Compute base 2 exponential of Hb,
* where -0.0625 <= Hb <= 0.
* Theoretical relative error of the approximation is 2.8e-12.
*/
/* z = 2**Hb - 1 */
z = ((( 9.416993633606397E-003 * Hb
+ 5.549356188719141E-002) * Hb
+ 2.402262883964191E-001) * Hb
+ 6.931471791490764E-001) * Hb;

/* Express e/16 as an integer plus a negative number of 16ths.
* Find lookup table entry for the fractional power of 2.
*/
if( e < 0 )
\ti = -( -e >> 4 );
else
\ti = (e >> 4) + 1;
e = (i << 4) - e;
w = A[e];
z = w + w * z; /* 2**-e * ( 1 + (2**Hb-1) ) */
z = ldexpf( z, i ); /* multiply by integer power of 2 */

if( nflg )
\t{
/* For negative x,
* find out if the integer exponent
* is odd or even.
*/
\tw = 2 * floorf( (float) 0.5 * w );
\tif( w != y )
\t\tz = -z; /* odd exponent */
\t}

return( z );
}
-drl
New An option of interest?
The gcc manual has the following options described:

-mieee-fp
-mno-ieee-fp
Control whether or not the compiler uses IEEE floating point comparisons. These handle correctly the case where the result of a comparison is unordered.
New I'll look into that. Datum: gcc 3.2.1 on glibc 2.3.1
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Doesn't seem to make any difference.
Another oddity:

double y = 0.0;
y = pow(10.0, 3);
if ((123456 / y) == (123456 / 7)) { printf("works\\n"); }

prints 'works'.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Re: Doesn't seem to make any difference.
This has to be gcc 3.x in interaction with 2.95.x horkage.

Will gcc 3 ever be ready?
-drl
     C decimal question: - (admin) - (53)
         Equality on Real/Double is always a crapshoot - (tablizer)
         Don't know... - (Simon_Jester)
         Re: C decimal question: - (deSitter) - (1)
             Makes no difference. - (admin)
         The boolean compare operators - (ChrisR) - (36)
             I've done the subtract - (admin) - (35)
                 Assembly dump? - (ChrisR) - (22)
                     I did my dump with GDB - (Arkadiy)
                     What I get: - (admin) - (20)
                         Speculating... - (ChrisR) - (19)
                             gcc 2.95.4 results - (admin) - (18)
                                 Re: gcc 2.95.4 results - (deSitter) - (1)
                                     EBP is base of stack frame - (Arkadiy)
                                 I86 Assembly is not my specialty... - (ChrisR)
                                 ASM Comments - (ChrisR) - (13)
                                     Errrr... - (admin) - (12)
                                         Getting ASM from VC6 - (deSitter) - (11)
                                             Results - (deSitter) - (10)
                                                 Request... - (ChrisR) - (9)
                                                     Re: Request... - (deSitter) - (8)
                                                         Trimming it down... - (ChrisR) - (7)
                                                             Re: Trimming it down... - (deSitter) - (6)
                                                                 Scratches head... - (ChrisR) - (5)
                                                                     Re: Scratches head... - (deSitter) - (4)
                                                                         Thanks... - (ChrisR) - (3)
                                                                             BTW - (deSitter) - (2)
                                                                                 Hmmmm - (ChrisR) - (1)
                                                                                     puts("42"); -NT - (deSitter)
                                 gcc 2.95.26 - (ChrisR)
                 Did you try it with optimizations off? -NT - (deSitter) - (7)
                     A really good optimizing compiler... - (ChrisR) - (6)
                         Yes.. - (deSitter)
                         Compiler has no way to predict how pow() is implemented - (Arkadiy) - (4)
                             IEEE - (deSitter) - (2)
                                 pow() is library, not hardware - (Arkadiy) - (1)
                                     True - (deSitter)
                             Re: Compiler has no way to predict how pow() is implemented - (deSitter)
                 An option of interest? - (ChrisR) - (3)
                     I'll look into that. Datum: gcc 3.2.1 on glibc 2.3.1 -NT - (admin)
                     Doesn't seem to make any difference. - (admin) - (1)
                         Re: Doesn't seem to make any difference. - (deSitter)
         Not a direct answer, but still on-topic. - (static)
         RESOLUTION: - (admin) - (10)
             A caveat from Borland: - (a6l6e6x) - (6)
                 This was slightly different. - (admin) - (5)
                     Re: This was slightly different. - (deSitter)
                     Oh - to discover a real compiler problem - (deSitter) - (1)
                         This is not considered a compiler problem by the GCC folks. - (admin)
                     Scott, that's a bug worthy of catching - (Simon_Jester) - (1)
                         Fun to track down, at least. - (admin)
             Re: RESOLUTION: - (deSitter) - (2)
                 Historically speaking... - (ChrisR) - (1)
                     Re: Historically speaking... - (deSitter)

They're going to sue you.
405 ms