Subject: Re: [sv-ec] Re: [sv-bc] Packed arrays
From: Kevin Cameron x3251 (Kevin.Cameron@nsc.com)
Date: Wed Jan 29 2003 - 16:27:48 PST
> From mac@verisity.com Wed Jan 29 15:16:01 2003
>
> Kevin Cameron x3251 writes:
> > You would only have to do that with packed structs/unions, and the
> > overhead difference is likely to be marginal.
> >
> > Using $rtb/$btr is definitely slower regardless of bit order than
> > using wreal or a shared variable.
> >
>
> Come now. Realtobits is presuambly coded as:
>
> typedef union {
> double d;
> long long l;
> } rtb_u;
>
> long long realtobits( double d ) {
> rtb_u u;
> u.d = d;
> return(u.l);
> }
>
> which on the sparc is expanded to essentially a nop (three cheers for
> register windows!):
>
> 0x10608 <realtobits>: retl
> 0x1060c <realtobits+4>: nop
>
> and given the test program:
>
> main ()
> {
> union {
> long long l;
> int a[2];
> } u;
> double d = 3.1415926459045;
> long long l;
> l = realtobits(d);
> u.l = l;
> printf("D is %f; $realtobits(D) is 0x%08x%08x\n",d,u.a[0],u.a[1]);
> }
>
> on the sparc you get:
>
> kodiak 162 > a.out
> D is 3.141593; $realtobits(D) is 0x400921fb533c1c8a
>
> On the x86 it expands to:
>
> 0x8048460 <realtobits>: push %ebp
> 0x8048461 <realtobits+1>: mov %esp,%ebp
> 0x8048463 <realtobits+3>: mov 0x8(%ebp),%eax
> 0x8048466 <realtobits+6>: mov 0xc(%ebp),%edx
> 0x8048469 <realtobits+9>: pop %ebp
> 0x804846a <realtobits+10>: ret
> 0x804846b <realtobits+11>: nop
>
> because I have a 32 bit linux port, and hence you can't do a 64 bit
> move; and moreover without register windows you have to actually copy
> the input to the output. Curiously the CISC takes more instructions
> that the RISC, in this case!
>
> On Linux, or course, as a little endian, you get a different
> intermediate value:
> D is 3.141593; $realtobits(D) is 0x533c1c8a400921fb
Hmm.. that appears to be a straight 32-bit word swap. Obviously both
machines are using the FP representation.
> The point is realtobits and bitstoreal do precisely what a union would
> do; and I believe this is what you are asking for in your packed
> discussion; and so I tell you that in terms of execution speed, you
> already have what you want, part of Verilog 1364-1995.
>
> All that you could optimize away is the procedure call, which on the
> sparc is the call and the return.
>
> I do agree that it would be far nicer if one didn't have to call
> $realtobits and $bitstoreal when passing things around.
That's where the real inefficiency is.
> However, if you do make this conversion to bits as union part of SV,
> you will must then chose:
>
> 1) State that the intermediate bit pattern is undefined and
> implementaion specific,
>
> 2) Require the little endian (x86) to do it the big endian (sparc)
> way, making this operation more expensive on x86
>
> 3) Require the big endian (sparc) to do it the little endian (x86)
> way, making this operation more expensive on sparc
>
> -mac (beating a dead horse, and enjoying it enormously... You gave me
> a brief bit of joy, allowing me to program in C again... *sigh* )
The code above should behave the same in SV as it does in C, the
argument is about the behavior in packed structs/unions. Your example
shows that there is no real difference in the FP representation (IEEE
I presume) between X86 and Sparc so there would only be minor bit twiddling
required if you pick a particular order for packed doubles.
You can consider an IEEE 64-bit real as the SV:
packed struct {
bit sign,
[10:0] exponent,
[51:0] mantisa;
} ieee_64bit_fp;
(http://archive.stsci.edu/fits/fits_standard/node94.html#SECTION002112000000000000000)
Since packed items are platform independent, you might as well say that's
what a double is when it's packed and have it behave as any other packed
struct. Shortreal is:
packed struct {
bit sign,
[7:0] exponent,
[22:0] mantisa;
} ieee_32bit_fp;
Kev.
This archive was generated by hypermail 2b28 : Wed Jan 29 2003 - 16:28:30 PST