RE: Thoughts about arrays in SystemVerilog


Subject: RE: Thoughts about arrays in SystemVerilog
From: Michael McNamara (mac@verisity.com)
Date: Tue Mar 26 2002 - 16:58:51 PST


Paul Graham writes:
> Here are a few thoughts about arrays in SystemVerilog. This is not a
> proposal, just an exploration of the issues raised at yesterday's phone
> meeting.

<Diatribe>
This packed versus unpacked is reminding me, distinctly, of Verilog
XL's 'vectored' and 'scalared' attribute for vectors.

As I recall, if you didn't state that a vector was 'scalared' then
from the PLI you could not fetch bit selects of the item directly; you
had to fetch the whole thing, and then write your own C code to slice
and dice the item that you receieved. Moreover I believe you had to
fetch the 0/1 bits in one call and the x/z bits in another call.

I could never understand why this delivered to the world any benefit
what so ever.

A) if the user wanted a bit or part select, it didn't save anything by
not allowing him to get it easily: he would write the code anyway
(likely inefficiently as he is a chip designer not a C programmer).

B) It made for incredibly embarassing run time errors: at time 500,000
you run some pli code that you wrote to fetch out data for a timing
calculator: and you get a pli error: can't do a part select of a non
scalared vector...

D) The vendor already had to write the code to fetch out bit and part
selects from those vectors declared 'scalared' so they didn't escape
any work. Perhaps there was some 'fast' representation of a vector
that Gateway came up with that was inconvienient for them to code up a
bit/part select for? Give me a break!

E) At Chronologic we couldn't think of two ways to store a vector in
two ways, such that one way would be faster than the way in which it
would be convenient to have the data so as to allow the pli fetching
bit and part selects.

F) Even if we could come up with dual storage methods, the very idea
of having to design four ways to do addition, subtraction,
multiplication, division and modulus on vectors: (V op V, V op S, S op
V, and S op S), and code them all up, test them and regress them just
boggled the mind.

Hence we implemented a single way of storing vectors (made coding a
lot simpler!) and simply ignored the 'scalared' and 'vectored' key
words.

From the pli you could get a bit or part select of anything you
wanted.
</diatribe>

OK, that said, I am left wondering what is the advantage that both
packed and unpacked arrays are supposed to give us?

If there is a way to unpack a packed array so that you get the joys of
unpackedness; and also a way to pack an unpacked array to get the
reverse, why then are we making the user decide?

I say we pick the representation that offers the most flexibility, and
implement JUST ONE method for aggregating data.

If we are trying to grandfather in old code that does not do slices
and dices of Verilog 2001, while at the same time allowing new code to
do this; and hence feel we need to introduce new syntax; then by all
means let us just go ahead and remove the restrictions. No old code
exists that performs these operations, as until now, they have been
illegal.

>
> 1. Peter pointed out that one source of Verilog's simulation efficiency is
> its lack of variable length arrays. Because all array (or bit-vector)
> declarations have constant range bounds, their sizes are fixed when
> simulation begins. In contrast, VHDL generally requires some size
> information to be associated with an array, since an array's bounds may be
> determined after simulation begins, for instance, by a slice with variable
> bounds.
>
> Since a verilog array or array slice (part select) is required to
> have constant bounds (or constant width, in the case of part
> selects), and a multiple concatenation is required to have a
> constant repeat count, it is not necessary to pass around array
> size information. The size of each array can be determined by the
> *start* of simulation. Not *before*, since it is still possible,
> for example, for an array to be sized by a function, which may be
> affected by parameters, which may be changed by a module
> instantiation. But still, no extra simulation data structures are
> needed to store array sizes.

This really doesn't deliver _simulation_ efficiency. It does mean
that you can calculate at elaboration time, the exact memory image of
the job, malloc it once, and use it from then on.

>
> 2. A feature of VHDL and C, but not Pascal or Verilog, is the
> ability to declare a function which operates on a variable-length
> array. In VHDL this is done using an unconstrained array argument.
> In C it is done with a pointer to the start of the array and a
> specification of the element type, but no indication of the array
> length. It would be nice, at some point in the future, to allow
> Verilog functions (and modules) to accept variable-length array
> arguments.
>
> One objection to variable-length arguments is that they require
> simulation-time size data, which goes against the efficiency
> concerns in part 1. above. But a way to get around this problem is
> by function or module cloning. I don't know if simulators do this,
> but the way that our synthesis tool handles a variable-length array
> argument in vhdl is to treat it as being governed by an implicit
> parameter to the module. Since we generate a unique netlist for
> every different combination of parameters to a module, the effect
> is that every different argument size that a module is instantiated
> with results in a copy of the module with that argument sized
> appropriately.

> The same goes for functions. Given this function:
>
> function f(x : bit_vector) return integer;
>
> then the calls:
>
> f((x, y)) -- called with 2-element bit-vector
> f((x, y, z, w)) -- called with 4-element bit-vector
>
> result in two copies of the function, like this:
>
> function f_2(x : bit_vector(0 to 1)) return integer;
> function f_4(x : bit_vector(0 to 3)) return integer;
>
> For multidimensional arrays, each dimension can have its own implicit
> parameters. (Actually, for vhdl you need parameters for the left & right
> bounds, and for the direction, since these attributes can be queried from
> within the function. But you get the idea.)

 Module cloning is already required in Verilog because of parameters.
 However the value of parameters are known at compile time.

 I.E, given:

 module pow2(out, in);
   parameter WIDTH = 8;
   input [WIDTH-1:0] in;
   output [WIDTH-1:0] out

   assign out = in * in;

 endfunction

 module top;
    wire [63:0] o2;
    wire [31:0] o1;
    reg [63:0] i2;
    reg [31:0] i1;

    pow2 #(32) p1(o1,i1);
    pow2 #(64) p2(o2,i2);

 endmodule

 A verilog simulator would likely clone pow2; one having 32 bit
 arguments, the other 64 bits arguments.

 I assume that VHDL parameters known at synthesis time, right?

 However, in a language like C, one can not usefully 'clone' a
 function, as the myriad possible sizes of arrays discernable at
 compile time would mean requiring 2^32 implementations of strlen(2),
 for example. It seems the same would be the case in VHDL if indeed
 it has run time sized arrays.

 As I said on the call, I did build a compiler for such a language,
 (this one allowed you to at run time vary each of the lengths of the
 dimensions of the array, as well as allow them to be defined by a
 formula. (I hope Kurt Baty isn't reading this and getting any
 ideas!!!) and yes you do add a shadow variable to every array
 dimension to hold it's current length, so that your library code
 could properly iterate.

> 3. The first item in my array proposal is to remove the packed array syntax
> and use only the unpacked array syntax. That is, all array declarations
> will look like one of the following:
>
> wire [3:0] x;
> wire [3:0] x [3:0];
> wire [3:0] x [3:0][3:0];
> ...
>
> and not like:
>
> wire [3:0][3:0] x [3:0];
>
> I think it is too bad that memories in verilog were defined with the
> bit-width on the left and the array width on the right. Certainly I found
> this confusing when I was learning verilog. It seemed odd to me that the
> leftmost bound in a memory declaration was the fastest varying bound, while
> the rightmost was the slowest varying. As a consequence, a memory (in
> Verilog-2001) is indexed backwards compared to its declaration. For
> instance:
>
> reg [7:0] mem [127:0];
>
> is indexed as mem[127][7], instead of mem[7][127].

 Paul, this came because given the rules of verilog that one could not
 directly reference elements of a memory; but could only fetch from
 and store to them, one very commonly declared the memdata register
 and the memory in the same statement:

 module mem (memaddr, memdata);
  input [9:0] memaddr;
  output [31:0] memdata;
  reg [31:0] memdata, memory[0:1023];

  assign memdata <= #10 memory[memaddr];
 endmodule

> It would be nice, in my opinion, if we could just put all the array bounds
> on one side or the other, but we can't do that while maintaining
> compatibility with Verilog. So the next best thing is to confine the word
> bounds to the left of the declaration and put all the array bounds on the
> right. This is exactly what Verilog-2001 does, and so requires no change to
> the existing Verilog language syntax.
>
> 4. Someone pointed out that by eliminating packed arrays we run into a
> funny case with typedefs:
>
> typedef reg [7:0] foo;
> foo [7:0] bar;
>
> If packed arrays are disallowed, then the declaration of bar is illegal,
> because it would expand to be a two-dimensional packed array. The issue
> raised was that this makes a typedef's syntactic validity dependent on the
> specifics of its definition.
>
> But in fact this same problem occurs with packed arrays. Can I not do
> something like this:
>
> typedef reg foo [7:0];
> foo [7:0] bar;
>
> Here, the typedef 'foo' specifies an unpacked array of bits. Such a typedef
> cannot be used within the packed declaration of 'bar'.
>
> (I'm not sure if this example is legal in SystemVerilog as
> currently defined, because the manual mentions only that packed
> arrays can be built up with typedefs. If that's the case, why
> can't unpacked arrays be built with typedefs?)
>
> I think that both these problems are a consequence of splitting
> array bounds to the left and right of the declaration name. Such a
> problem does not occur in C, because you simply don't have as many
> choices:
>
> typedef int foo [3];
> foo bar[4];
>
> This is the only thing you can do. You can't have, for instance:
>
> typedef int [3] foo; // syntax error
> foo [4] bar; // syntax error
>
> If we remove the packed array syntax, then we can adopt the
> convention that all array bounds appearing in a declaration or
> typedef beyond the first (bit-vector) dimension are applied to the
> unpacked array dimensions. With this convention,
>
> typedef reg [3:0] foo; // bit-vector range [3:0];
> foo [4:0] bar; // same as reg [3:0] bar [4:0]
> foo bar2 [4:0]; // same as reg [3:0] bar2 [4:0]
>
> Or if you don't like having multiple ways of saying the same thing, then
> I have no problem with saying that
>
> foo [4:0] bar;
>
> is illegal, and requiring the use of
>
> foo bar [4:0];
>
> to specify an array of five foo's.

I really don't have an opinion on typedef and {un?}packed arrays;
except to say having two kinds of arrays is going to make explaining
typedef to hardware designers REALLY DIFFICULT. I believe it is well
known that few _C programmers_ are as comfortable with typedefs as
they should be.

> 5. With all this said, I have to admit that the packed array syntax
> is convenient for representing something like an array of registers
> which are in turn subdivided into bit-fields. For instance, the
> PowerPC Altivec register set can be declared as:
>
> reg [3:0][31:0] regs [31:0];
>
> If you're familiar with the Altivec register set, you'll know that it can
> also be declared as:
>
> reg [127:0] regs [31:0];
> reg [8:0][15:0] regs [31:0];
> reg [15:0][8:0] regs [31:0];
>
> Basically, a 128-bit register can be seen as a single integer, 4 words, 8
> half words, or 16 bytes.
>
> However, Verilog-2001 does provide variable part selects, which I suppose
> are intended to make it easy to index into an integer as though it were a
> small array. So even without packed arrays it is not too hard to index
> into 'regs' like:
>
> regs[x][y*8-1-:8] -- get y'th byte of x'th reg
> regs[x][y*16-1-:16] -- get y'th half-word of x'th reg
> regs[x][y*32-1-:32] -- get y'th word of x'th reg
>
>
> Comments anyone?
>
> Paul



This archive was generated by hypermail 2b28 : Tue Mar 26 2002 - 16:00:05 PST