But removing comments is not blind-- it does take some account of tokenization.
For example,
a // inside a string does not begin a comment,
a // inside an escaped identifier does not begin a comment
etc.
Many lexers indeed use a "maximum munch" principle, i.e., i.e.,
continue gobbling characters as long as it legally extends the
current token under consideration. By that principle,
://xxxx would indeed be lexed as :/ followed by a comment.
Nikhil
Brad Pierce wrote:
> The // is not itself a token. The entire comment, including the //,
> is a single token. This is specified in 2.1 of the V2K LRM. Likewise,
> the /* and */ in a block comment are not tokens.
>
> So a line like
>
> ://xxxxxxxx
>
> is not ambiguous, because there is only one legal tokenization of it.
>
> The issue then is about unlikely constructs such as the following,
> in which extra slashes have been added
>
> :///xxxxxxx
>
> In my opinion, this is still clearly a : token followed by a comment
> token. Note that comments are not part of the BNF. Regardless of
> how most current lexers are written, comments are conceptually stripped
> out before any tokens are emitted to the parser.
>
> The main reason this is not done in practice is that there are
> often nonstandard language extensions (pragmas) embedded in the
> comments. (As an aside, note that if a // comment is included in
> the text of a macro, it *must* be stripped out, pragma or not,
> according to 19.3.1 of the V2K LRM. So if you want to embed a
> pragma using a macro, it must be a block comment.)
>
> Conceptually, the line
>
> :///xxxxxxx
>
> should be preprocessed to
>
> :
>
> There is still no real ambiguity. It's just a lexing issue.
>
> -- Brad
>
>
> -----Original Message-----
> From: owner-sv-bc@eda.org [mailto:owner-sv-bc@eda.org]On Behalf Of
> Hermann.Ilmberger@infineon.com
> Sent: Thursday, August 26, 2004 2:39 AM
> To: pgraham@cadence.com
> Cc: sv-bc@eda.org
> Subject: RE: [sv-bc] precedence of :/ vs. //
>
>
>
>>>However, if your tool has a Verilog2001 mode and a
>>
>>SystemVerilog mode,
>>
>>>the :// in your example would have to be preprocessed as
>>>: // (2 tokens) for 2001, and as
>>>:// (1 token) for SystemVerilog.
>>
>>What do you mean that "://" is one token in SystemVerilog? I
>>don't see any "://" token in the lrm. (Or has one been added
>>since 3.1a draft 6?)
>
>
> Sorry - I scanned one / to much. This all looks like smilies.
> :// has to be scanned as
> : // (2 tokens) for 2001, and as
> :/ / (2 tokens) for SystemVerilog
> when we assume C++ LRM rules.
>
>>The problem is that SystemVerilog needs to be backwards
>>compatible (as much as possible) with standard Verilog. I
>>ran across this "://" problem in an existing verilog test case.
>
>
> I had the same problem. :// is un-compatible, and if there is still a
> possibility
> we should find a better name which does not break old Verilog.
> -Hermann
>
>
>>The "///" problem wasn't an issue for C/C++ since "///" could
>>never occur in a valid C program (except within a /* ... */ comment).
>>
>>Paul
>>
>
>
Received on Fri Aug 27 09:39:38 2004
This archive was generated by hypermail 2.1.8 : Fri Aug 27 2004 - 09:40:19 PDT