Appreciate this feedback. Greg. By responding however you open yourself up to answering my questions in return, so I hope you don't mind if I leach some learnings off you in the process of improving my smithing nomenclature:) Let's see if I can respond to these issues and improve this proposal. See below. -Tom >-----Original Message----- >From: Greg Jaxon [mailto:Greg.Jaxon@synopsys.com] >Sent: Monday, November 19, 2007 11:23 AM >To: Alsop, Thomas R >Cc: sv-bc@eda.org >Subject: Re: [sv-bc] 1339: (RESEND)`define behavior on trimming leading and >trailing spaces in macros > > > >Alsop, Thomas R wrote: > >> Here is the additional wording. I took some of it from the ANSI C >> preprocessing document that Steven pointed us to. I am not a wizard yet >> on LRM word-smith'ing so any advice before we vote on this would be >welcome. >> >> Thanks, -Tom > >You need more from the ANSI C standard - specifically the whole step-by- >step >operational definition approach. Here are some questions I have for your >definition: > > A) Is the backslash escape for newline applied before or after other > uses of backslash as escape (for example in quoted strings, or >escaped > identifiers)? If I want a quoted newline in the replacement, what > should I write? (see below for some alternatives) > [Alsop, Thomas R] I agree that the replacement should save newline escapes. So the way I would answer this is that they be preserved during macro replacement. For multi-line macros, the last escape character is the continuation character. All other escape characters are treated as part of the replacement text. > B) Is backslash-newline whitespace? I always assumed it was, but you >treat > it separately, why? > [Alsop, Thomas R] Simply because it should be preserved, while the other whi That's a good question WRT to this definition. My question back to you would be whether replacement code should contain newlines? If I had to visually see the code after the replacement then I would argue that newlines must be preserved to make it readable. I don't know of any tools that I am currently using which require that I look at the replaced code so perhaps I should be treating them the same. > C) Can the backslash-newline ligature be the terminating whitespace of > of an escaped identifier? If so, will the identifier end with a >backslash > or not? > [Alsop, Thomas R] I see your point. It has to be the terminating character for escaped identifiers along with other white-space characters. Backslash newline must be considered "white-space" in this context. And no the identifier cannot have the backslash as part of the identifier. > D) The first sentence defines "macro text" as being arbitrary stuff > on the same "line"; veterans who know the Unix convention of escaped >newlines > can factor this in as just more arbitrary bytes. But your additional > sentences describe the "macro replacement string", which I feel is a >misuse > of the well-defined term "string". I think both the term "text" and >"string" > are misleading, and the LRM should instead define the >"macro_replacement formula", > since it clearly contains free variables. But ultimately the >trimming > effort belies the original definition of this text as "arbitrary". > [Alsop, Thomas R] Yes, I was struggling to find the right nomenclature for all the "arbitrary" stuff that we know of as the macro. For me there are three things found in a macro. The string tokens which are all just cut and paste stuff, the argument tokens which are "search and replace" stuff, and the other stuff (sorry if I am sounding like Britney Spears here:), which has special replacement operations. Like \" gets replaced with ". All of this combined stuff I just called the macro string. My bad. I like "macro text". Macro_replacement formula doesn't encapsulate or describe that stuff for me. Is this another term I am just not used to? >> The macro text can be any arbitrary text specified on the same line as >> the text macro name. If more than one line is necessary to specify the >> text, the newline shall be preceded by a backslash ( \ ). The first >> newline not preceded by a backslash shall end the macro text. The >> newline preceded by a backslash shall be replaced in the expanded macro >> with a newline (but without the preceding backslash character). > >Which raises question (A) what tokenization happens after this replacement, >and what backslash substitutions happen before it. Which text below >expands to the >newline character? > [Alsop, Thomas R] I hear that word "tokenization" a lot, but I am not familiar with it. What does that mean exactly? >`define ascii_NL "\\ >" > >or > >`define ascii_NL "\\\ >" >? > [Alsop, Thomas R] First the above cases do not currently compile, I am sure you knew this already. I believe I understand where you are taking this question. Interesting enough I had to add a white-space before the escape continuation to get it to compile. I wasn't aware that this was a requirement. According to my interpretation of the LRM, it should compile as we do not require white-space before the continuation character. In both of these examples the last escape continuation character followed by the newline, expand into the newline without the continuation character. The first example should treat the preceding escape character as nothing since there is nothing after it upon replacement. The second example replaces the two preceding escape characters as one escape character in the replacement. >In Unix conventions, a "line" is defined as arbitrary text delimited by >unescaped newlines. >I'd prefer to see that definition once very early in the lexical convention >section, >and then simply not deal with it, except in notes or examples to illustrate >the concept. > [Alsop, Thomas R] That would certainly eliminate some confusion. I found that as I was reading through your reply it took me a while to understand your "newline" usage. My interpretation was a newline as literally defined in the LRM as the "\n" characters when in reality it also refers to the unseen newline as the end of a line, which I believe you are referring to as the "unescaped" newline. >Similarly "whitespace" comprises space, the horizontal and vertical tabs, >and newline, >maybe carriage return - and possibly others. Isn't there a standard >covering this? > [Alsop, Thomas R] Not that I am aware of in the LRM. Shalom should answer this. >As to whether the committee should back down from "arbitrary" text to >"trimmed" text, >I would personally recommend trimming /leading/ whitespace, but NOT >/trailing/ whitespace. >The first is done in the interest of free vertical alignment, to make > `define A 1 >equal > `define \A 1 >, and to prevent any macro from expanding to mere whitespace. > [Alsop, Thomas R] You lost me on this. What is the second example you have? How would you use the "\A" macro? >The second is done to finesse this whole complication about escaped >identifiers. > >> Any white-space characters preceding or following the macro replacement >> string are not considered part of the replacement. Additionally, for >> multi-line macros any trailing white-space between the last token on a >> line and the newline before a backslash is not considered part of the >> replacement. > >That "Additionally" clause is probably just a note on your original >definition, >not an actual addition to the rules. I oppose the trimming of trailing >whitespace. However, I don't vote, so don't fret about it if you can get >consensus otherwise. > [Alsop, Thomas R] No, not a note, just not well written. What do you think about this instead? The macro text can be any arbitrary text specified on the same line as the text macro name. If more than one line is necessary to specify the text, the newline shall be preceded by a backslash ( \ ). The first newline shall end the macro text. The newline preceded by a backslash shall be replaced in the expanded macro with a newline (but without the preceding backslash character). Any white-space characters preceding or following the macro replacement text are not considered part of the replacement. For multi-line macros any leading white-space and any trailing white-space preceding an escape continuation character will be removed. >Greg -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.Received on Mon Nov 19 14:31:20 2007
This archive was generated by hypermail 2.1.8 : Mon Nov 19 2007 - 14:31:43 PST