Here are a few more curiosities: 10. You can escape the quotation mark and write "\"". This is a one-character string containing the quotation mark. The LRM defines this for text macros, but not for string literals. For that reason, you also have to use a double backslash if you want to end a string literal with a backslash, otherwise it escapes the closing quotation mark. 11. Because the special characters like \n are really a single character, they are only recognized if you write them together in a single string literal. For example, if you try to concatenate two string literals, one that ends with a backslash and one that begins with 'n', it will stay two characters, a backslash followed by an 'n'. 12. In contrast to special characters like \n, %% is recognized as a single percent character only if identified as a format string. So $display("%%"); displays a single %, whereas $display("%s", "%%"); displays %%. Generally, unless displayed via a %s format, a string literal in a $display may be interpreted as a format string with everything that that implies. 13. What about a concatenation as a format string, e.g., $display({"a","b"});? Here some simulators identify it as a string and display "ab", whereas others convert it to a numeric value and display a number. 14. If you end a format string with a % symbol, $display("a%");, some tools will print the % symbol, even though only one appears and not two, and others issue an error that a format character is missing following the % symbol. Shalom ________________________________ From: owner-sv-bc@server.eda.org [mailto:owner-sv-bc@server.eda.org] On Behalf Of Bresticker, Shalom Sent: Sunday, May 11, 2008 2:37 PM To: sv-bc Cc: Geoffrey.Coram Subject: [sv-bc] Special characters in strings - Mantis 1507 5.9 says, "Nonprinting and other special characters are preceded with a backslash." There is an example of ""Hello world\n", which is considered a 12-character string. That is, "\n" is considered a single character, even though it takes more than 1 character to write it. Is that true for all the characters in Table 5-1? What if a backslash is followed by a character that does not appear in Table 5-1? Is it one character or two? 20.2.1.1 says, "An escaped character not appearing in Table 20-1 shall cause the character to be printed by itself," but that relates to printing and not necessarily to the literal itself. What about "%%"? As a display format, it has a special meaning, to just display "%". But is it one character or two? There is need for clarification in the LRM. Mantis 1507 more or less covers this issue. I started checking with four different simulators and here is what I found. 1. The special characters appearing in Table 5-1 and Table 20-1 consisting of a backslash followed by a single character, are stored in a string literal as a single character, the ASCII code corresponding to the special character denoted. That is, "\n" is stored with the ASCII code corresponding to Carriage Return, "\t" is stored with the ASCII code corresponding to Tab, etc. 2. A backslash followed by a character not appearing in those tables is also stored as a single character, but as the ASCII code corresponding to that character, the same as if not escaped. So, for example, "\m" is stored as x6D, the same as "m", and "\o" is stored as x6F, but "\n" is stored as x0A, not as x6E. 3. What about the new special characters added in 1800-2005 3.6: \v, \f, \a? So far as I can see, none of the simulators implements them yet, and they are just treated as normal, i.e., "\v" is the same as "v", etc. One simulator gives a warning that these special characters are not implemented yet. 4. Presumably one would expect them to be stored as their special characters. I.e., "\a" would be stored with the ASCII code corresponding to "Bell" and not the same as "a". This means that moving between Verilog and SystemVerilog modes would change the behavior in this case, not just when printing them as strings, but also in their internal storage value, and thus their printed value in non-string formats, etc. It changes the string length as well. Admittedly, this is normally a corner case. 5. 20.2.1.1 includes a new line from Mantis 1101: "An escaped character not appearing in Table 20-1 shall cause the character to be printed by itself. For example, a string argument "\a" shall print simply "a"." This was fine for 1364, but unfortunately "\a" is now also a special character, so the example should be changed to some other character. Stu, please note. 6. What about octal codes, a backslash followed by numerals? "\0" becomes x00, "\7" becomes x07, but "\8" becomes the ASCII code for the character "8". OK. That is according to the rule that a backslashed non-special character becomes the same as the character without the backslash. 7. What about an octal number greater than \377? The LRM says that a tool "may" issue an error. One tool did, the others truncated the octal number to 8 bits from the left, i.e., used the least significant 8 bits. 8. What about hexadecimal codes, \x followed by a one or two digit code? It seems that like the other special characters added in 1800-2005, none of the simulators implements this yet. So they just look at this as "\x", which is the same as "x", followed by a digit string. As with other new special characters, implementing this changes the behavior and internal stored value from 1364. 9. Oh, what about "%%"? In contrast to backslashed characters, this really is stored as 2 "%" characters, and therefore does not and should not appear in Table 5-1. It only becomes a single "%" character when displayed. This is the only difference between Table 5-1 and Table 20-1. Probably there should be only one table and "%%" should be described separately. --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.Received on Mon May 12 01:06:22 2008
This archive was generated by hypermail 2.1.8 : Mon May 12 2008 - 01:07:44 PDT