<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.0.2" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Implementing a scripting language with Antlr (Part 2: Parser)</title>
	<link>http://tech.puredanger.com/2007/01/15/antlr-2/</link>
	<description>Alex Miller's technical blog</description>
	<pubDate>Tue, 13 May 2008 09:49:11 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.2</generator>

	<item>
		<title>by: M Shekhar</title>
		<link>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1413</link>
		<pubDate>Mon, 26 Mar 2007 20:45:09 +0000</pubDate>
		<guid>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1413</guid>
					<description>Alex, I took at look at Vocubularies and seems like that is not what I'm looking for. My problem is similar to this thread that someone had started sometime back, but apparently had no responses.

http://www.antlr.org:8080/pipermail/antlr-interest/2005-September/013603.html

Let me know, if you have any pointers to this problem.

Thanks,
manju</description>
		<content:encoded><![CDATA[<p>Alex, I took at look at Vocubularies and seems like that is not what I&#8217;m looking for. My problem is similar to this thread that someone had started sometime back, but apparently had no responses.</p>
<p><a href='http://www.antlr.org:8080/pipermail/antlr-interest/2005-September/013603.html' rel='nofollow'>http://www.antlr.org:8080/pipermail/antlr-interest/2005-September/013603.html</a></p>
<p>Let me know, if you have any pointers to this problem.</p>
<p>Thanks,<br />
manju
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: M Shekhar</title>
		<link>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1412</link>
		<pubDate>Mon, 26 Mar 2007 20:36:26 +0000</pubDate>
		<guid>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1412</guid>
					<description>Thanks Alex, that helps.</description>
		<content:encoded><![CDATA[<p>Thanks Alex, that helps.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Alex</title>
		<link>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1404</link>
		<pubDate>Mon, 26 Mar 2007 13:31:14 +0000</pubDate>
		<guid>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1404</guid>
					<description>Regarding error handling, I posted &lt;a href=&quot;http://tech.puredanger.com/2007/02/01/recovering-line-and-column-numbers-in-your-antlr-ast/&quot; rel=&quot;nofollow&quot;&gt; about recovering line and column numbers in an AST&lt;/a&gt; earlier.  Beyond that, I have not done much yet with error recovery in antlr.  

Antlr does support the integration of multiple grammars and even grammar extension through the notion of &quot;vocabularies&quot;.  You might check out the antlr doc on &lt;a href=&quot;http://www.antlr.org/doc/vocab.html&quot; rel=&quot;nofollow&quot;&gt;Vocabularies&lt;/a&gt; for some more info.</description>
		<content:encoded><![CDATA[<p>Regarding error handling, I posted <a href="http://tech.puredanger.com/2007/02/01/recovering-line-and-column-numbers-in-your-antlr-ast/" rel="nofollow"> about recovering line and column numbers in an AST</a> earlier.  Beyond that, I have not done much yet with error recovery in antlr.  </p>
<p>Antlr does support the integration of multiple grammars and even grammar extension through the notion of &#8220;vocabularies&#8221;.  You might check out the antlr doc on <a href="http://www.antlr.org/doc/vocab.html" rel="nofollow">Vocabularies</a> for some more info.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: M Shekhar</title>
		<link>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1394</link>
		<pubDate>Mon, 26 Mar 2007 05:21:39 +0000</pubDate>
		<guid>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1394</guid>
					<description>Hi Alex, I'm glad I brought it to your attention. Thanks for the explanation. Also, I had a couple of requests. Perhaps they might need seperate articles of their own. 

Firstly, I was wondering if you could discuss how to properly handle errors and report them(line no., module, etc). instead of the cryptic 'exception thrown' kind of errors. I've seen a section on error recovery in the antlr's documentation, but it's such a mind-bender :-) I'm still new at all this, so pl. excuse my ignorance. 

Secondly, what's the best way to parse included modules, e.g. if a script includes another script(a la C includes), I would like to parse 
the included file first. Any links to articles on this subject would be great too.

Thanks again
Manju
-manju</description>
		<content:encoded><![CDATA[<p>Hi Alex, I&#8217;m glad I brought it to your attention. Thanks for the explanation. Also, I had a couple of requests. Perhaps they might need seperate articles of their own. </p>
<p>Firstly, I was wondering if you could discuss how to properly handle errors and report them(line no., module, etc). instead of the cryptic &#8216;exception thrown&#8217; kind of errors. I&#8217;ve seen a section on error recovery in the antlr&#8217;s documentation, but it&#8217;s such a mind-bender :-) I&#8217;m still new at all this, so pl. excuse my ignorance. </p>
<p>Secondly, what&#8217;s the best way to parse included modules, e.g. if a script includes another script(a la C includes), I would like to parse<br />
the included file first. Any links to articles on this subject would be great too.</p>
<p>Thanks again<br />
Manju<br />
-manju
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Alex</title>
		<link>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1391</link>
		<pubDate>Mon, 26 Mar 2007 03:57:08 +0000</pubDate>
		<guid>http://tech.puredanger.com/2007/01/15/antlr-2/#comment-1391</guid>
					<description>Hi Manju,

Thanks for pointing this out!  As it turns out, I forgot something critical in the parser definition.  

But first, the reason you're seeing an error with characters other than those in the punctuation section is that these are not valid according to the lexer, so the lexer is throwing an error.  

In the case of the extra semicolon, the lexer does understand the token, so no error is thrown (by the lexer).  If you look at the output of the TestParser program with your example script, you'll notice that it prints only the first block, which was my first big clue to my mistake.  The key is that I didn't tell Antlr to parse the *whole* input - so instead Antlr just consumed input and matched it as asked until it was no longer valid, in which case it just stopped, leaving a slew of tokens unparsed.  

Obviously, this is not what we want.  To fix this, we need to add the pre-defined EOF token to the end of our script rule - this tells Antlr that we expect to match the entire input stream and end with EOF:

	script : (block)* EOF

Then if we run with your script we will see the expected error:

Parsing: test/puredanger/parser/comment.script
line 7:5: expecting EOF, found ';'

Thanks again and I'll update the article appropriately.</description>
		<content:encoded><![CDATA[<p>Hi Manju,</p>
<p>Thanks for pointing this out!  As it turns out, I forgot something critical in the parser definition.  </p>
<p>But first, the reason you&#8217;re seeing an error with characters other than those in the punctuation section is that these are not valid according to the lexer, so the lexer is throwing an error.  </p>
<p>In the case of the extra semicolon, the lexer does understand the token, so no error is thrown (by the lexer).  If you look at the output of the TestParser program with your example script, you&#8217;ll notice that it prints only the first block, which was my first big clue to my mistake.  The key is that I didn&#8217;t tell Antlr to parse the *whole* input - so instead Antlr just consumed input and matched it as asked until it was no longer valid, in which case it just stopped, leaving a slew of tokens unparsed.  </p>
<p>Obviously, this is not what we want.  To fix this, we need to add the pre-defined EOF token to the end of our script rule - this tells Antlr that we expect to match the entire input stream and end with EOF:</p>
<p>	script : (block)* EOF</p>
<p>Then if we run with your script we will see the expected error:</p>
<p>Parsing: test/puredanger/parser/comment.script<br />
line 7:5: expecting EOF, found &#8216;;&#8217;</p>
<p>Thanks again and I&#8217;ll update the article appropriately.
</p>
]]></content:encoded>
				</item>
</channel>
</rss>
