<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:series="http://unfoldingneurons.com/"
	>

<channel>
	<title>MettaProgramming &#187; xml</title>
	<atom:link href="http://mettadore.com/tag/xml/feed/" rel="self" type="application/rss+xml" />
	<link>http://mettadore.com</link>
	<description>Thoughts on Software and Technology</description>
	<lastBuildDate>Fri, 03 Feb 2012 17:39:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Scala and the mystery of the quotes</title>
		<link>http://mettadore.com/scala/scala-and-the-mystery-of-the-quotes/</link>
		<comments>http://mettadore.com/scala/scala-and-the-mystery-of-the-quotes/#comments</comments>
		<pubDate>Tue, 29 Dec 2009 17:48:49 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Scala]]></category>
		<category><![CDATA[strings]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://mettadore.com/?p=10</guid>
		<description><![CDATA[As a kid, I loved the Hardy Boys mysteries, they always had interesting mysteries and plots that were just simple enough that you could figure them out at just the right time. Sometimes, programming is like a mystery, something happens unexpectedly, you hunt down clues, solve the mystery, and find that it leads to deeper [...]]]></description>
			<content:encoded><![CDATA[<p>As a kid, I loved the Hardy Boys mysteries, they always had interesting mysteries and plots that were just simple enough that you could figure them out at just the right time. Sometimes, programming is like a mystery, something happens unexpectedly, you hunt down clues, solve the mystery, and find that it leads to deeper mysteries.</p>
<p>A while ago, I started exploring Scala and wrote a post called Head Smacking in Scala on <a href="http://positivelyglorious.com">Positively Glorious!</a> (<a href="http://mettadore.com/scala/head-smacking-in-scala-xml-parsing/">Republished here</a> for convenience). Yesterday I found another head smacking problem that taught me a few things.<span id="more-10"></span></p>
<p>Recall that I was creating an XML parser to pull metadata and results from USGS Stream Gauge sites.<sup><a href="http://mettadore.com/scala/scala-and-the-mystery-of-the-quotes/#footnote_0_10" id="identifier_0_10" class="footnote-link footnote-identifier-link" title="See the original post for the code">1</a></sup> Yesterday, I decided to explore some of Scala&#8217;s DBC abilities so that I could populate my local database with the results. I added the following to my application (cribbed from <a href="http://la.scala.la/post/110327928/baby-steps-in-scala-dbc-and-postgresql">La.Scala.La</a>):</p>
<pre>import scala.dbc._
import scala.dbc.Syntax._
import scala.dbc.syntax.Statement._
import java.net.URI

object PgVendor extends Vendor {
  val uri = new URI("jdbc:postgresql://localhost:5432/dbname")
  val user = "dbuser"
  val pass = "dbpass"

  val retainedConnections = 5
  val nativeDriverClass = Class.forName("org.postgresql.Driver")
  val urlProtocolString = "jdbc:postgresql:"
}

object Runner extends Application {
  val db = new Database(PgVendor)

  val rows = db.executeStatement {
    select fields ("code" of characterVarying(20)) from ("site") where ("responsible_profile_id = 4")
  }
  for (val r &lt;- rows;
       val f &lt;- r.fields) {
    try {
      val str = f.content.sqlString
      val Station = new UsgsWaterQualityStationParser(str)
      println(Station.getMetaData)
    }
  }

  db.close
}</pre>
<p>The documentation on the scala.dbc library is woefully lacking, but I managed to pick up some information using a scala wiki detailing the <a href="http://scala.sygneca.com/libs/dbc">seemingly orphaned scala.dbc2 library</a>. It seemed to work, and gave me enough information to build a proper executeStatement.</p>
<p>The problem was that it didn&#8217;t work.</p>
<p>After some Hardy Boy-like investigation, I discovered a very interesting problem that I never expected.<sup><a href="http://mettadore.com/scala/scala-and-the-mystery-of-the-quotes/#footnote_1_10" id="identifier_1_10" class="footnote-link footnote-identifier-link" title="That&amp;#8217;s a great euphemism for &amp;#8220;I screwed up and created a bug&amp;#8221;   ">2</a></sup> It&#8217;s related to the sqlString&#8217;s return value, and has something to do with how Scala deals with double vs. single quotes.</p>
<p>Line 25 of the code shows the creation of a value using sqlString(), which, according to what documentation I can find, returns a Java String object. Because UsgsWaterQualityStationParser takes a String in the constructor, everything should work out. The problem, the string returned from sqlString() contains single quotes. Thus, the URL that gets sent to the USGS website is something like:</p>
<pre>http://qwwebservices.usgs.gov/Station/search?siteid=USGS-'12345678'</pre>
<p>instead of what it should properly be, which is:</p>
<pre>http://qwwebservices.usgs.gov/Station/search?siteid=USGS-12345678</pre>
<p>It&#8217;s a bit strange to me that sqlString would return a String value containing quotes, but I expect it&#8217;s because of something that I&#8217;m misunderstanding about the scala.dbc library. </p>
<p>As a bit of a hack, I just used String&#8217;s replace method:</p>
<pre>    try {
      val str = f.content.sqlString.replace("'","")
      val Station = new UsgsWaterQualityStationParser(str)
      println(Station.getMetaData)
    }</pre>
<p>Which worked. Running this<sup><a href="http://mettadore.com/scala/scala-and-the-mystery-of-the-quotes/#footnote_2_10" id="identifier_2_10" class="footnote-link footnote-identifier-link" title="Not in full, since I didn&amp;#8217;t want to create a DOS attack on the USGS&amp;#8217;s REST server with 21,000 requests!">3</a></sup> I was able to print out the metadata for the sites that I have in our database that are under the responsibility of the USGS. This solved the problem of <em>getting</em> the data, even if it left open the mystery of the quotes for a later time.</p>
<p>All that&#8217;s left now is to create the query such that only sites without metadata will be used (allowing me to do, say 50 sites at a time to limit server requests), and create insert statements to load the metadata into my database. The result, all the information I could want about each USGS site, in the USGS&#8217;s own format.</p>
<h3>Coda</h3>
<p>Despite the learning curve, I&#8217;m still having fun with Scala. The issues I have with type safety (I&#8217;m a Pythonic Duck Typer) and trying to wrap my head around Scala&#8217;s syntax and Java connectivity are a small price to pay for the power, ease and flexibility of the language.</p>
<ol class="footnotes"><li id="footnote_0_10" class="footnote">See the <a href="http://mettadore.com/scala/head-smacking-in-scala-xml-parsing/">original post</a> for the code</li><li id="footnote_1_10" class="footnote">That&#8217;s a great euphemism for &#8220;I screwed up and created a bug&#8221; <img src='http://mettadore.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  </li><li id="footnote_2_10" class="footnote">Not in full, since I didn&#8217;t want to create a DOS attack on the USGS&#8217;s REST server with 21,000 requests!</li></ol>]]></content:encoded>
			<wfw:commentRss>http://mettadore.com/scala/scala-and-the-mystery-of-the-quotes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Head smacking in Scala: XML Parsing</title>
		<link>http://mettadore.com/scala/head-smacking-in-scala-xml-parsing/</link>
		<comments>http://mettadore.com/scala/head-smacking-in-scala-xml-parsing/#comments</comments>
		<pubDate>Tue, 29 Dec 2009 15:47:01 +0000</pubDate>
		<dc:creator>john</dc:creator>
				<category><![CDATA[Scala]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://mettadore.com/?p=13</guid>
		<description><![CDATA[I program in a lot of different languages, everything from C and C++ to Awk and Sed, Visual Basic and ASP to PHP and Javascript. I&#8217;m a bit of a jack-of-all-trades when it comes to languages, but the main one for the past 10 or so years has been Python. Python is the language that [...]]]></description>
			<content:encoded><![CDATA[<p>I program in a lot of different languages, everything from C and C++ to Awk and Sed, Visual Basic and ASP to PHP and Javascript. I&#8217;m a bit of a jack-of-all-trades when it comes to languages, but the main one for the past 10 or so years has been <a href="http://python.org">Python</a>. Python is the language that I automatically turn to when I say &#8220;I need to do ${X},&#8221; where X is any given task that does not require a UML diagram and user case studies. It&#8217;s fast, it&#8217;s powerful, and it&#8217;s about as comfortable as an old shoe.</p>
<p>Lately, many of my projects&#8211; including <a href="http://hydra.si">my really really big one</a>&#8211; have been in Java. Since I haven&#8217;t programmed in Java since about 1998 (about when I picked up Python, notably) it&#8217;s been a hard road. Java has become a harsh mistress. That sweet young thing that was so easy going and flexible so many years ago has grown up to be a cynical, hard-edged woman with a riding crop in her hand.</p>
<p>At least, that&#8217;s been <em>my</em> recent experience.<span id="more-13"></span></p>
<p>Still, it&#8217;s not all been bad. One thing that&#8217;s great about it is that I&#8217;ve discovered the <a href="http://scala-lang.org/">Scala programming language</a>.</p>
<p>If Python and Java got together and had a son, and then if Haskell and Ruby got together and had a daughter, and then that son and daughter got together and had a baby, that baby would be named Scala.</p>
<p>Scala is a scripting language for Java that is powerful, yet fast. It&#8217;s super OOP-centric, yet still has a foot planted firmly in functional programming land. Simple in its syntax, yet able to use any Java classes natively, and it byte compiles to native Java code.</p>
<p>Scala is everything I remember Java promising back in the mid-90s. It&#8217;s basically a language that a Java-whipped Python programmer could only dream of, yet it&#8217;s real.</p>
<p>Still, as when learning <em>any</em> language, there are times when I smack my head. Today&#8217;s head smacker illustrates why it&#8217;s important to not make assumptions when programming in an unfamiliar language.</p>
<h3>My XML Parser</h3>
<p>Today, I was building an XML parser that would grab water quality station information from the websites of organizations like the Army Corps and USGS. This is normally something that I&#8217;d do in Python&#8211; I&#8217;d whip it out real fast and then forget about it&#8211; but I thought this would be another good chance to play with Scala.</p>
<p>It&#8217;s a very simple language,<sup><a href="http://mettadore.com/scala/head-smacking-in-scala-xml-parsing/#footnote_0_13" id="identifier_0_13" class="footnote-link footnote-identifier-link" title="and this example was made simpler by the fact that Raphael Ferreira already built up the code to parse Amazon&amp;#8217;s website">1</a></sup> my HTTP request client consists of only the following:<sup><a href="http://mettadore.com/scala/head-smacking-in-scala-xml-parsing/#footnote_1_13" id="identifier_1_13" class="footnote-link footnote-identifier-link" title="Code here, except where otherwise licensed, is licensed under the MIT license.">2</a></sup></p>
<pre>import java.io.InputStream;
import java.net.URL;

object Http {
  def request(urlString:String): (Boolean, InputStream) =
    try {
      val url = new URL(urlString)
      val body = url.openStream
      (true, body)
    }
    catch {
      case ex:Exception =&gt; (false, null)
   }
}</pre>
<p>That&#8217;s all that&#8217;s needed to make the HTTP request to a server. As you can see, we can pull in the Java classes for InputStream and URL and use them natively. That&#8217;s quite nice.</p>
<p>Scala&#8217;s design is a bit strange to me. The fact that you can&#8217;t have static methods in a class (they go in <em>objects</em>) is a little head scratching sometimes. Also, if you have both static and instance methods, and you do so by naming both a class and an object with the same name. Furthermore, you can have traits, which are somewhat like Java interfaces. Thus, you could have three completely separate types with the same name, all which function both independently and together. Talk about head scratching.</p>
<p>While trying to get used to all this, I defined a base parser as a trait, because I&#8217;ll likely be creating parsers for a lot of different types of data sites.</p>
<pre>trait BaseParser {
  def baseUrl:String = "http://"
  def fetchAndParseURL(URL:String) {
      val (true, body) = Http request(URL)
      val xml = XML.load(body)
      xml
   }
  def fetchAndParseQuery(query:String) = fetchAndParseURL(baseUrl + query)
}</pre>
<p>Here is the skeleton of basic functionality for actually grabbing and parsing the XML of a website.<sup><a href="http://mettadore.com/scala/head-smacking-in-scala-xml-parsing/#footnote_2_13" id="identifier_2_13" class="footnote-link footnote-identifier-link" title="yes, fellow geeks, I know that there are things I could do better here. This is a quick skeleton to get used to the language, not a final product that needs to be judged. Save that for when I can actually make something work">3</a></sup> fetchAnParseURL() is the basic method, which grabs a URL and parses the XML (error checking and unit tests stripped here). fetchAndParseQuery() is a way to generically modify the base url, with specific modifications to be made in the class.</p>
<p>There are some interesting things in how Scala defines methods. The biggest one is the lack of return statements. Scala assumes that the last object in a method returns. That&#8217;s a bit like magic, sometimes, and thus somewhat scary. Another one is the tendency to use equal signs and not require parentheses (as in baseUrl, which is a method). Another nice thing to note here is that function definitions can be declared on one line. It makes some class definitions quite small:</p>
<pre>class USGSStation(siteID:String) extends Application with BaseParser {
  override def baseUrl = "http://qwwebservices.usgs.gov/Station/search?siteid=USGS-"
  def GetMetaData() { this.fetchAndParseURL(baseUrl + siteID) }
}</pre>
<p>So, here&#8217;s my first parser class. It takes a USGS site ID as a string, and overrides with a baseUrl that coincides with the &#8220;station identification&#8221; REST query. It&#8217;s as simple and fast as Python, which blows my mind.</p>
<p>But it doesn&#8217;t work.</p>
<h3>Smacking Your Head With A Unit</h3>
<p>So, I spent a long time trying to figure out why this wouldn&#8217;t work. I mean, it compiled, and it ran, and it returned something. It just didn&#8217;t return what I wanted. I kept getting a null value of the type &#8220;Unit.&#8221;</p>
<p>Well, it turns out that this is the &#8220;don&#8217;t make assumptions&#8221; part of learning a new language. You see, in every other language I know except JavaScript which was designed explicitly to torture terrorist suspect detainees, methods are defined with a signature followed by the definition. That&#8217;s it. Signature, definition, done.</p>
<p>Scala, is different. I thought that the equal sign was a clever way of making one line functions. No, it turns out that it&#8217;s a necessary part of defining a method&#8211; at least if you want it to return anything.</p>
<p>Strangely, Scala doesn&#8217;t break when you mis-define a method, it just returns Unit&#8211; which, as much as I can tell, is the number <a href="http://en.wikipedia.org/wiki/42_(number)#In_The_Hitchhiker.27s_Guide_to_the_Galaxy">42</a>. Once I figured out that my methods needed equal signs, everything worked as expected.</p>
<p>Thus, here is the final XML grabber for the site metadata:</p>
<pre>import xml.{Elem, XML}
import java.io.InputStream;
import java.net.URL;

object Http {
   def request(urlString:String): (Boolean, InputStream) =
      try {
         val url = new URL(urlString)
         val body = url.openStream
         (true, body)
      }
      catch {
         case ex:Exception =&gt; (false, null)
      }
}

trait BaseParser {
  def baseUrl:String = "http://"
  def fetchAndParseURL(URL:String) = {
      val (true, body) = Http request(URL)
      val xml = XML.load(body)
      xml
   }
  def fetchAndParseQuery(query:String) = fetchAndParseURL(baseUrl + query)
}

class USGSStation(siteID:String) extends Application with BaseParser {
  override def baseUrl = "http://qwwebservices.usgs.gov/Station/search?siteid=USGS-"
  def GetMetaData() = this.fetchAndParseURL(baseUrl + siteID)
}</pre>
<p>…which works correctly. It grabs the site&#8217;s metadata as the full XML object, which I can then parse for elements such as county code, name, latitude/longitude and sensor type. I&#8217;m using this, in conjunction with a bunch of other as-yet-poorly-written-code to pull both sites and site data from their pseudo-REST interface.</p>
<h3>Coda</h3>
<p>So far, I&#8217;m really impressed. Scala is a great language. The total time I spent on this (without including my stupid lack of equal sign problem) was not much more than what I&#8217;d spend on a Python version, and that&#8217;s without me being familiar with the language. More than that, we can byte compile it and use it within our larger infrastructure without resorting to something like Jython or another solution.</p>
<ol class="footnotes"><li id="footnote_0_13" class="footnote">and this example was made simpler by the fact that <a href="http://blogs.sun.com/rafaelferreira/entry/pragmatic_scala">Raphael Ferreira</a> already built up the code to parse Amazon&#8217;s website</li><li id="footnote_1_13" class="footnote">Code here, except where otherwise licensed, is licensed under the <a href="http://www.opensource.org/licenses/mit-license.php">MIT</a> license.</li><li id="footnote_2_13" class="footnote">yes, fellow geeks, I know that there are things I could do better here. This is a quick skeleton to get used to the language, not a final product that needs to be judged. Save that for when I can actually make something work</li></ol>]]></content:encoded>
			<wfw:commentRss>http://mettadore.com/scala/head-smacking-in-scala-xml-parsing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

