The case: RPC vs. Messaging

December 17, 2009 by kamstrup

There’s a classical flamefest discussion about Remote Procedure Calls (RPC) vs. Messaging. People much brighter than me has discussed it elsewhere, but that doesn’t stop me from throwing in my 2 cents. It appears to me that there is a whole crowd of people still refusing to realize why RPC is so bad.

Before I get too deep in this let’s get the terms RPC and Messaging more well defined. I wont claim that I have the “correct” definitions, but here’s what I mean when I use those terms: RPC is a mechanism that allows you to call methods on remote services as though they where methods on a local object. In pseudo code:

calc = lookup_calculator_service("127.0.0.1", 8080)
four = calc.add(2, 2)
eight = calc.multiply(2, 4)
print ("Result of (2+2)+(2x4) = " + calc.add(four, eight))

For Messaging consider it like email, not between people but between different apps on different machines. A message is typically some container-like format with some extra metadata naming the sender and the recipent(s), maybe timestamps and serial numbers. All you can do in a messaging system is basically to send a message to a particular address. Whether or when the resident at that address responds is not possible to determine – just like email in that sense. For a large scale example of a messaging system we have the internet itself. The very much hyped REST interactions of online services is also an example where messaging is starting to show success.

Back to the RPC example above – it’s very convenient and easy to work with right? If this example is really all you need to do, then I tend to agree that this kind of RPC is fine. But what happens if you are writing a mission critical system where data integrity is paramount, you have lots of interconnected services, and needs low latency and high throughput? Let’s examine the situation a bit…

The server might be implemented as:

function add (num1, num2) {
    return num1 + num2
}

The RPC system would then wrap the server object and expose some predfined methods as remote methods. It magically parses incoming calls and delegates control to my server’s add() function giving it the right arguments.

Problems of RPC

What happens in line 2 in the client code above if calc.add(2,2) causes the calculator service to go out of memory? Some RPC systems like Java RMI has the “feature” of sending you the raw exceptions as they happen on the server directly. In case of an OutOfMemoryError (OOM) the exception would completely escape the server’s logging or critical error handling and be send to the caller. Our calculator client then gets an OOM without the slightest chance of figuring out whether it is itself OOM or the server is OOM. And all while the client thinks it is OOM, and might crash, the server which is really OOM happily chucks along down whatever path of complete failure lies ahead of it.

This can be solved partially if the client wraps all remote calls in try/catch clauses catching the most general type of error the runtime has. In Java this would be Throwable. Also the server needs to wrap all of its remotely available methods in try/catch in order to shut down nicely (or protect itself in some way) in case of OOM or other critical errors. So our previous example now becomes:

calc = lookup_calculator_service("127.0.0.1", 8080)
try {
    four = calc.add(2, 2)
} catch (Throwable t) {
   log.warn("Error adding numbers!")
   return
}
... Nah... I am pretty sure you don't want to read the rest of the try/catch hell

As you of course realize this can all be solved by thorough exception handling in both clients and servers. It wont be fun, but it can be done. Let’s call this problem the Non-Local Exceptions Problem.

The next problem inherent in RPC could be called the Indirect Memory Allocation Problem. This problem arises anywhere you accept a datastructure of an arbitrary size in your methods’ arguments, eg. an array. Suppose I change my calculator server’s API to be more flexible, so that the add() method takes an array of numbers to add, like calc.add([1,2,3,4,5,6]) = 21. Now what happens if a client sends me an array with 10^9 numbers to add? If we assume that a number is 4 bytes, then the RPC system on the server will try to allocate 4*10^9 = 4GB for the array before passing control into add(). This will likely cause the server to OOM before even reaching into my method.

To handle the indirect memory allocation problem I must either be able to ensure that my RPC system will not allow clients to send such huge arguments, or be able to parse the arguments in some streaming manner on the server side – but the latter does not sound a lot like RPC does it?

- and note that the indirect memory allocation problem is not only on the server side. The server may also return a huge datastructure as a method response so the client needs to guard against this too.

Next up on the list of problems is the Blocking Calls Problem. When the client calls to the server it issues a request over the network and really has no way to anticipate when that call returns. While it waits it blocks the thread from which it is calling (or at least all RPC systems I know does this). So if you want to do concurrent calls you’d have one thread per call in progress. If you’ve never seen an app go belly up because of thread starvation or I bet you’ve never programmed multi threaded production systems. Blocking calls make your system more fragile and also much more affected by network latency.

Skipping on to the next problem, this one particularly strikes strongly typed programming languages (like Java, which we use a lot here at the State and University Library of Denmark). Let’s call it the Static Interface Problem. In a strongly typed language you need to be able to resolve the method signatures at compile time (that or use varargs signatures everywhere -eeeks!). In order to do this one frequently hand writes or autogenerates some interface- or stub classes. If the remote API changes you app is likely to crash or simply not run at all – the interface classes needs to be regenerated and your code recompiled against these new interfaces. If you are a purist you might say that such pubilc interfaces should never change and that I must surely be a slacker since I even bring this up, but the sad fact of the matter is that in real life you can not control the entire world and interfaces do change.

Looking back RPC have:

  • Non-Local Exceptions
  • Indirect Memory Allocation
  • Blocking Calls
  • Static Interfaces (in strongly typed languages)

The way these problems are solved in an RPC context is typically to write a CalcClient class which does the needed client side magic (catching exceptions, delegating work to an async thread, hides the remote interface declaration etc.) and then pass a bunch of HashMaps or parametized Value types around with each method where you can stuff any arguments you need to add to the interface in a backwards compatible way. The only thing that is nearly impossible to tackle is the indirect memory allocation problem.

Enter Messaging. Messaging solves all of the above problems in one fell swoop, and if you decide to use a standard, like HTTP, for the connections then you can even talk to you messaging services via you browser or standard Unix command line tools like wget or curl.

Tooting my Own Horn

The above list of problems is not just pulled out of my hat. We have seen, and faught, them all in Summa.  To start moving down the messaging road i started the no-nonsense Juglr project on Gitorious. It’s still far from ready but it’s coming along nicely. In a nutshell it is an Actor model implementation coupled with a JSON/HTTP high performance messaging system. In order not to reinvent the wheel too much I am basing the actor model implementation on Doug Lea’s Fork/Join framework that is also scheduled for inclusion in Java 7.

Real life examples

A non-complete list of the RPC systems I’ve crossed paths with:

  • Java RMI
  • SOAP
  • CORBA

A ditto list of Messaging systems:

  • HTTP and Email
  • REST(ful) web services
  • DBus
  • Protocol Buffers

The last two: DBus and Protobufs deserve an extra note. When you get down at the protocol level these two systems are indeed both messaging systems, but they are most often used as RPC systems! I am honestly now sure why it is so, but it’s probably because it is (deceptively) easier to get started with an RPC based approach.

Solid Toys for the Boys

December 8, 2009 by kamstrup

As some may know we have experimented quite a bit with Lucene indexes on Solid State Drives and we’ve had very good experiences with it. Seeing huge performance gains. Since we are also routinely running big applications and other heavy duty tasks on our desktop machines our dear Toke had the idea that we should all have SSDs in our desktops. After a good deal of shopping about he settled on the Kingston v 40GB drive as research revealed that this exact model had the good Intel metal inside (this is fx. not the case for the 64GB model).

Yesterday we got the delivery and immediately start unpacking and upgrading our machines. And boy where these babies worth every penny! :-)

(sorry for the ugly scaling of the following images – WordPress is killing me)

Toke was the Super delivery boy

Quick - get them before they are gone!

Yours truly is a Super happy camper

Super tag team getting their hands dirty

Firstly we did clean installations of Ubuntu. With a 10GB root partition and a ~26GB /home partition and ~4GB swap. Root and /home formatted with Ext4. All on the SSD. The time?

  • Installing Ubuntu Karmic 64 bit from USB stick: 4 minutes (with ~1 minute waiting for network on a slow repository)

The next thing was the boot… While we where rebooting from the install-session we talked about how fast the boot was going to be. But in the talking we almost didn’t react before the reboot was back up to the login screen. Wow. As we didn’t have a timer with sub-minute resolution at hand we can only give you subjective numbers. Among the spectators the opinions range from “negative time” to “5s” to “10s”. My personal estimates are:

  • Boot from GRUB to GDM login screen: 5s
  • From login screen to working GNOME desktop: 4s

This is pretty darn fast I tell you :-)

In general application launching is also noticably faster. Especially so for applications with lots of IO, likethe  Evolution mail reader or our development environment IntelliJ Idea. Compiling the Summa project is also a heavily IO bound process. The result:

  • Compiling Summa from scratch with cold disk caches: With conventioanl drives ~6 minutes. With our new SSDs 2.5 minutes. That’s a speedup of a factor ~2.5.

As you might have guessed by now – we like SSDs – a lot!

IntelliJ Idea Open Sourced

October 16, 2009 by kamstrup

Wow, I must admit that the latest news from JetBrains takes me quite by surprise! But what a sweet surprise it is!

Scala and Git support out of the box you say? This is more than welcome – now the next generation development experience is enabled out of the box.

I can’t help but wonder why they did it though? Growing pressure from Netbeans and Eclipse? I’ve always thought that Idea was the better of the three – thus expecting it to generate a fine revenue? Perhaps not -  or perhaps JetBrains had a sudden fit of philanthropy? Or perhaps open source is just a superior development model. No matter the true motivation I am pretty hyped about this :-)

Searching in the dark

September 25, 2009 by eskildsen

As part of our obligation to preserve our online cultural heritage, Statsbiblioteket and Det Kongelige Bibliotek in Denmark continuously harvest the danish web (the *.dk-domains), digitize public danish television, rip all danish-produced music and generally just collect whatever we can get our hands on. The terabytes add up (120TB for the web pages so far, more for television, radio and so on) and the machines are happily harvesting, ripping and wolfing down the bytes into semi-safe storage (2 geographically and architecturally different setups, checksummed, re-checksummed etc.). All fine and dandy.

Except that access to most of the material is rather limited and that search is … well, pretty much non-existing.

Such things tend to change over time, preluded by meetings, committees, deals and whatnot. As technicians, we are normally not directly involved in all the politics surrounding this, but in order to get some concrete arguments, we were asked to try and index some of the harvested web material and do a search demo, where web material was presented together with our normal material (books, cds, articles et al).

The harvested web material is stored in ARC-files, so the obvious choice for a quick test was NutchWAX. Setup was easy, some 100 million documents was indexed (about 2% of the harvested web material) and searches were sub-second on a modest machine. A great success in terms of answering the “is it even feasible to do this?“-question.

The “but does it makes sense to do integrated search for such different data sources as web and library books?“-question could not be answered by this, so naturally we had to hack something together with Summa, our precious hammer. Due to other highly-prioritized assignments, we only had about a week to get it to work, so corners were cut where possible. Using the ARC-reader from Heritrix and the Tika-toolkit for analyzing the wealth of different data, the aptly named Arctika was born. Arctika handled the web stuff and an aggregator handled the integration with our standard library index.

It could use a lot more work, but it worked surprisingly well for a quick hack. We were able to demonstrate everything we wanted: The integrated search made sense, the ranking generally pulled the good stuff to the top (admittedly, tweaking the ranking for different sources would surely be needed for a real application) and the faceting system clearly helped give an overview of material types & sources and provided an easy means to do temporal navigation in the search-result: Limiting searches to a specific period of time is quite usable for investigating the media handling of major events.

So what’s the dark part? Well, legislation. As always. That and money. Harvested web material is sensible, only legally accessible for the selected few professors. On top of that, showing snippets from harvested web pages seems – at the moment – to require compensating the content owners, according to EU-law. Opening up for all the material at once will probably not happen in the foreseeable future.

Happily we don’t need to do everything at once. If we limit the public accessible index to websites from the government and companies, it should be legal to show the search-results and the stored versions (hello continuity). Add the recorded television and radio to the mix, pour in scanned newspapers, integrate with old-school books and presto, we have something great. Danish culture at our fingertips, past and present.

Dreaming, I know. But on the technical level, we just need the green light from the bigwigs to make this happen.

A screenshot, you say? Why, yes, of course. We present this super-cool bling bling interface with a stupendously large amount of interesting information to you. Slightly marred by the need to sensor out some sensible information and the fact that indexing time was capped at half a day to make the deadline.

Sample search in Arctika

Sample search in Arctika

Brilliant, guys!

September 20, 2009 by eskildsen

Cancel Undo

Our fine usability guys hard at work.

An Excursion in Java Recursion

September 4, 2009 by kamstrup

A quick Googling defines Excursion as: “a journey taken for pleasure”. Considering what I am about to blog about the title of this blog post might be a bit misleading, but you gotta give me one for the rhyme ;-)

As you might or might not know, doing recursion in Java is simply a bad thing. This is mainly because Java can’t do tail recursion. You can use recursion in Java if you are absolutely positive that you are only going to do a very limited number of recursive calls. If you could possibly go over 100 calls you should consider making it a for or while loop instead, if the Java runtime performs somewhere around 1000 recursive calls you will get a StackOverflowError. This is really bad – you see if you read the StackOverflowError docs you will see that it is a subclass of VirtualMachineError. The docs for VirtualMachineError says:

Thrown to indicate that the Java Virtual Machine is broken or has run out of resources necessary for it to continue operating

This means that you have pretty much no choice but to log a fatal error and abort the JVM.

There are ways for making the recursion limit of the JVM bigger by setting some system properties, but that is really just a band aid and I would advise against using them.

The Real Life Case: XML Parsing

Java 6 ships with a new XML parsing library, the core class of which being XMLStreamReader (also known as the “push parser”). I must say that it is quite a nice library and a huge improvement over SAX parsing, while still keeping a blazing performance. We use it in Summa and has been very happy with it.

The problem came when we started indexing documents like this one: java-recursion-lection-1.xml. You can definitely expect to find similar structures out in the wild (as we have seen here at work). The basic document structure is as follows:

<mydocument>
  <mytag>
     SOME TEXT BLOCK
  </mytag>
</mydocument>

If we just want to extract the text block it would be annoying with a standard SAX parser because a SAX parser splits up characters segments into arbitrary chunks and you have to collect them into one string yourself. The push parser API makes this a lot easier because it defines the property XMLInputFactory.IS_COALESCING which, when set, requires the parser to collect all the text chunks into one string. So extracting the raw text contents is easy peasy lemon squeezy:

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamReader;
import javax.xml.stream.events.XMLEvent;
import java.io.FileReader;

/**
 * A small excursion in Java recursion.
 */
public class JavaRecursionLecture1 {

  public static void main(String[] args) throws Exception {
    XMLInputFactory inputFactory = XMLInputFactory.newInstance();
    inputFactory.setProperty(XMLInputFactory.IS_COALESCING, Boolean.TRUE);

    XMLStreamReader reader = inputFactory.createXMLStreamReader(
               new FileReader("/home/mke/Documents/java-recursion-lection-1.xml"));
    parse(reader);
  }

  public static void parse(XMLStreamReader reader) throws Exception {
    while (reader.hasNext()) {
      int event = reader.getEventType();
      switch (event) {
        case XMLEvent.START_DOCUMENT :
          System.out.println("Document start");
          break;
        case XMLEvent.START_ELEMENT :
          System.out.println("Element: " + reader.getLocalName() );
          break;
        case XMLStreamReader.CHARACTERS :
          // Warning: Here be StackOverflowErrors
          System.out.println("Char data:\n" + reader.getText());
          break;
      }
    reader.next();
    }
  }
}

Except that this will throw a StackOverflowError if you run it on the file I linked you to. “What is up with that, there is no recursion here!” – you ask?

The problem here is that XMLStreamReader is highly recursive underneath the hood. My file contains lots of XML entities and the parser will make a recursive call each time a new entity is found in the stream. Looking at the heart of the implementation you will see that the author(s) actually where very minute about making sure that all recursive calls where tail calls. This would have been very robust had the Java runtime supported tail recursion – alas.

There are two ways to work around this misfeature. The first one is to don’t set the IS_COALESCING property, and then change the switch statement to something like this, using reader.getElementText() instead:

switch (event) {
  case XMLEvent.START_DOCUMENT :
    System.out.println("Document start");
    break;
  case XMLEvent.START_ELEMENT :
    System.out.println("Element: " + reader.getLocalName() );

    if ("mytag".equals(reader.getLocalName())) {
      System.out.println(reader.getElementText());
    }
    break;
  case XMLStreamReader.CHARACTERS :
    // Warning: Here be StackOverflowErrors
    System.out.println("Char data:\n" + reader.getText());
    break;
 }

This is not particularly elegant since it hard codes our <mytag> element. A more generic way is to provide your own coalescing implementation of getText():

/**
 * Use this method in response to XMLEvent.CHARACTERS event instead of
 * XMLStreamReader.getElementText() on a XMLEvent.START_ELEMENT. The former
 * approach will
 * @param reader the XMLStreamReader to pull character data out of,
 *               the reader is expected to be in a XMLEvent.CHARACTERS state
 * @return A string containing the full character data as one string
 * @throws Exception if the Jupiter aligns with Mars
 */
 public static String getCoalescedText(XMLStreamReader reader)
 throws XMLStreamException {
   StringBuilder builder = new StringBuilder();
   char[] buf = new char[1024];

   while (reader.getEventType() == XMLEvent.CHARACTERS) {
     int offset = 0;
     int len;
     while (true) {
       len = reader.getTextCharacters(offset, buf, 0, buf.length);
       if (len != 0) builder.append(buf, 0, len);
       if (len < buf.length) break;
     }
     reader.next();
   }
   return builder.toString();

And then in the switch branch checking on character events do:

     case XMLStreamReader.CHARACTERS :
       // Warning: If you expect a StackOverflowError here, you are
       //          going to wait a long while!
       System.out.println("Character data:\n"
                          + getCoalescedText(reader));
       break;

Anyway – this became a long an code-full post. All I really wanted to say was Avoid recursion in Java unless you know exactly what you are doing.

Summa Moving to SourceForge

August 4, 2009 by kamstrup

Yesterday I had the pleasure to announce on the mailing lists that Summa has reached the first milestone in migrating to SourceForge, and here follows the blog post :-)

From now on all Summa code is hosted and developed in the “summa” project on SourceForge now, in addition all bugs have been migrated from our old GForge solution to a Trac instance hosted via the cool new “hosted apps” functionality on SourceForge.

We will also move the mailing lists over in the near future. The fate of the Summa wiki is still left unclear.

I must be frank and admit that I have long felt that SourceForge was in a bit of a standstill applying only visual refreshes every now and then, and never fixing the real issues with the site. However the new Hosted Apps approach is simply sweet! There is a huge list of popular open source products you can choose to run on your site as a hosted apps (see an incomplete list here). For instance; some may surprised to know that popular version control systems such as Git, Mercurial, and Bazaar is supported as well as Subversion. Right now we run only a Trac issue tracker and a Subversion repository.

On a personal note I must still admit that my heart lies with the recently open sourced Launchpad, despite the recent kick-assiness from the SF team.

Ticer Summer School

August 3, 2009 by villadsen

I went to the Ticer Summer School 2009 on Friday 31st of August to to talk about Summa as part of a panel on Integrated Search alongside Jørgen Madsen (Primo), David Lindahl (eXtensible Catalog), and Benoît Pauwels (VuFind).

Tilburg University Campus

Tilburg University Campus

Thomas Place made the introduction explaining the basic concepts of Integrated Search. Afterwards we each presented our systems followed by a short question and answer session. I think it was during these presentations that people realized how similar most of the systems actually were.

After a coffee break each of us once again got up to do a second presentation – this time focusing more on a specific feature.

  • Jørgen Madsen: Joining Catalogues – Clean-up and Deduplication
  • David Lindahl: Metadata Handling and FRBR
  • Mads Villadsen: Facetting and Clustering
  • Benoît Pauwels: Web 2.0 Features of Integrated Search

Each presentation was once again followed by a longer question and answer session.

My slides from both of theses can be found on the Presentations page on the Summa Wiki.

The day concluded with a panel discussion based on questions from the audience. All during the day the people from the audience had been very good at asking relevant questions, and I think they really managed to get to the core of the issues regarding usability, faceted searching, and integrated search itself.

This is the first time I have been part of a panel in this way, and also the first time I have been at a Ticer Summer School – and I found both things to be a really great experience. I have never been to anything that has been as well organized as this, and all the people were very interested and engaged in the discussions. The other panel members were well prepared, open for discussion, and willing to talk freely about any issues. I can only hope that I was in the same league as them.

All in all I had a really nice time, and if I ever get the chance to do something similar again I would be very interested.

Quick and dirty test of the YUI Compressor

July 2, 2009 by Jørn Thøgersen

As a part of our quest trying to optimize the speed of our search front end I recently tried out the Yahoo js and css minifyer – YUI Compressor.

At first glance the nice things about the YUI Compressor are that it is a Java based (we are a Java friendly team), open source and fairly easy to work with. The YUI Compressor handles both javascript and css but in this post I have chosen to focus on the js part.

The test integration into my IDE (Intellij IDEA) and the project was quite easy because somebody has taken the time to write YUIAnt. I just downloaded the YUI compressor version 2.4.2 and the YUIAnt.jar and added them to the project and modified my build scripts to run the compressor when the website is deployed to the web server. The beauty of this is that you naturally don’t have to look at the minified javascript when editing and if you for some reason want to debug the code run time you can easily setup a debug option in your build script and bypass the compressor for on the fly debugging. If you aren’t into all this build script stuff or have a simple project there are lots of online YUI Compressor sites out there where you can paste you js code or css and get a compressed version in return.

The version 2.4.2 of the YUI Compressor nearly worked without problems. For some reason – I didn’t bother to investigate further – the YUI Compressor had some issues with unterminated Strings in the jscalendar-1.0 library. I just excluded the directory and went on with my small non scientific test using Firebug as my test environment.

The first screen shot shows the size and load times for our js files. Business as usual – the YUI Compressor is disabled.

nocompress2Scaled

The next screen shot shows the size and load times for the same js files now with the compression enabled.

compress2Scaled

The file sizes have been reduced and the overall load time has shrunk approximately half a second. When the file sizes are very small the load times are very sensitive to queing effects but the file size is in most cases reduced. In the case of bigger js files the improvement in speed as well as size is clear. I have tried to compensate for caching effects in both cases (compress/not compressed). It seems that there is about a 20-25% reduction in file size and approximately the same reduction in load time for the js. These numbers are without using the obfuscation option (reduction of variable names to the shortest possible length and other tricks) simply because I don’t thing we will be comfortable with this knowing that it might cause errors.

As I am new to this I am interested to hear about any major drawbacks compressing/minifying may have.

This is of course a small step and not something which alone makes the difference between a slow and a fast site but I am hoping that attention to a number of different optimization issues will make a big difference in the long run.

Thoughts on optimizing our search web site

July 1, 2009 by Jørn Thøgersen

wwws

The code for our search front end has over time grown to a considerable size and we have started to suspect that the web site’s response time could be better. With this in mind I have for some time now been keen on looking into optimizing the speed of our front end – especially when the underlying search engine Summa has proven to be blazingly fast.

There are a lot of things we could do better such as:

1. Optimizing the javascript code by trawling through the lot and removing redundancy as well as rewriting some of the methods to be more efficient.

2. A thorough cleanup of the css. There is a lot we can do here as we have loads of redundancy, classes which are not in use anymore and declarations which could be handled way cooler. Another thing I noticed is we like divs – loads of divs.

3. Taking a critical look at our numerous DOM transformations. Some of them are down right unnecessary.

4. General optimizing of the server side code. In fact this part isn’t all that bad but a general clean up once in a while doesn’t hurt anybody.

Because my summer holiday is coming up soon I have chosen to start with some light weight stuff. I have tried out the newest version of the YUI Compressor – tool to compress/minify javascript and css. As we don’t use minifying at the moment we should be able to benefit from it performance wise. In order not to clutter up this post I will post my experience with this in a separate post soonish.