HELP! I’m think I’m becoming an architect

Caveat Lector: turns out I’m WRONG again. The following is mostly correct, but I’m not sure the Distributor will do what I want it to do. I’m examining my options, but don’t go away thinking this is gospel truth, because it’s not. Nor’s the gospel, for that matter, but that’s a longer subject.

I want to preface this by saying that part of me hates big architecture with a passion. I learned programming in VBScript and HTML, and not a day goes by when I don’t think we might have been better off. I spent three days recently instantiating abstract factory factory configurator factories and I was getting ready to lose my temper and just HARD CODE the damned thing. Still… it’ll all be worth it, apparently.

I mentioned a while back that I was working on ways to make Huddle more scalable. As part of that, I’ve had a lot of whinging conversations with our CTO where I’ve begged him for some time when we can spike a service bus.

Finally, my dreams have come true. I’ve been playing with NServiceBus today. When a user uploads a file in Huddle, we don’t just write the file to disk. We also have to create thumbnails of PDFs and images; we have to send email notifications for interested users; we have to update the news feed; and we have to send the file to the search engine for indexing.

In the olden days (read: a couple of months ago), all of this stuff happened inline, as soon as the file was uploaded. We had to change the way that we handled search engine indexing, though, because it would occasionally go bonkers and eat all the memory on the machine. To get around this, we placed Lucene on a separate server, and whenever we finish saving a file, we put the document’s unique id onto a message queue.

We have a WCF service set up to listen on the message queue, and whenever it receives a file change notification, it loads up the content and passes it to Lucene for indexing. As an added bonus, because we’re using a reliable message queue, if the indexing service goes bonkers, it’ll pick up where it left off once it’s restarted.

This has helped A LOT with the stability of the application, but it started me thinking about how many other things I could do in the background.

Today, when we show users their news feed, we have to load the entire thing from the database. The query joins across a scary number of tables, and makes Baby Jesus cry. In the future, I’d like to be able to generate news feed messages in the background, and serve up static content. Increasingly, I’m realising that dynamic sites don’t actually have to be dynamic so long as they’re dynamic enough. Chances are that there’s a cunning way to generate static content instead, and to serve it in a cacheable way. I want to come back to cacheability and RESTful-ness in a future post, because that’s also keeping me awake at night.

So, for the next week I’ll be trying to get thumbnails to generate asynchronously. This will prevent the annoying pause after an upload finishes in Huddle, and will hopefully stop the mysterious bug where uploads refuse point blank to complete, though that might be a threading bug. It’ll also prevent the sudden spike in memory use when we load a file into RAM to thumbnail it. In the case of PDF or large graphics, this can be a fairly large chunk, and that can cause the application pool to recycle, which is a) expensive and b) lame. It’s really a starting point, though, for moving more and more of Huddle’s components onto separate machines and taking the load away from the web servers.

What does my solution look like so far? I’ve built the world’s most obscenely over-engineered hello world application. I thought the WCF/MSMQ hello world was bad, but at least that one only used a single message queue.

Ladies and gentleman, Hello World 2.0 uses no fewer than 7 messages queues, three command line applications (which can be executed on physically separate machines), and two Inversion of Control frameworks (but I’m fixing that tomorrow).

It all makes me a little uneasy, truth be told. I was happier when Huddle was all running on a single webserver. This latest monstrosity is more Enterprisey than I care to admit.

ServiceNotificationGateway

This class is responsible for instantiating a new service bus connection, and firing off messages. That’s literally ALL it knows how to do, and it’s too dumb to know how to read those messages. If we ever need to do full duplex messaging, then it’ll get more complicated, but I don’t think we will – it would be missing the point, and our CTO would shout at me if I had a web server waiting for an async call to return.

A message is a class which implements the interface from NServiceBus.IMessage. For the purposes of my Hello Universe app, the message looks something like this

public class HelloWorldMessage : IMessage { public Guid Id { get; set; } public string MessageBody { get; set; } }

and the gateway looks like this (with private methods redacted)

public class ServiceNotificationGateway : IDisposable { readonly IBus serviceBus; readonly ILog log; public ServiceNotificationGateway(IBuilder builder, ILogFactory logFactory) { log = logFactory.GetLogger(); serviceBus = ConfigureBusWith(builder); } public void Send(IMessage message) { log.DebugFormat(“sending message {0}”, message.ToString()); serviceBus.Send(message); } public void Dispose() { serviceBus.Dispose(); } }

When we send a message from the ServiceNotificationGateway, we place it onto a message queue called distributor-bus and it gets picked up by the

Distributor

This service runs on another machine. When a message comes into the distributor-bus, it looks at the type of the message, and assigns it to an outbound queue which serves a particular worker type. In the case of HelloWorldMonstrosity, we only place the message onto one queue, the HelloWorld queue; but for a FileModified message, we would place the message onto separate queues for each of the services I listed above. The request is then finally picked up by the

This is the service which is actually reponsible for fulfilling the original request. In the case of HelloWorldMonstrosity, it just writes out a message to the console, but the ThumbnailWorker will, for example, examine the FileMessage; check the mime-type to see if we need to do any thumbnailing; pull the original image over the network; thumbnail it locally; and finally save the thumbnailed version back to the filestore, updating the DB so we know where the latest thumbnail can be found.

Why a service broker?

It’s possible, even preferable for a service this simple, to hook up the publisher (ServiceNotificationGateway) and the subscriber (Worker) directly, but by communicating via a broker (the Distributor) we gain two rather important benefits.

Firstly, the website and the APIs don’t know anything about ThumbnailWorker or HelloWorldService or LuceneIndexingWorker. All they know is that when you finish saving a file to disk, you immediately call ServiceGateway.Send(new FileModifiedMessage(file)). This allows us to plug the services in, one by one, without having to change any client code or configuration.

If I decide that I want a service which browses text files for swear words and incriminating phrases (like confidential, UK eyes only, unmarked £10 notes, etc) and puts the filenames into an RSS feed for my viewing pleasure, I can hook one up and the website and API are blissfully unaware, they just keep sending out the same dumb messages.

Secondly, the distributor is responsible for balancing load across many workers. When a worker service starts up, it registers itself with the distributor and tells it how many threads it is willing to run at a time.

When requests come in from clients, the distributor picks the next available worker to fulfil the request. When the worker finishes, it calls back to the distributor to let it know that it’s ready for more work.

This is a really nice mechanism called a Competing Consumer pattern that allows us to add more machines for handling a particular service with ease. If the Indexing service gets overloaded, we can create a second instance on another machine, and it will register itself with the distributor, which will automatically balance load across the two of them. If, on the other hand, we find out that the Indexing server is being underused, the we can increase the number of threads, and the distributor – again – will automatically send more work across to the service.

So, it’s 23:19, I was at work until 22:00, I came home and immediately wrote up a blog post – that’s how excited I am about this thing, I’m blogging and I’m not even drunk. It’s not just HelloWorld, it’s the most scalable Hello World app evar!!!

What’s next?

At the moment, NServiceBus does all its building with Spring.Net, but we use Windsor at Huddle, so my plan for tomorrow is to consolidate the IOC frameworks and use just Windsor. If I’m feeling particularly winful, I might look at configuring with Binsor because, quite frankly, I am sick and fucking tired of fucking XML files everywhere.

After that, it’s time to stop playing with Hello World and actually write a thumbnailing service, which will be less fun, but probably more useful in the long run.

This really is a grotesquely over-engineered solution for our immediate needs, and I feel REALLY guilty about it, but it’s a solution that we can grow into, and that will support our longer term architectural goals. So Maybe We Are Gonna Need It.

tl;dr: Separate components can be run on multiple machines. Use an off-the-shelf service bus to do the plumbing for you. Over-engineering is fun.


Request a Demo
trillatron

© 2006 - 2019. All Rights Reserved.