Azure Queue Agent – Introduction

For a small side project I’ve been working on I needed a way to schedule background tasks and handle them on a pool of workers. Since I was planning to run this on Azure, the Queues offered with Azure Storage seemed to be a no-brainer. Even more so since the Azure Scheduler, which can be used to periodically execute some action, can also be used to add messages to Queues. I figured that I wasn’t the only one needing something to handle such tasks, so I decided to build a lightweight open-source library for this.

Enter Azure Queue Agent (AQuA)

AQuA comes with two main components: a Producer and a Consumer. As the names suggest, the Producer can be used to produce (i.e. enqueue) new jobs, while the Consumer can be used to consume (i.e. dequeue and then handle) jobs from the queue. Job Descriptors are used to define which job should be executed and what parameters should be used for the execution. They are encoded as simple JSON objects like the one below, i.e. they can also easily be written manually (e.g. when used with the Azure Scheduler). That said, with AQuA it is very simple to build a scalable and robust collection of workers which take care of all your background processing jobs.

{ "Job": "HelloWho", "Properties": { "Who": "World" } }

The above example for instance would queue the HelloWho job, which does nothing more but print the value of the Who parameter on stdout like this: “Hello, <Who>!”. In addition, the Azure Queue Agent Consumer can be configured to either delete or requeue messages which were badly formatted, using unknown jobs or which could not be executed successfully, such that you can even use a single queue for multiple different pools of workers, should you ever find yourself in that situation.

Getting Started

The Azure Queue Agent is available as a NuGet package, currently however only in pre-release. You can get it like this:

Install-Package aqua.lib -Pre

Once this is done, you need to create an instance of Producer (if you want to create job requests from your code), and an instance of Consumer (for when you want to handle job requests).

// Setup and initialization
JobFactory factory = new JobFactory();

// Register all the jobs you want your consumer to be able to handle.
factory.RegisterJobType(typeof(HelloWho));
factory.RegisterJobType(typeof(MyBackgroundJob));

// Use the storage account from the emulator with queue "jobs".
ConnectionSettings connection = new ConnectionSettings("jobs");

Producer producer = new Producer(connection, factory);
Consumer consumer = new Consumer(connection, factory);

// Produce (i.e. enqueue) a HelloWho job request.
HelloWho job = new HelloWho() { Who = "Azure Queue Agent Example" };
producer.One(job);

// Consume (i.e. dequeue and handle) a job request.
consumer.One();

This should get you going for now. I’ll follow up with more later. Oh, and you can read the sources on GitHub.

Offline JSON Pretty Printing

Today when you’re dealing with Web APIs, you often find yourself in the situation of handling JSON, either in the input for these APIs or in the output, or both. Some browsers have the means to pretty print the JSON from their dev tools. But you don’t always have that opportunity. That’s why there are tools to pretty print JSON. I’ve found quite a few of them on the web, but all the ones I’ve found have one terrible flaw: they actually send the JSON you’re trying to pretty print to the server (*shudder*). I don’t want my JSON data (sensitive or not) to be sent to some random servers!

All your JSON are belong to us!

Now as I wrote, I don’t particularly like the fact that my JSON data is sent over the wire for pretty printing. It may not be super secret or anything, but in these days, you cannot be careful enough. Besides, it’s completely unnecessary to do it. All you need is already in your browser! So I quickly built my own JSON pretty printer (and syntax highlighter). You can find it right here.

Offline JSON Pretty Printing to the Rescue

Actually, the design is very simple. All my JSON pretty printer is doing, is to take your JSON input and try to parse it as JSON in the browser.

JSON.parse(yourJsonInput)

If that fails, I’m showing the parsing error and it’s done. If it succeeds, I get back a JavaScript object/array/value, which then I’m inspecting. For objects, I’m using basic tree navigation to go through all the properties and nested objects/arrays/values for pretty printing. That’s it, really simple. No need to transmit the data anywhere — it stays right in your browser!

So like it, hate it, use it or don’t: cymbeline.ch JSON Pretty Printer

LINQ with Lucene.Net.ObjectMapping

Last time I mentioned that I started to work on supporting LINQ with Lucene.Net.ObjectMapping. That includes LINQ queries like the following:

using (Searcher searcher = new IndexSearcher(directory))
{
    IQueryable<BlogPost> posts =
        from post in searcher.AsQueryable<BlogPost>()
        where obj.Tag == "lucene"
        orderby obj.Timestamp descending
        select post;
}

Now granted, the above example is a very basic one. So here’s a short list of other methods on IQueryable<T> that are already supported at this point: Any *, Count *, First *, FirstOrDefault *, OrderBy, OrderByDescending, Single *, SingleOrDefault *, Skip, Take, ThenBy, ThenByDescending, and finally Where.

* Method is supported both with and without a filter predicate.

With this, it becomes easy to build paging based on objects you get back as a result of a query on Lucene.Net. I’m still working on improving the supported filter expressions (most of all for Where, but all the other filterable methods naturally profit too). For instance, with the default JSON-based object mapping it is already possible to search for entries in a dictionary that maps a string to another property or object. Say you have a set of classes, defined as follows.

public class MyClass
{
    public int Id { get; set; }
    public Dictionary<string, MyOtherClass> Map { get; set; }
}

public class MyOtherClass
{
    public string Text { get; set; }
    public int Sequence { get; set; }
    public DateTime Timestamp { get; set; }
}

Now you can actually search for instances of MyClass that satisfy certain conditions in the Map dictionary, like this:

var query = from c in searcher.AsQueryable<MyClass>()
            where c.Map["MyKey"].Sequence == 123
            select c;

Since the items in the dictionary are mapped to analyzed fields in the Lucene.Net document, we can search on them!

Delete and Update By Query

Now since I have this query expression binder to create Lucene.Net queries based on LINQ filter expressions, I’ve added an extension method to update and one to delete documents that match a query. So it is now possible to do this:

indexWriter.Delete<MyClass>(x => x.Id == 1234);
indexWriter.Update(myObject, x => x.Id == myObject.Id);

Call to Action

Now with all this said, I’m looking for volunteers to help me get more coverage on the LINQ queries, because that’s definitely where the weak spot is right now. If you’re interested, leave a comment here or on GitHub.

Improvements to Lucene.Net.ObjectMapping

I’d like to discuss some improvements to Lucene.Net.ObjectMapping which I published yesterday as a new version (1.0.3) to NuGet. In addition, I want to take this opportunity to give a quick outlook on what’s to come next.

CRUD Operations

The library now comes with support for all of the CRUD operations. Let’s look at them one by one, starting with Create.

Create / Add

In Lucene.Net terms, that would be AddDocument. Since the library does object to document mapping, this is simplified to an Add operation.

IndexWriter myIndexWriter = ...;
MyClass myObject = new MyClass(...);

myIndexWriter.Add(myObject);

Or, if you need a specific analyzer for the document the object gets mapped to, you can use the overload which accepts a second parameter of type Analyzer.

IndexWriter myIndexWriter = ...;
MyClass myObject = new MyClass(...);

myIndexWriter.Add(myObject, new MyOwnAnalyzer());

Retrieve / Query

The retrieve operation, or mapping of a document to an object hasn’t changed since v1.0.0. There are examples for how to query and retrieve in my previous post. Of course, if you happen to know the ID of the document without a query, then you can just map that document to your class without going through a query. But since the document IDs can change over time, it’s usually more practical to pivot off a query.

Update

Update is maybe the most interesting operation here. Since document IDs can change over time, there’s really no good way to reliably update a specific document, without making a query. That’s why the UpdateDocument method from the IndexReader asks you for a query/term to use to match the document to update. And that’s why it’s generally a good idea to bring your own unique identifier to the game. Suppose your class has a property of type Guid and name “Id”, which is used as your unique identifier for the objects of that type.

IndexWriter myIndexWriter = ...;
MyClass myObject = ...;

myObject.MyPropertyToUpdate = "new value";

myIndexWriter.Update(
    myObject,
    new TermQuery(new Term("Id", myObject.Id.ToString())));

Under the covers, this will find all the documents matching the query and matching the type (MyClass), delete them and then add a new document for the mapped myObject. If you need an analyzer, for the newly mapped document, you can use the second overload.

IndexWriter myIndexWriter = ...;
MyClass myObject = ...;

myObject.MyPropertyToUpdate = "new value";

myIndexWriter.Update(
    myObject,
    new TermQuery(new Term("Id", myObject.Id.ToString())),
    new MyOwnAnalyzer());

Delete

Just like the retrieve operation, the Delete operation is also supported since v1.0.0. I realize though that I haven’t given any examples yet. But really, it’s quite simple again. You give the type of objects you want to delete the mapped documents for, and you give a query to identify the objects to delete. No magic at all.

IndexWriter myIndexWriter = ...;
myIndexWriter.DeleteDocuments<MyClass>(
    new TermQuery(new Term("Tag", "deleted")));

Naturally, you can use any Query you want for the delete operation (as well as for updates). You can make them arbitrarily complex as long as they’re still supported by Lucene.Net.

Summary and Outlook

That’s it, CRUD with no magic, no tricks. Let me know if there’s functionality you’d like to see added, either by commenting here or by opening a bug/enhancement/whatever on GitHub. I’ve started working on LINQ support for the ObjectMapping library too, with the goal that you can write LINQ queries like the following.

var query = from myObject in mySearcher.AsQueryable<MyClass>()
            where myObject.Tag == "history"
            select myObject;

It will likely take a little longer to get that stable, but I’ll try to make a pre-release on NuGet in the next few weeks.