Last.fm to Couchbase exporter
After Tug’s workshop I started playing around with Couchbase. In this article I will describe my experiments.
I have written a small application which loads artist information from Last.fm and saves JSON results in Couchbase. Application starts with on arbitrary artist, loads it’s information and continues recursively with artists similar to the this one. This process never stops, since every artist on Last.fm has similar artists. To speed up performance loading and saving is done in multiple threads. With 10 threads you can reach up to 25 writes per seconds with the application, which is not a lot for Couchbase, I guess. Performance bottleneck is querying Last.fms web service. After running the application for one hour you have thousands of artists in your Couchbase bucket. Running it for a longer time it will produce a kind of “Big Data”, useful data, not clutter created by a simple loop.
Source code of the application is on Github. Repository is: lastfm-exporter.
Prerequisites to run the application is a running Couchbase Server and a Last.fm API account. For Couchbase installation instructions, please consult the documentation: “Chapter 2. Installing and Upgrading“. Last.fm API documentation you can find here.
There are only two classes in the Application. First I will show both classes en bloc. Below you you find some explanations.
LastfmExporter.java - Main class for initialization and starting the application.
ArtistExportThread.java - Thread to load and save data concurrently
Constructor of LastfmExporter initializes 3 things:
- Couchbase client – Is used to interact with Couchbase server. You can add multiple URIs to URI list if you have a couchbase cluster. See Java SDK guide for more details.
- Jersey client – Is used to load artist data from Last.fm web service. Jersey is the open source, reference implementation for building RESTful web services.
- Thread executer – Is used to execute multiple threads at the same time. See this documentation for details.
This is the core of the application:
- Load similar artists from Last.fm
- Export similar artist in multiple threads
- Call execution recursively for each similar artist
Artist.getSimilar(artistName, key) is a static method provided by Last.fms API bindings for Java.
Load artist info from Last.fm
Loading artist info from Last.fm using Jersey client is straight forward:
- Build web service URL
- Create a Jersey WebResource and call the get method
Both methods are part of class ArtistExportThread.java.
Save JSON in Couchbase
Since all data in Couchbase is saved in JSON format you can take the result from Last.fm web service and put it directly in your Couchbase bucket: very nice! Every entry in a Couchbase bucket needs a key which is similar to a primary key in “old school” database.
You can now start to query the artist database. If you want to find out more about that I recommend chapter 9 from the Couchbase server manual: Views and Indexes.