An Album Is A Social Ontology

UPDATE: Looks like the BBC already has a bunch of data just like this and there’s even a mailing list for Music Ontology for people to join. Thanks to danbri for the pointer.

In my last blog post about an idea for a music browser I talked about the album being a document analog to the HTML page in a web browser. If we break down what it means to “browse Zed’s music site” in the same way we say “browse Zed’s web site” we have to invent similar concepts as found on a web server. It’s also important that these same concepts keep up the key elements of the browser and webserver that make for disconnected sites.

Any music browser without the ability for one site to link to another without permission from the target site will fail to disintermediate the current trend of walled gardens and “Music CompuServes” like iTunes. In the same way that the web’s promiscuous free forming chaotic linking wiped out most BBS like CompuServe, the music browser concept has to allow and encourage linking between artist’s works without “approved” gatekeepers and registries.

Therefore, what would make a good model for the following concepts in a web server:

  1. site
  2. directory
  3. page
  4. media

and how would those concepts map to serving of albums? Could these concepts just simply be reused and the clients are left to present the information to the user as they see fit?

Obviously the browser could do whatever it wants, but the goal of a good set of site and document structure concepts would make it easier on the implementer of a browser. Based on this, I’m going to go with the existing concepts that URLs pointing at directories of files and media on a server stays. No point in changing that because everyone knows it very well and it works just fine.

The only concept that needs to be developed for an album is to replace the “page” with an album model that works for the peculiar social system that makes up most musician’s lives.

What’s In It For Everyone?

Before I go into what I’ve considered might be a good start at a album document model, I’d like to put forward a reason why a music browser could be beneficial to the artists and music world in general:

Advertising to fans to generate revenue.

Now, I’m not necessarily talking about a musician doing this. They obviously could, but I’m more talking about a company such as Google being able to now conduct their adwords business on a newly opened realm of the internet.

With the current walled garden system, a company like Apple is the only one allowed to advertise, and therefore they can control that revenue stream. They use this special status in their little world to cross market other products they have, just like Amazon does. However, this cross marketing only works to give Amazon and Apple more money, so they have an incentive not to open it up. If they do it must come at a very heavy price.

If there’s a wide open, user controlled, promiscuously connected browser taking listeners wherever they want to go, then that opens up advertising revenue to all the established internet players. It also means that artists can see a new stream of revenue for their own publishing, and potentially could use tasteful advertising to fund the free release of their albums.

With that in mind, it will probably be important to make sure that any browser developed follows the Netscape and IE model of making sure that users are not snooped on while still providing advertising capabilities to publishers. This means the ability to show interactive content.

As you’ll see shortly, that won’t be much of a problem using some nice existing technologies.

An Album Is A Social Ontology

Who played bass guitar on “What’s Going On” by Marvin Gaye? James Jamerson, but could you name him right away? I hang out with musicians at school, and they can recite back to me who the producer of the third song on an obscure album was, then go through all the guitarists that Steely Dan had.

When you pop open a CD and look at an album, you find artwork and sometimes lyrics, but what you always find is credits. You’ll see who did what on the album, the guitars they used, amp brands, studios they recorded in, friends that helped them. Everything about producing music is a social network of people working to make the album sound great (or, like crap).

In a model for an album document, I want to make this connection between the players and supporting cast of primary importance. It’s important to the musicians and producers, but when you download an MP3 that information is always missing. You just hear a song, not all the people who went into making it.

Another important reason to include capabilities to allow for describing the relationships between the people involved is for improved searching. You could punch in a favorite guitarist and pull up all the albums he worked on, even if he was a session musician. This could lead to you finding out that he plays in your city quite frequently, and therefore bring the “eyeballs” to the artist without an approved eyeball enhancing authority watching over the operations.

In short, I believe that being able to craft an album document that mentions all the people involved, and then using a format that makes analyzing that information trivial, will open a huge world for artists, even if they are just on the sidelines.

Now, the smart people reading are probably screaming what the perfect technology is for such a task.

Well, What About The Semantic Web?

Exactly what I was thinking. The semantic web is a set of standards and data formats that allow for machine interpreted and searchable information about the relationships between entities. What if an “album” was just an RDF document in N3, Turtle, or RDF XML that described all the related pieces to make that document?

There’s already the Friend Of A Friend which supports many of these concepts. There’s also plenty of other technologies available for processing these documents, analyzing them, querying them, sorting them, and everything else you’d need to do to a large collection of information about connections.

In the past, RDF was a nightmare. It was classic XML infinite abstraction, with some of the worst syntax ever. Recently though the W3C has taken its head out of its ass and produced some formats that are actually pleasant to work with such as N3, Turtle, and NTuples. Each has their place, and are probably way harder to use than an average musician could deal with, but they are perfect for the coders putting the stuff together.

Now, with the new nicer formats and syntax, and the great libraries for semantic web technologies, it’s possible to just use them directly as the way an album is encoded into a document.

Since an album is almost entirely information about people and their roles on the album, the semantic web technologies fit almost perfectly.

A Simplistic Python Example

The library RDFLib is a decent library for Python that supports all the big technologies the semantic web has. To give an idea of how you can use it, I’ll present a simple bit of code that crafts a stripped down album in Python, and then show the output. This will show you both that the Python is simple, but really the resulting N3 output is actually very nice.

from rdflib.Graph import Graph
 from rdflib import URIRef, Literal, BNode, Namespace
 from rdflib import RDF

store = Graph ()

1. Bind a few prefix, namespace pairs.
 store.bind ("dc",

1. Create a namespace object for the Friend of a friend namespace.
 FOAF = Namespace

1. create the album as a project
 wgo = BNode ()
 store.add ((wgo, RDF.type, FOAF["Project"]))
 store.add ((wgo, FOAF["name"], Literal ("What's Going On")))

1. create Marvin Gaye
 marvin = BNode ()
 store.add ((marvin, RDF.type, FOAF["Person"]))
 store.add ((marvin, FOAF["nick"], Literal ("marvin")))
 store.add ((marvin, FOAF["name"], Literal ("Marvin Gaye")))

1. connect them
 store.add ((marvin, FOAF["pastProject"], wgo))

print store.serialize (format="pretty-xml")
 print store.serialize (format="n3")

What’s going on here is we grab the information for the FOAF and DC namespaces and then we just construct a FOAF record using the FOAF specification and then we print out a XML and an N3 document. The XML document is just to show you the contrast, but first the very nice N3 output:

`@`prefix foaf:
 `@`prefix rdf:

[ a foaf:Person;
 foaf:name "Marvin Gaye";
 foaf:nick "marvin";
 foaf:pastProject [ a foaf:Project;
 foaf:name "What's Going On"]].

Yes, that’s all there is to it. It pretty much looks like a Smalltalk method call or any number of more modern languages. In fact, I like this syntax so much I’m not even going to bother putting the annoying XML on this page.

Once we have that document, we can now pull it up and gather information about this person and the projects he’s worked on. There’s even a SQL-like query language called SPARQL that would let you rip through a store of these.

The end result is that we can leverage all this work done on making a machine readable store of related knowledge to do something mundane like say James Jamerson played bass guitar on “What’s Going On”.

Next up, an actual complete album done in N3 and how it might be displayed. A key component of this will be including information about time and related media files.