Tuesday, 19 August 2008
Mastering iTunes using Linq To Xml
Last updated : 19th August 2008
Background
Lets us assume for this scenario that we are adding a new feature to our web dating web site. Yes, let's presume that we already match people based on their interests etc but what if we could match them based on the music they like. In fact, what if we could scan their iTunes library and see what kind of music they listen to most often and match this against other daters listening habits? Wouldn't that be a good proposition?
Just about everyone I know has an apple music product of some description which inevitably means they will have iTunes installed to manage this. But how do we get to this information? Fortunately, all this music data is stored within an XML document maintained by iTunes. The format of this data is not ideal though. The music data is stored within a <dict> node that contains all of its data in a pair of nodes that are similar in structure to a key value pair. I have shown this below.
Strategy
To enable this matching we need to provide a facility to upload a daters iTunes Music Library information which includes all the music they have along with all the usage data as well. This will allow us to search through this information and use it on our web site. All this information is stored in an XML file called, funny enough, iTunes Music Library. Now, to help us extrapolate this data we will use a nice new feature of the .Net Framework 3.5 called Language Integrated Query (LINQ). More specifically we'll use LINQ to XML, as this provides a simple way to interrogate XML data using simple queries which look a lot like standard SQL.
Implementation
Before we start, let's have a look at how the final web site will look. In the figure below you will see that users are provided with a simple interface to upload their iTunes XML file. When they click on the Find Me My Match button, the system will :
- Upload this file
- Parse the file for the current users favourite tracks
- Match this against a simulated database of users
- The system will then make a decision about who would be the best match
In Visual Studio we need to create a new web application. We need to make sure that we select .Net framework 3.5 and we will use VB.NET for this demo.
To save a load of time on the user interface I downloaded a free web site template from AnVisionWebTemplates.com. This template gives us the html blueprint for our design and we only need to add the upload functionality into the overall website application. You would normally spend a bit of time to take out the common html elements and place them into a master page or pages. I have just copied this into my directory structure and modified the aspx page that I'll be using. You can download the files in this article from the link provided above.
The Project structure should resemble the figure below:
I have created a simple XML file called userdb.xml and this is simulating a database of users that we may have on our dating website. The structure of this file is shown below:
Next we want to create a class to do all the work. In the App_Code folder I created a new class called Bracora.Matchmarker that will be implementing the couple of static methods we need.
''' <summary>
''' General class for demo
''' </summary>
''' <remarks>Note: you would want to seperate the logic out into seperate classes (tiers)
''' in a real world application
''' </remarks>
Public NotInheritable Class MatchMaker
''' <summary>
''' </summary>
''' <remarks>Don't instantiate</remarks>
Private Sub New()
End Sub
Now let's start using some of this lovely Linq functionality. The first method I want to implement is the one to upload all the user information in the usersdb XML file and use this as the basis for matching later on. For this simple demo, to match music I am merely looping through this information and performing a basic string comparison so I am returning a general IEnumerable interface for my simulated database.
As you can see from the code below, I simply load the data into a XDocument object that is part of the Linq namespace. This allows me to run queries against this data as if it were held in a relational database. The query syntax is similar to SQL but slightly more intuitive. Another nice feature of .net 3.5 is the ability to return anonymous types. In the query below I am returning a strongly typed collection of objects containing two properties, userId and track. The compiler does all the hard work and saves me having to define this class just to return these values.
''' <summary>
''' Get all the data we have in our simulated database. We would normally keep all this
''' information in a database. But this is fine for the demo.
''' </summary>
''' <returns>A list of tracks an associated user ID's</returns>
''' <remarks>Just hard coding values for the demo</remarks>
Public Shared Function GetAllMusic() As IEnumerable
Dim databasePath = HttpContext.Current.Server.MapPath("App_Data") + "/userdb.xml"
If System.IO.File.Exists(databasePath) Then
Dim userdb = XDocument.Load(databasePath)
' Just return a list of our simulated users
Dim dbSongs = From s In userdb...<song> _
Order By s.<UserID>.Value _
Select _
s.<UserID>.Value, _
s.<track>.Value
Return dbSongs
Else
Return Nothing
End If
End Function
The next thing I need to do is create a method that will read through an uploaded iTunes library document and grab the top 25 songs most often played. Again, using Linq to XML this is an easy task.
''' <summary>
''' Process the uploaded file and get the top tracks baced on how often they
''' are played. In a real world we perform this processing offline.
''' </summary>
''' <param name="iTunesFile">The file location of the uploaded iTunes file</param>
''' <returns>A list of top songs for that user</returns>
''' <remarks>We are only taking the bare minimun information for this example.
''' Note how LINQ feturns a generic list of anonymous types without us having to
''' create a class definition for this ;-)
''' </remarks>
Public Shared Function GetUsersTopTracks(ByVal iTunesFile As String) As IEnumerable
If System.IO.File.Exists(iTunesFile) Then
Dim iTunes = XDocument.Load(iTunesFile)
' Drill into the iTunes XML structure to get the music information
' Transform it into a more meaningful XML structure
Dim tracks = From t In iTunes.<plist>.<dict>.<dict>.<dict> _
Select New XElement("song", _
From key In t.<key> _
Select New XElement(key.Value.ToString.Replace(" ", ""), _
CType(key.NextNode, XElement).Value.ToString))
' Get the top 25 most played songs
Dim topTracks = From song In tracks _
Order By song.<PlayCount>.Value Descending _
Take (25) _
Select _
Artist = song.<Artist>.Value, _
TrackName = song.<Name>.Value
Return topTracks
Else
Return Nothing
End If
End Function
There are a couple of things to note here. You will notice that I am drilling into the iTunes XML structure using what is known as Child Axis properties. VB.NET provides me with this syntax. The line of code (iTunes.<plist>.<dict>.<dict>.<dict>) is basically me drilling into where the music information is stored within the file hierarchy. I don't like the way iTunes manages this data so I am taking this raw information and transforming it into an XML structure that works for me. After I do this I am then sorting on descending Playcount value and then taking the top 25 songs. Again, I am simply returning a strongly typed collection of objects that contain 2 properties, Artist and Trackname.
And now for the matching routine....
To hold the matching data I have created a simple class.
Namespace Bracora.Demo
Public Class MatchedEntry
Public Sub New()
End Sub
Private _userID As String
Public Property UserId() As String
Get
Return _userID
End Get
Set(ByVal value As String)
_userID = value
End Set
End Property
Private _trackName As String
Public Property TrackName() As String
Get
Return _trackName
End Get
Set(ByVal value As String)
_trackName = value
End Set
End Property
End Class
End Namespace
All I want to do now is pass in two lists to a routine, one with the top 25 music tracks for the current user and another list that contains our simulated database with all of our stored users favourite selections. We then loop through the data and store all matches in a generic list of MatchedEntry objects. From here we run a Linq query to group and sort the matches based on the most 'hits' and then we select the first or top match.
If we have a valid match at this stage, we then execute another Linq query that gathers all the music tracks that were common to both users. So we now have a userId and an array of tracks that were matched (Note the .ToArray() method on the Linq object). Now thank you .net 3.5, we now just create an anonymous type with this information and return it back :-)
''' <summary>
''' Attempts to find the most matched user based on checking their music information
''' against the music information we have in our simulated database
''' </summary>
''' <param name="userTracks">A list of user tracks to match</param>
''' <param name="dbTracks">A list of music tracks against users we have in our 'database'</param>
''' <returns>The best matched userID or nothing for no matches</returns>
''' <remarks>Don't try this at home. Honestly there are much better ways to do this!</remarks>
Public Shared Function GetTopMatchedUser(ByVal userTracks As IEnumerable, _
ByVal dbTracks As IEnumerable) As Object
' contains a list of non-unique userIDs when songs are matched
Dim MusicMatches As New Generic.List(Of MatchedEntry)
Dim Match As MatchedEntry = Nothing
' loop round each of the users favourite tracks and try to match this
' against the list of tracks we have in our simulated database. If
' a match occurs the add the userId. The more instances of that userId
' then the more music they have in common.
For Each track In userTracks
For Each song In dbTracks
If track.TrackName = song.track Then
Match = New MatchedEntry()
Match.UserId = song.UserID
Match.TrackName = song.track
MusicMatches.Add(Match)
End If
Next
Next
' Group by the userID and then work out the count for each one
Dim top = From u In MusicMatches _
Group By u.UserId Into g = Group _
Order By g.Count() Descending _
Select _
UserId, _
NumberOfMatches = g.Count()
If top.Count() > 0 Then
Dim topMatchedUser = top.First()
' Get the matched songs based on the userId
Dim matchedSongs = From u In MusicMatches _
Where u.UserId = topMatchedUser.UserId _
Select u.TrackName Distinct
Dim MatchedUser = New With {.User = topMatchedUser.UserId, .Songs = matchedSongs.ToArray}
Return MatchedUser
End If
Return Nothing
End Function
To finish up with this class we need to create a controller routine that ties all these methods together.......
''' <summary>
''' Main entry point for the demo and controller
''' </summary>
''' <remarks></remarks>
Public Shared Function FindMeMyMatch(ByVal fileName As String) As Object
Dim userSongs As IEnumerable = GetUsersTopTracks(fileName)
Dim allUsersSongs As IEnumerable = GetAllMusic()
Return GetTopMatchedUser(userSongs, allUsersSongs)
End Function
Now, all we need to do now is implement the user interface. Therefore we need a FileUpload control, a few labels, an image control to show matched users and a button to trigger the whole thing off. The full details are contained in the download and I would encourage you to get it and run the application as there are no database requirements. Below is the main routine that uploads the iTunes file and displays the outcomes, either matched or unmatched.
One point to note is that ASP.NET constrains the upload size by default. As iTunes library files can be quite large, you'll need to modify the web.config file to increase the allowed upload size.
<!-- need to increase the size of the permitted upload size -->
<httpRuntime maxRequestLength="20480" />
Here is the code behind for the web page. It's pretty straight forward.
- We check that it's a proper XML file
- We generate a new unique filename and save the uploaded file
- We then pass the filename to the matching routine and await our fate!
I have coded some simple display logic for the demo. But we either match or we don't and display accordingly.
Protected Sub btnUpload_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btnUpload.Click
Dim savePath As String = HttpContext.Current.Server.MapPath("uploads") + "/"
If (FileUpload1.HasFile) Then
'make sure it's an XML file
If (FileUpload1.PostedFile.ContentType.ToLower <> "text/xml") Then
message.Text = "This is not a valid iTunes file. Please make sure you upload the proper file."
Exit Sub
End If
' create a unique name for the uploaded file.
Dim fileName As String = Guid.NewGuid().ToString + ".xml"
' Append the name of the file to upload to the path.
savePath += fileName
FileUpload1.SaveAs(savePath)
Dim MyMatch = Bracora.Demo.MatchMaker.FindMeMyMatch(savePath)
Dim HeaderText As String = String.Empty
Dim ImageOfMatchedUserUrl As String = String.Empty
Dim DisplaySongs As String = String.Empty
If MyMatch IsNot Nothing Then
Select Case MyMatch.User
Case "1"
ImageOfMatchedUserUrl = "App_Themes/Standard/images/user1.jpg"
Case "2"
ImageOfMatchedUserUrl = "App_Themes/Standard/images/user2.jpg"
Case "3"
ImageOfMatchedUserUrl = "App_Themes/Standard/images/user3.jpg"
Case Else
ImageOfMatchedUserUrl = "App_Themes/Standard/images/user4.jpg"
End Select
For i As Int32 = 0 To MyMatch.Songs.Length() - 1
DisplaySongs += String.Format("<li>{0}</li>", MyMatch.Songs(i))
Next
HeaderText = String.Format("<h3>{0}</h3>", "Success, we found a match!")
Else
' No matches show the user who has been waiting longest :-)
ImageOfMatchedUserUrl = "App_Themes/Standard/images/user4.jpg"
DisplaySongs = String.Format("<li>{0}</li>", "You have no songs in common, here is your closest match")
HeaderText = String.Format("<h3>{0}</h3>", "No luck I'm afraid!")
End If
Me.headerMatch.Text = HeaderText
Me.songList.Text = DisplaySongs
imageOfMatchedUser.ImageUrl = ImageOfMatchedUserUrl
imageOfMatchedUser.Visible = True
Else
' Error message for no file
message.Text = "<p class=""error"">You did not specify an iTunes file to upload.</p>"
End If
End Sub
And that's all there is to it. Linq to Xml is a wonderfully powerful addition to the .net framework.
Summary
Hopefuly this tutorial has shown how you might use Linq in your own applications. Linq to XML is really nice feature and makes working with XML documents really easy. The leaning curve is very smooth and you can quickly pick up the core concepts without too much effort and grow into the more complex aspects of Linq namespace later on.
The code download also contains a few sample iTunes in the upload directory.
Note: iPod® and iTunes® are registered trademarks of Apple Inc.
© Bracora 2013 . Powered by Bootstrap , Blogger templates and RWD Testing Tool
No comments :
Post a Comment