I found some more time to add in the details of my progress on the radio playlist project. I’ve decided once again that the original information I included on that page should probably have been a blog post. I have removed it and replaced it with something more suitable. The text that used to be there is included at the end of this blog post just so I’ll have it, and also so I do not have to explain the whole thing over again. For now, here is the progress I have made so far:
The first thing I had to do was find a way to download the web page to my computer from the command line. Having used cURL before, it seemed like an obvious answer. The problem I ran into first, was that I didn’t know the URL of the actual playlist page. The way I normally got to it was by visiting the Edge’s website, clicking on the “Edge Music” button, and then selecting the “What song was that?” option. The URL for the page supposedly showed up in my address bar, but it was incorrect. I could not for the life of me figure out the URL to the playlist page. Then my friend gave me a great idea. Use Wireshark to see what exactly happens when I click on the link. That was the answer to this problem. Wireshark revealed that the URL to the playlist is:
http://www.mediabase.com/whatsong/whatsong.asp?var_s=
075069068074045070077
It appears that The Edge uses another company called MediaBase to handle the playlist for them. Regardless, I had the URL. Now it was time to get cURL to download the page. This is a piece of cake. The command is simply:
curl http://www.mediabase.com/whatsong/whatsong.asp?var_s=
075069068074045070077
Done. The code to the page is dumped to the screen. Now that I had a way to get the page locally, I needed a way to get rid of all the html code and just get the Artist and Title that corresponds to a specific time. I figured the best way to start was to use good old grep. I chose a random time that was available on the playlist and tried piping the curl command to it:
curl http://www.mediabase.com/whatsong/whatsong.asp?var_s=
075069068074045070077 | grep “11:30 AM”
It worked… almost. The line containing the time was returned to me, although it appears that the time for one song is actually included on the previous line. This means that when I find the line that contains the time for the song I want, I actually need to get the next line to see the artist and song title. The only way I could figure out how to get the line returned with grep was to run the command:
curl http://www.mediabase.com/whatsong/whatsong.asp?var_s=
075069068074045070077 | grep “11:30 AM” -A 1
The -A 1 argument tells grep to return the matching line, and one extra line after that. So now I had the artist and the title, but all this other html garbage as well. Since I didn’t need the first line that contained the time, I figured the next logical step was to get rid of it. I only wanted the second line returned by grep. How did I do it? Like this:
curl http://www.mediabase.com/whatsong/whatsong.asp?var_s=
075069068074045070077 | grep “11:30 AM” -A 1 | tail -n1
This command gets the web page, pipes the output to grep which returns the line containing the time, and the next line that contains the artist name and song title. Then it pipes those two lines of output to the tail command. The “tail -n1″ command shows the last “n” lines of whatever I pipe to it. In my case, n=1. Therefore, tail will cut out everything but the last line.
Now that I have only the line I need, I decided to use sed to parse out the information I wanted. I found this awesome tutorial for sed during an earlier project and it proved to be a huge help with this project as well. This part took me a while. I’m not very good with sed but eventually a found a solution that worked for me. I piped all the output from the first commands to this one:
sed ’s/<td nowrap><span class=blackMain11px>//’
This command finds the text “<td nowrap><span class=blackMain11px>” (which is just some code in the page) and deletes it. I then pipe that output to the command:
sed ’s/<\/td>/\n/’
This command finds the first part in the text that matches “</td>” and replaces it with the newline character (/n). Why did I do this? The first sed command removes everything before the artist name. The second sed command splits the one line into two lines. The first line will contain the artist name only. The second line contains all of the html code that I don’t care about. Now I can simply pipe that output to:
head -n1
The head command works opposite the tail command. “head -n1″ outputs the first ‘n’ lines of the input. In this case, n=1 so head outputs one line only. The end result is the artist name and nothing else! Now I just had to do something similar to retrieve the song name. The full command I used for this was:
cat savedPage.html | grep “$1″ -A 1 | tail -n1 | sed ’s/D><td
nowrap><span class=blackMain11px>/\n/’ | tail -n1 |
sed ’s/<\/td>/\n/’ | head -n1
I won’t bother explaining this whole thing because it is the exact same idea as the last command. The only difference is that I removed all of the text before the song title, and split the one line right after the song title. I now have a way to retrieve the Artist name and Song title of any song as long as I know the time the song was played. I then created a bash script to do all of this for me. The script code looks like this:
#!/bin/bash
#—-The next three lines are to be used for checking the time in a later revision—-
h=`date +%l`
m=`date +%M`
echo “time: $h:$m”
#–Get the web page and save in a file called “savedPage.html”
curl http://www.mediabase.com/whatsong/whatsong.asp?var_s=075069068074045070077 > savedPage.html
#—–Get the Artist——-
cat savedPage.html | grep “$1″ -A 1 | tail -n1 | sed ’s/<td
nowrap><span class=blackMain11px>//’ | sed ’s/<\/td>/\n/’ | head -n1
#—–Get the Song———
cat savedPage.html | grep “$1″ -A 1 | tail -n1 | sed ’s/D><td
nowrap><span class=blackMain11px>/\n/’ | tail -n1 | sed ’s/<\/td>/\n/’ | head -n1
This script will simply output the artist name on one line, and the song title on the other line. I saved the webpage to a file first to avoid having to retrieve it twice. This lowers the run time of the script.
That’s all I have for now. The next step should probably be to set up the script with a fudge factor. It is very unlikely that my computer is set to the exact same time as the playlist computer’s clock. This means that I will never have the exact time the song is played. The script must look at the time I listened, and find song with the closest time. Also, it takes about 20 minutes for the playlist to update after each song, so the script will have to hear my request, and then wait to retrieve the song. After that, I will have to set up the e-mail portion of the script. This is the part that will actually allow me to send a text to my computer.
So far I’m really happy with how fast this is coming along. I expect it to be finished very soon.
—————–Previously found in the project section——————-
I thought of this project a few days ago. I was driving in my car and they played this new song on the radio (The Edge 103.9). I loved it and wanted to find it on my computer when I got home to listen again, but I had missed the names of the song and band. This happens a lot while driving. Normally in that situation I will have to send a text message to myself of some lyrics that I remember so I can Google them later and try to find the song. Well this project aims to prevent this problem.
I want to write a script that will sit on my server and look for incoming e-mails with a specific subject line coming from my cell phone. When it finds one of these e-mails it will go to The Edge’s web site and look at what song is playing at, or around, that time. When it finds the artist and song title it will save that to a “playlist” of sorts. I can then come home later and look at the list to see all of the songs I want to find.
The reason this should work is because The Edge keeps a list of all the songs that they play. Every time they play a song, the list gets updated. I’m thinking I can use Curl to check the website and Bash to do most of the other scripting. Maybe I should use Perl instead to make it more portable? We’ll see. I’ve already started this project but I’ll post up my progress in a blog post later.