Published in Application Programming Interfaces on Wednesday, November 9th, 2005
Looking at Yahoo!s APIs, you can see a little of that "openess" that Yahoo! CEO Terry Semel referred to at 2005's Web 2.0 conference. From images to movies to maps and search, they offer a lot of data through their APIs.
Note: The script available at the end of the article was updated on 2005-11-16 due to a small undefined index.
Yahoo! provides APIs for many of the services that it offers, all housed under the Yahoo! Developer Network. These include Flikr, Maps, Music, Search, Search Marketing, Shopping (including Shopping User Product Reviews), Travel and Konfabulator Widgets. They also provide the ability to create customizable RSS feeds.
This post will examine Yahoo! Search Web Services, which consists of an API that allows developers to integrate Yahoo!s search functionality into their web sites and applications.
The goal of this article will be to query the Yahoo! API server, and then to process the data once we have it. At the end of the article I have linked up a sample script which does just that.
Yahoo!s search API is quite a bit more complete than that offered by Google (which we will be looking at in a later post). Not only can you make more daily queries (up to 5000), but you can get up to 100 results in one shot, you can request data from a start position (good for paginating results) and you can tighten your searches using some other advanced search parameters.
As with most APIs, there are restrictions that must be followed. For the example that will be rolled out here, an integrated site search, we are within the bounds of the API Terms of Service, but be aware that you cannot use the API for commercial use without permission from Yahoo!.
Yahoo! has a demo API key that they use in a few examples on their site, but if you are going to play around with the examples I would recommend that you head over and get an Application ID and the developer kit, which provides some examples in PHP penned by none other than Rasmus Lerdorf.
The Yahoo! Search Web Services are all REST services. In this case, this simply means that we need to make a GET request to their API server, passing it our API key along with any other parameters that outline the data we want to receive.
That request is structured according to the rules Yahoo! has set out, and the data returned is structured in a format that they have determined (in this case, XML).
Lets look at the base of a request URI:
http://api.search.yahoo.com/WebSearchService/V1/webSearch?
Starting from the hostname, they add the service name (WebSearchService) and version number followed by the method (webSearch) that will be used. From there we can add query parameters based on their rules for request parameters. Be aware that any values passed thru the URI must be url encoded.
Yahoo! provides the following example url that has parameters of an API key YahooDemo, searching for 2 results containing the term madonna:
http://api.search.yahoo.com/WebSearchService/V1/
webSearch?appid=YahooDemo&query=madonna&results=2
Now lets look at building our own request. One very common way of accomplishing this is to make an associative array that contains the parameters that we want to use as keys, and the value for the parameter as values for those keys, as seen in lines 1 - 7 below.
Holding the base URI in another variable, we can pass them both to an URI building function that returns a complete URI for our purposes. See the code below for an example.
Note: We could expand on this by writing a function that builds the parameter array for us, grabbing values from GET variables passed thru our own search form.
For example, it could take site.com/search?p=mysql&num=20&filetype=pdf
and, with the right coding, build the correct Yahoo! specific request. For this post, I am keeping it simple.
Given the code above, we can now easily build a request URI to query the Yahoo! Search API server.
So far so good? Cool. Now, lets get some data!
Now that we can build a request for some data, we need a function that sends the request and fetches the file.
As we are coding in PHP, our approach will be to open the file with fopen(), and then fread() the data into a variable, as outlined below:
Okay, we've managed to build a request, access the resulting file and read it into a variable. Now may be a good time to have a quick look at what the response was from the Yahoo! API server.
I've included the response to the madonna example query below so that we can have a good look at it:
Lets look at the opening tag, ResultSet
. For our purposes, we want to pay attention to the last three attributes, totalResultsAvailable="3610652" totalResultsReturned="2" firstResultPosition="1"
. Those attributes are fairly self explanatory, and will be used when we present our results.
The next section is a series of results, in this case two, each of which having a title
, a summary
, an url
and a ClickUrl
. Yahoo! likes you to use the ClickUrl
when you use their results in a system, so that they can track the usage.
And that is about it. Not too complicated, no?
At this point we hit a bit of a crossroads. As we are using PHP in this example, we can do one of three things to parse the resulting XML document:
Getting into either of numbers 1 or 2 would be a bit much for this article, in my opinion. It is already long enough, and I'm sure people would rather get to the meat and play with the API, so we are going to use an external library to unserialize the returned XML document into an array.
There are a couple of libraries out there, for example minixml and Keith Devens' PHP XML library. For this article, I'll be using Keith's library, as it is a smaller, one file include (and open source).
It is worth noting that this library uses the SAX engine (#1 from above) to get it's work done, which consists of serializing XML from an array and vice versa.
This is quite simple with Keith's library. From Step 2 above, we already have our data held in a variable called $xml
, so now all we have to do is pass it to the XML_unserialize($xml)
function and, as seen in the example below, our data will be held in an array called $data
(note that this example builds on the function from Step 2 above):
Here is a look at print_r($data)
after running the madonna search thru the above code. As you will see, data for the search numbers is held in $data['ResultSet attr']
. We can access our search results via $data['ResultSet']['Result']
:
Now we have our data, and simply need to process the array into the format or markup that we desire. This little bit I'm not going to cover here, though I do offer an example in the file available at the end of this document.
Now that we have gone thru these explanations, lets look at what we need to do to build a site search feature with the Yahoo! Search API:
Here is an example script that pulls this whole article together and accomplishes the list outlined above. When using, remember:
Obviously there are many more things that can be done with this Yahoo! API. You can add features into the sample script by simply adding them to the form as options which get dumped into the $params
array, or more simply by adding them directly to that array in the code.
Please keep in mind that I haven't done any cleaning of the user input search string. If you do use this code and plan on echoing the search terms back to the user, be sure to clean the input first.
Some other possibilites exist as well. Obviously paginating the data is possible, and for some situations, like a site search, one may want to filter out home pages and other pages that may have new data on them since Yahoo! last crawled the site being queried.
Over at Using Wikipedia and the Yahoo API to give structure to flat lists, they have documented an interesting approach to cleaning up their data by using the Yahoo! API and a site specific search. Great stuff.
This was my first crack at a longer technical post here on Fiftyfoureleven.com, so apologies if some things aren't very clear. Please feel free to ask away in the comments. Ditto if I've made an error somewhere!
I've already noticed some limitations of this new design, so I'm hoping to have a widescreen alternate stylesheet for code viewing ready for next week.
Next Week will see a double attack of the Google (Wednesday) and MSN (later) search APIs, after which we'll try and move into some other juicier offerings and also deal with request caching, among other things.
Sitepoint's web devlopment books have helped me out on many occasions both for finding a quick solution to a problem but also to level out my knowlegde in weaker areas (JavaScript, I'm looking at you!). I am recommending the following titles from my bookshelf:
I started freelancing by diving in head first and getting on with it. Many years and a lot of experience later I was still able to take away some gems from this book, and there are plenty I wish I had thought of beforehand. If you are new to freelancing and have a lot of questions (or maybe don't know what questions to ask!) do yourself a favor and at least check out the sample chapters.
The author line-up for this book says it all. 7 excellent developers show you how to get your JavaScript coding up to speed with 7 chapters of great theory, code and examples. Metaprogramming with JavaScript (chapter 5 from Dan Webb) really helped me iron out some things I was missing about JavaScript. That said each chapter really helped me to develop my JavaScript skills beyond simple Ajax calls and html insertion with libs like JQuery.
Like the other books listed here, this provides a great reference for the PHP developer looking to have the right answers from the right people at their fingertips. I tend to pull this off the shelf when I need to delve into new territory and usually find a workable solution to keep development moving. This only needs to happen once and you recoup the price of the book in time saved from having to develop the solution or find the right pattern for getting the job done..
Comments and Feedback
Mmmmmm...Y! API's. Mike, this is a fantastic piece you've put together.
This is very long indeed. I'll let some of the engineers on the search team know that you've put this together.
Long, yeah.
Lots to stick in there, and so much more that could be done as well! I hope that the code file at the end at least makes it worthwhile :-) Be interesting to hear what the insiders have to think...
Great piece of work . Will surely try this out. Looking forward for next wednesday.
Thanks Kaushal, I hope you find it useful! I love that header image in Blue Horizon, btw. Very nice...
Thank You, I love playing with colours and images. I am here to rock Wordpres world you just wait and watch. Still learning but will get hang of it in 3-4 months :)
Thanks This is very long indeed. hope that the code file at the end at least makes it worthwhile. Will surely try this out.