How To Create An Image Search Engine?
While crawling the web and following links, you should track the img tag and the links that point to images. Most images on the web are jpegs, gifs and pngs so you can detect them based on extension.
Now what metadata can you find about a picture? You can find information about a picture in the name, in the alt attribute, in the link description, in the text near the image and in the context of the page. Although the text that surrounds the image is sometimes relevant, nobody guarantees that the image doesn't illustrate a very small detail or something related to the topic on the page.
To see what's the best image search engine, I tested 5 big search engines (ok, Flickr is not a search engine, but it has great photos) against 20 searches.
paint shop pro
solaris [the book, the film, the operating system]
New York attacks
Robbie Williams concert
I looked mainly at the first result, but I took in the considerations the others if they were great.
picsearch.com : 17/20
Probably the best image search. It doesn't have many results, but the sources are carefully selected.
(bad results for: new york attacks, shallow, tough argument )
Hillarious results, not safe-for-work results, logos usually come first.
(bad results for: fear, tough argument, freebsd, new york attacks, gnomedex)
Not as good as the web search.
(bad results for: fear, shallow, unrequited love, paint shop pro, freebsd, new york attacks)
Ask has really good related searches.
(bad results for: fear, shallow, unrequited love, tough argument, paint shop pro, freebsd)
Where is the search button? The tags are great, but they restrict the search queries.
(bad results or no results for: shallow, unrequited love, laughing child, tough argument, gnuplot
paint shop pro, openoffice, new york attacks. india tsunami, gnomedex, robbie williams concert, oscars 2005)