This week I entered a whole new world of relational databases, the art of SQL, Boolean logic and the need for efficient, relevant search results from online sources. Hmm, the memorable Kupor quote regarding ‘drinking from the fire hydrant’ form Ali’s blog last week immediately came to mind. So much to absorb and so little time…hence bite-size and context needed. So for this entry, I’d like to write briefly about information retrieval in terms of image search technology, linking in my interest on image research and volunteer work at the British Library.
A picture is worth a thousand words….
Image searching, uses algorithms to search for features of still and moving images rather than relying on text indexing. Generally, in order for an image to be findable, it needs to be described in some way and needs metadata – but what happens if the image has no accurate metadata ? A question that has arisen for me when tagging images with little or no information from the British Library photostream for creative research projects.
As a background, image retrieval methods range from from concept based (or text based), approach where keywords or metadata are used, to content based where the image content itself (such as shape, colour or texture) is used to provide the ‘map’ for searching. In recent research comparing methods of image retrieval, content based systems are leading the field as an attempt to ‘bridge the semantic gap,’ as,
‘the starting point of the retrieval process is typically the high level query from a human. Translating or converting the question posed by a human to the low level features seen by the computer illustrates the problem in bridging the semantic gap.’ (Lew et al. 2002)
Various computer vision and image identification software for content-based image retrieval (CBIR), also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR) has evolved. TinEye is a reverse image search engine developed in 2008.
As it returns information on where a users selected image appears on the web, this has significant use for improving metadata and in the copyright world both for potential infringement and managing ‘orphan works.’
Following on from TinEye, Google Images launched their own reverse image facility in 2011, directly into their image search bar.
In terms of comparison, ZDnet’s article by Stephen Chapman claims Google’s ‘vast reach’ is said to outperform TinEye greatly, however further debates online reveal a loyal following for TinEye regarding accuracy and sorting options.
Further application of reverse image retrieval in the British Library …
Very recently, the British Library worked in partnership with the Technology Strategy Board to challenge software developers to produce a tool that could measure or assess the impact of releasing its digital content into the public domain. For example, how were the one million Flickr images from their collection of 19th century Microsoft books being utilised?
Enter Peter Balman, the developer who won the competition with an idea for a tool that searches for British Library’s digital content on the web and gives a detailed breakdown of where, how and by whom it is being used. Named ‘Visibility,’ this project could help the Library make choices around targeting users by releasing similar content and encourage further use and deeper engagement within these groups.
A link to his project is here (NB. playback is good on IE but I had problems in Google Chrome).
Until next week kind viewers…