The BirdPix Project



Executive Summary

This project aims to produce a web application which can be used to properly describe, store and search images. While this technology will be developed in as general a form as possible, it will also be specifically applied to the realm of bird pictures. The core technologies will include a wizard-like image submission, and review module which asks English questions to generate RDF (Resource Description Framework) descriptors that are associated with the image. This RDF will then be exported for use by other web agents, including other instances of this application. If you are interested in helping out please join our sourceforge mailing lists. A high level design (work in progress!) is available here.

[top]

Genesis

Besides being a Java Programmer, I am also a bird watcher, and an amateur photographer with a strong preference for photographing birds and wildlife. Several months ago, I was contemplating the rather large stack of unlabeled, uncategorized, pictures I have accumulated that really should be put in photo albums. I really wanted the photos to be organized, and I even had an empty photo album at the ready. Yet the photos remained in the box and the album remained empty.

I thought back on all the different times I had looked at the box and dreamed up ways to organize the pictures... By photographic characteristics (lens, lighting, film, shutter etc) so that I could use them for learning how to take better pictures, by bird species to form a sort of self made photographic life list, or identification guide, by trip, or excursion for the memory of the fun I had taking them... I always seemed to think of a new way. But every way excluded each other way unless I built a meticulous database of cross references. Then there was the aspect of sharing the pictures with friends... Albums are heavy and most on line photo services like WebShots have limited space. I have a web server, and I had also regularly found myself contemplating creating a web site full of pictures but as soon as I contemplated adding scanning, uploading and authoring pages to the sorting, collating and putting pictures in albums (which I wasn't doing) the amount of work just seemed unreasonable.

Then as I stood there my thoughts began to wander. I thought about pictures of birds on the web, and times when I had tried to find them. It can take quite a bit of browsing to find a picture that has a particular view or shows a particular behavior. Most sites have 1-6 pictures on a page, and using Google image search helps speed it up, but you get duplication, thumbnails that turn out not to have the originals available and even worse you get results like this page that came up in a search for "worm-eating warbler" simply because of a comment on the page that says

It can be a challenge to distinguish between the various tchagras unless good views are obtained; all have head patterns reminding me of large versions of the Worm-eating Warbler Helmitheros vermivorus in North America.

This discrepancy also points out another possible glitch of current technology. Lets say you want a picture of a warbler eating a worm (yes I know that warblers don't actually consume Annelids... that isn't the point) you would probably never find it, even if it existed because you would get large numbers of pictures of this particular species who's name (incorrectly) happens to match the behavior.

Another problem is that the picture you find can't easily be used for anything without the possibility that the owner will show up someday and demand that you stop distributing it, or worse yet, demand damages and cause all kinds of annoyance. Posting it on your website may violate the owner's copyright (something I feel is not unreasonable for the photographer to care about), and the quality of the image, much less the file format is entirely unpredictable. Even if it is a good photo in an acceptable format, finding the photographer or owner of the photo to obtain permission to use it on your site is somewhere between agonizingly slow and impossible. Because this is so difficult there is at least a fair chance that the owner is unknown to the web site author who simply gave up and copied the picture without permission.

The whole thing seems to be a disorganized mess. But I am a person who likes to solve puzzles and the problem of freely available pictures of birds described and searchable by content, with known provenance and easily shared across the web suddenly seemed like a rather entertaining puzzle. This project and the web app it will eventually produce is my attempt to solve the puzzle. I also hope that the result may be easily generalized for use with any type of picture, perhaps for use with any type of resource on the web. But lets start with bird pictures and see what we learn...

[top]

Technology

Presentation
Tapestry
Command framework?
xwork
Workflow?
OS workflow??
Modeling?
Eclipse Modeling Framework
Persistence of non-rdf data
hibernate or JDO
RDF
Joseki server?? Jenna toolkit & separate DB for RDF data
Database
MySQL 4
Application Server
Tomcat
Web Server (birdpix.org)
(static content, proxy to web app) Apache 2.0
Server OS (birdpix.org)
Redhat 9.0 (initially) - move to another distro before beta.
Suggested Development Environment
Windows or Linux, Eclipse IDE.

[top]

Technological Goals

[top]

Release Plan

Obviously at this point it is mostly conjecture, but this is my thoughts on how to break up the development

ReleaseDescription
0.1 Start by creating a generic app with the usual user management stuff. Get this stuff out of the way early. It's a web app, and user roles are important. Get them in early, don't retrofit later. In this familiar territory, learn more about tapestry and some of the other tools.
0.2 Registered user can submit a picture which is appropriately stored in the system. Size, and number of submission checks in place. Picture simply appears as a link on a page.
0.3 Newly submitted pictures are routed to admin and additional reviewers, can be approved, and viewed by general users only after approval. No RDF yet, just workflow.
0.4 Descriptor wizard, and it's admin/expert review variant generate RDF display of picture also shows rdf associated with the picture. Wizard is dynamically generated from a configuration file, not statically coded. Only a small sub set of RDF descriptors configured. absence of a descriptor MUST be handled gracefully.
0.5 Implement a Wizard configuration web interface that does intelligent things when descriptors are added/removed from the wizard. Changes should not require restart of the application.
0.6 Implement a query interface. Page with picture shows English descriptions (from wizard config) not raw RDF. This is the first version where we begin letting external users submit pictures on birdpix.org.
0.7 Export the RDF, allow servers to cooperate.
1.0 Address all remaining bugs, and user feedback. birdpix.org begins full operation, Hopefully we have outside experts involved by now. Range of species will depend on availability of confirming experts. Actively try to solicit others to host servers. See where it goes from here.
1.1 Bird quiz... Bird Identifier using RDF Reasoner? Fun stuff. On line (cell/PDA accessible) life lists?
2.0 Allow addition of sound file (song) and other (range map, Life History) resources?

[top]

SourceForge.net Logo
Valid XHTML 1.1!
Valid CSS!