Scientists are working to collatemillions of images for use in future research. Steve Farrar looks at how technology will make them accessible to all
More than 1.75 million species of organism have been named and described by scientists, each one by definition distinct from every other. From elephants to pondweed, all of this taxonomic information is locked away in a mixture of type specimens, photographs, drawings and microscope slides stored in libraries and institutes across the world.
Biologists are keen to make this fragmented treasure trove of data readily available for their research, and many groups are looking for ways to computerise photographs and drawings in a standard fashion.
British experts are producing a system in which data are input as a series of photographs and corresponding drawings and then processed to map onto each other. This allows new photographs to be classified according to which drawing each matches.
The pilot has so far been based around diatoms, a group of unicellular algae that is enclosed by a silica shell with a remarkable range of geometric patterns.
The project, dubbed "Diadist", or diatom and desmid (a type of algae) identification by shape and texture, is backed by the Biotechnology and Biological Sciences Research Council's bioinformatics programme. It is led by researchers from the department of computer science, vision and geometry group at Cardiff University and the diatom laboratory at the Royal Botanic Garden Edinburgh.
Yulia Hicks, research associate with the Cardiff group, says the advance has been driven by recent work in computer graphics that enables photographs and drawings to be mapped onto each other.
She says that parts of the system were piloted during an earlier phase involving the European Union-funded automated diatom identification and classification consortium using 37 species of diatom, each chosen for the variety of form.
The new system extracts key features of each diatom from its two-dimensional contours, as revealed in microscope photographs. These are then converted into standardised drawings. Once the system has built up a database of drawings, it can be used to identify new biological specimens automatically by converting them into drawings using the same system and searching for a match. Data are included on particular species at different points in their life cycle, during which their size and shape vary.
Such an approach makes it possible for modern computer systems to analyse photographic data and either identify diatoms within reasonable time scales or even suggest the discovery of an unknown species.
Hicks says: "The ultimate goal would be for this system to produce drawings of internal features as well."
The Diadist project's preliminary results were revealed at last year's British machine vision conference in Cardiff. Once the system has been shown to work on diatoms, the project will be extended to cover a more three-dimensional structure and will use desmids as a test case.
Back to ICT in higher education contents