The Phyloreferencing project seeks a postdoctoral fellow for researching and developing computational semantics approaches to large-scale biodiversity data integration problems. The project aims to enable addressing elements in the Tree of Life by unambiguous, transparent, and fully computable semantics of their patterns of evolutionary descent. The work involves researching and developing OWL models and ontologies, tools for converting existing data into OWL format, and online proof-of-concept applications for using machine reasoning to navigate biodiversity data by Tree of Life-semantics.
The project is a collaboration between Dr. Nico Cellinese (Florida Museum of Natural History, University of Florida) and Hilmar Lapp (Center for Genomic and Computational Biology, Duke University) newly funded by the National Science Foundation for 3 years. The incumbent will be based in the lab of Dr. Cellinese, but will work closely with both PIs and will periodically travel to Lapp’s lab at Duke University. Starting date is negotiable but the sooner the better. Salary starts at $45,000.
Integrating and querying biological data across organisms, whether on small or large scales, to this day relies on traditional names for organisms and groups of organisms based on Linnaean nomenclature. However, when it comes to integrating and communicating data, these suffer from two major limitations. Firstly, because they are simple text-strings, machines cannot access the meaning intended by those who coin a name and those who apply it, resulting in rampant ambiguity and inconsistency in what organism names are interpreted to mean. Secondly, there are many groups of organisms for which a Linnaean name does not and may never exist, but for which valuable biological knowledge needs to be communicated. The Phyloreferencing project aims to address these challenges by defining ontology-based references (“phyloreferences”) to elements on the Tree of Life that are unambiguous have fully computable semantics. To accomplish this, we are using OWL, OWL ontologies, and machine reasoning. Phyloreferences build on a large body of theoretical and applied work on phylogenetic taxonomy.
The overall deliverables of the project include a formal specification for phyloreference encoding and reasoning in OWL; ascertaining correctness of the specification using small-scale tests verifiable by domain experts; and a large-scale proof-of-concept application for using phyloreferences to navigate biodiversity data resources.
The full grant proposal is available at http://dx.doi.org/10.6084/m9.figshare.1401984.
Postdoctoral project responsibilities
The postdoc will work closely with the project PIs and graduate students to generate the major project deliverables, which include the following:
As part of creating a specification for encoding phyloreferences and phylogenies in OWL, the postdoc will participate in OWL model and ontology development for phylogenies and phyloreferences; develop templates for constructing phyloreferences in OWL; and develop tools to convert published phylogenies and phyloreferences to OWL ontologies.
The postdoc will create an online proof-of-concept application that uses OWL reasoning to allow users to query biodiversity data resources using phyloreferences and Tree of Life-scale phylogenies.
All software source code, ontologies, and related products will be developed collaboratively on Github and be made available under open-source licenses.
University of Florida
PhD in Computer Science, Computational Biology, Bioinformatics or a related field.
Domain knowledge in biology, biodiversity, or evolution is helpful, but not required.
Strong knowledge of Description Logic semantics, ontologies and modeling (ideally OWL ontologies).
Experience with logic machine reasoning (ideally OWL DL reasoners and entailments).
Strong programming experience in languages frequently used in scientific computing, and especially for computing with ontologies (Java, Scala, functional languages) and for managing or converting scientific data (e.g., Python).
Experience programming with OWL API or other APIs to working with machine reasoners.
About University of Florida
University of Florida is a major, public, comprehensive, land-grant, research university. The state's oldest, largest and most comprehensive university, it is among the nation's most academically diverse public universities. University of Florida has a long history of established programs in international education, research and service. It was founded in 1853 and is based in Gainesville, Florida.