Author
GESSLER, DAMIAN - NCGR | |
SCHILTZ, GARY - NCGR | |
MAY, GREG - NCGR | |
AVRAHAM, SHULY - COLD SPRING HARBOR LABS | |
TOWN, CHRISTOPHER - J. CRAIG VENTER INSTITUTE | |
Grant, David | |
Nelson, Rex |
Submitted to: BMC Bioinformatics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 9/15/2009 Publication Date: 9/23/2009 Citation: Gessler, D.D., Schiltz, G.S., May, G.D., Avraham, S., Town, C.D., Grant, D.M., Nelson, R. 2009. SSWAP: A Simple Semantic Web Architecture and Protocol for Semantic Web Services. BMC Bioinformatics. 10:309doi:10.1186/1471-2105-10-309. Interpretive Summary: Species-specific genetics and genomics databases have been established and used for many years for many agronomically important crops. Researchers now want to access data across these databases to do comparative studies. Unfortunately, in most cases the only way to do this to date is to manually query the databases and compile the results. Semantic web technologies offer the potential to link internet resources and data by shared concepts without having to rely on absolute lexical matches. SSWAP (Simple Semantic Web Architecture and Protocol) has been developed to allow cross-database queries based on semantics, rather than just exact word matches. SSWAP relies on an architecture where individual service and data providers advertise their services and deliver content. The underlying semantic relationships allow a program to proceed through the semantic relationships and determine analogous concepts without relying on exact lexical matches. Reliance on semantic reasoning is important in that most species databases were developed with idiosyncratic naming conventions for each data type. This has not been a problem in the past as the nomenclature was one which was familair to that species' research community. However, as users try to query across multiple databases, nomenclature issues have become a serious problem. By linking each data type to a common, semantically defined system, data can be retrieved without reliance on a curator or user needing to identify analogous concepts in separately maintained databases. This will allow both casual database users and species-specific database curators to employ computer algorithms to identify, retrieve and transfer data more efficiently. Technical Abstract: SSWAP (Simple Semantic Web Architecture and Protocol) is an architecture, protocol, and platform for using reasoning to semantically integrate heterogeneous disparate data and services on the web. SSWAP is the driving technology behind the Virtual Plant Information Network, an NSF-funded semantic web services project for the plant sciences. There are currently over 1500 resources published in SSWAP. Approximately two dozen are custom-written services for QTL (Quantitative Trait Loci) and mapping data for legumes and grasses (grains). The remaining are wrappers to Nucleic Acids Research Database and Web Server entries. As an architecture, SSWAP establishes how clients (users of data, services, and ontologies), providers (suppliers of data, services, and ontologies), and discovery servers (semantic search engines) interact to allow for the description, querying, discovery, invocation, and response of semantic web services. As a protocol, SSWAP provides the vocabulary and semantics to allow clients, providers, and discovery servers to engage in semantic web services. The protocol is based on the W3C-sanctioned semantic web ontology language OWL DL. As an open source platform, a discovery server running at http://sswap.info uses the description logic reasoner Pellet to integrate semantic resources under the protocol. The platform hosts an interactive guide to the protocol at http://sswap.info/protocol.jsp, developer tools at http://sswap.info/developer.jsp, and a portal to third-party ontologies at http://sswapmeet.sswap.info. SSWAP is open source and available at http://sourceforge.net/projects/sswap. SSWAP addresses the three basic requirements of any semantic web services architecture (i.e., a common syntax, shared semantic, and semantic discovery) while addressing three technology limitations common in distributed service systems: i.e., i) the fatal mutability of traditional interfaces, ii) the rigidity and fragility of static subsumption hierarchies, and iii) the confounding of content, structure, and presentation. SSWAP is novel by establishing the concept of a canonical yet mutable OWL-DL graph that allows data and service providers to describe their resources, to allow discovery servers to offer semantically rich search engines, to allow clients to discover and invoke those resources, and to allow providers to respond with semantically tagged data. |