Data management and data integration are fundamental problems in the life sciences. Advances in molecular biology and molecular medicine are almost u- versallyunderpinned by enormouse?orts in data management,data integration, automatic data quality assurance, and computational data analysis. Many hot topics in the life sciences, such as systems biology, personalized medicine, and pharmacogenomics, critically depend on integrating data sets and applications producedby di?erent experimentalmethods, in di?erent researchgroups,andat di?erent levels of granularity. Despite more than a decade of intensive research in these areas, there remain many unsolved problems. In some respects, these problems are becoming more severe, both due to continuous increases in data volumes and the growing diversity in types of data that need to be managed. And the next big challenge is already upon us: the need to integrate the di?- ent “omics” data sets with the vast amounts of clinical data, collected daily in thousands of hospitals and physicians’ o?ces all over the world. DILS 2006 is the third in an annual workshop series that aims at fostering discussion, exchange, and innovation in research and development in the areas of data integration and data management for the life science. DILS 2004 in Leipzig and DILS 2005 in San Diego each attracted around 100 researchersfrom all over the world. This year the number of submitted papers again increased. The Program Committee selected 23 papers out of 50 strong full submissions.
Refereed proceedings of the Third International Workshop on Data Integration in the Life Sciences, DILS 2006
Presents 19 revised full papers and 4 revised short papers together with 2 keynote talks
Topics include data integration, text mining, systems, and workflow
Keynotes.- An Application Driven Perspective on Biological Data Integration.- Towards a National Healthcare Information Infrastructure.- Data Integration.- Data Access and Integration in the ISPIDER Proteomics Grid.- A Cell-Cycle Knowledge Integration Framework.- Link Discovery in Graphs Derived from Biological Databases.- Text Mining.- Towards an Automated Analysis of Biomedical Abstracts.- Improving Text Mining with Controlled Natural Language: A Case Study for Protein Interactions.- SNP-Converter: An Ontology-Based Solution to Reconcile Heterogeneous SNP Descriptions for Pharmacogenomic Studies.- Systems I.- SABIO-RK: Integration and Curation of Reaction Kinetics Data.- SIBIOS Ontology: A Robust Package for the Integration and Pipelining of Bioinformatics Services.- Data Structures for Genome Annotation, Alternative Splicing, and Validation.- BioFuice: Mapping-Based Data Integration in Bioinformatics.- Potpourri.- A Method for Similarity-Based Grouping of Biological Data.- On Querying OBO Ontologies Using a DAG Pattern Query Language.- Using Term Lists and Inverted Files to Improve Search Speed for Metabolic Pathway Databases.- Systems II.- Arevir: A Secure Platform for Designing Personalized Antiretroviral Therapies Against HIV.- The Distributed Annotation System for Integration of Biological Data.- An Information Management System for Collaboration Within Distributed Working Environment.- Short Papers.- Ontology Analysis on Complexity and Evolution Based on Conceptual Model.- Distributed Execution of Workflows in the INB.- Knowledge Networks of Biological and Medical Data: An Exhaustive and Flexible Solution to Model Life Science Domains.- On Characterising and Identifying Mismatches in Scientific Workflows.- Workflow.- Collection-Oriented Scientific Workflows for Integrating and Analyzing Biological Data.- Towards a Model of Provenance and User Views in Scientific Workflows.- An Extensible Light-Weight XML-Based Monitoring System for Sequence Databases.