- Home
- What We Do
- e[datascientist] platform
- e[datascience]
- e[curate]
e[curate]
Scientific challenge
Data curation is an essential element of the life sciences R&D process, requiring significant expert manual effort. Biologists and data scientists are diverted by the endless challenges of data curation: processes are entangled, error-prone and labor-intensive, presenting a major bottleneck.
Biocuration also relies heavily on tacit and implicit knowledge, not easily transferrable across divisional and organizational boundaries.
Current tools for data engineers and scientists are limited to structural, lexical and syntactical data wrangling tools. The requirement is for semantic enrichment capabilities necessary for complex and context-specific life sciences data.
The challenge is further exacerbated by the complexity of microbiome and genomic datasets, which are often stored in silos with no consistency or context.
e[curate] automates the identification, cross mapping and semantic enrichment of complex, diverse and disparate life sciences data types.
Key features
-
Systematize and automate data curation in accordance with de facto standards, including DOD 7Cs
-
Automate data discovery, improve data quality for analysis and increase the potential for data reuse
-
Generate structured and annotated data to enable quicker and more targeted information search and retrieval
-
Perform semantic enrichment and contextualize data and metadata on a multi-layer hypergraph
-
Ensure data governance by design, in standardization of reference data and ontologies
Benefits
-
Build trust in data collection, characterization, cleansing, categorization and cataloging processes
-
Integrate multi-dimensional external and internal datasets to self-service particular data needs
-
Prepare experimental information to streamline the journey to discover and evaluate insight
-
Build meaningful understanding and contextualize knowledge surrounding experimental data
-
Identify the most compelling research areas on which to focus investment, increasing research productivity and scalability