Data integration and handling
2017
Modern technology allows researchers to generate data at an ever increasing rate,
outpacing the capacity of researchers to analyse it. Developing automated support
systems for the collection, management and distribution of information is therefore an
important step to reduce error rates and accelerate progress to enable high-quality
research based on big data volumes. This thesis encompasses five articles, describing
strategies for the creation of technical research platforms, as well as descriptions of the
technical platforms themselves.
The key conclusion of the thesis is that technical solutions for many issues have been
available for a long time. These technical solutions are however overlooked, or simply
ignored, if they fail to recognise the social dimensions of the issues they try to solve.
The Molecular Methods database is an example of a technically sound but only
partially successful solution in regards to social viability. Thousands of researchers
have used the website to access protocols, but only a handful have shared their own
work on MolMeth. Experiences from the Molecular Methods database and other
projects have provided a foundation for studies supporting the development of the
eB3Kit
The eB3Kit is a portable, robust and scalable informatics platform for structured data
management. Deploying the platform enables research groups to carry out advanced
research projects with very limited means. With the eB3Kit researchers can integrate
data from a wide variety of sources, including the local laboratory information
management system and analyse it using the Galaksio interface. Galaksio provides user
friendly access to the Galaxy workflow management system and provides eB3Kit users
with access to tools developed by a far larger user community than the one actively
developing the eB3Kit. Using a workflow management system improves
reproducibility and enables bioinformaticians to prepare workflows without directly
accessing ethically or commercially sensitive data. Therefore, it is especially well-
suited for applications where researchers are worried about privacy and during disease
outbreaks where persistent storage and analysis capacity must be established quickly.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI