Abstract Taverna workbench is an environment for construction, visualization and execution of bioinformatic workflows that integrate specialized tools available through the internet. It is gaining popularity fast, because of supporting the most important bioinformatic services and its simple, yet robust graphical notation. Here we present XQTav—an extension of Taverna that provides full integration with XQuery (the query language for XML) engine. XQTav allows execution of XQuery scripts in Taverna workflow diagrams. All existing Taverna processors can be accessed in the XQuery scripts. This provides an alternative way of specifying subworkflows in Taverna and is useful when one deals with query-like algorithms (e.g. filters and inner joins). Moreover, XQtav may be used to automatically generate an XQuery script that is equivalent to Taverna's workflow. This constitutes another way of creating and enacting bioinformatic workflows: overall structure of a diagram is drawn in Taverna environment, XQuery code is generated and possibly adjusted by hand. It can be executed by XQuery engines or incorporated into other software environments. Availability: XQtav is an open source software. It may be downloaded from . The page also contains various tutorials and examples, including the one described in this report. Contact: sroka@mimuw.edu.pl, a.kierzek@surrey.ac.uk
data- training data for peptides < 25 AA (16.8 MB) models - checkpoints of HydrAMP, PepCVAE, and Basic models for every training epoch (466 MB) results - dumped generation results for every model. Required for running comparison notebooks (832 MB) wheels - custom TensorFlow packages (1 GB)
This report summarizes the presentations and discussions of the third workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR'16). The BeyondMR workshop was held in conjunction with the 2016 SIGMOD conference in San Francisco, California, USA on July 1, 2016. The goal of the workshop was to bring together researchers and practitioners to explore algorithms, computational models, architectures, languages and interfaces for systems that need largescale parallelization and systems designed to support efficient parallelization and fault tolerance. These include specialized programming and data-management systems based on MapReduce and extensions, graph processing systems, data-intensive workflow and dataflow systems. The program featured two very well attended invited talks by Ion Stoica from AMPLab, University of California Berkeley and Carlos Guestrin from the University of Washington.
Abstract Background Bioimaging techniques offer a robust tool for studying molecular pathways and morphological phenotypes of cell populations subjected to various conditions. As modern high-resolution 3D microscopy provides access to an ever-increasing amount of high-quality images, there arises a need for their analysis in an automated, unbiased, and simple way. Segmentation of structures within the cell nucleus, which is the focus of this paper, presents a new layer of complexity in the form of dense packing and significant signal overlap. At the same time, the available segmentation tools provide a steep learning curve for new users with a limited technical background. This is especially apparent in the bulk processing of image sets, which requires the use of some form of programming notation. Results In this paper, we present PartSeg, a tool for segmentation and reconstruction of 3D microscopy images, optimised for the study of the cell nucleus. PartSeg integrates refined versions of several state-of-the-art algorithms, including a new multi-scale approach for segmentation and quantitative analysis of 3D microscopy images. The features and user-friendly interface of PartSeg were carefully planned with biologists in mind, based on analysis of multiple use cases and difficulties encountered with other tools, to offer an ergonomic interface with a minimal entry barrier. Bulk processing in an ad-hoc manner is possible without the need for programmer support. As the size of datasets of interest grows, such bulk processing solutions become essential for proper statistical analysis of results. Advanced users can use PartSeg components as a library within Python data processing and visualisation pipelines, for example within Jupyter notebooks. The tool is extensible so that new functionality and algorithms can be added by the use of plugins. For biologists, the utility of PartSeg is presented in several scenarios, showing the quantitative analysis of nuclear structures. Conclusions In this paper, we have presented PartSeg which is a tool for precise and verifiable segmentation and reconstruction of 3D microscopy images. PartSeg is optimised for cell nucleus analysis and offers multi-scale segmentation algorithms best-suited for this task. PartSeg can also be used for the bulk processing of multiple images and its components can be reused in other systems or computational experiments.