DarkQ: Continuous genomic monitoring using message queues
2020
Motivation: The representation of text as dense, low-dimensional vectors of numbers ("embeddings") is a common practice in the field of natural language processing (NLP), because these vectors can be used as direct input to a variety of learning algorithms such as neural networks and make training more efficient due to their "pretrained" nature.
Results: We developed nanotext, an open source Python library and command line interface that allows for training and analysis of protein domain and genome embeddings analogous to word and document embeddings in NLP.
Availability: nanotext is released under the BSD-3 license at [github.com/phiweger/nanotext](https://github.com/phiweger/nanotext).
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
16
References
1
Citations
NaN
KQI