A Map/Reduce Parallelized Framework for Rapidly Classifying Astrophysical Transients

2009 
The Berkeley Transients Classification Pipeline (TCP) is a source identification, classification, and broadcast pipeline which federates data streams from multiple surveys. The TCP identifies variable science by making proba- bilistic statements about the scientific classification of newly discovered sources observed by the Palomar Transient Factory's all sky survey. The primary pur- pose of PTF is to consistently map the available sky with the intent to discover a variety of galactic and extragalactic transient sources and events. The TCP iden- tifies and alerts follow-up telescopes such as PAIRITEL (Bloom et al. 2005) and end users to these newly discovered transient sources. Here we discuss software used within the TCP to generate science classifiers when little or no data has been acquired by the survey of interest. This case proves more challenging than when generating classifiers for a well populated survey. We present some of the difficulties encountered and a parallelized Hadoop/MapReduce based technique we use to resolve them.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    5
    Citations
    NaN
    KQI
    []