Extracting statistically significant behaviour from fish tracking data with and without large dataset cleaning

2017 
Extracting a statistically significant result from video of natural phenomenon can be difficult for two reasons: (i) there can be considerable natural variation in the observed behaviour and (ii) computer vision algorithms applied to natural phenomena may not perform correctly on a significant number of samples. This study presents one approach to clean a large noisy visual tracking dataset to allow extracting statistically sound results from the image data. In particular, analyses of 3.6 million underwater trajectories of a fish with the water temperature at the time of acquisition are presented. Although there are many false detections and incorrect trajectory assignments, by a combination of data binning and robust estimation methods, reliable evidence for an increase in fish speed as water temperature increases are demonstrated. Then, a method for data cleaning which removes outliers arising from false detections and incorrect trajectory assignments using a deep learning-based clustering algorithm is proposed. The corresponding results show a rise in fish speed as temperature goes up. Several statistical tests applied to both cleaned and not-cleaned data confirm that both results are statistically significant and show an increasing trend. However, the latter approach also generates a cleaner dataset suitable for other analysis.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    47
    References
    10
    Citations
    NaN
    KQI
    []