Mathis Boerner, Tim Ruhe, Katharina Morik, Wolfgang RhodeAffiliation(s): TU Dortmund University
Astrophysical experiments produce Big Data which need efficient and effective data analytics. In this paper we present a general data analysis process which has been successfully applied to data from the IceCube, a cubic-kilometer large neutrino detector located at the geographic South Pole. The goal of the analysis is to separate neutrinos from the background within the data to determine the muon neutrino energy spectrum. The presented process covers straight cuts, feature selection, classification, and unfolding. A major challenge in the separation is the unbalanced dataset. The expected signal to background ratio was worse than 1:1000 and, moreover, any surviving background would hinder further analysis of the data. The overall process was embedded in a multi-fold cross-validation to control its performance. A following regularized unfolding yields the sought-after energy spectrum.