High-Performance Computing Framework Based on Distributed Systems for Large-Scale Neurophysiological Data

Document Type : Original Article

Authors

1 School of Cognitive Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran

2 Department of Cognitive Sciences, Faculty of Psychology and Educational Sciences, University of Tehran, Tehran, Iran.

3 Cognitive Systems Laboratory, Control, and Intelligent Processing Center of Excellence (CIPCE), School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran.

Abstract

Recent advancements in neurophysiological recording technologies have led to significant complexities in managing large-scale neural data, creating potential bottlenecks in the storage, sharing, and processing within the neuroscience community. To address these challenges, we developed the Big Neuronal Data Framework (BNDF), a distributed high-performance computing (HPC) solution. BNDF leverages open-source big data frameworks, Hadoop and Spark, to offer a flexible and scalable architecture. We tested BNDF on three large-scale electrophysiological datasets from nonhuman primate brains, demonstrating improved runtimes and scalability due to its distributed design. In comparative analyses against MATLAB, a widely used platform, BNDF showcased over five times faster performance in spike sorting, a common task in neuroscience. This significant speed advantage highlights BNDF’s potential to enhance the efficiency of neural data processing and analysis, making it a valuable tool for researchers navigating the complexities of modern neural datasets. Overall, BNDF represents a promising approach to streamline the handling of extensive neural data in the field of neuroscience.

Keywords


Ahrens, M. B., Orger, M. B., Robson, D. N., Li, J. M., & Keller, P. J. (2013). Whole-brain functional imaging at cellular resolution using light-sheet microscopy. Nature Methods, 10(5), 413-420. https://doi.org/10.1038/nmeth.2434
Boubela, R. N., Kalcher, K., Huf, W., Našel, C., & Moser, E. (2015). Big Data Approaches for the Analysis of Large-Scale fMRI Data Using Apache Spark and GPU Processing: A Demonstration on Resting-State fMRI Data from the Human Connectome Project. Front Neurosci, 9, 492. https://doi.org/10.3389/fnins.2015.00492
Bouchard, K. E., Aimone, J. B., Chun, M., Dean, T., Denker, M., Diesmann, M., Donofrio, D. D., Frank, L. M., Kasthuri, N., Koch, C., Rübel, O., Simon, H. D., Sommer, F. T., & Prabhat. (2018). International Neuroscience Initiatives through the Lens of High-Performance Computing. Computer, 51(4), 50-59. https://doi.org/10.1109/MC.2018.2141039
Bouchard, Kristofer E., Aimone, James B., Chun, M., Dean, T., Denker, M., Diesmann, M., Donofrio, David D., Frank, Loren M., Kasthuri, N., Koch, C., Ruebel, O., Simon, Horst D., Sommer, Friedrich T., & Prabhat. (2016). High-Performance Computing in Neuroscience for Data-Driven Discovery, Integration, and Dissemination. Neuron, 92(3), 628-631. https://doi.org/https://doi.org/10.1016/j.neuron.2016.10.035
Chen, Y., Wang, Z.-y., Yuan, G., & Huang, L. (2017). An overview of online based platforms for sharing and analyzing electrophysiology data from big data perspective. WIREs Data Mining and Knowledge Discovery, 7(4), e1206. https://doi.org/https://doi.org/10.1002/widm.1206
Cunningham, J. P. (2014). Analyzing neural data at huge scale. Nature Methods, 11(9), 911-912. https://doi.org/10.1038/nmeth.3071
Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1), 107–113. https://doi.org/10.1145/1327452.1327492
Freeman, J. (2015). Open source tools for large-scale neuroscience. Curr Opin Neurobiol, 32, 156-163. https://doi.org/10.1016/j.conb.2015.04.002
Freeman, J., Vladimirov, N., Kawashima, T., Mu, Y., Sofroniew, N. J., Bennett, D. V., Rosen, J., Yang, C.-T., Looger, L. L., & Ahrens, M. B. (2014). Mapping brain activity at scale with cluster computing. Nature Methods, 11(9), 941-950. https://doi.org/10.1038/nmeth.3041
Harris, K. D., Henze, D. A., Csicsvari, J., Hirase, H., & Buzsáki, G. (2000). Accuracy of tetrode spike separation as determined by simultaneous intracellular and extracellular measurements. J Neurophysiol, 84(1), 401-414. https://doi.org/10.1152/jn.2000.84.1.401
Hastie, T., Friedman, J., & Tibshirani, R. (2001). Model Assessment and Selection. In T. Hastie, J. Friedman, & R. Tibshirani (Eds.), The Elements of Statistical Learning: Data Mining, Inference, and Prediction (pp. 193-224). Springer New York. https://doi.org/10.1007/978-0-387-21606-5_7
Landhuis, E. (2017). Neuroscience: Big brain, big data. Nature, 541(7638), 559-561. https://doi.org/10.1038/541559a
Makkie, M., Li, X., Quinn, S., Lin, B., Ye, J., Mon, G., & Liu, T. (2019). A Distributed Computing Platform for fMRI Big Data Analytics. IEEE Trans Big Data, 5(2), 109-119. https://doi.org/10.1109/tbdata.2018.2811508
Markram, H., Muller, E., Ramaswamy, S., Reimann, Michael W., Abdellah, M., Sanchez, Carlos A., Ailamaki, A., Alonso-Nanclares, L., Antille, N., Arsever, S., Kahou, Guy Antoine A., Berger, Thomas K., Bilgili, A., Buncic, N., Chalimourda, A., Chindemi, G., Courcol, J.-D., Delalondre, F., Delattre, V., . . . Schürmann, F. (2015). Reconstruction and Simulation of Neocortical Microcircuitry. Cell, 163(2), 456-492. https://doi.org/10.1016/j.cell.2015.09.029
Melnik, S., Gubarev, A., Long, J. J., Romer, G., Shivakumar, S., Tolton, M., & Vassilakis, T. (2010). Dremel: interactive analysis of web-scale datasets. Proc. VLDB Endow., 3(1–2), 330–339. https://doi.org/10.14778/1920841.1920886
Panier, T., Romano, S. A., Olive, R., Pietri, T., Sumbre, G., Candelier, R., & Debrégeas, G. (2013). Fast functional imaging of multiple brain regions in intact zebrafish larvae using selective plane illumination microscopy. Front Neural Circuits, 7, 65. https://doi.org/10.3389/fncir.2013.00065
Quiroga, R. Q., Nadasdy, Z., & Ben-Shaul, Y. (2004). Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Comput, 16(8), 1661-1687. https://doi.org/10.1162/089976604774201631
Rey, H. G., Pedreira, C., & Quian Quiroga, R. (2015). Past, present and future of spike sorting techniques. Brain Research Bulletin, 119, 106-117. https://doi.org/https://doi.org/10.1016/j.brainresbull.2015.04.007
Rezayat, E., Dehaqani, M.-R. A., Clark, K., Bahmani, Z., Moore, T., & Noudoost, B. (2021). Frontotemporal coordination predicts working memory performance and its local neural signatures. Nature Communications, 12(1), 1103. https://doi.org/10.1038/s41467-021-21151-1
Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010, 3-7 May 2010). The Hadoop Distributed File System. 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST),
Toosi, R., Akhaee, M. A., & Dehaqani, M.-R. A. (2020). An Adaptive Detection for Automatic Spike Sorting Based on Mixture of Skew-t distributions. bioRxiv, 2020.2006.2012.147736. https://doi.org/10.1101/2020.06.12.147736
Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S., & Stoica, I. (2010). Spark: Cluster computing with working sets. 2nd USENIX workshop on hot topics in cloud computing (HotCloud 10),
Volume 1, Issue 1
March 2025
Pages 1-12
  • Receive Date: 01 December 2024
  • Revise Date: 02 January 2025
  • Accept Date: 04 January 2025
  • First Publish Date: 01 March 2025
  • Publish Date: 01 March 2025