During my PhD studies, I employed nonlinear signal analysis methods to analyse physiological time series data. Entropy was just one of these methods, which is commonly used to estimate the randomness/complexity/irregularity of data. However, access to reliable, accurate, and validated software with which to perform entropic data analysis was lacking. This was not just an issue for me and other early stage researchers, but for the scientific community as a whole. Without a reliable resource for conducting data analysis with entropy, it became difficult to cross-validate and replicate results from various studies.
I set out to change this and ended up creating EntropyHub - an open-source toolkit for entropic time series and image data analysis.
Over the last few decades, there has been an increasing number of studies employing measures of entropy to estimate the irregularity/randomness/complexity of time series and image data in real-world applications. Matching the rise in popularity of entropy, the number of new methods of estimating entropy has similarly increased to leave what has been recently termed, “The Entropy Universe”. This universe of entropies continues to expand as more and more methods are derived with improved statistical properties over their precursors, such as robustness to short signal lengths, resilience to noise, insensitivity to amplitude fluctuations. Furthermore, new entropy variants are being identified which quantify the variability of time series data in specific applications, including assessments of cardiac disease from electrocardiograms, and examinations of machine failure from vibration signals.
As the popularity of entropy spreads beyond the field of mathematics to subjects ranging from neurophysiology to finance, there is an emerging demand for software packages with which to perform entropic time series analysis. Open-source software plays a critical role in tackling the replication crisis in science by providing validated algorithmic tools that are available to all researchers. Without access to these software tools, researchers lacking computer programming literacy may resort to borrowing algorithms from unverified sources which could be vulnerable to coding errors. Furthermore, software packages often serve as entry points for researchers unfamiliar with a subject to develop an understanding of the most commonly used methods and how they are applied. This point is particularly relevant in the context of entropy, a concept that is often misinterpreted, and where the name and number variant methods may be difficult to follow. For example, derivatives of the original sample entropy algorithm, already an improvement on approximate entropy. include modified sample entropy (fuzzy entropy), multiscale (sample) entropy, composite multiscale entropy, refined multiscale entropy, and refined-composite multiscale entropy.
Several packages offering entropy-related functions have been released in recent years, intended primarily for the analysis of physiological data. Although these packages offer some useful tools, they lack the capacity to perform extensive data analysis with multiple methods from the cross-entropy, bidimensional entropy, and multiscale entropy families of algorithms. Additionally, the utility of these packages is also limited for several reasons. Nearly all operate through graphical user interfaces (GUIs) and/or are designed for use with the MATLAB programming environment which requires a purchased license in order to use. This paywall prevents many users from accessing the software and consequently impedes the replication of results achieved by using these packages. Few have accompanying documentation to describe how to use the software, and none are hosted on the native package repository for MATLAB (MathWorks File Exchange) or Python (PyPi), which facilitate direct and simplified installation and updating.
Against this background, I developed EntropyHub, an open-source toolkit for entropic time series analysis in the MATLAB, Python and Julia programming environments. Incorporating entropy estimators from information theory, probability theory and dynamical systems theory, EntropyHub features a wide range of functions to calculate the entropy of, and the cross-entropy between, univariate time series data. In contrast to other entropy-focused toolboxes, EntropyHub runs from the command line without the use of a GUI and provides many new benefits, including:
- Functions to perform refined, composite, refined-composite and hierarchical multiscale entropy analysis using more than twenty-five different entropy and cross-entropy estimators (approximate entropy, cross-sample entropy, etc).
- Functions to calculate bidimensional entropies from two-dimensional (image) data.
- An extensive range of function arguments to specify additional parameter values in the entropy calculation, including options for time-delayed state-space reconstruction and entropy value normalisation where possible.
- Availability in multiple programming languages – MATLAB, Python and Julia – to enable open-source access and provide cross-platform translation of methods through consistent function syntax. This is the first entropy-specific toolkit for the Julia language, and the first package of its kind to be available in all three languages.
- Compatible with both Windows, Mac and Linux operating systems.
- Comprehensive documentation describing installation, function syntax, examples of use, and references to source literature. Documentation is available online at www.EntropyHub.xyz (or at MattWillFlood.github.io/EntropyHub), where it can also be downloaded as a booklet (EntropyHub Guide.pdf). Documentation specific to the MATLAB edition can also be found in the ‘supplemental software’ section of the MATLAB help browser after installation. Documentation specific to the Julia edition can also be found at MattWillFlood.github.io/EntropyHub.jl/stable.
- Hosting on the native package repositories for MATLAB (MathWorks File Exchange), Python (PyPi) and Julia (Julia General Registry), to facilitate straightforward downloading, installation and updating. The latest development releases can also be downloaded from the EntropyHub GitHub repository - www.github.com/MattWillFlood/EntropyHub.
As new measures enter the ever-growing entropy universe, EntropyHub aims to incorporate these measures accordingly.
EntropyHub is licensed under the Apache license (version 2.0) and is available for use by all on condition that the following reference be cited on any scientific outputs realised using the EntropyHub toolkit.
Matthew W. Flood and Bernd Grimmm, (2021) EntropyHub: An open-source toolkit for entropic time series analysis PLoS One 16(11):e0259448 DOI: 10.1371/journal.pone.0259448
Download EntropyHub from:
- EntropyHub Website
- EntropyHub Journal Paper
- GitHub repository
- MatLab download centre
- Python package repository
- Julia package registry