Multivariate Entropies

Functions for estimating the entropy of a multivariate time series dataset.

The following functions also form the multivariate entropy method used by Multivariate Multiscale functions. -> MvMSEn, cMvMSEn

EntropyHub._MvSampEn.MvSampEn — Function

MSamp, B0, Bt, B1 = MvSampEn(Data)

Returns the multivariate sample entropy estimate (MSamp) and the average number of matched delay vectors (m: B0; joint total m+1 subspace: Bt; all possible m+1 subspaces: B1), from the M multivariate sequences in Data using the default parameters: embedding dimension = 2*ones(M), time delay = ones(M), radius threshold = 0.2, logarithm = natural, data normalization = false

Note

The entropy value returned as MSamp is estimated using the "full" method [i.e. -log(Bt/B0)] which compares delay vectors across all possible m+1 expansions of the embedding space as applied in [1][2]. Contrary to conventional definitions of sample entropy, this method does not provide a lower bound of 0!! Thus, it is possible to obtain negative entropy values for multivariate sample entropy, even for stochastic processes...

Alternatively, one can calculate MSamp via the "naive" method, which ensures a lower bound of 0, by using the average number of matched vectors for an individual m+1 subspace (B1) [e.g. -log(B1(1)/B0)], or the average for all m+1 subspaces [i.e. -log(mean(B1)/B0)].

To maximize the number of points in the embedding process, this algorithm uses N - max(m * tau) delay vectors and not N-max(m) * max(tau) as employed in [1][2].

MSamp, B0, Bt, B1 = MvSampEn(Data::AbstractArray{T} where T<:Real; m::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, tau::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, r::Real=0.2, Logx::Real=exp(1), Norm::Bool=false)

Returns the multivariate sample entropy estimates (MSamp) estimated from the M multivariate data sequences in Data using the specified keyword arguments:

Arguments:

Data - Multivariate dataset, NxM matrix of N (>10) observations (rows) and M (cols) univariate data sequences

m - Embedding Dimension, a vector of M positive integers

tau - Time Delay, a vector of M positive integers

r - Radius Distance Threshold, a positive scalar

Logx - Logarithm base, a positive scalar

Norm - Normalisation of all M sequences to unit variance, a boolean

See also SampEn, XSampEn, SampEn2D, MSEn, MvFuzzEn, MvPermEn

References:

[1] Ahmed Mosabber Uddin, Danilo P. Mandic
    "Multivariate multiscale entropy: A tool for complexity
    analysis of multichannel data."
    Physical Review E 84.6 (2011): 061918.

[2] Ahmed Mosabber Uddin, Danilo P. Mandic
    "Multivariate multiscale entropy analysis."
    IEEE signal processing letters 19.2 (2011): 91-94.

source

EntropyHub._MvFuzzEn.MvFuzzEn — Function

MFuzz, B0, Bt, B1 = MvFuzzEn(Data)

Returns the multivariate fuzzy entropy estimate (MFuzz) and the average vector distances (m: B0; joint total m+1 subspace: Bt; all possible m+1 subspaces: B1), from the M multivariate sequences in Data using the default parameters: embedding dimension = 2*ones(M,1), time delay = ones(M,1), fuzzy membership function = "default", fuzzy function parameters= [0.2, 2], logarithm = natural, data normalization = false,

Note

The entropy value returned as MFuzz is estimated using the "full" method [i.e. -log(Bt/B0)] which compares delay vectors across all possible m+1 expansions of the embedding space as applied in [1][3]. Contrary to conventional definitions of sample entropy, this method does not provide a lower bound of 0!! Thus, it is possible to obtain negative entropy values for multivariate fuzzy entropy, even for stochastic processes...

Alternatively, one can calculate MFuzz via the "naive" method, which ensures a lower bound of 0, by using the average vector distances for an individual m+1 subspace (B1) [e.g. -log(B1(1)/B0)], or the average for all m+1 subspaces [i.e. -log(mean(B1)/B0)].

To maximize the number of points in the embedding process, this algorithm uses N - max(m * tau) delay vectors and not N - max(m) * max(tau) as employed in [1] and [3].

MFuzz, B0, Bt, B1 = MvFuzzEn(Data::AbstractArray{T} where T<:Real; m::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, tau::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, r::Union{Real,Tuple{Real,Real}}=(.2,2.0), Fx::String="default", Logx::Real=exp(1), Norm::Bool=false)

Returns the multivariate sample entropy estimates (MSamp) estimated from the M multivariate data sequences in Data using the specified keyword arguments:

Arguments:

Data - Multivariate dataset, NxM matrix of N (>10) observations (rows) and M (cols) univariate data sequences

m - Embedding Dimension, a vector of M positive integers

tau - Time Delay, a vector of M positive integers

Fx - Fuzzy function name, one of the following: {"sigmoid", "modsampen", "default", "gudermannian", "bell", "triangular", "trapezoidal1", "trapezoidal2", "z_shaped", "gaussian", "constgaussian"}

r - Fuzzy function parameters, a 1 element scalar or a 2 element tuple of positive values. The r parameters for each fuzzy function are defined as follows: [default: [.2 2]]

            default:        r(1) = divisor of the exponential argument
                            r(2) = argument exponent (pre-division)
            sigmoid:        r(1) = divisor of the exponential argument
                            r(2) = value subtracted from argument (pre-division)
            modsampen:      r(1) = divisor of the exponential argument
                            r(2) = value subtracted from argument (pre-division)
            gudermannian:   r  = a scalar whose value is the numerator of
                                argument to gudermannian function:
                                GD(x) = atan(tanh(`r`/x))
            triangular:     r = a scalar whose value is the threshold (corner point) of the triangular function.
            trapezoidal1:   r = a scalar whose value corresponds to the upper (2r) and lower (r) corner points of the trapezoid.
            trapezoidal2:   r(1) = a value corresponding to the upper corner point of the trapezoid.
                            r(2) = a value corresponding to the lower corner point of the trapezoid.
            z_shaped:       r = a scalar whose value corresponds to the upper (2r) and lower (r) corner points of the z-shape.
            bell:           r(1) = divisor of the distance value
                            r(2) = exponent of generalized bell-shaped function
            gaussian:       r = a scalar whose value scales the slope of the Gaussian curve.
            constgaussian:  r = a scalar whose value defines the lower threshod and shape of the Gaussian curve.

Logx - Logarithm base, a positive scalar

Norm - Normalisation of all M sequences to unit variance, a boolean

See also MvSampEn, FuzzEn, XFuzzEn, FuzzEn2D, MSEn, MvPermEn

References:

[1] Ahmed, Mosabber U., et al. 
    "A multivariate multiscale fuzzy entropy algorithm with application
    to uterine EMG complexity analysis." 
    Entropy 19.1 (2016): 2.

[2] Azami, Alberto Fernández, Javier Escudero. 
    "Refined multiscale fuzzy entropy based on standard deviation for 
    biomedical signal analysis." 
    Medical & biological engineering & computing 55 (2017): 2037-2052.

[3] Ahmed Mosabber Uddin, Danilo P. Mandic
    "Multivariate multiscale entropy analysis."
    IEEE signal processing letters 19.2 (2011): 91-94.

source

EntropyHub._MvPermEn.MvPermEn — Function

MPerm, MPnorm = MvPermEn(Data)

Returns the multivariate permutation entropy estimate (MPerm) and the normalized permutation entropy for the M multivariate sequences in Data using the default parameters: embedding dimension = 2*ones(M,1), time delay = ones(M,1), logarithm = 2, normalisation = w.r.t #symbols (sum(m-1))

Note

The multivariate permutation entropy algorithm implemented here uses multivariate embedding based on Takens' embedding theorem, and follows the methods for multivariate entropy estimation through shared spatial reconstruction as originally presented by Ahmed & Mandic [1].

This function does NOT use the multivariate permutation entropy algorithm of Morabito et al. (Entropy, 2012) where the entropy values of individual univariate sequences are averaged because such methods do not follow the definition of multivariate embedding and therefore do not consider cross-channel statistical complexity.

To maximize the number of points in the embedding process, this algorithm uses N- max(tau * m) delay vectors and not N-max(m) * max(tau) as employed in [1].

MPerm, MPnorm = MvPermEn(Data::AbstractArray{T} where T<:Real; m::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, tau::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, Typex::String="none", tpx::Union{Int,Nothing}=nothing, Norm::Bool=false, Logx::Real=2)

Returns the multivariate permutation entropy estimate (MPerm) for the M multivariate data sequences in Data using the specified keyword arguments:

Arguments:

Data - Multivariate dataset, NxM matrix of N (>10) observations (rows) and M (cols) univariate data sequences

m - Embedding Dimension, a vector of M positive integers

tau - Time Delay, a vector of M positive integers

Typex - Permutation entropy variation, can be one of the following strings:

        {`'modified'`, `'ampaware'`, `'weighted'`, `'edge'`, `'phase'`}
        See the `EntropyHub guide <https://github.com/MattWillFlood/EntropyHub/blob/main/EntropyHub%20Guide.pdf>`_ for more info on MvPermEn variants.

tpx - Tuning parameter for associated permutation entropy variation.

        *   [ampaware]  `tpx` is the A parameter, a value in range [0 1]; default = 0.5
        *   [edge]      `tpx` is the r sensitivity parameter, a scalar > 0; default = 1
        *   [phase]     `tpx` is the option to unwrap the phase angle of Hilbert-transformed signal, either [] or 1 (default = 0)

Norm - Normalisation of MPnorm value, a boolean operator:

        *   false -  normalises w.r.t log(# of permutation symbols [sum(m)-1]) - default
        *   true  -  normalises w.r.t log(# of all possible permutations [sum(m)!])

Logx - Logarithm base, a positive scalar

See also PermEn, PermEn2D, XPermEn, MSEn, MvFuzzEn, MvSampEn

References:

[1] Ahmed Mosabber Uddin, Danilo P. Mandic
    "Multivariate multiscale entropy: A tool for complexity
    analysis of multichannel data."
    Physical Review E 84.6 (2011): 061918.

[2] Christoph Bandt and Bernd Pompe, 
    "Permutation entropy: A natural complexity measure for time series." 
    Physical Review Letters,
    88.17 (2002): 174102.

[3] Chunhua Bian, et al.,
    "Modified permutation-entropy analysis of heartbeat dynamics."
    Physical Review E
    85.2 (2012) : 021906

[4] Bilal Fadlallah, et al.,
    "Weighted-permutation entropy: A complexity measure for time 
    series incorporating amplitude information." 
    Physical Review E 
    87.2 (2013): 022911.

[5] Hamed Azami and Javier Escudero,
    "Amplitude-aware permutation entropy: Illustration in spike 
    detection and signal segmentation." 
    Computer methods and programs in biomedicine,
    128 (2016): 40-51.

[6] Zhiqiang Huo, et al.,
    "Edge Permutation Entropy: An Improved Entropy Measure for 
    Time-Series Analysis," 
    45th Annual Conference of the IEEE Industrial Electronics Soc,
    (2019), 5998-6003

[7] Maik Riedl, Andreas Müller, and Niels Wessel,
    "Practical considerations of permutation entropy." 
    The European Physical Journal Special Topics 
    222.2 (2013): 249-262.

[8] Kang Huan, Xiaofeng Zhang, and Guangbin Zhang,
    "Phase permutation entropy: A complexity measure for nonlinear time
    series incorporating phase information."
    Physica A: Statistical Mechanics and its Applications
    568 (2021): 125686.

source

EntropyHub._MvDispEn.MvDispEn — Function

MDisp, RDE = MvDispEn(Data)

Returns the multivariate dispersion entropy estimate (MDisp) and the reverse dispersion entropy (RDE) for the M multivariate sequences in Data using the default parameters: embedding dimension = 2*ones(M,1), time delay = ones(M,1), # symbols = 3, algorithm method = "v1" (see below), data transform = normalised cumulative density function (ncdf) logarithm = natural, data normalization = true,

Note

By default, MvDispEn uses the method termed mvDEii in [1], which follows the original multivariate embedding approach of Ahmed & Mandic [2]. The v1 method therefore returns a singular entropy estimate.

If the v2 method is selected (Methodx=="v2"), the main method outlined in [1] termed mvDE is applied. In this case, entropy is estimated using each combination of multivariate delay vectors with lengths 1:max(m), with each entropy value returned accordingly. See [1] for more info.

MDisp, RDE = MvDispEn(Data::AbstractArray{T,2} where T<:Real; m::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, tau::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, c::Int=3, Methodx::String="v1", Typex::String="NCDF", Norm::Bool=false, Logx::Real=exp(1))

Returns the multivariate dispersion entropy estimate (MDisp) for the M multivariate data sequences in Data using the specified keyword arguments:

Arguments:

Data - Multivariate dataset, NxM matrix of N (>10) observations (rows) and M (cols) univariate data sequences

m - Embedding Dimension, a vector of M positive integers

tau - Time Delay, a vector of M positive integers

c - Number of symbols in transform, an integer > 1

Methodx - The method of multivariate dispersion entropy estimation as outlined in [1], either:

    * `"v1"` - employs the method consistent with the original multivariate embedding approach of Ahmed &
                Mandic [2], termed `mvDEii` in [1]. (default)
    * `"v2"` - employs the main method derived in [1],  termed `mvDE`.

Typex - Type of data-to-symbolic sequence transform, one of the following:

            {`'linear'`, `'kmeans'`, `'ncdf'`, `'equal'`}
        See the `EntropyHub Guide` for more info on these transforms.

Norm - Normalisation of MDisp and RDE values, a boolean:

            * [false]   no normalisation (default)
            * [true]    normalises w.r.t number of possible dispersion patterns (`c^m`).

Logx - Logarithm base, a positive scalar

See also DispEn, DispEn2D, MvSampEn, MvFuzzEn, MvPermEn, MSEn

References:

[1] H Azami, A Fernández, J Escudero
      "Multivariate Multiscale Dispersion Entropy of Biomedical Times Series"
      Entropy 2019, 21, 913.

[2] Ahmed Mosabber Uddin, Danilo P. Mandic
      "Multivariate multiscale entropy: A tool for complexity
      analysis of multichannel data."
      Physical Review E 84.6 (2011): 061918.

[3] Mostafa Rostaghi and Hamed Azami,
       "Dispersion entropy: A measure for time-series analysis." 
       IEEE Signal Processing Letters 
       23.5 (2016): 610-614.

[4] Hamed Azami and Javier Escudero,
       "Amplitude-and fluctuation-based dispersion entropy." 
       Entropy 
       20.3 (2018): 210.

[5] Li Yuxing, Xiang Gao and Long Wang,
       "Reverse dispersion entropy: A new complexity measure for sensor signal." 
       Sensors 
       19.23 (2019): 5203.

source

EntropyHub._MvCoSiEn.MvCoSiEn — Function

MCoSi, Bm = MvCoSiEn(Data)

Returns the multivariate cosine similarity entropy estimate (MCoSi) and the corresponding global probabilities (Bm) estimated for the M multivariate sequences in Data using the default parameters: embedding dimension = 2*ones(M), time delay = ones(M), angular threshold = 0.1, logarithm = 2, data normalization = none,

Note

To maximize the number of points in the embedding process, this algorithm uses N-max(m * tau) delay vectors and not N-max(m) * max(tau) as employed in [1][2].

MCoSi, Bm = MvCoSiEn(Data::AbstractArray{T,2} where T<:Real; m::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, tau::Union{AbstractArray{T} where T<:Int, Nothing}=nothing, r::Real=.1, Logx::Real=2, Norm::Int=0)

Returns the multivariate cosine similarity entropy estimates (MSamp) estimated from the M multivariate data sequences in Data using the specified keyword arguments:

Arguments:

Data - Multivariate dataset, NxM matrix of N (>10) observations (rows) and M (cols) univariate data sequences

m - Embedding Dimension, a vector of M positive integers

tau - Time Delay, a vector of M positive integers

r - Angular threshold, a value in range [0 < r < 1]

Logx - Logarithm base, a positive scalar (enter 0 for natural log)

Norm - Normalisation of Data, one of the following integers:

        *  [0]  no normalisation - default
        *  [1]  remove median(`Data`) to get zero-median series
        *  [2]  remove mean(`Data`) to get zero-mean series
        *  [3]  normalises each sequence in `Data` to unit variance and zero mean
        *  [4]  normalises each sequence in `Data` values to range [-1 1]

See also CoSiEn, MvDispEn, MvSampEn, MvFuzzEn, MvPermEn, MSEn

References:

[1] H. Xiao, T. Chanwimalueang and D. P. Mandic, 
    "Multivariate Multiscale Cosine Similarity Entropy" 
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
    pp. 5997-6001, doi: 10.1109/ICASSP43922.2022.9747282.

[2] Xiao, H.; Chanwimalueang, T.; Mandic, D.P., 
    "Multivariate Multiscale Cosine Similarity Entropy and Its 
    Application to Examine Circularity Properties in Division Algebras."
    Entropy 2022, 24, 1287. 

[3] Ahmed Mosabber Uddin, Danilo P. Mandic
    "Multivariate multiscale entropy: A tool for complexity
    analysis of multichannel data."
    Physical Review E 84.6 (2011): 061918.

[4] Theerasak Chanwimalueang and Danilo Mandic,
    "Cosine similarity entropy: Self-correlation-based complexity
    analysis of dynamical systems."
    Entropy 
    19.12 (2017): 652.

source