Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/eonu/sequentia into master
Browse files Browse the repository at this point in the history
  • Loading branch information
eonu committed Jan 7, 2021
2 parents 2e23549 + da74076 commit 403f0e6
Show file tree
Hide file tree
Showing 10 changed files with 178 additions and 78 deletions.
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,17 @@
## [0.10.2](https://github.com/eonu/sequentia/releases/tag/v0.10.2)

#### Major changes

- Add support for dependent feature warping (addresses [#124](https://github.com/eonu/sequentia/pull/124)). ([#135](https://github.com/eonu/sequentia/pull/135))
- Add multi-processed predictions for `HMMClassifier` (addresses [#121](https://github.com/eonu/sequentia/pull/121)). ([#136](https://github.com/eonu/sequentia/pull/136))
- Re-order `predict()` and `evaluate()` arguments. ([#138](https://github.com/eonu/sequentia/pull/138))

#### Minor changes

- Add `original_labels` documentation to `KNNClassifier`. ([#133](https://github.com/eonu/sequentia/pull/133))
- Simplify `GMMHMM` documentation. ([#134](https://github.com/eonu/sequentia/pull/134))
- Fix posterior comment in `classifier.svg`. ([#137](https://github.com/eonu/sequentia/pull/137))

## [0.10.1](https://github.com/eonu/sequentia/releases/tag/v0.10.1)

#### Minor changes
Expand Down
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,12 +58,13 @@ The following algorithms provided within Sequentia support the use of multivaria

### Classification algorithms

- [x] Hidden Markov Models (via [`hmmlearn`](https://github.com/hmmlearn/hmmlearn))<br/><em>Learning with the Baum-Welch algorithm [[1]](#references)</em>
- [x] Hidden Markov Models (via [`hmmlearn`](https://github.com/hmmlearn/hmmlearn))<br/><em>Learning with the Baum-Welch algorithm</em> [[1]](#references)
- [x] Gaussian Mixture Model emissions
- [x] Linear, left-right and ergodic topologies
- [x] Multi-processed predictions
- [x] Dynamic Time Warping k-Nearest Neighbors (via [`dtaidistance`](https://github.com/wannesm/dtaidistance))
- [x] Sakoe–Chiba band global warping constraint
- [x] Feature-independent warping (DTWI)
- [x] Dependent and independent feature warping (DTWD & DTWI)
- [x] Custom distance-weighted predictions
- [x] Multi-processed predictions

Expand Down
2 changes: 1 addition & 1 deletion docs/_static/classifier.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 10 additions & 11 deletions docs/sections/classifiers/gmmhmm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,17 +70,16 @@ Note that even in the case that multiple Gaussian densities are not needed, the
can be adjusted so that irrelevant Gaussians are omitted and only a single Gaussian remains.
However, the default setting of the :class:`~GMMHMM` class is a single Gaussian.

Then a GMM-HMM is completely determined by the learnable parameters
:math:`\lambda=(\boldsymbol{\pi}, A, B)` where :math:`B=(C,\Pi,\Psi)` and
Then a GMM-HMM is completely determined by
:math:`\lambda=(\boldsymbol{\pi}, A, B)`, where :math:`B` is a collection of
:math:`M` emission distributions (one for each state :math:`m=1,\ldots,M`), which are each
parameterized by a collection of

- :math:`C=\big(c_1^{(m)}, \ldots, c_K^{(m)}\big)_{m=1}^M` is
a collection of the mixture weights,
- :math:`\Pi=\big(\boldsymbol\mu_1^{(m)}, \ldots, \boldsymbol\mu_K^{(m)}\big)_{m=1}^M` is
a collection of the mean vectors,
- :math:`\Psi=\big(\Sigma_1^{(m)}, \ldots, \Sigma_K^{(m)}\big)_{m=1}^M` is
a collection of the covariance matrices,
- mixture weights :math:`c_1^{(m)}, \ldots, c_K^{(m)}`,
- mean vectors :math:`\boldsymbol\mu_1^{(m)}, \ldots, \boldsymbol\mu_K^{(m)}`,
- covariance matrices :math:`\Sigma_1^{(m)}, \ldots, \Sigma_K^{(m)}`,

for every mixture component of each state of the HMM.
for each of the :math:`1,\ldots,K` mixture components of each state.

Usually if :math:`K` is large enough, a mixture of :math:`K` Gaussian densities can effectively
model any probability density function. With large enough :math:`K`, we can also restrict the
Expand All @@ -89,8 +88,8 @@ and at the same time decrease the number of parameters that need to be updated d

The covariance matrix type can be specified by a string parameter ``covariance_type`` in the
:class:`~GMMHMM` constructor that takes values `'spherical'`, `'diag'`, `'full'` or `'tied'`.
The various types are explained well in this `StackExchange answer <https://stats.stackexchange.com/a/326678>`_,
and summarized in the below image (also courtesy of the same StackExchange answerer).
The various types are explained well `here <https://stats.stackexchange.com/a/326678>`_,
and summarized in the below image (also courtesy of the author of the response in the previous link).

.. image:: /_static/covariance_types.png
:alt: Covariance Types
Expand Down
2 changes: 1 addition & 1 deletion lib/sequentia/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = '0.10.1'
__version__ = '0.10.2'

from .classifiers import *
from .preprocessing import *
53 changes: 47 additions & 6 deletions lib/sequentia/classifiers/hmm/hmm_classifier.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
import numpy as np, pickle
import tqdm, tqdm.auto, numpy as np, pickle
from joblib import Parallel, delayed
from multiprocessing import cpu_count
from .gmmhmm import GMMHMM
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import LabelEncoder
Expand Down Expand Up @@ -43,7 +45,7 @@ def fit(self, models):
self._encoder = LabelEncoder()
self._encoder.fit([model.label for model in models])

def predict(self, X, prior='frequency', return_scores=False, original_labels=True):
def predict(self, X, prior='frequency', return_scores=False, original_labels=True, verbose=True, n_jobs=1):
"""Predicts the label for an observation sequence (or multiple sequences) according to maximum likelihood or posterior scores.
Parameters
Expand All @@ -66,6 +68,18 @@ def predict(self, X, prior='frequency', return_scores=False, original_labels=Tru
original_labels: bool
Whether to inverse-transform the labels to their original encoding.
verbose: bool
Whether to display a progress bar or not.
.. note::
If both ``verbose=True`` and ``n_jobs > 1``, then the progress bars for each process
are always displayed in the console, regardless of where you are running this function from
(e.g. a Jupyter notebook).
n_jobs: int > 0 or -1
| The number of jobs to run in parallel.
| Setting this to -1 will use all available CPU cores.
Returns
-------
prediction(s): str/numeric or :class:`numpy:numpy.ndarray` (str/numeric)
Expand All @@ -92,6 +106,9 @@ def predict(self, X, prior='frequency', return_scores=False, original_labels=Tru
else:
self._val.one_of(prior, ['frequency', 'uniform'], desc='prior')
self._val.boolean(return_scores, desc='return_scores')
self._val.boolean(original_labels, desc='original_labels')
self._val.boolean(verbose, desc='verbose')
self._val.restricted_integer(n_jobs, lambda x: x == -1 or x > 0, 'number of jobs', '-1 or greater than zero')

# Create look-up for prior probabilities
if prior == 'frequency':
Expand All @@ -105,10 +122,15 @@ def predict(self, X, prior='frequency', return_scores=False, original_labels=Tru
# Convert single observation sequence to a singleton list
X = [X] if isinstance(X, np.ndarray) else X

# Lambda for calculating the log un-normalized posteriors as a sum of the log forward probabilities (likelihoods) and log priors
posteriors = lambda x: np.array([model.forward(x) + np.log(prior[model.label]) for model in self._models])

# Calculate log un-normalized posteriors as a sum of the log forward probabilities (likelihoods) and log priors
# Perform the MAP classification rule and return labels to original encoding if necessary
posteriors = lambda x: np.array([model.forward(x) + np.log(prior[model.label]) for model in self._models])
scores = np.array([posteriors(x) for x in X])
n_jobs = min(cpu_count() if n_jobs == -1 else n_jobs, len(X))
X_chunks = [list(chunk) for chunk in np.array_split(np.array(X, dtype=object), n_jobs)]
scores = Parallel(n_jobs=n_jobs)(delayed(self._chunk_predict)(i+1, posteriors, chunk, verbose) for i, chunk in enumerate(X_chunks))
scores = np.concatenate(scores)
best_idxs = np.atleast_1d(scores.argmax(axis=1))
labels = self._encoder.inverse_transform(best_idxs) if original_labels else best_idxs

Expand All @@ -117,7 +139,7 @@ def predict(self, X, prior='frequency', return_scores=False, original_labels=Tru
else:
return (labels, scores) if return_scores else labels

def evaluate(self, X, y, prior='frequency'):
def evaluate(self, X, y, prior='frequency', verbose=True, n_jobs=1):
"""Evaluates the performance of the classifier on a batch of observation sequences and their labels.
Parameters
Expand All @@ -137,6 +159,18 @@ def evaluate(self, X, y, prior='frequency'):
Alternatively, class prior probabilities can be specified in an iterable of floats, e.g. `[0.1, 0.3, 0.6]`.
verbose: bool
Whether to display a progress bar or not.
.. note::
If both ``verbose=True`` and ``n_jobs > 1``, then the progress bars for each process
are always displayed in the console, regardless of where you are running this function from
(e.g. a Jupyter notebook).
n_jobs: int > 0 or -1
| The number of jobs to run in parallel.
| Setting this to -1 will use all available CPU cores.
Returns
-------
accuracy: float
Expand All @@ -146,7 +180,7 @@ def evaluate(self, X, y, prior='frequency'):
The confusion matrix representing the discrepancy between predicted and actual labels.
"""
X, y = self._val.observation_sequences_and_labels(X, y)
predictions = self.predict(X, prior=prior, return_scores=False, original_labels=False)
predictions = self.predict(X, prior=prior, return_scores=False, original_labels=False, verbose=verbose, n_jobs=n_jobs)
cm = confusion_matrix(self._encoder.transform(y), predictions, labels=self._encoder.transform(self._encoder.classes_))
return np.sum(np.diag(cm)) / np.sum(cm), cm

Expand Down Expand Up @@ -183,6 +217,13 @@ def load(cls, path):
with open(path, 'rb') as file:
return pickle.load(file)

def _chunk_predict(self, process, posteriors, chunk, verbose): # Requires fit
"""Makes predictions (scores) for a chunk of the observation sequences, for a given subprocess."""
return np.array([posteriors(x) for x in tqdm.auto.tqdm(
chunk, desc='Classifying examples (process {})'.format(process),
disable=not(verbose), position=process-1
)])

@property
def models(self):
try:
Expand Down
38 changes: 28 additions & 10 deletions lib/sequentia/classifiers/knn/knn_classifier.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import warnings, tqdm, tqdm.auto, numpy as np, types, pickle, marshal
from joblib import Parallel, delayed
from multiprocessing import cpu_count
from dtaidistance import dtw
from dtaidistance import dtw, dtw_ndim
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import LabelEncoder
from ...internals import _Validator
Expand All @@ -20,7 +20,7 @@ class KNNClassifier:
k: int > 0
Number of neighbors.
classes: array-liike of str/numeric
classes: array-like of str/numeric
The complete set of possible classes/labels.
weighting: 'uniform' or callable
Expand Down Expand Up @@ -60,6 +60,9 @@ class KNNClassifier:
pip install -vvv --upgrade --no-cache-dir --force-reinstall dtaidistance
independent: bool
Whether or not to allow features to be warped independently from each other. See `here <https://www.cs.ucr.edu/~eamonn/Multi-Dimensional_DTW_Journal.pdf>`_ for a good overview of both approaches.
random_state: numpy.random.RandomState, int, optional
A random state object or seed for reproducible randomness.
Expand All @@ -84,7 +87,7 @@ class KNNClassifier:
The complete set of possible classes/labels.
"""

def __init__(self, k, classes, weighting='uniform', window=1., use_c=False, random_state=None):
def __init__(self, k, classes, weighting='uniform', window=1., use_c=False, independent=False, random_state=None):
self._val = _Validator()
self._k = self._val.restricted_integer(
k, lambda x: x > 0, desc='number of neighbors', expected='greater than zero')
Expand Down Expand Up @@ -116,6 +119,9 @@ def __init__(self, k, classes, weighting='uniform', window=1., use_c=False, rand
warnings.warn('DTAIDistance C library not available – using Python implementation', ImportWarning)
self._use_c = False

self._independent = self._val.boolean(independent, 'independent')
self._dtw = self._dtwi if independent else self._dtwd

def fit(self, X, y):
"""Fits the classifier by adding labeled training observation sequences.
Expand All @@ -131,14 +137,17 @@ def fit(self, X, y):
self._X, self._y = X, self._encoder.transform(y)
self._n_features = X[0].shape[1]

def predict(self, X, verbose=True, original_labels=True, n_jobs=1):
def predict(self, X, original_labels=True, verbose=True, n_jobs=1):
"""Predicts the label for an observation sequence (or multiple sequences).
Parameters
----------
X: numpy.ndarray (float) or list of numpy.ndarray (float)
An individual observation sequence or a list of multiple observation sequences.
original_labels: bool
Whether to inverse-transform the labels to their original encoding.
verbose: bool
Whether to display a progress bar or not.
Expand All @@ -165,16 +174,17 @@ def predict(self, X, verbose=True, original_labels=True, n_jobs=1):
raise RuntimeError('The classifier needs to be fitted before predictions are made')

X = self._val.observation_sequences(X, allow_single=True)
self._val.boolean(original_labels, desc='original_labels')
self._val.boolean(verbose, desc='verbose')
self._val.restricted_integer(n_jobs, lambda x: x == -1 or x > 0, 'number of jobs', '-1 or greater than zero')

if isinstance(X, np.ndarray):
distances = np.array([self._dtw(X, x) for x in tqdm.auto.tqdm(self._X, desc='Calculating distances', disable=not(verbose))])
return self._output(self._find_nearest(distances), original_labels)
else:
n_jobs = cpu_count() if n_jobs == -1 else n_jobs
n_jobs = min(cpu_count() if n_jobs == -1 else n_jobs, len(X))
X_chunks = [list(chunk) for chunk in np.array_split(np.array(X, dtype=object), n_jobs)]
labels = Parallel(n_jobs=min(n_jobs, len(X)))(delayed(self._chunk_predict)(i+1, chunk, verbose) for i, chunk in enumerate(X_chunks))
labels = Parallel(n_jobs=n_jobs)(delayed(self._chunk_predict)(i+1, chunk, verbose) for i, chunk in enumerate(X_chunks))
return self._output(np.concatenate(labels), original_labels) # Flatten the resulting array

def evaluate(self, X, y, verbose=True, n_jobs=1):
Expand Down Expand Up @@ -205,7 +215,7 @@ def evaluate(self, X, y, verbose=True, n_jobs=1):
"""
X, y = self._val.observation_sequences_and_labels(X, y)
self._val.boolean(verbose, desc='verbose')
predictions = self.predict(X, verbose=verbose, original_labels=False, n_jobs=n_jobs)
predictions = self.predict(X, original_labels=False, verbose=verbose, n_jobs=n_jobs)
cm = confusion_matrix(self._encoder.transform(y), predictions, labels=self._encoder.transform(self._encoder.classes_))
return np.sum(np.diag(cm)) / np.sum(cm), cm

Expand Down Expand Up @@ -235,6 +245,7 @@ def save(self, path):
'weighting': marshal.dumps((self._weighting.__code__, self._weighting.__name__)),
'window': self._window,
'use_c': self._use_c,
'independent': self._independent,
'random_state': self._random_state,
'X': self._X,
'y': self._y,
Expand All @@ -259,7 +270,7 @@ def load(cls, path):
data = pickle.load(file)

# Check deserialized object dictionary and keys
keys = set(('k', 'classes', 'weighting', 'window', 'use_c', 'random_state', 'X', 'y', 'n_features'))
keys = set(('k', 'classes', 'weighting', 'window', 'use_c', 'independent', 'random_state', 'X', 'y', 'n_features'))
if not isinstance(data, dict):
raise TypeError('Expected deserialized object to be a dictionary - make sure the object was serialized with the save() function')
else:
Expand All @@ -277,6 +288,7 @@ def load(cls, path):
weighting=weighting,
window=data['window'],
use_c=data['use_c'],
independent=data['independent'],
random_state=data['random_state']
)

Expand All @@ -290,11 +302,16 @@ def _dtw_1d(self, a, b, window): # Requires fit
"""Computes the DTW distance between two univariate sequences."""
return dtw.distance(a, b, use_c=self._use_c, window=window)

def _dtw(self, A, B): # Requires fit
"""Computes the multivariate DTW distance as the sum of the pairwise per-feature DTW distances."""
def _dtwi(self, A, B): # Requires fit
"""Computes the multivariate DTW distance as the sum of the pairwise per-feature DTW distances, allowing each feature to be warped independently."""
window = max(1, int(self._window * max(len(A), len(B))))
return np.sum([self._dtw_1d(A[:, i], B[:, i], window=window) for i in range(self._n_features)])

def _dtwd(self, A, B): # Requires fit
"""Computes the multivariate DTW distance so that the warping of the features depends on each other, by modifying the local distance measure."""
window = max(1, int(self._window * max(len(A), len(B))))
return dtw_ndim.distance(A, B, use_c=self._use_c, window=window)

def _argmax(self, a):
"""Same as numpy.argmax but returns all occurrences of the maximum, and is O(n) instead of O(2n).
From: https://stackoverflow.com/a/58652335
Expand Down Expand Up @@ -391,6 +408,7 @@ def __repr__(self):
('k', repr(self._k)),
('window', repr(self._window)),
('use_c', repr(self._use_c)),
('independent', repr(self._independent)),
('classes', repr(list(self._encoder.classes_)))
]
try:
Expand Down
Loading

0 comments on commit 403f0e6

Please sign in to comment.