Skip to content

Latest commit

 

History

History
291 lines (223 loc) · 18 KB

Probabilistic Programming and TDA.md

File metadata and controls

291 lines (223 loc) · 18 KB

Probabilistic Programming and Topological Data Analysis

Probabilistic Programming and Topological Data Analysis may be most theoretical subject in data science. Probabilistic Programming is designed to solve the uncertain problems via Bayesian statistics. If ${A_i}{i=1}^{n}$ are disjoint events and ${\cup}{i=1}^{n}A_i$ is all the results possible to happen, we should keep it in mind:

$$ P(A_i\mid B)=\frac{P(B\mid A_i)P(A_i)}{P(B)}, \\ P(A_i\mid B)\approx P(B\mid A_i)P(A_i). $$

Bayesian Learning

Bayesian learning can be regarded as the extension of Bayesian statistics. The core topic of Bayesian learning is thought as prior information to explain the uncertainty of parameters. It is related with Bayesian statistics, computational statistics, probabilistic programming and machine learning.

Naive Bayes

Naive Bayes classifier takes the attribute conditional independence assumption.

Gaussian Naive Bayes

Average One-Dependence Estimator (AODE)

Bayesian Belief Network(BBN)

Bayesian Network

It is a probabilistic graphical model.

Optimal Learning

The Bayesian perspective casts a different interpretation on the statistics we compute, which is particularly useful in the context of optimal learning. In the frequentist perspective, we do not start with any knowledge about the system before we have collected any data. By contrast, in the Bayesian perspective we assume that we begin with a prior distribution of belief about the unknown parameters.

Everyday decisions are made without the benefit of accurate information. Optimal Learning develops the needed principles for gathering information to make decisions, especially when collecting information is time-consuming and expensive. Optimal learning addresses the problem of efficiently collecting information with which to make decisions. Optimal learning is an issue primarily in applications where observations or measurements are expensive.

It is possible to approach the learning problem using classical and familiar ideas from optimization. The operations research community is very familiar with the use of gradients to minimize or maximize functions. Dual variables in linear programs are a form of gradient, and these are what guide the simplex algorithm. Gradients capture the value of an incremental change in some input such as a price, fleet size or the size of buffers in a manufacturing system. We can apply this same idea to learning.

There is a list of optimal learning problems.

Bayesian Optimization

Bayesian optimization has been successful at global optimization of expensive-to-evaluate multimodal objective functions. However, unlike most optimization methods, Bayesian optimization typically does not use derivative information.

As response surface methods, they date back to Box and Wilson in 1951.

Bayesian Optimization


Probabilistic Graphical Model

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics — particularly Bayesian statistics — and machine learning. It is a marriage of graph theory and probability theory. It is aimed to solve the causal inferences, which is based on principles rather than models.

Probabilistic Programming

Probabilistic graphical models provide a formal lingua franca for modeling and a common target for efficient inference algorithms. Their introduction gave rise to an extensive body of work in machine learning, statistics, robotics, vision, biology, neuroscience, artificial intelligence (AI) and cognitive science. However, many of the most innovative and useful probabilistic models published by the AI, machine learning, and statistics community far outstrip the representational capacity of graphical models and associated inference techniques. Models are communicated using a mix of natural language, pseudo code, and mathematical formulae and solved using special purpose, one-off inference methods. Rather than precise specifications suitable for automatic inference, graphical models typically serve as coarse, high-level descriptions, eliding critical aspects such as fine-grained independence, abstraction and recursion.

PROBABILISTIC PROGRAMMING LANGUAGES aim to close this representational gap, unifying general purpose programming with probabilistic modeling; literally, users specify a probabilistic model in its entirety (e.g., by writing code that generates a sample from the joint distribution) and inference follows automatically given the specification. These languages provide the full power of modern programming languages for describing complex distributions, and can enable reuse of libraries of models, support interactive modeling and formal verification, and provide a much-needed abstraction barrier to foster generic, efficient inference in universal model classes.

We believe that the probabilistic programming language approach within AI has the potential to fundamentally change the way we understand, design, build, test and deploy probabilistic systems. This approach has seen growing interest within AI over the last 10 years, yet the endeavor builds on over 40 years of work in range of diverse fields including mathematical logic, theoretical computer science, formal methods, programming languages, as well as machine learning, computational statistics, systems biology, probabilistic AI.

Graph Nets library
Graph net

Hierarchical Bayesian Regression

Topological Data Analysis

Topological data analysis(TDA) is potential to find better representation of the data. TDA can visualize the high dimensional data and characterize the intrinsic invariants of the data. It is close to computational geometry, manifold learning and computational topology. It is one kind of descriptive representation learning.

As The NIPS 2012 workshop on Algebraic Topology and Machine Learning puts:

Topological methods and machine learning have long enjoyed fruitful interactions as evidenced by popular algorithms like ISOMAP, LLE and Laplacian Eigenmaps which have been borne out of studying point cloud data through the lens of geometry. More recently several researchers have been attempting to also understand the algebraic topological properties of data. Algebraic topology is a branch of mathematics which uses tools from abstract algebra to study and classify topological spaces. The machine learning community thus far has focused almost exclusively on clustering as the main tool for unsupervised data analysis. Clustering however only scratches the surface, and algebraic topological methods aim at extracting much richer topological information from data.

TDA


Topology

Topology focuses on the invariants under continuous mapping. It pays more attention to the geometrical or discrete properties of the objects such as the number of circles or holes. It is not distance-based.

Definition: Let $X$ be a non-empty set. A set $\tau$ of subsets of $X$ is said to be a topology if

  • $X$ and the empty set $\emptyset$ belong to $\tau$;
  • the union of any number of sets in $\tau$ belongs to $\tau$;
  • the intersection of any two sets inn $\tau$ belongs to $\tau$. The pair $(X,\tau)$ is called a topological space.

As the definition shows the topology may be really not based on the definition of distance or measure. The set can be countable or discountable.e3

Definition: Let $(X,\tau)$ be a topological space. Then the members of $\tau$ (the subsets of $X$) is said to be open set. If $X-S$ is open set, we call $S$ as closed set.

klein bottle

TDA

Topological data analysis as one data processing method is selected topic for some students on computer science and applied mathematics. It is not popular for the statisticians, where there is no estimation and test.

Topological data analysis (TDA) refers to statistical methods that find structure in data. As the name suggests, these methods make use of topological ideas. Often, the term TDA is used narrowly to describe a particular method called persistent homology.

TDA, which originates from mathematical topology, is a discipline that studies shape. It’s concerned with measuring the shape, by means applying math functions to data, and with representing it in forms of topological networks or combinatorial graphs.

There is another field that deals with the topological and geometric structure of data: computational geometry. The main difference is that in TDA we treat the data as random points, whereas in computational geometry the data are usually seen as fixed.

tda

TDA can be applied to manifold estimation, nonlinear dimension reduction, mode estimation, ridge estimation and persistent homology.


Computational Topology

Computational topology is the mathematical theoretic foundation of topological data analysis. It is different from the deep neural network that origins from the engineering or the simulation to biological neural network. Topological data analysis is principle-driven and application-inspired in some sense.

CS 598: Computational Topology Spring 2013 put that

Potential mathematical topics include the topology of ++cell complexes, topological graph theory, homotopy, covering spaces, simplicial homology, persistent homology, discrete Morse theory, discrete differential geometry, and normal surface theory. Potential computing topics include algorithms for computing topological invariants, graphics and geometry processing, mesh generation, curve and surface reconstruction, VLSI routing, motion planning, manifold learning, clustering, image processing, and combinatorial optimization++.

bugs-topology