【Hackathon 5th No.19】Add ContinuousBernoulli and MultivariateNormal API #58004

NKNaN · 2023-10-11T02:54:16Z

PR types

New features

PR changes

APIs

Description

Add ContinuousBernoulli and MultivariateNormal API

rfc of ContinuousBernoulli: https://github.com/PaddlePaddle/community/blob/master/rfcs/APIs/20230927_api_design_for_ContinuousBernoulli.md
rfc of MultivariateNormal: https://github.com/PaddlePaddle/community/blob/master/rfcs/APIs/20230927_api_design_for_MultivariateNormal.md

paddle-bot · 2023-10-11T02:54:22Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

NKNaN · 2023-10-11T03:21:42Z

Math Derivation for entropy of the Continuous Bernoulli distribution and kl_divergence of 2 Continuous Bernoulli distributions:

entropy:

$$\begin{aligned} H &= -\int_x C(\lambda) \lambda^x (1-\lambda)^{1-x} \log\{C(\lambda) \lambda^x (1-\lambda)^{1-x}\} dx \\\ & = -\int_0^1 C \lambda^x (1-\lambda)^{1-x} \left[ \log C + x \log \lambda + (1 -x) \log (1 - \lambda)\right] dx \\\ & = -\left[ C \log C \int_0^1 \lambda^x (1-\lambda)^{1-x} dx + C \log \lambda \int_0^1 x \lambda^x (1-\lambda)^{1-x} + C \log(1 - \lambda) \int_0^1 (1-x) \lambda^x (1-\lambda)^{1-x} \right] \\\ & = - \left[ \log C + \mathbb{E}(X) \log \lambda + \mathbb{E}(1 - X) \log(1 - \lambda) \right] \\\ & = -\log C + \left[ \log (1 - \lambda) -\log \lambda \right] \mathbb{E}(X) - \log(1 - \lambda) \end{aligned}$$

kl_divergence:

$$\begin{aligned} \mathcal{D}_{KL}(p_1|| p_2) &= \int_x p_1(x)\log\frac{p_1(x)}{p_2(x)} dx \\\ & = \int_0^1 C_1 \lambda_1^x (1-\lambda_1)^{1-x} \{\log[C_1 \lambda_1^x (1-\lambda_1)^{1-x}] - \log[C_2 \lambda_2^x (1-\lambda_2)^{1-x}]\} dx \\\ & = -H - [C_1 \log C_2 \int_0^1 \lambda_1^x (1-\lambda_1)^{1-x} dx + C_1 \log \lambda_2 \int_0^1 x \lambda_1^x (1-\lambda_1)^{1-x} dx + C_1 \log (1-\lambda_2) \int_0^1 (1-x) \lambda_1^x (1-\lambda_1)^{1-x} dx] \\\ & = -H - [\log C_2 + \log \lambda_2 \mathbb{E}_1(X) + \log (1-\lambda_2) \mathbb{E}_1(1-X) ] \\\ & = - H - \{\log C_2 + [\log \lambda_2 - \log (1-\lambda_2)] \mathbb{E}_1(X) + \log (1-\lambda_2) \} \end{aligned}$$

NKNaN · 2023-10-11T04:54:50Z

Math Derivation for entropy of the Multivariate Normal distribution and kl_divergence of 2 Multivariate Normal distributions:

entropy:

$$\begin{aligned} H &= -\int_x f(x) \log f(x) dx \\\ & = -\int_{x \in \mathbb{R}^n} f(x) \{ -\frac{n}{2}\log(2\pi) -\frac{1}{2} (x-\mu)^{\intercal} \Sigma^{-1} (x-\mu) - \frac{1}{2}\log (\det\Sigma) \} dx \\\ & = -\int_{x \in \mathbb{R}^n} f(x) \{ -\frac{n}{2}\log(2\pi) -\frac{1}{2} [A^{-1}(x-\mu)]^{\intercal}[A^{-1}(x-\mu)] - \log (\det A) \} dx \\\ & = \frac{n}{2} \log(2\pi) + \log {\det A} + \frac{1}{2}\int_{x \in \mathbb{R}^n} [A^{-1}(x-\mu)]^{\intercal}[A^{-1}(x-\mu)] f(x) dx\\\ & = \frac{n}{2} \log(2\pi) + \log {\det A} + \frac{1}{2} \mathbb{E}[(X-\mu)^{\intercal} \Sigma^{-1} (X - \mu)] \\\ & = \frac{n}{2} \log(2\pi) + \log {\det A} + \frac{1}{2} \mathbb{E}[tr[(X-\mu)^{\intercal} \Sigma^{-1} (X - \mu)] ] \\\ & = \frac{n}{2} \log(2\pi) + \log {\det A} + \frac{1}{2} \mathbb{E}[tr[\Sigma^{-1} (X - \mu) (X-\mu)^{\intercal}]] \\\ & = \frac{n}{2} \log(2\pi) + \log {\det A} + \frac{1}{2} tr[\mathbb{E}[\Sigma^{-1} (X - \mu) (X-\mu)^{\intercal}]] \\\ & = \frac{n}{2} \log(2\pi) + \log {\det A} + \frac{1}{2} tr[\Sigma^{-1} \mathbb{E}[ (X - \mu) (X-\mu)^{\intercal}]] \\\ & = \frac{n}{2} \log(2\pi) + \log {\det A} + \frac{1}{2} tr[\Sigma^{-1} \Sigma] \\\ & = \frac{n}{2} \log(2\pi) + \log {\det A} + \frac{n}{2} \end{aligned}$$

kl_divergence:

$$\begin{aligned} \mathcal{D}_{KL}(f_1|| f_2) &= \int_x f_1(x)\log\frac{f_1(x)}{f_2(x)} dx \\\ & = \int_{x \in \mathbb{R}^n} f_1(x)\left\{\left[ -\frac{n}{2} \log(2\pi) - \log(\det A_1) - \frac{1}{2}(x-\mu_1)^{\intercal} \Sigma_1^{-1} (x - \mu_1) \right] + \left[ \frac{n}{2} \log(2\pi) + \log(\det A_2) + \frac{1}{2}(x-\mu_2)^{\intercal} \Sigma_21^{-1} (x - \mu_2)\right]\right\} dx \\\ & = \log(\det A_2) - \log(\det A_1) +\frac{1}{2}\mathbb{E}_1[(X-\mu_2)^{\intercal} \Sigma_2^{-1} (X - \mu_2)] -\frac{n}{2} \\\ & = \log(\det A_2) - \log(\det A_1) +\frac{1}{2}tr [\Sigma_2^{-1}\mathbb{E}_1[ (X - \mu_2) (X-\mu_2)^{\intercal} ]] -\frac{n}{2} \\\ & = \log(\det A_2) - \log(\det A_1) -\frac{n}{2} +\frac{1}{2}tr [\Sigma_2^{-1}\mathbb{E}_1[ XX^{\intercal} -X \mu_2^{\intercal} - \mu_2 X^{\intercal} + \mu_2\mu_2^{\intercal}]] \\\ & = \log(\det A_2) - \log(\det A_1) -\frac{n}{2} +\frac{1}{2}tr [\Sigma_2^{-1} [ Var_1(X) + \mathbb{E}_1(X)\mathbb{E}_1(X)^{\intercal} -\mu_1\mu_2^{\intercal} - \mu_2 \mu_1^{\intercal} + \mu_2\mu_2^{\intercal}]] \\\ & = \log(\det A_2) - \log(\det A_1) -\frac{n}{2} +\frac{1}{2}tr [\Sigma_2^{-1} [ \Sigma_1 + \mu_1\mu_1^{\intercal} -\mu_1\mu_2^{\intercal} - \mu_2 \mu_1^{\intercal} + \mu_2\mu_2^{\intercal}]] \\\ & = \log(\det A_2) - \log(\det A_1) -\frac{n}{2} +\frac{1}{2}tr [\Sigma_2^{-1} \Sigma_1 + (\mu_1 - \mu_2)^{\intercal} \Sigma_2^{-1} (\mu_1 - \mu_2)] \\\ & = \log(\det A_2) - \log(\det A_1) -\frac{n}{2} +\frac{1}{2}[tr [\Sigma_2^{-1} \Sigma_1] + (\mu_1 - \mu_2)^{\intercal} \Sigma_2^{-1} (\mu_1 - \mu_2)] \\\ \end{aligned}$$

paddle-ci-bot · 2023-10-19T03:09:03Z

Sorry to inform you that 064e8a9's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

paddle-ci-bot · 2023-11-01T03:12:28Z

Sorry to inform you that 4ce267e's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

luotao1 · 2023-11-17T08:54:24Z

test_distribution_continuous_bernoulli_static 单测没过

paddle-ci-bot · 2023-11-30T03:09:25Z

Sorry to inform you that 8b913d3's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

cxxly · 2023-12-05T06:38:31Z

python/paddle/distribution/continuous_bernoulli.py

+        # convert type
+        if isinstance(probability, (float, int)):
+            probability = [probability]
+        probability = paddle.to_tensor(probability, dtype=self.dtype)


如果probability本身是Tensor，这里会改变probability的数据类型。别入用户传入p是fp32, 默认数据类型是fp64

cxxly

LGTM

jeff41404 · 2023-12-13T03:46:17Z

code is fine, please add link of rfc in description above

jeff41404 · 2023-12-13T03:49:39Z

the design of ContinuousBernoulli in rfc needs to be consistent with the code

jeff41404 · 2023-12-13T03:52:39Z

the rfc of MultivariateNormal have same issue

NKNaN · 2023-12-13T06:18:25Z

code is fine, please add link of rfc in description above

添加了rfc链接，rfc设计文档需要做一些修改，已提相应pr

luotao1 · 2023-12-13T06:35:09Z

对应中文文档可以提上来

NKNaN · 2023-12-13T07:11:42Z

对应中文文档可以提上来

已提中文pr

又修改了一下对应的英文文档

sunzhongkai588 · 2023-12-13T08:56:33Z

python/paddle/distribution/continuous_bernoulli.py

+    Args:
+        probability(int|float|Tensor): The probability of Continuous Bernoulli distribution between [0, 1],
+        which characterize the shape of the pdf. If the input data type is int or float, the data type of
+        `probability` will be convert to a 1-D Tensor the paddle global default dtype.
+        eps(float): Specify the bandwith of the unstable calculation region near 0.5. The unstable calculation region
+        would be [0.5 - eps, 0.5 + eps], where the calculation is approximated by talyor expansion. The
+        default value is 0.02.


Suggested change

Args:

probability(int|float|Tensor): The probability of Continuous Bernoulli distribution between [0, 1],

which characterize the shape of the pdf. If the input data type is int or float, the data type of

`probability` will be convert to a 1-D Tensor the paddle global default dtype.

eps(float): Specify the bandwith of the unstable calculation region near 0.5. The unstable calculation region

would be [0.5 - eps, 0.5 + eps], where the calculation is approximated by talyor expansion. The

default value is 0.02.

Args:

probability(int|float|Tensor): The probability of Continuous Bernoulli distribution between [0, 1],

which characterize the shape of the pdf. If the input data type is int or float, the data type of

`probability` will be convert to a 1-D Tensor the paddle global default dtype.

eps(float): Specify the bandwith of the unstable calculation region near 0.5. The unstable calculation region

would be [0.5 - eps, 0.5 + eps], where the calculation is approximated by talyor expansion. The

default value is 0.02.

对于 Args 下的每个参数，同一个参数的描述换行需要加下缩进

sunzhongkai588 · 2023-12-13T08:59:56Z

python/paddle/distribution/continuous_bernoulli.py

+    r"""The Continuous Bernoulli distribution with parameter: `probability` characterizing the shape of the density function.
+    The Continuous Bernoulli distribution is defined on [0, 1], and it can be viewed as a continuous version of the Bernoulli distribution.
+
+    [1] Loaiza-Ganem, G., & Cunningham, J. P. The continuous Bernoulli: fixing a pervasive error in variational autoencoders. 2019.


能不能直接贴上论文的连接？引用方式参考如何让文档相互引用；

sunzhongkai588 · 2023-12-13T09:06:19Z

python/paddle/distribution/continuous_bernoulli.py

+        would be [0.5 - eps, 0.5 + eps], where the calculation is approximated by talyor expansion. The
+        default value is 0.02.
+
+    Examples:


请参考代码示例写法，需要在前面加上 >>> 或 ...

另外，输出不要用注释的形式，请仔细参考代码示例书写规范

sunzhongkai588 · 2023-12-13T09:09:56Z

python/paddle/distribution/continuous_bernoulli.py

+
+        In the above equation:
+
+        * :math:\Omega: is the support of the distribution.


Suggested change

* :math:\Omega: is the support of the distribution.

* :math:`\Omega` is the support of the distribution.

sunzhongkai588 · 2023-12-13T09:12:51Z

python/paddle/distribution/kl.py

这篇文档和 continuous_bernoulli.py 有同样的问题，不赘述了

对应处已修改

cxxly · 2023-12-14T03:39:09Z

抱歉，再补充个Comment，ContinuousBernoulli 签名和PyTorch保持一致

NKNaN · 2023-12-14T05:08:31Z

抱歉，再补充个Comment，ContinuousBernoulli 签名和PyTorch保持一致

已修改

sunzhongkai588

LGTM～

doc-preview CI 中发现新的 system message 错误， @ooooo-create 之后全量的再检查一遍相关错误并修复叭

cxxly · 2023-12-15T03:02:42Z

python/paddle/distribution/continuous_bernoulli.py

+            [0.20103608, 0.07641447])
+    """
+
+    def __init__(self, probs=None, lims=(0.499, 0.501)):


不用加None，probs是必选参数

cxxly

LGTM

jeff41404 · 2023-12-18T13:26:41Z

There are spelling and capitalization issues in the link of rfc, eg should be 20230927_api_design_for_ContinuousBernoulli.md instead of 20230927_api_design_for_continuous_bernoulli.md, cause error of 404

NKNaN · 2023-12-18T13:32:37Z

There are spelling and capitalization issues in the link of rfc, eg should be 20230927_api_design_for_ContinuousBernoulli.md instead of 20230927_api_design_for_continuous_bernoulli.md, cause error of 404

Links have been updated

jeff41404

LGTM

…PI (PaddlePaddle#58004) * add api and test * add kl-div registrition for cb and mvn * fix docs annd test * fix test * fix test * fix mvn test coverage * fix docs * update docs * update cb and mvn * fix mvn test * fix test * fix test * fix test * fix test * fix unstable region calculation * fix test * update dtype convertion and tests * fix test * fix test * fix test * refine docs * update docs * update docs * update docs * update cb api * increase cb static test timeout * fix test time * fix test * update cb

paddle-bot bot added the contributor External developers label Oct 11, 2023

Ligoml mentioned this pull request Oct 11, 2023

【PaddlePaddle Hackathon 5th】开源贡献个人挑战赛 #57262

Open

luotao1 added the PaddlePaddle Hackathon label Oct 11, 2023

luotao1 assigned luotao1 and cxxly Oct 11, 2023

NKNaN added 9 commits November 30, 2023 23:00

add api and test

b705a8d

add kl-div registrition for cb and mvn

4f3c102

fix docs annd test

f64796c

fix test

42d279e

fix test

ca84f0d

fix mvn test coverage

b27eef0

fix docs

28e46d3

update docs

e7ce6be

update cb and mvn

2c5ce90

NKNaN force-pushed the ayase/develop2 branch from edce334 to 2c5ce90 Compare November 30, 2023 15:01

NKNaN added 7 commits December 1, 2023 11:00

fix mvn test

e84391c

fix test

251127e

fix test

8972b7b

fix test

b24b1f9

fix test

559c28e

fix unstable region calculation

ef80063

fix test

7e208a6

cxxly reviewed Dec 5, 2023

View reviewed changes

cxxly previously approved these changes Dec 12, 2023

View reviewed changes

refine docs

4532cd8

NKNaN dismissed cxxly’s stale review via 4532cd8 December 13, 2023 07:09

sunzhongkai588 reviewed Dec 13, 2023

View reviewed changes

NKNaN added 3 commits December 13, 2023 20:15

update docs

6e5d19a

update docs

806c1f6

update docs

1c645a8

update cb api

dec2efb

NKNaN added 2 commits December 14, 2023 14:38

increase cb static test timeout

08df9b8

fix test time

5e821c7

sunzhongkai588 previously approved these changes Dec 14, 2023

View reviewed changes

fix test

7500160

NKNaN dismissed sunzhongkai588’s stale review via 7500160 December 14, 2023 10:00

cxxly reviewed Dec 15, 2023

View reviewed changes

update cb

5e1816e

cxxly approved these changes Dec 18, 2023

View reviewed changes

jeff41404 approved these changes Dec 18, 2023

View reviewed changes

luotao1 merged commit 1208eab into PaddlePaddle:develop Dec 18, 2023
28 of 29 checks passed

NKNaN deleted the ayase/develop2 branch February 2, 2024 04:08


		In the above equation:

		* :math:\Omega: is the support of the distribution.

	* :math:\Omega: is the support of the distribution.
	* :math:`\Omega` is the support of the distribution.

【Hackathon 5th No.19】Add ContinuousBernoulli and MultivariateNormal API #58004

【Hackathon 5th No.19】Add ContinuousBernoulli and MultivariateNormal API #58004

Conversation

NKNaN commented Oct 11, 2023 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Oct 11, 2023

NKNaN commented Oct 11, 2023

Math Derivation for entropy of the Continuous Bernoulli distribution and kl_divergence of 2 Continuous Bernoulli distributions:

NKNaN commented Oct 11, 2023

Math Derivation for entropy of the Multivariate Normal distribution and kl_divergence of 2 Multivariate Normal distributions:

paddle-ci-bot bot commented Oct 19, 2023

paddle-ci-bot bot commented Nov 1, 2023

luotao1 commented Nov 17, 2023

paddle-ci-bot bot commented Nov 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cxxly left a comment

Choose a reason for hiding this comment

jeff41404 commented Dec 13, 2023

jeff41404 commented Dec 13, 2023

jeff41404 commented Dec 13, 2023

NKNaN commented Dec 13, 2023

luotao1 commented Dec 13, 2023

NKNaN commented Dec 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cxxly commented Dec 14, 2023

NKNaN commented Dec 14, 2023

sunzhongkai588 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cxxly left a comment

Choose a reason for hiding this comment

jeff41404 commented Dec 18, 2023

NKNaN commented Dec 18, 2023

jeff41404 left a comment

Choose a reason for hiding this comment

NKNaN commented Oct 11, 2023 •

edited

Loading