Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doubt in the derivatives from eq7 to eq8 #6

Open
pherrusa7 opened this issue Apr 11, 2019 · 2 comments
Open

Doubt in the derivatives from eq7 to eq8 #6

pherrusa7 opened this issue Apr 11, 2019 · 2 comments

Comments

@pherrusa7
Copy link

pherrusa7 commented Apr 11, 2019

Dear @adityac94 , @tataiani

If I understand correctly, you use the following assumption to go from eq.7 to eq.8, and from eq.8 to eq.9:

$\frac{\partial^2 Y^C}{\partial A^k_{ab} \partial A^k_{ij}} = 0, \text{ if } (a,b) \neq (i, j)$ [1]

Can you please provide an explanation for this?

I am also confused about why $\frac{\partial^2 Y^C}{(\partial A^k_{ij})^2} \neq 0$ [2] since it seems to me that $\frac{\partial Y^C}{\partial A^k_{ij}} = C\text{ (Constant)}$ [3].

Thank you for your time and efforts in advance!

Screenshot 2019-04-11 at 15 25 28

@mt-cly
Copy link

mt-cly commented Apr 6, 2020

Dear @adityac94 , @tataiani

If I understand correctly, you use the following assumption to go from eq.7 to eq.8, and from eq.8 to eq.9:

$\frac{\partial^2 Y^C}{\partial A^k_{ab} \partial A^k_{ij}} = 0, \text{ if } (a,b) \neq (i, j)$ [1]

Can you please provide an explanation for this?

I am also confused about why $\frac{\partial^2 Y^C}{(\partial A^k_{ij})^2} \neq 0$ [2] since it seems to me that $\frac{\partial Y^C}{\partial A^k_{ij}} = C\text{ (Constant)}$ [3].

Thank you for your time and efforts in advance!

Screenshot 2019-04-11 at 15 25 28

Hi, althought I am also confusing about motivation of introducing conv_third/second_grad, your fomula do not match with author's code.
Actually, the derivative is based on exp(Yc) instead of Yc as mentioned in paper, so that the second_grad or higher grad could be calculated by repeatly multiplying gradient value.

@mlerma54
Copy link

mlerma54 commented Apr 16, 2022

I have a similar issue, if we take partial derivative of (7) respect to A_{ij}^k we do not get (8) unless we assume that the cross-derivatives are zero. However (7) can be seen as an overdetermined system of linear equations with the alphas as unknowns. Since the system has more unknowns than equations it will be in general underdetermined, and will have infinitely many solutions. Assuming that the cross-derivatives are zero imposes additional restrictions on the unknowns and reduces the degrees of freedom in the space of solutions, leading to a possible formula for the alphas. However I do not see any particular reason to assume that the cross-derivatives are zero except for the pragmatical one of getting an equation from which the alphas can be isolated.

That said, we can still notice that the alphas computed may not anymore be solution of the original equation (7), because the derivatives used to get (9) kill linearities. In other words, if we add any linear function of the A_{ij}^k to Y^c the method used in the paper still produces the same alphas, so there is still no guarantee that the alphas obtained actually solve equation (7).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants