Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring the viz module to a more modular design. #476

Merged
merged 1 commit into from
Sep 28, 2023

Conversation

pfebrer
Copy link
Contributor

@pfebrer pfebrer commented Jul 18, 2022

The discussion for this PR started in #446, but the final aim is very different from what I intended at the beggining of that PR, so I prefer to open a new one.

Basically, I refactored the module to have a more modular and functional design while keeping the functionalities that I thought were important. The refactoring is still not complete. I have refactored PDOS and bands as a proof of concept and wanted to know your opinion before carrying on.

As a first introduction and to display the differences, this is now the code of BandsPlot, which at a high level is much easier to understand from scratch than the previous design:

@Plot.from_func
def BandsPlot(bands_data: BandsData = None, 
    Erange=[-2, 2], E0=0, E_axis: Literal["x", "y"] = "y", bands_range=None, spin=None, 
    bands_style={'color': 'black', 'width': 1, "opacity": 1}, spindown_style={"color": "blue", "width": 1},
    gap=False, gap_tol=0.01, gap_color="red", gap_marker={"size": 7}, direct_gaps_only=False, custom_gaps=[],
    bands_mode: Literal["line", "scatter", "area_line"] = "area_line"
):
    if bands_data is None:
        raise ValueError("You need to provide a bands data source in `bands_data`")

    # Filter the bands
    filtered_bands = filter_bands(bands_data, Erange=Erange, E0=E0, bands_range=bands_range, spin=spin)

    # Add the styles
    styled_bands = style_bands(filtered_bands, bands_style=bands_style, spindown_style=spindown_style)

    # Determine what goes on each axis
    x = get_axis_var(axis="x", var="E", var_axis=E_axis, other_var="k")
    y = get_axis_var(axis="y", var="E", var_axis=E_axis, other_var="k")

    # Get the actions to plot lines
    bands_plottings = PlotterNodeXY(data=styled_bands, x=x, y=y, set_axrange=True, what=bands_mode)

    # Gap calculation
    gap_info = calculate_gap(filtered_bands)
    # Plot it if the user has asked for it.
    gaps_plottings = draw_gaps(bands_data, gap, gap_info, gap_tol, gap_color, gap_marker, direct_gaps_only, custom_gaps, E_axis=E_axis)

    return CompositePlotterNode(bands_plottings, gaps_plottings, composite_method=None)

The main ideas are explained in this notebook (I include the html output so that you don't need to run it), because I thought it was much easier to explain in a notebook. I tried to not go into the details and explain the framework at a high level, so I expect it will be a fast read even if it seems a bit long.

Looking forward to hear your thoughts whenever you have time!

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Jul 18, 2022

This pull request introduces 140 alerts and fixes 5 when merging 4391711 into ea81dcd - view on LGTM.com

new alerts:

  • 26 for Unused import
  • 22 for 'import *' may pollute namespace
  • 17 for Unused local variable
  • 11 for Modification of parameter with default
  • 10 for Except block handles 'BaseException'
  • 7 for Conflicting attributes in base classes
  • 7 for Missing call to `__init__` during object initialization
  • 7 for Unnecessary 'else' clause in loop
  • 7 for Module is imported with 'import' and 'import from'
  • 7 for Wrong number of arguments in a call
  • 6 for Unreachable code
  • 4 for Wrong name for an argument in a call
  • 3 for Redundant assignment
  • 2 for Signature mismatch in overriding method
  • 2 for Unnecessary pass
  • 1 for `__eq__` not overridden when adding attributes
  • 1 for Variable defined multiple times

fixed alerts:

  • 4 for 'import *' may pollute namespace
  • 1 for Unused import

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Jul 18, 2022

This pull request introduces 140 alerts and fixes 5 when merging 471fc2c into ea81dcd - view on LGTM.com

new alerts:

  • 26 for Unused import
  • 22 for 'import *' may pollute namespace
  • 17 for Unused local variable
  • 11 for Modification of parameter with default
  • 10 for Except block handles 'BaseException'
  • 7 for Conflicting attributes in base classes
  • 7 for Missing call to `__init__` during object initialization
  • 7 for Unnecessary 'else' clause in loop
  • 7 for Module is imported with 'import' and 'import from'
  • 7 for Wrong number of arguments in a call
  • 6 for Unreachable code
  • 4 for Wrong name for an argument in a call
  • 3 for Redundant assignment
  • 2 for Signature mismatch in overriding method
  • 2 for Unnecessary pass
  • 1 for `__eq__` not overridden when adding attributes
  • 1 for Variable defined multiple times

fixed alerts:

  • 4 for 'import *' may pollute namespace
  • 1 for Unused import

@codecov
Copy link

codecov bot commented Jul 21, 2022

Codecov Report

Attention: 1033 lines in your changes are missing coverage. Please review.

Comparison is base (8d9c331) 87.42% compared to head (bc1ea37) 87.27%.
Report is 8 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #476      +/-   ##
==========================================
- Coverage   87.42%   87.27%   -0.16%     
==========================================
  Files         376      355      -21     
  Lines       48172    47680     -492     
==========================================
- Hits        42113    41611     -502     
- Misses       6059     6069      +10     
Files Coverage Δ
src/sisl/_environ.py 87.17% <100.00%> (+0.33%) ⬆️
src/sisl/io/siesta/tests/test_eig.py 100.00% <100.00%> (ø)
src/sisl/io/tests/test_xsf.py 100.00% <ø> (ø)
src/sisl/nodes/__init__.py 100.00% <100.00%> (ø)
src/sisl/nodes/context.py 88.46% <ø> (ø)
src/sisl/nodes/dispatcher.py 0.00% <ø> (ø)
src/sisl/nodes/syntax_nodes.py 100.00% <100.00%> (ø)
src/sisl/nodes/tests/test_context.py 97.43% <ø> (ø)
src/sisl/nodes/tests/test_node.py 97.76% <100.00%> (-0.19%) ⬇️
src/sisl/nodes/tests/test_workflow.py 86.17% <ø> (ø)
... and 83 more

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Jul 22, 2022

This pull request introduces 150 alerts and fixes 5 when merging 0b8546a into 7f1f979 - view on LGTM.com

new alerts:

  • 33 for Unused import
  • 23 for 'import *' may pollute namespace
  • 18 for Unused local variable
  • 11 for Modification of parameter with default
  • 10 for Except block handles 'BaseException'
  • 7 for Conflicting attributes in base classes
  • 7 for Missing call to `__init__` during object initialization
  • 7 for Unnecessary 'else' clause in loop
  • 7 for Module is imported with 'import' and 'import from'
  • 7 for Wrong number of arguments in a call
  • 6 for Unreachable code
  • 4 for Wrong name for an argument in a call
  • 3 for Redundant assignment
  • 2 for Signature mismatch in overriding method
  • 2 for Unnecessary pass
  • 2 for Variable defined multiple times
  • 1 for `__eq__` not overridden when adding attributes

fixed alerts:

  • 4 for 'import *' may pollute namespace
  • 1 for Unused import

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Jul 30, 2022

This pull request introduces 152 alerts and fixes 5 when merging 0b035be into 7f1f979 - view on LGTM.com

new alerts:

  • 33 for Unused import
  • 23 for 'import *' may pollute namespace
  • 18 for Unused local variable
  • 12 for Modification of parameter with default
  • 10 for Except block handles 'BaseException'
  • 7 for Conflicting attributes in base classes
  • 7 for Missing call to `__init__` during object initialization
  • 7 for Unnecessary 'else' clause in loop
  • 7 for Module is imported with 'import' and 'import from'
  • 7 for Wrong number of arguments in a call
  • 6 for Unreachable code
  • 4 for Wrong name for an argument in a call
  • 3 for Unnecessary pass
  • 3 for Redundant assignment
  • 2 for Signature mismatch in overriding method
  • 2 for Variable defined multiple times
  • 1 for `__eq__` not overridden when adding attributes

fixed alerts:

  • 4 for 'import *' may pollute namespace
  • 1 for Unused import

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Aug 5, 2022

This pull request introduces 149 alerts and fixes 5 when merging 5f40203 into 29b81cf - view on LGTM.com

new alerts:

  • 33 for Unused import
  • 20 for 'import *' may pollute namespace
  • 18 for Unused local variable
  • 12 for Modification of parameter with default
  • 10 for Except block handles 'BaseException'
  • 7 for Conflicting attributes in base classes
  • 7 for Missing call to `__init__` during object initialization
  • 7 for Unnecessary 'else' clause in loop
  • 7 for Module is imported with 'import' and 'import from'
  • 7 for Wrong number of arguments in a call
  • 6 for Unreachable code
  • 4 for Wrong name for an argument in a call
  • 3 for Unnecessary pass
  • 3 for Redundant assignment
  • 2 for Signature mismatch in overriding method
  • 2 for Variable defined multiple times
  • 1 for `__eq__` not overridden when adding attributes

fixed alerts:

  • 4 for 'import *' may pollute namespace
  • 1 for Unused import

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Nov 19, 2022

This pull request introduces 149 alerts and fixes 5 when merging 597399b into 51b57c7 - view on LGTM.com

new alerts:

  • 33 for Unused import
  • 20 for 'import *' may pollute namespace
  • 18 for Unused local variable
  • 12 for Modification of parameter with default
  • 10 for Except block handles 'BaseException'
  • 7 for Conflicting attributes in base classes
  • 7 for Missing call to `__init__` during object initialization
  • 7 for Unnecessary 'else' clause in loop
  • 7 for Module is imported with 'import' and 'import from'
  • 7 for Wrong number of arguments in a call
  • 6 for Unreachable code
  • 4 for Wrong name for an argument in a call
  • 3 for Unnecessary pass
  • 3 for Redundant assignment
  • 2 for Signature mismatch in overriding method
  • 2 for Variable defined multiple times
  • 1 for `__eq__` not overridden when adding attributes

fixed alerts:

  • 4 for 'import *' may pollute namespace
  • 1 for Unused import

Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. It looks like GitHub code scanning with CodeQL is already set up for this repo, so no further action is needed 🚀. For more information, please check out our post on the GitHub blog.

@zerothi
Copy link
Owner

zerothi commented Jan 9, 2023

This is not to push you, what is your ETA on this.
I would like to get the currents devs out pretty soon, so if your PR is close to ending, I could wait, otherwise I would try and push out a 0.13 release ASAP.

@pfebrer
Copy link
Contributor Author

pfebrer commented Jan 9, 2023

Hmm I think it's better not to rush it, so you can publish the release without it. Actually I was waiting for your opinion on the core ideas, because maybe there are some things that can be simplified/made more functional.

Whenever you have time if you could look at the html file that I provided here with a rendered notebook that would be nice :)

@zerothi
Copy link
Owner

zerothi commented Jan 9, 2023

Ok, I'll have a look (hopefully this week or next)! Thanks!

@zerothi
Copy link
Owner

zerothi commented Feb 2, 2023

Ok, I have had a look. And all-in-all I really like your ideas!

I have uploaded a new notebook with comments in them,
Sislvizrefactordisplay.txt

(change extension to ipynb :)

I say we go! :)

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 2, 2023

Thanks for looking at it!

I will go through some of your comments here:

  1. Will this be problematic for nodes storing large data-sets? Imagine nodes with grids and doing a workflow: read -> do function -> reduce grid. This would result in 2 full grids, and one at reduced size? Not only that, also the arguments to the functions (which needs to be in memory for recalculations).

Yes, I also thought that this would be problematic with grids. But then my thought was that if it does not make sense to store its inputs and outputs it shouldn't be a node. E.g. if you have a plot that does ... -> interpolate_grid -> smooth_grid -> reduce_grid -> ... you should probably pack all that functionality into a single node: ... -> process_grid -> ....

  1. Will a static code analyzer figure out that a node with a return value is valid input? (not necessary, just out of interest)

No, it doesn't. This is one of the reasons I want to get rid of defining all functions as nodes (see design tweak below).

  1. Perhaps there is something I am missing, but couldn't you just have done @Node.from_funct as well? What is the difference from a Workflow?

Yeah this is a bit tricky, but I think Workflow deserves a class of its own. The difference between a simple Node and a Workflow is that a node can contain any computation inside it, while a workflow is just a way of organizing nodes. A workflow just links nodes with each other, there is nothing computed outside the nodes it contains.

Of course you can also consider that you can split all the functionality in a node into individual pieces that are linked together, but you wouldn't want to store checkpoints for all of them 😅

My thoughts for a design tweak

I didn't like the fact that there was @Node.from_func on top of every function of sisl.viz (I had to introduce the default lazy_context=False to make them usable as normal functions). Also I didn't like that to use any other function in a workflow you need to define it explicitly as a node. I.e.:

from sisl import some_func
from sisl.viz import Node, Workflow

node_func = Node.from_func(some_func)

@Workflow.from_func
def my_workflow(...):
    ...
    a = node_func()
    ...

I have now an implementation (not merged here) that on Workflow definition builds replaces all function calls with Node calls. I.e. it allows you to do:

from sisl import some_func
from sisl.viz import Workflow

@Workflow.from_func
def my_workflow(...):
    ...
    a = some_func()
    ...

and it will be exactly the same as the previous code.

To me this is a great improvement because:

  • You don't need to redefine functions as nodes.
  • sisl.viz functionality doesn't need to be defined as nodes.
  • It makes going from workflow to function very simple (just remove the decorator). It is also much easier the other way.
  • Because of the previous to points, there is no need now for the lazy_context. If you use a node it will be lazy. Otherwise just use the normal functions.

I do the change from function calls to node calls getting the Abstract Syntax Tree, modifying it and recompiling. I don't know if it is too "hacky", but it seems very robust to me, and cleaner than any other implementation. Do you think it is fine or should I stick to "more normal" python trickery?

Also, if you don't need to define all functionalities as nodes I thought that this might be useful not only for sisl.viz. You could create a workflow with any functions in sisl (in fact with any function in any module). So do you think it makes sense to have something like sisl.nodes to keep this functionality?

Cheers!

@zerothi
Copy link
Owner

zerothi commented Feb 2, 2023

I'll have to wrap my head a bit more around the node stuff, I am a bit worried it might make the code more complicated and not as intuitive.

I have however thought about the dispatch methods since there are some short-comings, here the nodes might be interesting.

A note on the workflows, it seems to me that if you do it via ast you'll be forcing all things in the node tree and thus you'll be surprised about a workflow that does the process_grid thing, no?

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 2, 2023

I'll have to wrap my head a bit more around the node stuff, I am a bit worried it might make the code more complicated and not as intuitive.

Which part will be more complicated? I don't mean to define sisl functions as nodes, I just mean to store the Node and Workflow class in sisl.nodes instead of sisl.viz.nodes. Although since the functionality has nothing to do with sisl or science I also thought of creating a separate python package that sisl.viz would depend on. I don't know 😅

A note on the workflows, it seems to me that if you do it via ast you'll be forcing all things in the node tree and thus you'll be surprised about a workflow that does the process_grid thing, no?

What do you mean by doing the process_grid thing? Inside the workflow you would just call process_grid(), so process_grid is just converted to a node without caring about what's inside. All you know is that the developer wants to checkpoint the process_grid function.

@zerothi
Copy link
Owner

zerothi commented Feb 3, 2023

  1. Ok, sure lets do that! Could you make a separate PR, then we can get it in immediately, I'll promise to just accept any PR's in that sub-module so you can quickly go about it ;)
  2. It is more that end-users needs to be aware of the shortcomings of the workflow. Meaning they have to consider that every function call will be a node (static memory consumption). And also, what happens for x = y + z code lines? Will this be turned into a lambda and then a node? I found this article which arguably has some points to the matter https://www.georgeho.org/manipulating-python-asts/ It might be dangerous to change the code in-place. But if you like it the way it is, then fine!

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 3, 2023

  1. I want to clarify that I'm not replacing the code in place. When you create a workflow, you get a new object that is the workflow, but the original function remains untouched (it's the same as for a normal node).

Regarding x = y + z, I don't know what to do with it. The thing is that if y or z are a node, then the sum produces a node as well. So the only problem is if y and z are both direct inputs of the workflow, i.e. not nodes. Then you would need to do what you are saying. I don't know if it is worth it to try and catch these things. You need to determine all the dependencies of that line of code to turn it into a node.

(update) Just converting all workflow inputs to a Node when they are received, e.g. y -> WorkflowInputNode(y), would solve the problem. In fact I didn't realise, but this is what I'm doing now.

My first idea was to restrict the usage of workflows to very simple things. At the beggining I thought I would only allow assignments from function calls. E.g.:

@Workflow.from_func
def my_workflow(a, b):
      val = calc_val(a, b)
      shift = calc_shift(a)
      return shift_val(val, shift)

I think if you do this you are forcing devs to create easy to understand workflows and it is very clear where the checkpoints are.

Then I thought I could just replace function calls by node calls and everything would work the same. So in my current implementation it is now valid to do:

@Workflow.from_func
def my_workflow(a, b):
      return shift_val(calc_val(a, b), calc_shift(a))

and it produces the same workflow with the same checkpoints. My idea in doing this was that it would be much simpler to convert any given function into a workflow, which is nice.

However, I don't know if I should go back to forcing the first syntax. I like it because it explicitly declares the checkpoints. And you can access checkpoints of the workflow from variable names (e.g. workflow.val or workflow.shift). In some sense these are not just checkpoints, it is as if you had a state, only that you know what are the minimal functions to run to keep it up to date.

What I'm thinking now is that I could support the second syntax, but a variable declaration could mean that we should create a breakpoint. This is a bit different because checkpoints would be a thing external to nodes instead of using their intrinsic functionality. One could allow:

@Workflow.from_func
def my_workflow(a, b):
      # Private variables indicate that these are just temporal values and should not be checkpointed
      _val_left = calc_val(a, b)
      _val_right = calc_val(b, a)
     
      # I want to create a checkpoint here
      val = _val_left + _val_right 

      # Another checkpoint is implicitly created for the output.
      return shift_val(val, calc_shift(a))

Sorry for the long text :) Any ideas on what you think is best?

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 3, 2023

In the last idea, you wouldn't necessarily need to use _* for temporal variables, it could also be:

@Workflow.from_func(store=["val"])
def my_workflow(a, b):
      # Temporal values that are not checkpointed
      val_left = calc_val(a, b)
      val_right = calc_val(b, a)
     
      # I want to create a checkpoint here
      val = val_left + val_right 

      # Another checkpoint is implicitly created for the output.
      return shift_val(val, calc_shift(a))

@zerothi
Copy link
Owner

zerothi commented Feb 3, 2023

Would all nodes (calc_val, calc_shift) be present in the workflow node list? I think one shouldn't be able to access variables, rather they should access nodes and use get:

val = my_workflow.nodes.calc_val.get()

should be the way to get it, no? Since each node is accessible and the values them selves are .get()'ed then you wont have this problem.
In this way changing of variable names is not a problem! Consider taking a workflow that does A, but now it should also do B, but you accidentally used a variable name in the first workflow that clashes with the understanding of the 2nd workflow...

@zerothi
Copy link
Owner

zerothi commented Feb 3, 2023

To add to that, if a user wants a check-point, they should use a node! :)
So wanting to checkpoint your sum, you would need to nodify it ;)
This of course gets problematic when you have multiple calls to the same node function... hmm... How to do that!

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 3, 2023

So wanting to checkpoint your sum, you would need to nodify it ;)

Yes, but that's what happens automatically, thanks to your advice on Ufuncs. Any operation that you do with nodes results in a node. E.g. if you do y = node_1 + node_2, then y is a "SumNode" with it's inputs linked to node_1 and node_2 outputs. That is as if you did: y = SumNode(node_1, node_2).

This of course gets problematic when you have multiple calls to the same node function... hmm... How to do that!

Yes, exactly. Naming checkpoints by their node name has the downside that if you use the same node more than once you need to modify the names. What this PR implements right now is to add a number to the name if that's the case. I.e. there is calc_val and calc_val_2.

I thought that giving names to checkpoints was much nicer for this and because a normal user would understand it much better without knowing the inner workings I believe. calc_val is the generic name for the function that computes the value, but if you give a name to its output it can be much more meaningful in the context of the workflow.

The naming for the checkpoints would just be a local table to access the nodes outputs, it doesn't impose anything on the node itself. So a separate workflow would be able to use the same node and map its output to another name. (this is just the normal way variables work in python, nothing particularly strange, you can have multiple variables pointing to the same object and each named differently).

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 3, 2023

One could even argue if the variable name is more stable than the node name. You might change the way you compute a given variable of your workflow, or the function name from a third party package might simply change.

@zerothi
Copy link
Owner

zerothi commented Feb 3, 2023

Well, everything is up for changes ;)
The problem is sharing workflows. It requires documentation.

With this I think your idea of notifying the workflow what to expose is the best idea.

@Workflow.from_func(store=["val"])

You could imagine a Workflow.add_store("variable") in case users wants to examine a temporary variable.

The above approach ensures that the implementors of the workflow will document the stored variables for consistency and ease of use.

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 3, 2023

You could imagine a Workflow.add_store("variable") in case users wants to examine a temporary variable.

Yes, my idea was exactly that! It would generate a copy of the workflow with this extra checkpoint.

The problem is sharing workflows. It requires documentation.

I would say it requires the same amount of documentation as any other object, no? You need to document the variables that you expose and try not to change them, private variables need less documentation since they are not expected to be used by users, etc...

@zerothi
Copy link
Owner

zerothi commented Feb 3, 2023

You could imagine a Workflow.add_store("variable") in case users wants to examine a temporary variable.

Yes, my idea was exactly that! It would generate a copy of the workflow with this extra checkpoint.

Good.

The problem is sharing workflows. It requires documentation.

I would say it requires the same amount of documentation as any other object, no? You need to document the variables that you expose and try not to change them, private variables need less documentation since they are not expected to be used by users, etc...

Yes, but here it is a bit more difficult, right.
The workflow returns something, but you want users to be able to access temporary variables defined (in the workflow)!
This is not the same as documenting a function! Variables in a function is by definition private, what you are doing is making them public in some way.

Ideally the add_store should change the documentation and insert the variable name and some documentation on the variable, in this way it becomes automatic, and users will also more easily document their changes.

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 3, 2023

The workflow returns something, but you want users to be able to access temporary variables defined (in the workflow)!
This is not the same as documenting a function! Variables in a function is by definition private, what you are doing is making them public in some way.

Yes, I think of the workflow more as an object with state instead of a function. So what I meant is that you need to document the checkpoints of your workflow as you do with the state of any other object in python.

@zerothi
Copy link
Owner

zerothi commented Feb 3, 2023

Yes, just so we agree that the documentation needs to follow the store values ;)

@zerothi
Copy link
Owner

zerothi commented Sep 25, 2023

Just an idea, would it make sense to do something like this:

def inject_plot(yfunction, base=sisl.viz.Plot):
    def dec(cls):
        d = dict(cls.__dict__)
        d["function"] = staticmethod(function) # or possibly check if it needs to be a class call?
        return type(cls.__name__,cls.__bases__ + (base,), d)
    return dec

@inject_plot(a_cool_plot)
class CoolPlot:
    def color_like(self, color): ...

Or is this annoying, another complexity level?

Otherwise, it looks good! :)

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 25, 2023

I think the class definition not inheriting from Plot is too obscure, isn't it?

But I like the idea of the decorator as an alternative to defining the function variable. Or would it be too little benefit to have the decorator and explicitly inherit from Plot.

By the way is it possible to support

class CoolPlot(Plot, a_cool_plot):
    ...

? (I know, a bit crazy haha)

Or maybe

class CoolPlot(Plot[a_cool_plot]):
    ...

This one is possible for sure, but I don't know if it is sane, you tell me 😅

@zerothi
Copy link
Owner

zerothi commented Sep 25, 2023

But my decorator is having Plot as the base? Check type arguments. The base argument allows to change the default base. It might check that base is a subclass of Plot, just to be sure!

Your first is not viable, and the 2nd is too obscure IMHO...

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 25, 2023

Yes I know that it has the base argument, but the class definition without inheritance looks very weird 😅

The Plot[function] syntax would just be a shorthand for Plot.from_func(function), which already exists (this is Node functionality).

Maybe if we don't agree we could just leave it as it is for now. And then if someone uses it (which would already be very nice) and complains about it being too verbose we can think about it.

@zerothi
Copy link
Owner

zerothi commented Sep 26, 2023

Yes I know that it has the base argument, but the class definition without inheritance looks very weird 😅
Ah, like that. Yeah, true.

The Plot[function] syntax would just be a shorthand for Plot.from_func(function), which already exists (this is Node functionality).

I think this is too weird. I would think an index of a plot could possibly be hooked only to the backend, that wouldn't be so weird. But an indexer that subclasses with a function handle... hmm... not something I would expect.

Maybe if we don't agree we could just leave it as it is for now. And then if someone uses it (which would already be very nice) and complains about it being too verbose we can think about it.

Yes, lets do that. :)
Could you go through the tests that fail, i.e. there is one place where you have:

def test_calculate_gap(bands_data, gap):
    
        spin = bands_data.attrs["spin"]
    
        gap_info = calculate_gap(bands_data)
    
        # Check that the gap value is correct
>       assert gap_info['gap'] == gap
E       assert 2.4695733283635555 == 2.5

I think here it would be better to use gap_info['gap'] == pytest.approx(2.5) with appropriate range.

Could you go through them and make the tests succeed, then I'll merge!

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 26, 2023

Fixed the bug :)

But don't merge yet, I realised I still need to add some docstrings.

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 26, 2023

Hmm this is very strange, the test is failing with exactly the same exact gap value and only for spin orbit.

And I can't reproduce the failing test locally. Can you? The test data is generated randomly so I wonder if the CI sets a value for the random seed and that is what explains the exact same value error.

I don't think it is just a numerical issue, because from 2.5 to 2.46 it's too much error. That's why I don't want to add pytest.approx().

@zerothi
Copy link
Owner

zerothi commented Sep 27, 2023

I get these errors:

    @pytest.mark.parametrize("nodes_lazy", [True, False])
    def test_workflow(nodes_lazy):
    
        def sum_node(a, b):
            return a + b
    
        @Workflow.from_func
        def my_workflow(a, b, c):
            first_sum = sum_node(a, b)
            return sum_node(first_sum, c)
    
        with temporal_context(context=Node.context, lazy=nodes_lazy):
            #It shouldn't matter whether nodes have lazy computation on or off for the working of the workflow
            with temporal_context(context=Workflow.context, lazy=True):
>               val = my_workflow(a=2, b=3, c=4)

/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/tests/test_context.py:59: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/node.py:103: in __init__
    self.setup(*args, **kwargs)
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/workflow.py:740: in setup
    super().setup(*args, **kwargs)
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/node.py:131: in setup
    self._output = self._blank
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/workflow.py:846: in _set_output
    self.nodes.output._output = value
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/node.py:537: in __getattr__
    return GetAttrNode(obj=self, key=key)
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/node.py:110: in __init__
    self.get()
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/node.py:355: in get
    evaluated_inputs = self.map_inputs(
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/node.py:314: in map_inputs
    input_val = func(input_val)
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/node.py:340: in evaluate_input_node
    return node.get()
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/workflow.py:825: in get
    return self.nodes.output.get()
/opt/gnu/12.3.0/python/packages/3.11.4/sisl-dev/0/lib/python3.11/site-packages/sisl/nodes/node.py:537: in __getattr__
    return GetAttrNode(obj=self, key=key)
E   RecursionError: maximum recursion depth exceeded
!!! Recursion detected (same locals & position)
bands_data = <sisl.viz.data.bands.BandsData object at 0x14fbed324410>, gap = 2.5

    def test_calculate_gap(bands_data, gap):
    
        spin = bands_data.attrs["spin"]
    
        gap_info = calculate_gap(bands_data)
    
        # Check that the gap value is correct
>       assert gap_info['gap'] == gap
E       assert 2.4695733283635555 == 2.5

I did use the updated one...

@zerothi
Copy link
Owner

zerothi commented Sep 27, 2023

But is the gap you have entered to the test the correct one? I can not imagine anything where the gap is exactly 2.5...

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 27, 2023

Those errors are locally in your computer?

In this case it is exactly 2.5 (to machine precision, which I don't think will introduce an error of 0.03) because the data is sintetically generated with that gap.

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 27, 2023

I thought the error could be related to python 3.11 but even with 3.11 I could not reproduce it 😪

@zerothi
Copy link
Owner

zerothi commented Sep 27, 2023

Yes, locally on my machine... That is weird...

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 27, 2023

And to add more mistery it only fails for spin orbit, not even non-colinear spin. Hmm I think I'll mark it as skip for now.

The workflow one I also get it so I'll try to fix it.

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 27, 2023

Ok, interesting, I can reproduce if I run the full test suite (previously I was only running the viz tests)

@zerothi
Copy link
Owner

zerothi commented Sep 27, 2023

Ok, interesting, I can reproduce if I run the full test suite (previously I was only running the viz tests)

Sounds like a state?

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 27, 2023

It's a bug in the generation of the synthetic data. The bands are random polynomials, and if there were band crossings in the valence band the gap was not properly set. At some point of running the tests numpy's random seed is set and that's why it's generating always exactly the same gap.

@zerothi
Copy link
Owner

zerothi commented Sep 28, 2023

Could you rebase, then the tests should run through

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 28, 2023

I still haven't pushed the fix, give me 30 min or something like that hehe

@pfebrer pfebrer force-pushed the viz_data_sources branch 2 times, most recently from 7870e52 to fc9d1b0 Compare September 28, 2023 11:55
@pfebrer pfebrer changed the title WIP: Refactoring the viz module to a more modular design. Refactoring the viz module to a more modular design. Sep 28, 2023
@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 28, 2023

Unbelievably, this is finally ready for merge :)

@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 28, 2023

Sorry, as always I was pushing a change in files.

By the way, I remember now that I didn't change the naming of the modules. I don't know what to rename them to :(
But it is true that these are things that will most likely not be accessed, specially plotters and processors.

@zerothi
Copy link
Owner

zerothi commented Sep 28, 2023

Some things that needs to be cleaned in the commits (probably I'll do a squash to remove everything)

  • changes in .ipynb related to the version and nb_format should be reverted, it doesn't do anything
  • changes in the docs/*/generated should be deleted, they should be auto-generated. So remove them.

I think that is more or less it.

Everything is now based on functions that adhere to functional programming.
We combine these functions to create workflows and then use the sisl.nodes
framework to provide the update functionality.
Plotting backends are now implemented each on a single `Figure` class that
implements generic drawing methods, as well as subplots, animations... functionality.
@pfebrer
Copy link
Contributor Author

pfebrer commented Sep 28, 2023

Done, sorry.

Shouldn't the docs/**/generated directories be in .gitignore?

@zerothi zerothi merged commit 2798d15 into zerothi:main Sep 28, 2023
4 of 7 checks passed
zerothi added a commit that referenced this pull request Sep 28, 2023
Refactoring the viz module to a more modular design.
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants