Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VTK support for unstructured grids with global arrays. #3688

Open
jorgensd opened this issue Jul 7, 2023 · 13 comments
Open

VTK support for unstructured grids with global arrays. #3688

jorgensd opened this issue Jul 7, 2023 · 13 comments

Comments

@jorgensd
Copy link

jorgensd commented Jul 7, 2023

Currently there are some limitations to the VTK unstructured-grid reader in Paraview

Unstructured Grid, vtu, is supported with ADIOS2 Local Arrays Variables only (https://adios2.readthedocs.io/en/latest/ecosystem/visualization.html)

This means that the data is written into blocks, making it challenging to read this data in again on a different number of processes.

A motivating example would be to write a mesh to file:

  1. We write the mesh on N processes
  2. We want to read the mesh in on M processes, with M!=N.
    To do so, I would use Global Array or a JoinedArray (https://adios2.readthedocs.io/en/stable/components/components.html)

This would be extremely beneficial for using VTX as a default mesh format.
A reference implementation of how I want this to work in DOLFINx (previously using local arrays) can be found at: FEniCS/dolfinx#2717

@jorgensd
Copy link
Author

jorgensd commented Jul 7, 2023

Maybe I should move this to VTK? Any thoughts?

@williamfgc
Copy link
Contributor

@jorgensd it's OK if it's here. I am trying to understand the request. You can read current datasets generated with M processes launching ParaView with any number of N MPI processes. Local vs Global is orthogonal to that.

@williamfgc
Copy link
Contributor

We use local arrays as it fits typical VTK (or any) unstructured mesh engines and partitioning representations. For this case we targeted MFEM outputs.

@jorgensd
Copy link
Author

jorgensd commented Jul 7, 2023

@jorgensd it's OK if it's here. I am trying to understand the request. You can read current datasets generated with M processes launching ParaView with any number of N MPI processes. Local vs Global is orthogonal to that.

I agree that I can use Paraview with a different number of processes.

However, I would like to target reading in the .bp file into a distributed/partitioned finite element software (FEniCSx), where I am not reading in the data with the same number of processes as I'm using to write them out.

So for this I would use a global ADIOS array, so that I can choose the partitioning of the data in the BP file within DOLFINx.

The motivation for this is that we want to use the VTXFile for more than just outputting, also making it an input file to our programs (for meshes)

@williamfgc
Copy link
Contributor

williamfgc commented Jul 7, 2023

However, I would like to target reading in the .bp file into a distributed/partitioned finite element software (FEniCSx), where I am not reading in the data with the same number of processes as I'm using to write them out.

You can do this today with local arrays, ParaView or FEniCSx, would use the underlying adios2 library for reading BP files.

So for this I would use a global ADIOS array, so that I can choose the partitioning of the data in the BP file within DOLFINx.

Global or local arrays is just to describe the relationship among local data producers, partitioning of local data producers and consumers is an orthogonal topic and using global arrays doesn't buy much for this. Ideally, you want N writers and N readers for performance, but that's hardly ever the case. You can consume the current local meshes with any number of readers.
I think there is something I am missing in understanding what you're trying to do. Hope it helps.

@jorgensd
Copy link
Author

jorgensd commented Jul 7, 2023

Maybe I'm missing something basic in ADIOS2.
How, from an adios2 perspective would I read in a local-array written in 3 blocks with 2 mpi processes (say in C++).
As far as I could tell you would have to select the blocks (usually based on the MPI rank), and if we have fewer processes than blocks, how to distribute these.

I'll make a minimal example over the weekend (realized the Python pybindings cannot put local variables, so I will have to write it in C++)

@jorgensd
Copy link
Author

jorgensd commented Jul 7, 2023

I.e. say we have this minimal script:

#include <adios2.h>
#include <filesystem>
#include <mpi.h>
#include <numeric>

int main(int argc, char* argv[])
{
  MPI_Init(&argc, &argv);

  {
    std::filesystem::__cxx11::path filename = "test.bp";
    adios2::ADIOS adios(MPI_COMM_WORLD);
    adios2::IO io = adios.DeclareIO("DataWriter");
    io.SetEngine("BP4");
    adios2::Engine engine(io.Open(filename, adios2::Mode::Write));
    int rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    std::vector<double> v(12 + 3 * rank);
    std::iota(v.begin(), v.end(), rank);
    adios2::Variable<double> var
        = io.DefineVariable<double>("Points", {}, {}, {v.size() / 3, 3});

    engine.Put(var, v.data());
    engine.PerformPuts();
    engine.Close();
  }
  MPI_Finalize();

  return 0;
}

which generates a bp-file with the following contents:

bpls test.bp/ -l -d Points
  double   Points  [2]*{__, 3} = 0 / 15
        step 0: 
          block 0: [0:3, 0:2] = 0 / 11
    (0,0)    0 1 2 3 4 5
    (2,0)    6 7 8 9 10 11
          block 1: [0:4, 0:2] = 1 / 15
    (0,0)    1 2 3 4 5 6
    (2,0)    7 8 9 10 11 12
    (4,0)    13 14 15 

How would one read this in to ADIOS in C++ with say three MPI processes?

@williamfgc
Copy link
Contributor

@jorgensd in the adios2 API you set Variable block selection SetBlockSelection and local selections SetSelection for local arrays. The latter can be a function of your MPI rank to setup the consumer partition you need (in your case from [0-2] using 3 MPI processes as it's not evenly distributed). See this example and tests. Hope it helps.

@jorgensd
Copy link
Author

jorgensd commented Jul 7, 2023

@jorgensd in the adios2 API you set Variable block selection SetBlockSelection and local selections SetSelection for local arrays. The latter can be a function of your MPI rank to setup the consumer partition you need (in your case from [0-2] using 3 MPI processes as it's not evenly distributed). See this example and tests. Hope it helps.

So you would set say block selection:

Rank 0: 0
Rank 1: 0 and 1
Rank 2: 1

and then use SetSelection to do different slices of each block for each process?
Say

Rank 0 (block 0): [0:9]
Rank 1 (block 0): [9:]
Rank 1 (block 1): [0:8]
Rank 2 (block 1): [8:]

This would in theory work for our applications (just a bit more work on our end to do this).

If one cannot use the same block on multiple processes, or multiple blocks on a single process, reading in data easily becomes a bottleneck in certain cases:

  1. Write bp-file in serial, read on 100 processes
  2. Write bp-file on a 100 processes, read in serial

However, I would stress that it would be very neat if the VTXReader in Paraview could support both global, joined and local arrays, as it would make it a lot easier to encode global information about the mesh topology.

I guess this is then more of a VTK issue than a Paraview issue as ADIOS2 can do any of these operations. Feel free to close the issue, and I'll re-open a rephrased issue in VTK.

@williamfgc
Copy link
Contributor

Yes, the adios2 reader is the one handling things based on your workflow needs regardless of front end.

However, I would stress that it would be very neat if the VTXReader in Paraview could support both global, joined and local arrays, as it would make it a lot easier to encode global information about the mesh topology.

We follow the VTK data model for unstructured grids. Global arrays are by definition more structured that local arrays and perhaps a better option for structured grids at the global level (e.g. as in vtkImageData), I don't know what would be the added value since unstructured meshes are topologically irregular (hence the choice of local arrays).

From the VTK docs

 - Structured. The dataset is a topologically regular array of cells such as pixels and voxels (e.g., image data) or quadrilaterals and hexahedra (e.g., structured grid) (see “The Visualization Model” on page 19 for more information).
Rectangular subsets of the data are described through extents. The structured dataset types are vtkImageData,
vtkRectilinearGrid, and vtkStructuredGrid.

- Unstructured. The dataset forms a topologically irregular set of points and cells. Subsets of the data are described
using pieces. The unstructured dataset types are vtkPolyData and vtkUnstructuredGrid (see “The Visualization
Model” on page 19 for more information)

My two cents.

@garth-wells
Copy link

Seems to me that this is a straightforward VTK question/suggestion: support reading local and/or global arrays in VTK. If there are reasons why this would not be appropriate/possible it would be helpful to understand.

To my mind the VTK data model isn't relevant. VTK unstructured grid cell connectivity works with (i) an array of point indices, (ii) an array of offsets (starting index for each cell), and (iii) an array of cell type integer identifiers. This can be done with global arrays. Using global arrays to create VTX files would lead to files that are effectively unrelated to the number of processes with which they were created, i.e. for a given mesh the output on 1 MPI rank or 10 MPI ranks would be logically the same.

[1] https://examples.vtk.org/site/VTKFileFormats/#unstructuredgrid

@williamfgc
Copy link
Contributor

williamfgc commented Jul 12, 2023

@garth-wells the question is more on the added value. What exactly is not possible to do with local arrays today? Also, it would be good to understand how your mesh engine is generating the unstructured grid input to adios2. The engines I used so far follow a local array logic.

To my mind the VTK data model isn't relevant.

It is very relevant when implementing a data model and maintaining a reader in VTK, which is the request here.

Using global arrays to create VTX files would lead to files that are effectively unrelated to the number of processes with which they were created. i.e. for a given mesh the output on 1 MPI rank or 10 MPI ranks would be logically the same.

The keyword is "logically". Global (and local) arrays are written the same way in blocks that can be inspected with bpls -lavD file.bp. The difference is on the logic when reading, which is only metadata related, in which in Global arrays you do selections and in local arrays you do selection on blocks. If anything, the block partitioning is better exposed in the read API for local arrays which gives you information for better load balancing at read time (which I understand this is more desired for developers, like myself on the VTX reader, not necessarily end users).

My two cents is to give it a shot first to reading local arrays in an unstructured mesh data model, understand the trade-offs and what's the added value for using global arrays to the developer. For the end user of your product there won't be no difference in accessing the data or performance. Keep in mind that adios2 motivation is performance in HPC and we provide the pieces to do that. We'd need a lot more resources to do and maintain a fully-featured ecosystem with functionality that doesn't tackle performance issues.

@williamfgc
Copy link
Contributor

williamfgc commented Jul 12, 2023

Needless to say, I'm happy to assist you with your reader efforts. You can look at the VTX reader as an example, and feel free to do a PR into VTK if providing and maintaining functionality and tests for a global mesh reader, it's all open source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants