Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discard trailing entries of a line in a MM-format file #1628

Merged
merged 2 commits into from
Aug 14, 2024

Conversation

MarcelKoch
Copy link
Member

This PR will discard any character in a line of a MM-format file after the values have been read in. This addresses #1627, using multiple columns within the MM-file will now lead to an exception. However, the exception is thrown by the read_entry function and thus the exception message is not very helpful.

@MarcelKoch MarcelKoch self-assigned this Jun 19, 2024
@MarcelKoch MarcelKoch linked an issue Jun 19, 2024 that may be closed by this pull request
@ginkgo-bot ginkgo-bot added reg:testing This is related to testing. mod:core This is related to the core module. labels Jun 19, 2024
@thoasm thoasm self-requested a review June 21, 2024 11:39
Copy link
Member

@thoasm thoasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
I just have one question: Does the GKO_CHECK_STREAM also throw in case we don't read enough elements?

core/base/mtx_io.cpp Outdated Show resolved Hide resolved
core/base/mtx_io.cpp Outdated Show resolved Hide resolved
@nbeams
Copy link
Collaborator

nbeams commented Jun 21, 2024

LGTM! I just have one question: Does the GKO_CHECK_STREAM also throw in case we don't read enough elements?

With the current develop (not this PR) I tried a case where I left off one element, and it threw a read error.

@MarcelKoch MarcelKoch added the 1:ST:ready-for-review This PR is ready for review label Jul 11, 2024
Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you also remove the nnz in the error message in line L577-L579 of core/base/mtx_io.cpp
It is for the dense, so there's no need for nnz.

could you also share the real case (I assume from some application) which does not fit the expected MM-file?
I am thinking whether we should ignore the additional entry or just throw the error unless they use some % as the comment?
Could you update the documentation with this behavior?

@nbeams
Copy link
Collaborator

nbeams commented Aug 12, 2024

The "real case" was just me making a mistake in the dense/"array" format for Matrix Market (vs coordinate). I didn't check that the entries were supposed to be in column-major order, and instead provided them row-major (the way you would print out the matrix to look at it visually) and without the correct one-per-line formatting. Because I still had the correct total number of entries, Ginkgo did not throw an error (whitespace between values was treated the same as newlines), so I didn't realize the values were being interpreted incorrectly and therefore causing problems with my test.

So it was a case of user error for sure, but the idea behind this PR was that Ginkgo could help users by making this a failure mode -- if you do not have the correct number of lines, it will be an error, and you will realize your file formatting is incorrect. It has the benefit of more strictly adhering to the MM standard, apart from cases of user error.

MarcelKoch and others added 2 commits August 13, 2024 12:00
- formatting
- fix error message

Co-authored-by: Thomas Grützmacher <thomas.gruetzmacher@tum.de>
Co-authored-by: Yu-Hsiang M. Tsai <yhmtsai@gmail.com>
@MarcelKoch
Copy link
Member Author

Could you update the documentation with this behavior?

I don't think there is anything to update. This just fixes the behavior for our previous wrong parsing.

@MarcelKoch MarcelKoch added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Aug 13, 2024
@MarcelKoch MarcelKoch merged commit ceee174 into develop Aug 14, 2024
10 of 14 checks passed
@MarcelKoch MarcelKoch deleted the mm-io-discard-rest-of-row branch August 14, 2024 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-to-merge This PR is ready to merge. mod:core This is related to the core module. reg:testing This is related to testing.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MM-Format IO Checking
5 participants