Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing of template specializations and templated friend functions #733

Merged
merged 15 commits into from
Jan 9, 2024

Conversation

HGuillemet
Copy link
Contributor

Parsing this:

template <class T>
struct S<T, T>  {};

generates a S<T,T>.java class.

When it's parsed the first time, and no info is found for the class, it adds a dummy one for S<T,T>.
I think this adding is to allow proper namespace resolution if S is referenced afterwards.
But when DeclarationList.add is called, the query that looks for template instances in info map returns S<T,T> and triggers the instantiation.

I suggest to untemplate the name of the class before adding it to the infoMap, in order to still help namespace resolution but prevent spurious template instantiation.
Does it make sense ?

@saudet
Copy link
Member

saudet commented Jan 2, 2024

Sounds good, but let's see if there are other similar cases elsewhere we could fix before merging this

@HGuillemet
Copy link
Contributor Author

I have a fix for friend function declaration, not related to this one. Do I add the commit to this PR or do I open another ?

@saudet
Copy link
Member

saudet commented Jan 2, 2024

You can add it here since this fix is just one line

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Jan 3, 2024

So included in this PR, some fixes for handling templated friend functions:

namespace ns {
  template <typename U>
  void f(U u) {};

  struct S {
    friend void f<S>(S t);
  };

  template <typename U>
  struct X;

  template <typename U>
  void g(X<U> x) {};

  template <typename U>
  struct X {
    friend void g<U>(X x);
  };
}

With master, the parsing of this code has several issues:

  1. Without any info (but new Info().friendly()), is generated:
public class S extends Pointer {
  /*  ... */
  private static native @Namespace void f<ns::S>(@ByVal S t);
  public void f<ns::S>() { f<ns::S>(this); }
}

Which doesn't compile because of f<ns::S> method name. However, I think the behaviour is correct. The alternative would be either to silently skip the friend function because it has no explicit Java name, or to remove the template argument in the Java name. I think it's ok to force to add an info with the correct name. It's consistent with how class templates that are automatically instantiated because they are used somewhere else are given Java classes with invalid class names. Removing the template argument would work in most cases but could also hide functions if the function template is instantiated more than once in the same class. So, if you agree, there is no real issue here.

  1. If we add an info to give a valid name to the function: new Info("ns::S::f<ns::S>").javaNames("f").friendly(), it doesn't work. Same class is generated. This is because of this line in declarator() that prevents to pick the Java name from the info map. I'm not sure to understand the intent of the context.templateMap != null && context.templateMap.type == null clause. I suggest to instead test if the cpp names equal: info.cppNames[0].equals(dcl.cppName). Thus the Java name is picked if the main info cpp name has no template arguments or if it has template arguments but they exactly match the ones of the declaration being parsed.

  2. If we instantiate X<int> with new Info("ns::X<int>").pointerTypes("X"). The info map query when g is parsed is ns::X<int>::g<U>(int) instead of ns::X<int>::g<int>(int). I suggest to fix this by replacing these lines by a call to templateArguments, which performs the substitution, correctly qualifies the types, and probably also parses more robustly complex arguments.

  3. After applying the fixes above, the JNI doesn't compile with error:

/home/java/javacpp/test/test/jniglobal.cpp:583:9: error: 'f' was not declared in this scope; did you mean 'ns::f'?
  583 |         f<ns::S>(*ptr0);

This is because we rely on ADL when we call a friend function, reseting the namespace with @Namespace (some friend operators or functions are only accessible through ADL). But ADL is not performed when calling a function with explicit template arguments, for some reasons. So in declarator we must strip the template arguments from the @Name of friend functions, and hope the compiler will figure out the right template instance.

Regression tests with the whole PR:

  • pytorch : info for templated friend c10::impl::ListElementReference::swap must be adjusted.

  • onnx: there are template specifializations in proto_utils.h:

template <typename T>
inline std::vector<T> RetrieveValues(const AttributeProto& attr);
template <>
inline std::vector<int64_t> RetrieveValues(const AttributeProto& attr) {
 return {attr.ints().begin(), attr.ints().end()};
}

template <>
inline std::vector<std::string> RetrieveValues(const AttributeProto& attr) {
 return {attr.strings().begin(), attr.strings().end()};
}

template <>
inline std::vector<float> RetrieveValues(const AttributeProto& attr) {
 return {attr.floats().begin(), attr.floats().end()};
}

And presets has:

infoMap
  .put(new Info("onnx::RetrieveValues<int64_t>").javaNames("RetrieveValuesLong"))
  .put(new Info("onnx::RetrieveValues<std::string>").javaNames("RetrieveValuesString"))

WIth master, when the specializations are parsed and declarator() is called, the Java name from first info is picked because context.templateMap != null && context.templateMap.type == null is true. With PR, info.cppNames[0].equals(dcl.cppName) is false so the Java name is not picked and the original name is kept. This results in master with only 2 declarations (specializations picked java name RetrieveValuesLong clashes with template instantiation from info), and with PR with 3 declarations. I'd say the PR behaviour is more logical, although the resulting mapping is sloppy.

  • tvm: for similar reasons, PR generated an extra declaration for this template specialization in array.h:
template <>
inline ObjectPtr<ArrayNode> make_object() {
  return ArrayNode::Empty();
}
  • All other presets: no change
  • Not checked: libfreenect2,chilitags,hyperscan,tensorflow,ale, ngraph, qt, skia, videoinput

The best behavior for template specializations is probably not to map them unless requested via info. What do you thing ?

@HGuillemet HGuillemet changed the title Untemplate class name before adding them to infoMap Fix parsing of template specializations and templated friend functions Jan 3, 2024
@saudet
Copy link
Member

saudet commented Jan 3, 2024 via email

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Jan 3, 2024

Nothing important. I'd give up the mapping of swap in pytorch but nobody would care.

If we look at the 4 fixes included, in order of appearance in Parser.java:

  • Fix 1 (parsing of template arguments in declarator) is simple, not game changing, improves and simplifies the code by logically reusing templateArguments. Can you think of a possible side effect ?
  • Fix 2 (conditions for picking a java name from info) you might want to avoid, since the implications are not clear, at least for me. If we merge it I think we must include another fix to prevent automatic mapping of full template specializations.
  • Fix 3 (remove template args from @Name of friends) has very limited impact and the fix is clear. No reason not to merge it I believe.
  • Fix 4 (remove template args before creating an entry in info map) is the one that solves the issue of top post.

@saudet
Copy link
Member

saudet commented Jan 4, 2024

I'm not sure which ones of those modifications are number 1, 2, 3, but anyway the results for ONNX and TVM don't make sense. If you can figure out a way to not break that, please do it.

@saudet
Copy link
Member

saudet commented Jan 4, 2024

Like, for example, with ONNX and this:

               .put(new Info("onnx::RetrieveValues<float>").javaNames("RetrieveValuesFloat"))
               .put(new Info("onnx::RetrieveValues<int64_t>").javaNames("RetrieveValuesLong"))
               .put(new Info("onnx::RetrieveValues<std::string>").javaNames("RetrieveValuesString"))

we still get something that doesn't work:

@Namespace("onnx") public static native @ByVal @Name("RetrieveValues<float>") FloatVector RetrieveValuesFloat(@Const @ByRef AttributeProto attr);

@Namespace("onnx") public static native @ByVal @Name("RetrieveValues<int64_t>") LongVector RetrieveValuesLong(@Const @ByRef AttributeProto attr);

@Namespace("onnx") public static native @ByVal @Name("RetrieveValues<std::string>") StringVector RetrieveValuesString(@Const @ByRef AttributeProto attr);
@Namespace("onnx") public static native @ByVal LongVector RetrieveValues(@Const @ByRef AttributeProto attr);

So please fix this

@HGuillemet
Copy link
Contributor Author

I'm not sure which ones of those modifications are number 1, 2, 3,

I added precisions in my previous comment.

@saudet
Copy link
Member

saudet commented Jan 4, 2024

Ok, thanks, so like I said before, it's hard to come up with meaningful names for template instances, that's not a bug. Don't try to implement anything related to that, unless maybe define() is set

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Jan 4, 2024

Ok, thanks, so like I said before, it's hard to come up with meaningful names for template instances, that's not a bug. Don't try to implement anything related to that

That's not the intent.

Like, for example, with ONNX and this:

               .put(new Info("onnx::RetrieveValues<float>").javaNames("RetrieveValuesFloat"))
               .put(new Info("onnx::RetrieveValues<int64_t>").javaNames("RetrieveValuesLong"))
               .put(new Info("onnx::RetrieveValues<std::string>").javaNames("RetrieveValuesString"))

we still get something that doesn't work:

@Namespace("onnx") public static native @ByVal @Name("RetrieveValues<float>") FloatVector RetrieveValuesFloat(@Const @ByRef AttributeProto attr);

@Namespace("onnx") public static native @ByVal @Name("RetrieveValues<int64_t>") LongVector RetrieveValuesLong(@Const @ByRef AttributeProto attr);

@Namespace("onnx") public static native @ByVal @Name("RetrieveValues<std::string>") StringVector RetrieveValuesString(@Const @ByRef AttributeProto attr);
@Namespace("onnx") public static native @ByVal LongVector RetrieveValues(@Const @ByRef AttributeProto attr);

So please fix this

The problem here is with instantiation of template specializations:
Say we have this header:

template<typename A, typename B>
struct X;

template<typename A>
struct X<A,int>;

template<>
struct X<int,int>;

and this info: new Info("X<int,float>").javaNames("X_int_float").

Each template is instantiated in turn with same info, ignoring the extra template arguments in the second and third ones.
With master, all instantiations are given the same name, obtained from partial info match. So in practice only the first one is kept, because of java name clash.
With PR, the java name for X<int,float> is not accepted for naming X<int> (which happens to be a X<int,int>, so it looks sane). So the cpp name X is kept and we end up with an extra declaration.
But I think the best is to never instantiate a template with an info that has more template arguments that the template has parameters.

So, what do you think is the correct behavior ?

  1. Master
  2. Current PR
  3. Current PR + test so that an info for X<A,B> is never used to instantiate a X<A>

@saudet
Copy link
Member

saudet commented Jan 4, 2024

template<typename A, typename B>
struct X;

template<typename A>
struct X<A,int>;

template<>
struct X<int,int>;

The last two are just specializations of the first one. It's just an implementation detail, so they can and should be ignored.

@HGuillemet
Copy link
Contributor Author

I fully agree.
If in Info we have only X<A,B>, the 2 specializations should not be instantiated. This is not the case currently.
So I suggest the 3rd option (PR + prevent X<A,B> to instantiate X<A>).
With this option, tvm and onnx will be back to unchanged. However a quick test shows me some declarations in opencv that are generated because template full specializations (template <> ...) are automatically instantiated and some info will have to be added to instantiate them explictly.
Shall I proceed ?

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Jan 4, 2024

I forgot to check classes that would not be generated anymore. Here are they:

  • depthai:
    • ConnectionHash: full specialization of std::hash<dai::Node::Connection>: will need a .define() now. This will also add the proper methods from the container. The class seems quite useless as is.
  • opencv
    • CvMatDefaultDeleter : the parent, non specialized, template DefaultDeleter is in cvstd_wrapper.hpp which is not parsed.
    • Accumulator : a specialization for cv::Accumulator<unsigned char>, seems useless as is. Other accumulators are not mapped.
    • CvCaptureDefaultDeleter : same as CvMatDefaultDeleter
    • CvVideoWriterDefaultDeleter : same as CvMatDefaultDeleter
  • pytorch
    • pack : spurious template instance
    • class_ : spurious template instance
  • tensorflowlite
    • TfLiteTypeToType : full specialization for tflite::TfLiteTypeToType<kTfLiteInt32>. Seems useless as is. Parent template not mapped.
    • ValueHasher : Same, for tflite::op_resolver_hasher::ValueHasher<tflite::BuiltinOperator>
  • tensorrt
    • EnumMaxImpl : Same for nvinfer1::impl::EnumMaxImpl<nvinfer1::DataType>
  • tvm
    • Handler
    • PackedFuncValueConverter
    • Type2str
    • is_floating_point
    • is_integral
    • is_pod
    • type_name_helper
    • typed_packed_call_dispatcher

@saudet
Copy link
Member

saudet commented Jan 4, 2024

That sounds alright, yes, thanks

@saudet
Copy link
Member

saudet commented Jan 5, 2024

And pull bytedeco/javacpp-presets#1455 passes with these changes, is that it?

@HGuillemet
Copy link
Contributor Author

Yes, but I'll have to add commits to adjust some info and remove spurious classes from gen.
Tell me if you'd like a PR for simiar adjustments of depthai, opencv, tvm... or if you prefer to do it yourself.

@saudet
Copy link
Member

saudet commented Jan 5, 2024 via email

@HGuillemet
Copy link
Contributor Author

For most of them, yes. But we need to remove them from src/gen.

A .define must be added for ConnectionHash in depthai.

Concerning opencv, there are a bunch of functions void write(@ByRef FileStorage fs, SOMECLASS value); in opencv2/core/persistence.hpp. Some of them are plain functions, some are templates, some are full template specializations.
In master, plain functions and full specializations are mapped.
With PR, only plain functions.
I believe we must either map none, or all. In both cases we must add some info.

@saudet
Copy link
Member

saudet commented Jan 5, 2024

For most of them, yes. But we need to remove them from src/gen.

Sure, but it's not too important, it'll get done eventually :)

A .define must be added for ConnectionHash in depthai.

It doesn't appear to be used anywhere, we don't need it

Concerning opencv, there are a bunch of functions void write(@ByRef FileStorage fs, SOMECLASS value); in opencv2/core/persistence.hpp. Some of them are plain functions, some are templates, some are full template specializations. In master, plain functions and full specializations are mapped. With PR, only plain functions. I believe we must either map none, or all. In both cases we must add some info.

Ah, yes, we should create instances for those...

Well, please open a pull request for that, sure.

So, you think we can merge this pull request here?

@HGuillemet
Copy link
Contributor Author

Please hold on, I'm tracking another potential issue.

@HGuillemet
Copy link
Contributor Author

For most of them, yes. But we need to remove them from src/gen.

Sure, but it's not too important, it'll get done eventually :)

A .define must be added for ConnectionHash in depthai.

It doesn't appear to be used anywhere, we don't need it

Ok, then we can remove its info.

Concerning opencv, there are a bunch of functions void write(@ByRef FileStorage fs, SOMECLASS value); in opencv2/core/persistence.hpp. Some of them are plain functions, some are templates, some are full template specializations. In master, plain functions and full specializations are mapped. With PR, only plain functions. I believe we must either map none, or all. In both cases we must add some info.

Ah, yes, we should create instances for those...

Well, please open a pull request for that, sure.

Ok, finishing pytorch adaptation first.

@HGuillemet
Copy link
Contributor Author

Please hold on, I'm tracking another potential issue.

Can you remember why there is a !decl.custom clause at this line ?
This prevents instantiatons of templates with a javaText.
For my current need it's not blocking, since it concerns a constructor template and I can split between an info with define for template infomap query and another info with with full cpp name and javaText for query instance generation. So no hurry for a fix.

I'm finishing adaption of pytorch and opencv, and if there is no issue, we can merge this PR.

@saudet
Copy link
Member

saudet commented Jan 5, 2024

Can you remember why there is a !decl.custom clause at this line ? This prevents instantiatons of templates with a javaText. For my current need it's not blocking, since it concerns a constructor template and I can split between an info with define for template infomap query and another info with with full cpp name and javaText for query instance generation. So no hurry for a fix.

That's from commit e473740. I guess we could ignore that flag when define() is set, yes, that should be fine.

I'm finishing adaption of pytorch and opencv, and if there is no issue, we can merge this PR.

👍

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Jan 5, 2024

Something like sed 's/decl.custom = true/decl.custom = !info.define/g' src/main/java/org/bytedeco/javacpp/tools/Parser.java should do what you want, no?

Ok I have done this, although it looks uselessly cryptic to me. Unless I missed something, all this declaration.custom stuff is only here to allow writing new Info("X").javaText("...") instead of new Info("X<XXX>").javaText("..."), for the same result.

Pytorch presets is ok now. There is only the problem with Stack constructors, for which I have no solution, but it might well have 0 impact.

@HGuillemet
Copy link
Contributor Author

Please wait a bit more before merging.

@saudet
Copy link
Member

saudet commented Jan 7, 2024

Sure, take your time to finalize this, but don't try to start anything new here. I'd like to make a release sooner rather than later while we still got all the builds working.

@HGuillemet
Copy link
Contributor Author

Commits added:

  • Function templates: a previous commit in this PR added template arguments to function fullnames for querying the infoMap. But declarator() called before to determine the cppName could already have added this template arguments (because declarator calls Context.qualify which proposes a list of cpp names some of them with template arguments, and, depending on what is found in the infoMap, those cpp names could be selected). I added a check to not append the template arguments if the cpp name already ends with >. That's not very satisfactory but it seems to work and I didn't see another alternative besides redesigning how Context.qualify works and how the cpp name is selected from what is returned.
  • Updating the opencv presets showed me an upcast info were missing, but adding it triggered some bugs related to how virtual inheritance is tracked to determine if static or dynamic cast is needed in downcast constructors. Second commit fixes that.

All presets are now updated for this PR, so I think you can merge unless you spot something wrong. I'll then push a PR with changes for Pytorch, Opencv and Depthai.

@HGuillemet
Copy link
Contributor Author

Concerning template specializations and the problem of Stack above and other similar problems, like Complex<Half> in Pytorch, where template specialization change the API of the parent template:
It would be nice to be able to write info like this:

infoMap
  .put(new Info("S<double>").javaNames("Sdouble"))
  .put(new Info("S<>", "S<int>").javaNames("Sint"))
;

So that the template specialization is used to generate instance S<int> and the parent template for instance S<double>.
That doesn't work currently, I'm trying to understand why. There is probably little to modify to make this possible.

@saudet
Copy link
Member

saudet commented Jan 8, 2024 via email

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Jan 8, 2024

It does unfortunately.
The specialization replaces the definitions in the parent/primary template.

infoMap
  .put(new Info("S<double>").javaNames("Sdouble"))
  .put(new Info("S<>", "S<int>").javaNames("Sint"))
;

won't work in fact, since the class Sint would have a @Name("S<>") which is not a valid name. It should be S<int>. But if S<int> is listed as the first cppName in Info, it will trigger the instantiation with the primary template.

Another way we could fix that is to arrange so that a new definition that clashes with an already generated definition (same signature) replaces it instead of being ignored. That way we first instantiate S<int> as Sint using primary template, then we re-instantiate it using the specialization, and the specialization takes precedence.
What do you think ?
Is not worth the effort ?

@saudet
Copy link
Member

saudet commented Jan 8, 2024

C++ already picks the specialization in priority, in fact it's not possible to not use the specialization

@HGuillemet
Copy link
Contributor Author

Right. That's why theoretically we should parse the specialization when generating an instance that matches, and not use the primary template.

@saudet
Copy link
Member

saudet commented Jan 8, 2024

You mean like the specialization adds and removes members?! 🤔

@HGuillemet
Copy link
Contributor Author

Yes, specializations can do that.
Example:

template <typename T>
struct S {
  void f(T x);
};

template <>
struct S<int> {
  void g();
};

int main(int ac, char **av) {
  S<int> sint;
  S<double> sdouble;
  sint.f(1); // Compilation error
  sdouble.f(1); // Ok
  sint.g(); // Ok
}

In this case we should have a way to generate a Sint.java with g and no f.

@saudet
Copy link
Member

saudet commented Jan 9, 2024

Is there any one last thing you might want to look at before merging this?

@HGuillemet
Copy link
Contributor Author

No. I'm done.

@saudet saudet merged commit f540614 into bytedeco:master Jan 9, 2024
11 checks passed
@HGuillemet HGuillemet deleted the fix_info_template branch January 9, 2024 13:46
HGuillemet added a commit to HGuillemet/javacpp-presets that referenced this pull request Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants