Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parsing of operator and function templates #732

Merged
merged 6 commits into from
Jan 1, 2024

Conversation

HGuillemet
Copy link
Contributor

@HGuillemet HGuillemet commented Dec 29, 2023

See Issue #729

  • Add Parser.operator that explicitly parses an operator following the operator keyword. In master, all tokens up to next open parenthesis were taken as the operator name. We need this to be able to parse template arguments that may appear after the operator and before a parenthesis .
  • Add support for function parameters to Templates.splitNamespace.
  • Make Infomap.normalize use methods in Templates to untemplate. This allows to untemplate things like operator<<double>.
  • Fix construction of fullname in Parser.function to insert the function template arguments.
  • Clarify distinction between constructor calling name NS::C and declaration name NS::C::C. We cannot specify template arguments of constructor templates with calling name, so info keys should use the declaration name. We still query the calling name for backwards compatibility when generating a constructor template instance, but we query only the declaration name when looking for template instances in DeclarationList.add. This fixes a bug where template arguments of the class were used instead of template arguments of the constructor when both were templated.
  • Fix a bug where the cppName of constructor was set to the declaration name in the particular case of the class lying in root namespace.

Example derived from @benshiffman's:

template <typename T>
class TemplatedClass {
public:
    template <typename U> TemplatedClass(U val);
    template <typename U> void funTemp(U val);
    template <typename U> bool operator<(TemplatedClass<U>& rhs);
};

WIth info:

infoMap
    .put(new Info("TemplatedClass<double>").pointerTypes("TemplatedClassDouble"))

    .put(new Info("TemplatedClass<double>::TemplatedClass<int>").javaNames("XXX"))

    .put(new Info("TemplatedClass<double>::funTemp<int>").javaNames("funTempInt"))

    .put(new Info("TemplatedClass<double>::operator <<double>").javaNames("lt"))
;

we now get:

    public TemplatedClassDouble(int val) { super((Pointer)null); allocate(val); }
    private native void allocate(int val);
    public native @Name("funTemp<int>") void funTempInt(int val);
    public native @Cast("bool") @Name("operator <<double>") boolean lt(@ByRef TemplatedClassDouble rhs);

Note that .javaNames("XXX") is necessary because in DeclarationList.add we require a Java name to be present in the info to trigger the template instantiation. This doesn't make sense for constructors. Maybe should we also accept info with define, and javaText. Or maybe remove any constraint on what is in the info (not tested).

If, because of polymorphism, we need to target functions with specific arguments, things get a bit more complicated:

infoMap
     .put(new Info("TemplatedClass<double>").pointerTypes("TemplatedClassDouble"))

     .put(new Info("TemplatedClass<double>::TemplatedClass<int>(U)").javaNames("XXX"))

     .put(new Info("TemplatedClass<double>::funTemp<int>(U)").javaNames("XXX"))
     .put(new Info("TemplatedClass<double>::funTemp<int>(int)").javaNames("funTempInt"))

     .put(new Info("TemplatedClass<double>::operator <<double>(TemplatedClass<U>&)").javaNames("XXX"))
     .put(new Info("TemplatedClass<double>::operator <<double>(TemplatedClass<double>&)").javaNames("lt"))
;

The use of placeholder U is a bit ugly, but I don't think we can do much about it. It's necessary to answer the infoMap queries in DeclarationList.add, like TemplatedClass<double>::funTemp<U>(U). TemplatedClass<double>::funTemp<int>(U) matches, but not TemplatedClass<double>::funTemp<int>(int).

Info("TemplatedClass<double>::funTemp<int>(int)").javaNames("funTempInt") is still necessary if we need to set the javaName because when we generate the function instance, the infoMap is queried with the function fullname, with template arguments replaced by their concrete values, and TemplatedClass<double>::funTemp<int>(U) won't match.

This PR should only introduce one breaking change: info keys using calling name of templated constructor won't match anymore and must be changed to use declaration name.

Regression tests:

  • pytorch: two infos for constructor template using calling names must be changed.
  • bullet: this PR generates one less constructor for btDefaultMotionState for a complex reason. This class has a virtualize info and a constructor with parameters having default values.
    • in master, because of the bug mentioned above and because the class lies in root namespace, the info map for the constructor is queried with the declaration name only and doesn't match the class info with virtualize.
    • with PR, constructor cppName is correct, and infomap is queries first with declaration name, then with calling (class) name. So virtualize is (wrongly) set to true for the constructor and since function overloads are not generated for virtualized functions, the no-arg constructor is not generated.
      virtualize info doesn't make sense for constructors. Does it ?
      If it does not, I suggest to ignore this info for constructors, which should avoid this problem related to a confusion between constructor info and class info.
  • all other presets: no change
  • presets not checked: libfreenect2,chilitags,hyperscan,tensorflow,ale, ngraph, qt, skia, videoinput

@benshiffman
Copy link

benshiffman commented Dec 29, 2023

Thanks for doing this! I'm doing my best to understand what's going on in the code and I think I get the gist of your changes. Will this allow templated operators with a different type than the parent class to map correctly with an info like so?

infoMap.put(new Info("TemplatedClass<double>::operator +<int>(TemplatedClass<U>&)").javaNames("XXX"));
infoMap.put(new Info("TemplatedClass<double>::operator +<int>(TemplatedClass<int>&)").javaNames("addInt"));

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Dec 29, 2023

Didn't you mean operator +<U> and operator +<int> ?
Yes, that's one of the things fixed.
And you won't need to specify the parameters. This should be enough:

infoMap.put(new Info("TemplatedClass<double>::operator +<int>").javaNames("addInt"));

Can you give this PR a try with your code ?

@benshiffman
Copy link

Didn't you mean operator +<U> and operator +<int> ?

It should have been operator +<int>(TemplatedClass<U>&) and operator +<int>(TemplatedClass<int>&) I think per your example above. I've updated my comment.

infoMap.put(new Info("TemplatedClass<double>::operator +<int>").javaNames("addInt"));

This mapping alone works will consequentially rename any nontemplated operator + to addInt. I had to use this mapping for operator =<U>(const TemplatedClass<U>&) to ensure proper naming since I have a nontemplated operator =:

infoMap.put(new Info("TemplatedClass<double>::operator =<std::complex<double> >(const TemplatedClass<U>&)").javaNames("XXX"));
infoMap.put(new Info("TemplatedClass<double>::operator =<std::complex<double> >(const TemplatedClass<std::complex<double> >&)").javaNames("putComplexDouble"));

infoMap.put(new Info("TemplatedClass<std::complex<double> >::operator =<double>(const TemplatedClass<U>&)").javaNames("XXX"));
infoMap.put(new Info("TemplatedClass<std::complex<double> >::operator =<double>(const TemplatedClass<double>&)").javaNames("putDouble"));

Now the templated one is named putDouble or putComplexDouble and the nontemplated one is just put.

Normal constructors and functions work as per your example.

@HGuillemet
Copy link
Contributor Author

Didn't you mean operator +<U> and operator +<int> ?

It should have been operator +<int>(TemplatedClass<U>&) and operator +<int>(TemplatedClass<int>&) I think per your example above. I've updated my comment.

Right.

infoMap.put(new Info("TemplatedClass<double>::operator +<int>").javaNames("addInt"));

This mapping alone works will consequentially rename any nontemplated operator + to addInt. I had to use this mapping for operator =<U>(const TemplatedClass<U>&) to ensure proper naming since I have a nontemplated operator =:

infoMap.put(new Info("TemplatedClass<double>::operator =<std::complex<double> >(const TemplatedClass<U>&)").javaNames("XXX"));
infoMap.put(new Info("TemplatedClass<double>::operator =<std::complex<double> >(const TemplatedClass<std::complex<double> >&)").javaNames("putComplexDouble"));

infoMap.put(new Info("TemplatedClass<std::complex<double> >::operator =<double>(const TemplatedClass<U>&)").javaNames("XXX"));
infoMap.put(new Info("TemplatedClass<std::complex<double> >::operator =<double>(const TemplatedClass<double>&)").javaNames("putDouble"));

Ok.
What about this:

infoMap.put(new Info("TemplatedClass<double>::operator =<std::complex<double> >").javaNames("putComplexDouble"));
infoMap.put(new Info("TemplatedClass<double>::operator =<double>").javaNames("putDouble"));
infoMap.put(new Info("TemplatedClass<double>::operator =").javaNames("put"));

@benshiffman
Copy link

This ended up working. In the header file, my nontemplated assignment operator overload comes first.

infoMap.put(new Info("TemplatedClass<double>::operator =").javaNames("put")).
infoMap.put(new Info("TemplatedClass<double>::operator =<std::complex<double> >").javaNames("putComplexDouble"));
infoMap.put(new Info("TemplatedClass<std::complex<double> >::operator =").javaNames("put"));
infoMap.put(new Info("TemplatedClass<std::complex<double> >::operator =<double>").javaNames("putDouble"));

There is a catch though. If the templated operator comes first in the C/C++ header (or if there is no nontemplated operator), the above infoMapping will create identical functions to the nontemplated-first scenario but the Parser technically ignores the nontemplated version completely. Reversing the order of the infoMapping will cause the javaName for the nontemplated operator not to map, but both functions will still be inserted where the templated version should be.

So, I think best practice moving forward is nontemplated version first, in both header and mapping.

@saudet
Copy link
Member

saudet commented Dec 30, 2023

  • bullet: this PR generates one less constructor for btDefaultMotionState for a complex reason. This class has a virtualize info and a constructor with parameters having default values.

    • in master, because of the bug mentioned above and because the class lies in root namespace, the info map for the constructor is queried with the declaration name only and doesn't match the class info with virtualize.
    • with PR, constructor cppName is correct, and infomap is queries first with declaration name, then with calling (class) name. So virtualize is (wrongly) set to true for the constructor and since function overloads are not generated for virtualized functions, the no-arg constructor is not generated.
      virtualize info doesn't make sense for constructors. Does it ?
      If it does not, I suggest to ignore this info for constructors, which should avoid this problem related to a confusion between constructor info and class info.

Are you talking about this line?

if (context.virtualize) {
    // Prevent creation of overloads, that are not supported by the generated C++ proxy class.
    break;
}

Right, I guess it would make more sense to check for the @Virtual annotation like

if (context.virtualize && type.annotations.contains("@Virtual")) {
    // Prevent creation of overloads, that are not supported by the generated C++ proxy class.
    break;
}

Does this fix it?

@HGuillemet
Copy link
Contributor Author

Are you talking about this line?

if (context.virtualize) {
    // Prevent creation of overloads, that are not supported by the generated C++ proxy class.
    break;
}

Yes.
context.virtualize is set to true in the case of PR because info.virtualize is true.
And info is set by a first query on btDefaultMotionState::btDefaultMotionState, then, since it returns nothing, on btDefaultMotionState which returns the class info, which has virtualize on.
So to fix this we either can remove the ambiguous query on btDefaultMotionState (my preference, but could break too much. I didn't check on existing presets), or add a check in this if to exclude constructors.

Right, I guess it would make more sense to check for the @Virtual annotation like

if (context.virtualize && type.annotations.contains("@Virtual")) {
    // Prevent creation of overloads, that are not supported by the generated C++ proxy class.
    break;
}

Does this fix it?

I don't think so since, in case of virtualized non-constructor functions, @Virtual is added to modifiers variable and is not in type.annotations. I see that modifiers then doesn't apply to constructors, so we cannot have a virtualized constructor.
What about simply testing :

if (context.virtualize && !type.constructor) {
    // Prevent creation of overloads, that are not supported by the generated C++ proxy class.
    break;
}

Or, probably cleaner, ignore info.virtualize for constructors:

        context.virtualize = (context.virtualize && type.virtual) || (info != null && info.virtualize && !type.constructor);

@saudet
Copy link
Member

saudet commented Dec 30, 2023

I don't think so since, in case of virtualized non-constructor functions, @Virtual is added to modifiers variable and is not in type.annotations. I see that modifiers then doesn't apply to constructors, so we cannot have a virtualized constructor.

Ok, so let's add a check for modifiers then. My understanding is that we can't have overloads only for the @Virtual methods, regardless of whether they are constructors or not. Are you saying you tried that, and it didn't work?

@HGuillemet
Copy link
Contributor Author

Checking modifier should work, right.

But I'm realizing that @Virtual just doesn't apply to constructor:

@Target({ElementType.METHOD})
public @interface Virtual
Test.java:21: error: annotation type not applicable to this kind of declaration
    public @Virtual Test(int x) {}

So we might as well enforce this in Parser and ignore virtualize for constructors.

@saudet
Copy link
Member

saudet commented Dec 30, 2023

Right, I guess, but your context.virtualize = ... line above doesn't work when type.virtual is set, no?

BTW, shouldn't we have modifiers = "@Virtual + modifiers"; instead of overwriting everything with just modifiers = "@Virtual";?

@HGuillemet
Copy link
Contributor Author

Right, I guess, but your context.virtualize = ... line above doesn't work when type.virtual is set, no?

type.virtual is true when we encounter the C++ keyword virtual. Which never happens for constructors. What do you mean ?

BTW, shouldn't we have modifiers = "@Virtual + modifiers"; instead of overwriting everything with just modifiers = "@Virtual";?

The public native default value for modifiers that we overwrite is replaced 6 lines below:

                modifiers += (context.inaccessible ? "protected native " : "public native ");

so it works in the general case.
modifiers is set to another value above in the case of friend and of static/global functions,
but in those case virtualize makes no sense.
We could throw an exception if such case is detected ?

@saudet
Copy link
Member

saudet commented Dec 30, 2023

type.virtual is true when we encounter the C++ keyword virtual. Which never happens for constructors. What do you mean ?

Yes, you're right, so that works. Let's try that

so it works in the general case. modifiers is set to another value above in the case of friend and of static/global functions, but in those case virtualize makes no sense. We could throw an exception if such case is detected ?

I see, so that's OK too, no need to modify that for now.

@HGuillemet
Copy link
Contributor Author

Could you comment about this too:

Note that .javaNames("XXX") is necessary because in DeclarationList.add we require a Java name to be present in the info to trigger the template instantiation. This doesn't make sense for constructors. Maybe should we also accept info with define, and javaText. Or maybe remove any constraint on what is in the info (not tested).

I think we should at least trigger the template instantiation if javaText is present, and not only javaName.

@HGuillemet
Copy link
Contributor Author

type.virtual is true when we encounter the C++ keyword virtual. Which never happens for constructors. What do you mean ?

Yes, you're right, so that works. Let's try that

Pushed and tested: now bullet is unchanged and no new changes are introduced.

@saudet
Copy link
Member

saudet commented Dec 30, 2023

Note that .javaNames("XXX") is necessary because in DeclarationList.add we require a Java name to be present in the info to trigger the template instantiation. This doesn't make sense for constructors. Maybe should we also accept info with define, and javaText. Or maybe remove any constraint on what is in the info (not tested).

I think we should at least trigger the template instantiation if javaText is present, and not only javaName.

javaText replaces everything, I don't understand what effect that should have? In any case, we could add a check for define , sure, that should be fine. That's kind of what it's used for macros. I just didn't do that for templates because usually it's hard to come up with a nice name automatically.

Pushed and tested: now bullet is unchanged and no new changes are introduced.

👍 Does this PR supersede pull #730?

@saudet
Copy link
Member

saudet commented Dec 30, 2023

Yes, this does appear to supersede pull #730.

@benshiffman Let me know if this is good to merge for you! Thanks

@HGuillemet
Copy link
Contributor Author

Note that .javaNames("XXX") is necessary because in DeclarationList.add we require a Java name to be present in the info to trigger the template instantiation. This doesn't make sense for constructors. Maybe should we also accept info with define, and javaText. Or maybe remove any constraint on what is in the info (not tested).

I think we should at least trigger the template instantiation if javaText is present, and not only javaName.

javaText replaces everything, I don't understand what effect that should have?

We might want to instantiate a function template, and set the Java text of this instance explicitly, not only the Java name. That's what @benshiffman tried to do here.

In any case, we could add a check for define , sure, that should be fine.

Ok, I'll add this to the PR. The main question is whether we need to test anything about the info or just test for its presence. I guess I'll have to try out and see.

That's kind of what it's used for macros. I just didn't do that for templates because usually it's hard to come up with a nice name automatically.

Right, but there are 2 cases where java name is not meaningful : constructors and function templates with full info (see XXX in top post).

Does this PR supersede pull #730?

Yes, the line fixed in PR #730 doesn't exist any more, it has been replaced by the use of Templates.

@saudet
Copy link
Member

saudet commented Dec 30, 2023

We might want to instantiate a function template, and set the Java text of this instance explicitly, not only the Java name. That's what @benshiffman tried to do here.

You mean the issue #729 (comment) bit about ArrayInt? Exactly how is it supposed to know to name the instance "ArrayInt" without any javaNames?

Right, but there are 2 cases where java name is not meaningful : constructors and function templates with full info (see XXX in top post).

I get it for constructors, but not for the other use cases. Are you saying that TemplatedClass<double>::TemplatedClass<int> matches with TemplatedClass<int>? That doesn't sound right

@benshiffman
Copy link

We might want to instantiate a function template, and set the Java text of this instance explicitly, not only the Java name. That's what @benshiffman tried to do here.

In any case, we could add a check for define , sure, that should be fine.

Ok, I'll add this to the PR. The main question is whether we need to test anything about the info or just test for its presence. I guess I'll have to try out and see.

The need for this functionality might come up in the project I’m working on but I think the current interaction with templates in this PR is great as it is.

@HGuillemet
Copy link
Contributor Author

We might want to instantiate a function template, and set the Java text of this instance explicitly, not only the Java name. That's what @benshiffman tried to do here.

You mean the issue #729 (comment) bit about ArrayInt? Exactly how is it supposed to know to name the instance "ArrayInt" without any javaNames?

javaName is not needed if we have a javaText to generate the template instance. But I'm realizing that it is needed if the type is referenced elsewhere, for example if another function returning a value which type is this template instance.
So ok, we can require a javaName in this case.

Right, but there are 2 cases where java name is not meaningful : constructors and function templates with full info (see XXX in top post).

I get it for constructors, but not for the other use cases. Are you saying that TemplatedClass<double>::TemplatedClass<int> matches with TemplatedClass<int>? That doesn't sound right

No.
Given this function template:

template <typename U> void f(U);
template <typename U> void f(U, int);

We want to instantiate the first overload, using the fullname with parameters.
DeclarationList.add will look for instances with this info query: f<U>(U).
To match this query, we need an info like f<int>(U).
An info like f<int>(int) won't match.
Then, when the parser generates the template instance in Parser.function, it queries: f<int>(int) to get the javaName and other informations. The existing info f<int>(U) won't match. That's why we need 2 info, and the javaName is meaningful for the second one only:

infoMap.put(new Info("f<int>(U)").javaNames("XXX"))
       .put(new Info("f<int>(int)").javaNames("fInt"));

If we want to avoid the double info, we need Infomap.normalize to "unparameter", like it "untemplates" and "unconst", so that f<int>(int) matches with f<U>(U). But I think it'll break too much and it's probably not worth for this corner case.

@saudet
Copy link
Member

saudet commented Dec 31, 2023

I see, but in that case infoMap.put(new Info("f<int>(U)", "f<int>(int)").javaNames("fInt")); should work so, that's alright, no?

@HGuillemet
Copy link
Contributor Author

Indeed.
So I suggest to just check for the presence of javaName or define.
define could be used for constructors but also for methods where the default name is ok because the template arg is a function parameter and we can overload using the same java name (typically for operators).

@saudet
Copy link
Member

saudet commented Dec 31, 2023 via email

@saudet saudet merged commit a15c408 into bytedeco:master Jan 1, 2024
11 checks passed
@HGuillemet HGuillemet deleted the function_template branch January 1, 2024 09:35
@benshiffman
Copy link

Hi all, having great success with these changes despite one issue that I'm currently trying to isolate. Templated operator overloads with a std::vector<U> as the right-hand parameter are not mapping. This does not affect non-templated overloads. Will update with more info when I have it.

@HGuillemet
Copy link
Contributor Author

HGuillemet commented Jan 4, 2024

Any news about the remaining issue ?

Could you try PR #733 with your code ? Some commits may impact you.

@benshiffman
Copy link

@HGuillemet As far as I can tell there are no changes in functionality with the new code. Using std::vector<U> is not super high on my priority list so it may be a few days before I can keep cracking at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants