Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] Porting from boost multiprecision #234

Closed
illera88 opened this issue Feb 24, 2022 · 26 comments
Closed

[question] Porting from boost multiprecision #234

illera88 opened this issue Feb 24, 2022 · 26 comments
Assignees
Labels
question Further information is requested

Comments

@illera88
Copy link

Hi,

I'm thinking on creating a branch to move away from boost in Triton. I have multiple questions:

  • Casting one type to another should be as easy to use static_cast<T>(value) right? i.e.
this->memory[addr+i] = (cv & 0xff).convert_to<triton::uint8>(); // old
this->memory[addr+i] = static_cast<triton::uint8>((cv & 0xff)); //new
  • How can I cast a number to string? Is the following correct?
math::wide_integer::uint512_t value = <whatever>;           
std::stringstream ss;
ss << value;
std::string strValue(ss.str());
  • How can I cast from a std::string to a math::wide_integer::uint512_t?
  • How can I create a wide integer whose size is 80 bits? This is the code I want to replicate. I've tried the follwing:
typedef math::wide_integer::uintwide_t<static_cast<size_t>(UINT32_C(80)), std::uint32_t> uint80;

but I get:

error C2338:  Error: Width2 must be 2^n times 1...63 (with n >= 3), while being 16, 24, 32 or larger, and exactly divisible by limb count

because of:

// Verify that the Width2 template parameter (mirrored with my_width2):
    //   * Is equal to 2^n times 1...63.
    //   * And that there are at least 16, 24 or 32 binary digits, or more.
    //   * And that the number of binary digits is an exact multiple of the number of limbs.
    static_assert(   (detail::verify_power_of_two_times_granularity_one_sixty_fourth<my_width2>::conditional_value)
                  && (my_width2 >= 16U) // NOLINT(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers)
                  && (my_width2 == (number_of_limbs * static_cast<size_t>(std::numeric_limits<limb_type>::digits))),
                  "Error: Width2 must be 2^n times 1...63 (with n >= 3), while being 16, 24, 32 or larger, and exactly divisible by limb count");

Thank you so much for your help

@ckormanyos ckormanyos self-assigned this Feb 24, 2022
@ckormanyos ckormanyos added the question Further information is requested label Feb 24, 2022
@ckormanyos
Copy link
Owner

Thank you for your queries @illera88

creating a branch to move away from boost...

There is a bunch to say on this. I'm also co-author/co-maintainer of Boost.Multiprecision so that also has lots of strengths I see.

One thing in the oven is when Boost 1.79 rolls out this spring, Multiprecision will/is-expected-to-be standalone and essentially dependency-free (except for config and possibly math, which are also standalone). So the dependencies of boost 1.79's version of Multiprecision and following versions will become a lot more dependency-free.

Casting one type to another should be as easy to use static_cast<T>(value) right?

Yes. This is an area that I have done quite a bit of work on recently, so i'm not sure it is entirely correct. This also means if problems are found, we can discuss these. At the moment, cast to built-in integral type is explicit (as considered to be a narrowing cast. These should run through the operator here.

How can I cast a number to string?

Good question. At the moment, you do, in fact, have to string-stream it. There is, however #153 which purposes to implement to_chars for this namespace-specific type of wide-integer. This will make it a lot more standards-close in the future. But it's not yet done...

How can I cast from a std::string to a math::wide_integer::uint512_t

Oddly enough, wide-integer started as a project for bare-metal microcontrollers. No-weight was the first mandate. Nowdays folks use it for high-performance all over so it's not such a thing. But a relic of this origin is the fact that there is not a constructor from std::string. It would be a on-liner to make one, but it's not there yet. At the moment, it supports explicit construction from const char*, so using the .c_str() method on the string should give you explicit construction.

How can I create a wide integer whose size is 80 bits?

This is a weakness of wide-integer. At the moment, there is no support for fractional limbs. There is an issue (#7, which I see is a very old issue) that is expected to handle this. At ehe moment, you need to fill up the limbs. So you could synthesize uint80_t with 10 limbs of type uint8_t or 5 limbs of type uint16_t. I know, it's a shame to go 5 limbs when 2-and-a-half would/could do the trick. But that's what #7 is going for.

theoretically, we could clear up all this stuff this spring/summer, as I've opted out of one project in particular that usually keeps me busy at the time.

If you like, try the branch, see how it goes and if you hit any road blocks, please contact me any time.

Thanks for your interest.

@johnmcfarlane
Copy link
Contributor

tl;dr it might be best to avoid in-class conversion to/from other types such as std::string.

Firstly, if you mean representation of a number in human-readable form, that's not a cast, it's a conversion.

Certainly, it may not be appropriate to supply conversion via constructors or conversion operators. There are many reasons why.

E.g., the choice of std::string value is wide. Should a wide integer with value 16 result in a string of value "16" or "016" or "0x10" or "10". All of these are correct - given appropriate context. But the context isn't obvious from a cast operation alone.

Secondly - and as Chris mentions - it would be great if wide-integer remained embedded-friendly. Drawing in a type like std::string imposes various requirements - especially related to dynamic allocation and exception handling - which makes use on standalone (bare metal) platforms far more difficult.

Finally, it would be ideal to reduce coupling between types. If you think of all the different types that can be converted between, you quickly get a combinatorial explosion. For this reason, a free function (like std::to_chars or std::to_string) helps ensure that the explosion doesn't happen directly in the wide-integer header. And it makes it clear what kind of transformation is being performed. In this case, there's a base parameter for example.

Does that sound like an OK way to use uintwide_t?

@illera88
Copy link
Author

Hey guys!

First of all, thank you for your responses.

I've already adapted most of my questions but there are still two things left. Here is the branch where I'm working on migrating Triton from using Boost to wide-integers.

The two things left:

  • Can you help me out to create the 80bits unsigned int we talked about before? Here is how I am trying to create it now:
    //! unsigned 80-bits
    typedef math::wide_integer::uintwide_t<static_cast<size_t>(UINT32_C(80)), std::uint32_t> uint80;

You mentioned: So you could synthesize uint80_t with 10 limbs of type uint8_t or 5 limbs of type uint16_t Can you respond with a example code on how to create those 10 or 5 limbs?

  • How can I use the | operator with two wide-integer types? In this part of the code I'm using | on a uint32 and a uint512 but the compiler is complaining about math::wide_integer::uint512_t not having the | operator.
 triton::sint512 modularSignExtend(AbstractNode* node) {
      triton::sint512 value = 0;

      if ((node->evaluate() >> (node->getBitvectorSize()-1)) & 1) {
        value = -1;
        value = ((value << node->getBitvectorSize()) | node->evaluate()); // node->evaluate() returns a math::wide_integer::uint512_t type
      }
      else {
        value = node->evaluate();
      }

      return value;
    }

Thank you!!

@ckormanyos
Copy link
Owner

create the 80bits unsigned int

Simply replace the so-called limb type (the second template parameter) with std::uint16_t. The templates do the rest.

In your exact cast, then (using a typedef)

typedef math::wide_integer::uintwide_t<static_cast<size_t>(UINT32_C(80)), std::uint16_t> uint80;

A lot of mixed operators are not specialized for intermixing wide-integers having different limb types. So you might find other conversions needing to be explicit when you use the synthesized type uint80.

@ckormanyos
Copy link
Owner

ckormanyos commented Feb 25, 2022

How can I use the | operator with two wide-integer types?

OK. Well, it's moving right along. But a few things are not yet quite resolved...

I need to understand the use case a bit more clearly. Could you please describe more clearly what the types are in lines around here?

The subroutine seems to use a signed integer. Also, it would really help diagnosing this issue if I know exactly what the types of the results of node->evaluate() and node->getBitvectorSize() are. What are these types?

Also node->evaluate() is being called multiple times in each logic path. I wonder if it could be called once to clarify the code

ckormanyos pushed a commit that referenced this issue Feb 26, 2022
@illera88
Copy link
Author

node->evaluate() is a math::wide_integer::uint512_t
node->getBitvectorSize() is a std::uint32_t

I've tried to change:
typedef math::wide_integer::uintwide_t<static_cast<size_t>(UINT32_C(80)), std::uint32_t> uint80; to
typedef math::wide_integer::uintwide_t<static_cast<size_t>(UINT32_C(80)), std::uint16_t> uint80; as you mentioned. But now I'm getting errors in other parts of the code since I can't convert from a uint80 to uint512 being:

typedef math::wide_integer::uintwide_t<static_cast<size_t>(UINT32_C(80)), std::uint16_t> uint80;
typedef math::wide_integer::uint512_t uint512;

Any thoughts?

@ckormanyos
Copy link
Owner

getting errors in other parts of the code since I can't convert from a uint80 to uint512

Any thoughts?

Yes. I have numerous thoughts on this phenomenon.

Consider what I had written in a previous thread entry above:

mixed operators are not specialized for intermixing wide-integers having different limb types. So you might find other conversions needing to be explicit when you use the synthesized type uint80

This, combined with your empirical observations, means we have reached a point where wide-integer is not at the moment a one-to-one seamless drop in replacement for what you have or need in your project.

There are several ways to proceed.

  1. You can write the needed conversions in a project-specific way. This might get you a working model fastest.
  2. You could consider using uint96 and masking out the 16 bits not needed for uint80. But I fear this would change too much of your internal code.
  3. You could write a project-specific uint80 based on wide-integer's uint96 (synthesized from 3 uint32_ts). This is similar to 2 above.
  4. In wide-integer, wait for support of interconversion/construction between types having different limb types.
  5. In wide-integer, wait for support of sub-limb bit resolutions, as mentioned both above as well as in Relax most constraints on binary digit count #7.

I realize that none of these solutions is a perfect fit. But I fear that either 4 or 5 might really take a while. Wide-integer has settled down quite a bit recently after a big year of change last year. In fact, last year, we handled signed-ness and constexpr-ness. I would need to think a bit before agreeing to 4 or 5 because most of the heavy wide-integer-crunchers are silent (and hopefully happy with it) and a bit of peace has moved into the project.

@ckormanyos
Copy link
Owner

I guess there is a final option. You could walk every integer type down to 16-bit internal limb type. But that hits performance rather hard.

@johnmcfarlane
Copy link
Contributor

johnmcfarlane commented Feb 26, 2022 via email

@ckormanyos
Copy link
Owner

Would a 12-byte type with 80-bit semantics suffice?

Exactly! Indeed, I was thinking a 96-bit (12-byte) type composed of 3 internal uint32_t would be a rather close fit. But I am not exactly sure how the OP is using the 80-bit type.

@johnmcfarlane
Copy link
Contributor

johnmcfarlane commented Feb 26, 2022 via email

@ckormanyos
Copy link
Owner

if uintwide can support 3 x .uint32

Maybe I'm missing the point of this comment, but yes, of course. This might not be quite clear, but uintwide_t can already be instantiated for 96 bits with an internal 32-bit limb size.

@ckormanyos
Copy link
Owner

Maybe the docs are unclear (and I should clear them up). But as long as the bits fit entirely within the limbs, then you can instantiate uintwide_t.

using uint96_t = math::wide_integer::uintwide_t<96U>;

gives the above-mentioned 96-bit unsigned integer composed of three 32-bit limbs, where the default limb type of std::uint32_t is used.

@johnmcfarlane
Copy link
Contributor

A clearer description would be appreciated, but understood, efficient 96-bit type is supported.

However, a type named uint96_t implies a fundamental 96-bit type, not a class-based type. Probably best to avoid using a name like those in .

@illera88
Copy link
Author

illera88 commented Feb 27, 2022

Hi guys

Thank you for your explanation. It makes more sense now to me. Out of the 5 or 6 options you proposed giving that I can't for now count with the 4th and 5th option, I think the more straightforward one is the 1st one (You can write the needed conversions in a project-specific way).

To do so I would need to write:

typedef math::wide_integer::uintwide_t<static_cast<size_t>(UINT32_C(80)), std::uint16_t> uint80;
typedef math::wide_integer::uintwide_t<static_cast<size_t>(UINT32_C(  512)), std::uint32_t> uint512;

uint80 convert_to_uint80(uint512 value) {
//impl
}

uint512 convert_to_uint512 (uint80 value) {
// impl
}

Thankfully those are the only two types that I need to convert between. Do you have some code I can inspire from to fill those functions or can you help me out?

Regarding the | operator, the response from what you asked:
node->evaluate() is a math::wide_integer::uint512_t
node->getBitvectorSize() is a std::uint32_t

thank you guys

@ckormanyos
Copy link
Owner

clearer description would be appreciated, ...

I will add some lines of clarification near the top of the MD.

However, a type named uint96_t implies a fundamental 96-bit type, not a class-based type.

We discussed this previously, though I do not exactly remember where. In fact, the use of the _t permeates uintwide_t's code, its tests and the examples. It would be a larger operation to reduce/remove these standards-reserved type names.

@johnmcfarlane now that we discuss this yet again, and I'm not really sure of the exact syntax in the standard, but is the use-of/reserved-nature-of the _t suffix found in a clear location in the standard? or is it generally accepted?

@ckormanyos
Copy link
Owner

have some code I can inspire from to fill those functions or can you help me out?

Hi @illera88 you need to use the representation()/crepresentation() member(s) of the class to access the internal data array. Then split the limbs or extend them, dpending on the conversion direction. Along the way, be sure to handle the different array-sizes of the representations 80-bit/512-bit.

I will be happy to draft out some code for those two conversions. Please give me a day or two for that. OK?

@illera88
Copy link
Author

Sure thing. I'll wait to check your code.

Thank you

@johnmcfarlane
Copy link
Contributor

@johnmcfarlane now that we discuss this yet again, and I'm not really sure of the exact syntax in the standard, but is the use-of/reserved-nature-of the _t suffix found in a clear location in the standard? or is it generally accepted?

Specifically, I'm making a suggestion WRT Triton. Admittedly I have very little knowledge of that library so please don't feel the need to act on it. :)

However, yes, we already discussed this WRT wide-integer. To recap/update my thoughts...

The _t used to be reserved - 'in spirit' at least - for aliases used by standards such as C, C++ and POSIX libraries. However, I really don't think it's a problem to use this consistently to highlight the different between a non-alias (e.g. short) and an aliases (e.g. int16_t). Also, there's a convention of aliasing type traits, for example):

template<class T, int Digits>
using set_digits_t = typename set_digits<T, Digits>::type;

I was going to add that aliases in the form std::[u]intn_t tend to be for aliases of fundamental types only. However, looking at P1889R1, that doesn't appear to be the case! There are still aliases to wide integers in there.

Personally, I'd advocate to revise that paper and and reserve those aliases for fundamentals. They already mean 'fundamental' to users and while wide-integer is a great way to break the width barrier, that doesn't mean the user should be unaware/unsure what kind of type is used to do that. They may depend on qualities of the type (e.g. trivially-copyable), which we're not sure we want wide-integer to grant.

In particular, what do you do about int128_t, which is already defined by some implementations on some platforms? If std::int128_t were an alias for a fundamental type on 64-bit GCC or Clang, but was also an alias for a class-based wide-integer object on 32-bit GCC/Clang and on MSVC, that could pose problems - and certainly surprise - for users.

I personally believe it should be an opt-in for the user to decide: I want a machine-efficient integer and I'm OK if that requires a library feature. E.g.:

namespace acme { // could be std some day, who knows?!
  template<int Width, unsigned_integer Limb = unsigned, bool IsSigned = true>
  fixed_int_t = ?;
}

But what wide-integer library chooses to do already is completely fine, and backward-compatibility is an important quality.#236 HTH, thanks, John

@ckormanyos
Copy link
Owner

ckormanyos commented Feb 27, 2022

draft out some code for those two conversions

Hi @illera88 the draft of the conversion routines can be found here.

I decided after a bit of deliberation to make the from_rep() method public and also add a move-optimized version of it. So I actually did change the uintwide_t.h header when making this draft. And you need the modified header for the conversions written in this way. So you'll need to update your uintwide_t.h from the main branch.

In the conversion subroutines, I also did some modernization via aliases and trailing return type. This is mostly because i have a bunch of syntax checkers running for these checks. Traditional style backport will, of course, also work fine if you prefer typedef, etc.

These conversions are very specific and apply solely and to exactly to the conversions we mentioned above (80/512). They are not generic, and they are not flexible.

If you have any problems adapting them or other roadbloks, please get back to me. Those drafts are very new code and might need more testing. Please see how it goes with them...

For the final issue of the 32-bit mixed-OR operations, we need to see what's exactly going on...

@illera88
Copy link
Author

illera88 commented Feb 28, 2022

Hey @ckormanyos

thank for the conversion. I've already added it to the code and now there is just the OR operation problem remaining.

I've reduced the problem to these few lines so you can debug it easily:

math::wide_integer::int512_t value = 0;
std::uint32_t to_shift = 11111;
math::wide_integer::uint512_t value2 = 31337; // node->evaluate()
      
math::wide_integer::int512_t shift = (value << to_shift); 
value = shift | value2; // math::wide_integer::int512_t | math::wide_integer::uint512_t

This is the error I get"
error C2678: binary '|': no operator found which takes a left-hand operand of type 'math::wide_integer::uintwide_t<512,uint32_t,void,true>' (or there is no acceptable conversion)

Do you have any idea why this is not working?

@ckormanyos
Copy link
Owner

ckormanyos commented Feb 28, 2022

Do you have any idea why this is not working?

Yes. Wide-integer's casting/conversion philosophy differs slightly from that of Boost's.

Although admittedly I actually think Boost's philosophy might be better (or more right, or less wrong) than that of wide-integer. On the other hand, neither of them can actually be considered right or wrong because the behavior of user-defined types is not yet specified in the standard.

In this particular case, you need an explicit cast to either signed or unsigned 512-bit wide-integer in order to tell the compiler which operator to select. At the moment, wide-integer does not specialize global binary mathematical operations involving different signed-ness of the participants.

You would need to explicitly tell the compiler which of the available wide-integer operators to take via explicit cast or construct. For instance.

    math::wide_integer::int512_t value = 1;
    std::uint32_t to_shift = 111;
    math::wide_integer::uint512_t value2 = 31337; // node->evaluate()

    math::wide_integer::int512_t shift = (value << to_shift); 
    value = shift | static_cast<math::wide_integer::int512_t>(value2); // explicit cast of value2

Also note that I changed a few values in your sample. The value of value in your case was 0 and not very expressive. And I found left shift with 11111 to be quite a large shift, maybe too large. Anyway, the actual numbers do not change the example's point.

@ckormanyos
Copy link
Owner

ckormanyos commented Feb 28, 2022

Again, if Boost is more right or less wrong than wide-integer regarding some casts/conversions, that may be one result of these kinds of porting efforts.

Yet even if this is the case (and it might not be), I probably won't be able to (in a stable, error-free fashion) change wide-integer's behavior very quickly. So you might ultimately need to consider the cast.

ckormanyos pushed a commit that referenced this issue Mar 2, 2022
@ckormanyos
Copy link
Owner

Hi @illera88 what is the status of this issue? Open? Or can I close this issue? Thx.

@illera88
Copy link
Author

illera88 commented Apr 2, 2022

You can close it.

It was successfully integrated in Triton JonathanSalwan/Triton#1105

@illera88 illera88 closed this as completed Apr 2, 2022
@ckormanyos
Copy link
Owner

was successfully integrated in Triton

Great! Glad it works. Thank you @illera88 for pushing forward on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants