Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe support for architecture independent serialization #36

Open
mgsloan opened this issue May 31, 2016 · 3 comments
Open

Safe support for architecture independent serialization #36

mgsloan opened this issue May 31, 2016 · 3 comments

Comments

@mgsloan
Copy link
Owner

mgsloan commented May 31, 2016

As described by Chris Done, here, we've gotten a lot of mileage out of using store on existing binary formats. However, that only works well when we can assume a machine architecture (endianness etc).

See #31 for ideas on how to support declaring machine independent serialization. The focus of this issue is on making it safe. Ideally, store would allow you to declare architecture independent serializations in such a way that you have static checking that your serialization is architecture independent. Here's the best scheme I can come up with for this:

  1. Choose the behavior on x86_64 machines to be the canonical serialization behavior. So, machine representations will be used on the most common platform.

  2. Add new class(es) and types for machine independent serialization. It just occurred to me to name these after the binary types - Rename Peek and Poke to binary / cereal naming #35 . This makes sense, as these are good names for machine independent serialization, and will make it easier to port code from binary / cereal.

class StoreCrossArch where
    putSize  :: Size a
    put :: a -> Put ()
    get :: Get ()

newtype Put a = Put (Poke a)
newtype Get a = Get (Peek a)

(probably split up - #21)

  1. On little endian machines with no alignment issues, StoreCrossArch should get implemented and Store defined in terms of it (on a per instance basis). This way the serialization performance should be the same on x86 whether or not you are using the machine independent interface.

  2. On big endian machines, Store would be implemented separately from StoreCrossArch. The StoreCrossArch instances would handle endianness flipping. So, put :: Int -> Put () would use a little endian int even though we're on a big endian architecture. poke :: Int -> Poke () would use a big endian int, as that's the machine representation.

Not sure if this will be implemented in the immediate future, but it's rather fun to think about. PRs appreciated!

@mgsloan
Copy link
Owner Author

mgsloan commented May 31, 2016

One interesting consideration is how the default methods for Store should work.

newtype Put a = Put { unPut :: Poke a }
newtype Get a = Get { unGet :: Peek a }

class StoreCrossArch where
    putSize  :: Size a
    put :: a -> Put ()
    get :: Get ()

    default putSize :: (Generic a, GStorePutSize (Rep a)) => Size a
    putSize = genericPutSize
    {-# INLINE putSize #-}

    default put :: (Generic a, GStorePut (Rep a)) => a -> Put ()
    put = genericPut
    {-# INLINE put #-}

    default get :: (Generic a , GStoreGet (Rep a)) => Get a
    get = genericGet
    {-# INLINE get #-}

class Store where
    size :: Size a
    poke :: a -> Poke ()
    peek :: Peek a

#if BIG_ENDIAN   -- Or whatever
    default size :: (Generic a, GStoreSize (Rep a)) => Size a
    size = genericSize
    {-# INLINE size #-}

    default poke :: (Generic a, GStorePoke (Rep a)) => a -> Poke ()
    poke = genericPoke
    {-# INLINE poke #-}

    default peek :: (Generic a , GStorePeek (Rep a)) => Peek a
    peek = genericPeek
    {-# INLINE peek #-}
#else
    default size :: (Generic a, GStorePutSize (Rep a)) => Size a
    size = putSize
    {-# INLINE size #-}

    default poke :: (Generic a, GStorePut (Rep a)) => a -> Put ()
    poke = unPut . put
    {-# INLINE poke #-}

    default peek :: (Generic a , GStoreGet (Rep a)) => Get a
    peek = unGet get
    {-# INLINE peek #-}
#endif

This would allow convenient and efficient generics. It'd encourage the convention that "on little endian, machine representation is used". This does seem a little bit too magical, though. In particular, there is the danger that the user might be on a big endian machine might derive a Store instance and not realize they also need to derive a StoreCrossArch instance.

It also may not be as efficient in the even that the convention that "on little endian, machine representation is used for StoreCrossArch" isn't followed.

@EarthCitizen
Copy link

Is there any update on this issue?

@mgsloan
Copy link
Owner Author

mgsloan commented Jun 26, 2017

@EarthCitizen Nope, same status. Haven't yet needed this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants