Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add support for custom attachments of type "Picture (Metafile) [CLSID: 00000315-0000-0000-C000-000000000046] #425

Closed
ragebear00 opened this issue Aug 1, 2024 · 16 comments

Comments

@ragebear00
Copy link

Bug Metadata

  • Version of extract_msg: [0.48.7]
  • Your python version: Python [3.10]
  • How did you launch extract_msg?

BTW, can I do attachment.count before saving and avoid counting embedded image as attachment?

def msgAttachment(filename):
msg = extract_msg.openMsg(filename) <---- error here
try:
msg.saveAttachments(customPath=r'C:_a', extractEmbedded=True)
msg.close()
return(1)
except Exception as e:
print("=================================Error Extract_Msg Attachment==================")

Traceback (most recent call last):
File "C:\T\ServerWarez\Python\Loogle\WalkerAttach.py", line 237, in
attach_walk(MainFolderArchive+"\")
File "C:\T\ServerWarez\Python\Loogle\WalkerAttach.py", line 84, in attach_walk
msgAttachment(filename)
File "C:\T\ServerWarez\Python\Loogle\Extract_Msg.py", line 32, in msgAttachment
msg = extract_msg.openMsg(filename) #filename including path
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\open_msg.py", line 124, in openMsg
return Message(path, **kwargs)
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\msg_classes\message_base.py", line 82, in init
super().init(path, **kwargs)
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\msg_classes\msg.py", line 221, in init
self.attachments
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\functools.py", line 981, in get
val = self.func(instance)
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\msg_classes\msg.py", line 862, in attachments
attachments.append(self.initAttachmentFunc(self, attachmentDir))
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\attachments_init_.py", line 106, in initStandardAttachment
return CustomAttachment(msg, dir_, propStore)
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\attachments\custom_att.py", line 40, in init
self._customHandler = getHandler(self)
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\attachments\custom_att_handler_init
.py", line 80, in getHandler
raise FeatureNotImplemented(f'No valid handler could be found for the attachment. Contact the developers for help. If the CLSID is not all zeros, include it in the title or message. (CLSID: {attachment.clsid})')
extract_msg.exceptions.FeatureNotImplemented: No valid handler could be found for the attachment. Contact the developers for help. If the CLSID is not all zeros, include it in the title or message. (CLSID: 00000315-0000-0000-C000-000000000046)

@TheElementalOfDestruction TheElementalOfDestruction changed the title msg open error [Feature Request] Add support for custom attachments of type "Picture (Metafile) [CLSID: 00000315-0000-0000-C000-000000000046] Aug 1, 2024
@TheElementalOfDestruction
Copy link
Collaborator

Not a bug, this is intended.

This issue is cause by having an unsupported type for custom attachments. I've never seen this type of custom attachment so if you can send me the file I can work on implementing it. If you can't, we'll have to exchange information back and forth until I get some code that works on your file.

You can still open the file for the most part by adding the following argument to openMsg: errorBehavior = extract_msg.enums.ErrorBehavior.ATTACH_NOT_IMPLEMENTED

There is no easy way to detect if an attachment is actually embedded or not, so there is no easy way to do an attachment count that excludes them. Best I can offer is something like this to count them and hope the number is accurate:

count = sum(not att.hidden for att in msg.attachments)

@ragebear00
Copy link
Author

thanks a brilliant, all of the above, works.

As to the Picture (metafile), the msg includes information I cannot post on the public domain. If you would like to test your code with this msg, I am more than happy to assist.

@TheElementalOfDestruction
Copy link
Collaborator

If you can share it over email (in which case it will be tested against but never made public) you can send it to arceusthe@gmail.com

Otherwise, please run the following code on the file and let me know the results, thank you.

import extract_msg

from extract_msg.enums import ErrorBehavior
from extract_msg.attachments.unsupported_att import UnsupportedAttachment

# Open the msg file and ignore errors. I can easily find the attachment this way.
with extract_msg.openMsg(filename, errorBehavior = ErrorBehavior.ATTACH_NOT_IMPLEMENTED) as msg:
    att = next((att for att in msg.attachments if isinstance(att, UnsupportedAttachment)), None)
    if att is None:
        # What happened here???
        print('Failed to find custom attachment, wtf?')
    else:
        # Run tests on the attachment.
        print(att.exists('__substg1.0_3701000D/CONTENTS'))
        print(att.exists('__substg1.0_3701000D/\x01Ole'))
        print(att.exists('__substg1.0_3701000D/\x03MailStream'))
        # Those were just quick validation checks. Now I want as much detail about that special folder as possible.
        for path in att.listDir():
            # Print every file name in the custom attachment data. This should not contain anything sensitive since it should just be part of an unknown standard.
            if path[0] == '__substg1.0_3701000D':
                print(path)

@ragebear00
Copy link
Author

ragebear00 commented Aug 1, 2024

Here you go! Unfortunately, the msg include over 170 email addresses.

True
True
True
['__substg1.0_3701000D', '\x01Ole']
['__substg1.0_3701000D', '\x03MailStream']
['__substg1.0_3701000D', 'CONTENTS']

@TheElementalOfDestruction
Copy link
Collaborator

Excellent, it looks like it should be similar to the already existing outlook DIB image implementation. My guess is that there may just be slight differences and that the contents stream will be a windows metafile as described in [MS-WMF].

The following code is used to test that.

import extract_msg

from extract_msg.enums import ErrorBehavior
from extract_msg.attachments.unsupported_att import UnsupportedAttachment

# Open the msg file and ignore errors. I can easily find the attachment this way.
with extract_msg.openMsg(filename, errorBehavior = ErrorBehavior.ATTACH_NOT_IMPLEMENTED) as msg:
    att = next((att for att in msg.attachments if isinstance(att, UnsupportedAttachment)), None)
    if att is None:
        # What happened here???
        print('Failed to find custom attachment, wtf?')
    else:
        # Run tests on the attachment.
        # Don't want to reveal the actual contents, so just check to see if we can find a WMF header that will identify the file type/
        contents = att.getStream('__substg1.0_3701000D/CONTENTS')
        header = contents[:4]
        # Check for placable WMF record ([MS-WMF] section 2.3.2.3)
        if header == b'\xD7\xCD\xC6\x9A':
            print(True)
        else:
            print(False)
            # Not a placeable one, so check if it matches one of the header types for normal WMF files.
            if header[:2] in (b'\x01\x00', b'\x02\x00'):
                print(True)
                # Check for the version.
                print(contents[4:6] == b'\x00\x01')
                print(contents[4:6] == b'\x00\x03')
            else:
                print(False)
                
        # I just want to know how much data is in these, not what the data is.
        print(len(att.getStream('__substg1.0_3701000D/\x01Ole')))
        print(len(att.getStream('__substg1.0_3701000D/\x03MailStream')))

PS: I have received your email, but unfortunately it is very common for forwarded emails or msg files sent as direct attachments to be mangled away from their original form, so it was not usable. Thanks for trying.

@ragebear00
Copy link
Author

No problem at all, glad to help make this app better! The result is

True
20
12

@TheElementalOfDestruction
Copy link
Collaborator

Perfect, the two other streams have the lengths I expect, and the data has come back as definitely being from the [MS-WMF] standard, so I can work on trying to implement something for it.

@TheElementalOfDestruction
Copy link
Collaborator

Alright, I've added some support in the next-release branch now. Please install the version from there and see if it works for that file.

@ragebear00
Copy link
Author

download next release and copy to
C:\Users\z\AppData\Local\Programs\Python\Python310\Lib\site-packages\extract_msg

code to run
msg = extract_msg.openMsg(filename, errorBehavior = extract_msg.enums.ErrorBehavior.SUPPRESS_ALL) #without .SUPPRESS_ALL, the msg canNOT open

print(msg.body) --> works
print(msg.to) --> works
print(sum(not att.hidden for att in msg.attachments)) ->7 works
msg.saveAttachments(customPath=r'C:\_a') -> error below
msg.close()

Traceback (most recent call last):
File "C:\T\Extract_Msg.py", line 103, in
print(ErrorCheck(r"C:_a\1 - Copy.msg"))
File "C:\T\Extract_Msg.py", line 97, in ErrorCheck
msg.saveAttachments(customPath=r'C:_a')
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\msg_classes\msg.py", line 808, in saveAttachments
attachment.save(skipHidden = skipHidden, **kwargs)
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\attachments\unsupported_att.py", line 29, in save
raise NotImplementedError('Unsupported attachments cannot be saved.')
NotImplementedError: Unsupported attachments cannot be saved.

@TheElementalOfDestruction
Copy link
Collaborator

I see the problem, I forgot to add the imports to register the custom attachment handler. Try now

@ragebear00
Copy link
Author

Traceback (most recent call last):
File "C:\T\Extract_Msg.py", line 1, in
import extract_msg
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg_init_.py", line 65, in
from . import attachments, msg_classes, null_date, properties, structures
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\attachments_init_.py", line 28, in
from . import custom_att_handler
File "C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\attachments\custom_att_handler_init_.py", line 59, in
from .outlook_image_meta import OutlookImageMetafilr
ImportError: cannot import name 'OutlookImageMetafilr' from 'extract_msg.attachments.custom_att_handler.outlook_image_meta' (C:\Users\z\AppData\Local\Programs\Python\Python310\lib\site-packages\extract_msg\attachments\custom_att_handler\outlook_image_meta.py)

@TheElementalOfDestruction
Copy link
Collaborator

Made a typo since I did this from my phone. It's fixed now and hopefully it works as expected

@ragebear00
Copy link
Author

it works. Attachments are saved.

The files were saved as BMP and canNOT be opened.

It is actually WMF. Once the extension is changed. The file can open correctly.

@TheElementalOfDestruction
Copy link
Collaborator

I've fixed the extension so that should no longer be a problem. I'll notify you when the release is out that contains this feature

@ragebear00
Copy link
Author

congradulations! let me know if need another test regarding attachment extension.

@TheElementalOfDestruction
Copy link
Collaborator

Alright, basic implementation has been added in the newest released version. I'm going to leave this o[en while I work on full implementation for this and DIB images

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants