Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove "<br/> Empty media" from Telegram messages #1454

Closed
aberenguel opened this issue Dec 20, 2022 · 9 comments
Closed

Remove "<br/> Empty media" from Telegram messages #1454

aberenguel opened this issue Dec 20, 2022 · 9 comments
Assignees
Labels

Comments

@aberenguel
Copy link
Contributor

aberenguel commented Dec 20, 2022

I'm using IPED 4.0.6.

I processed an UFDR file and some Telegram messages the user sent have the term "< br/> Empty media" in the trailing.

image

@aberenguel
Copy link
Contributor Author

As workaround, how can I edit the HTML file in the case? I haven't found it.

FelipeFcosta added a commit that referenced this issue Dec 20, 2022
install older version of imagemagick in the CI build using apt
@lfcnassif
Copy link
Member

lfcnassif commented Dec 20, 2022

As workaround, how can I edit the HTML file in the case? I haven't found it.

Well, it is not that straightforward, you would need to take the item hash and go to %case%/iped/stotage/storage-X.db where X is the decimal number correspondent to the first hex char of the hash value. Then open that SQLITE DB, go to t1 table and find the item by id (it's equal to the hash value). The item content is into the data column, but it is compressed using org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream

@lfcnassif lfcnassif added the bug label Dec 20, 2022
@lfcnassif
Copy link
Member

lfcnassif commented Dec 20, 2022

I'm tagging this as bug because the <br/> tag shouldn't be printed. Possibly it started to be printed when we started to HTML encode all message texts to avoid javascript vulnerabilities. The triggering line is in telegram-decoder dependency:
https://github.com/sepinf-inc/telegram-decoder/blob/35d89dfb42a81de8c6a669dd9227bb64ae27276c/telegram-decoder-impl/src/main/java/telegramdecoder/DecoderTelegram.java#L206

All <br/> should be removed from that class. But seems to me the "Empty media" string is intentional. @hauck-jvsh should we keep (and possibly translate) the "Empty media" string or could it be removed?

@aberenguel
Copy link
Contributor Author

aberenguel commented Dec 20, 2022

To whom is interested in the fix after the case is processed:

package general;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.zip.GZIPOutputStream;

import org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
import org.apache.commons.io.FileUtils;

public class FixTelegram {

    private static String[] hashes;

    public static void main(String[] args) throws IOException {

        hashes = FileUtils.readFileToString(new File("case/chats-telegram-hashes.txt"),
                Charset.defaultCharset()).split("\n");

        for (int i = 0; i < 16; i++) {

            fixDatabase(i);
        }

    }

    private static void fixDatabase(int index) throws IOException {

        File file = new File("case/IPED/iped/storage/storage-" + index + ".db");

        try (Connection connection = DriverManager.getConnection("jdbc:sqlite:" + file.getAbsolutePath())) {

            System.out.println("Conexão realizada em " + file);

            PreparedStatement stmt = connection.prepareStatement("select data from t1 where id = ?");

            PreparedStatement stmtUpdate = connection.prepareStatement("update t1 set data = ? where id = ?");

            for (String hash : hashes) {

                if (hash.startsWith(Integer.toHexString(index).toUpperCase())) {

                    stmt.setString(1, hash);
                    ResultSet resultSet = stmt.executeQuery();

                    byte[] compressedBytes = resultSet.getBytes(1);

                    GzipCompressorInputStream gzipInput = new GzipCompressorInputStream(
                            new ByteArrayInputStream(compressedBytes));

                    String html = new String(gzipInput.readAllBytes(), StandardCharsets.UTF_8);

                    String newHtml = html.replace("&lt;br/&gt; Empty media", "");

                    if (newHtml.equals(html)) {
                        continue;
                    }
                    
                    ByteArrayOutputStream output = new ByteArrayOutputStream();
                    try (GZIPOutputStream gzipOutput = new GZIPOutputStream(output)) {
                        gzipOutput.write(newHtml.getBytes(StandardCharsets.UTF_8));
                    }

                    stmtUpdate.setBytes(1, output.toByteArray());
                    stmtUpdate.setString(2, hash);

                    stmtUpdate.executeUpdate();
                }
            }
        } catch (SQLException e) {
            System.out.println(e.getMessage());
        }
    }
}

@lfcnassif
Copy link
Member

lfcnassif commented Dec 20, 2022

Thanks @aberenguel. How do those messages render in Telegram native application? Is there any kind of "empty media" indication, like an empty picture? Or just the text before <br/> Empty media?

@lfcnassif
Copy link
Member

Or how Cellebrite software render those messages?

@lfcnassif
Copy link
Member

@aberenguel said original evidence is not available anymore to check, and in Cellebrite software those messages are shown just with the text without any media indication, so let's make the behavior the same.

lfcnassif added a commit to sepinf-inc/telegram-decoder that referenced this issue Dec 21, 2022
@lfcnassif
Copy link
Member

lfcnassif commented Dec 21, 2022

This commit tries to fix the issue:
sepinf-inc/telegram-decoder@aa38ab8

I'm just in doubt about the old line 175, if it should be kept or not since it's not the issue reported here, but I commented it out for now (it looks similar). @hauck-jvsh please let me know if that is not ok.

@lfcnassif
Copy link
Member

I'm just in doubt about the old line 175, if it should be kept or not since it's not the issue reported here, but I commented it out for now (it looks similar). @hauck-jvsh please let me know if that is not ok.

I just found one single occurrence of it in a case and removing the "Empty media" string would leave an empty message balloon, so I'll revert that specific change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants