Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hold for Payment 13 August] Implement Hyperlink and Email link HTML to markdown conversion #4187

Closed
3 tasks
Jag96 opened this issue Jul 23, 2021 · 28 comments · Fixed by Expensify/expensify-common#406
Assignees
Labels
External Added to denote the issue can be worked on by a contributor Weekly KSv2

Comments

@Jag96
Copy link
Contributor

Jag96 commented Jul 23, 2021

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!


Related to #3047, similar to Expensify/expensify-common#392. Email and link MD->HTML version can be found here

Action Performed:

Copied a hyperlink from slack to E.cash
image
image

Expected Result:

Rules of parsing would be as

  1. [html](www.google.com) When <a href="www.google.com">html</a> and you will see this type pf link as html.
  2. www.google.com when <a href="www.google.com">www.google.com</a> and you will see this type pf link as www.google.com.
  3. abc@gmail.com when <a href="abc@gmail.com">abc@gmail.com</a> and visible as abc@gmail.com.

Actual Result:

Message just pasted text

Platform:

Where is this issue occurring?

  • Web
  • Desktop App
  • Mobile Web

View all open jobs on Upwork

@Jag96 Jag96 added the AutoAssignerTriage Auto assign issues for triage to an available triage team member label Jul 23, 2021
@MelvinBot
Copy link

Triggered auto assignment to @MitchExpensify (AutoAssignerTriage), see https://stackoverflow.com/c/expensify/questions/4749 for more details.

@MelvinBot MelvinBot removed the AutoAssignerTriage Auto assign issues for triage to an available triage team member label Jul 23, 2021
@parasharrajat
Copy link
Member

parasharrajat commented Jul 23, 2021

A suggestion to the description.
Rules of parsing would be as

  1. [html](www.google.com) When <a href="www.google.com">html</a> and you will see this type of link as html.
  2. www.google.com when <a href="www.google.com">www.google.com</a> and you will see this type of link as www.google.com.
  3. abc@gmail.com when <a href="abc@gmail.com">abc@gmail.com</a> and visible as abc@gmail.com.

@Jag96
Copy link
Contributor Author

Jag96 commented Jul 23, 2021

Sounds good, updated!

@parasharrajat
Copy link
Member

parasharrajat commented Jul 23, 2021

To match anchor tags we can use this
<(a)(?:href=(['"])(.*?)\2|"[^"]*"|'[^']*'|[^'">])*>(.*?)<\/\1>(?![^<]*(<\/pre>|<\/code>))

  1. Groups 3 and 4 contain the URL and text respectively.
  2. We will use a process method that will parse all three types of URLs based on these rules.
    a. if g2 and g3 are not equal, then output [g3](g2). we don't need to worry even if href was an email it just works.
    b. If not a then the output would be simple g2. We don't care if it's an email or a normal link.

@MitchExpensify
Copy link
Contributor

Replicated! Looks like a good external issue right @Jag96?

@MelvinBot MelvinBot added the Monthly KSv2 label Jul 26, 2021
@Jag96 Jag96 added External Added to denote the issue can be worked on by a contributor Monthly KSv2 and removed Monthly KSv2 labels Jul 27, 2021
@MelvinBot MelvinBot added Daily KSv2 and removed Monthly KSv2 labels Jul 27, 2021
@Jag96
Copy link
Contributor Author

Jag96 commented Jul 27, 2021

@MitchExpensify ah yes, forgot to add the External label!

@MitchExpensify
Copy link
Contributor

Weird that it didn't auto assign someone

@Jag96 Jag96 added External Added to denote the issue can be worked on by a contributor and removed External Added to denote the issue can be worked on by a contributor labels Jul 27, 2021
@MelvinBot
Copy link

Triggered auto assignment to @laurenreidexpensify (External), see https://stackoverflow.com/c/expensify/questions/8582 for more details.

@mananjadhav
Copy link
Collaborator

mananjadhav commented Jul 28, 2021

Ignore the PR reference.

Proposal for Link HTML to Markdown

I would go ahead with this regex.

<a.*?href=["']([^"']*)["'][^>]*>([^<]*)</a>

This will result in two matches $1 and $2. I'll add code to matching groups to handle plain links and mailto: href too.

It basically covers the odd cases such as inconsistent parenthesis etc. I can implement this if you plan to hire along with the tests. Will handle this for all the three groups mentioned.

@laurenreidexpensify
Copy link
Contributor

Hi Folks - @Jag96 @MitchExpensify I'm confused here - should I have hired @mananjadhav in Upwork yet? did we agree on the path for the fix in the PR?

@MitchExpensify
Copy link
Contributor

We agreed the problem was a good candidate for external so I think an Upwork job can be created. Then we'd hire @mananjadhav if the proposal suggested here looks good, right?

@Jag96
Copy link
Contributor Author

Jag96 commented Jul 28, 2021

@laurenreidexpensify @MitchExpensify first we need to create the Upwork job, then we can assign the Exported label which will assign an engineer to review proposals.

@mananjadhav
Copy link
Collaborator

mananjadhav commented Jul 29, 2021

Will wait for the job to be created. For the engineer to review, I've added this commit in my fork along with the unit test cases that confirm it is working for hyperlink as well as email.

mananjadhav/expensify-common@02e23c3

@laurenreidexpensify
Copy link
Contributor

okay we're on upwork here now

@mananjadhav
Copy link
Collaborator

I've added a PR for review and also sent a proposal on Upwork!

@laurenreidexpensify
Copy link
Contributor

Thanks @mananjadhav, just for reference the process here is that once a job has been added to Upwork folks can submit proposals on this GH issue and @Luke9389 will review them here and confirm when we're ready to move to PR. So for now you can close that PR and if @Luke9389 confirms your proposal is the one we're moving forward with, I will hire you in Upwork and then you can reopen or create a new PR. Thanks!

@mananjadhav
Copy link
Collaborator

Noted. Thanks for explaining the process. Closing the PR.

@laurenreidexpensify
Copy link
Contributor

@Luke9389 can you take a look today when you're back? Thanks

@MelvinBot MelvinBot removed the Overdue label Aug 2, 2021
@parasharrajat
Copy link
Member

Proposal #4187 (comment).

Please let me know if you need further explanation. Thanks.

@Luke9389
Copy link
Contributor

Luke9389 commented Aug 3, 2021

Ok, @parasharrajat had a proposal down first, so I'm gonna give him the green light to write up a PR.

@laurenreidexpensify
Copy link
Contributor

https://www.upwork.com/jobs/~015d7fb34d72ccdc11

@parasharrajat hired you on Upwork

@laurenreidexpensify laurenreidexpensify removed the Help Wanted Apply this label when an issue is open to proposals by contributors label Aug 3, 2021
@MelvinBot
Copy link

@parasharrajat, @Luke9389, @laurenreidexpensify it looks like no one is assigned to work on this job.
Please double the price or add a comment stating why the job isn't being doubled.

@laurenreidexpensify
Copy link
Contributor

melvin u liar

@MelvinBot MelvinBot removed the Overdue label Aug 5, 2021
@Luke9389
Copy link
Contributor

Luke9389 commented Aug 5, 2021

Strange, I'm gonna look into why Melvin misfired here...

@laurenreidexpensify laurenreidexpensify changed the title Implement Hyperlink and Email link HTML to markdown conversion [Hold for Payment 13 August] Implement Hyperlink and Email link HTML to markdown conversion Aug 6, 2021
@laurenreidexpensify laurenreidexpensify added Weekly KSv2 and removed Daily KSv2 labels Aug 9, 2021
@MelvinBot MelvinBot removed the Overdue label Aug 9, 2021
@parasharrajat parasharrajat mentioned this issue Aug 9, 2021
5 tasks
@laurenreidexpensify
Copy link
Contributor

laurenreidexpensify commented Aug 13, 2021

@parasharrajat looks like we missed a step here and you were never hired in Upwork for this one 😬 can you apply and i'll hire / pay you as soon as that's done! thanks!

@laurenreidexpensify
Copy link
Contributor

All sorted

@0xmiros
Copy link
Contributor

0xmiros commented Aug 19, 2022

Rules of parsing would be as

[html](www.google.com) When <a href="www.google.com">html</a> and you will see this type pf link as html.
www.google.com when <a href="www.google.com">www.google.com</a> and you will see this type pf link as www.google.com.
abc@gmail.com when <a href="abc@gmail.com">abc@gmail.com</a> and visible as abc@gmail.com.

I think this case should be added as well: (when text includes http/https so text and url are same)

https://www.google.com when <a href="https://www.google.com">https://www.google.com</a> and you will see this type pf link as https://www.google.com.

Here's video of failure:

html.markdown.mov

https://github.com/Expensify/expensify-common/blob/e0ec0b9abca4d7b417ba0b016f8725856cfeeece/lib/ExpensiMark.js#L259-L269

After reviewing code, I suspect this line which excludes failure case above:

                    if (link !== g4) {

But we cannot remove this condition completely because of email case.

Solution:
https://github.com/Expensify/expensify-common/blob/e0ec0b9abca4d7b417ba0b016f8725856cfeeece/lib/ExpensiMark.js#L264-L267
replace these lines with this simple line:

                    return g3.startsWith('mailto:') && link === g4 ? link : `[${g4}](${link})`;

final code:

            {
                name: 'anchor',
                regex: /<(a)[^><]*href\s*=\s*(['"])(.*?)\2(?:".*?"|'.*?'|[^'"><])*>(.*?)<\/\1>(?![^<]*(<\/pre>|<\/code>))/gi,
                replacement: (match, g1, g2, g3, g4) => {
                    const link = g3.startsWith('mailto:') ? g3.slice(7) : g3;
                    return g3.startsWith('mailto:') && link === g4 ? link : `[${g4}](${link})`;
                },
            },

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
External Added to denote the issue can be worked on by a contributor Weekly KSv2
Projects
None yet
8 participants