Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of recipe-scrapers for more generic web imports #18

Closed
cydanil opened this issue Aug 20, 2021 · 10 comments
Closed

Make use of recipe-scrapers for more generic web imports #18

cydanil opened this issue Aug 20, 2021 · 10 comments

Comments

@cydanil
Copy link
Contributor

cydanil commented Aug 20, 2021

https://github.com/hhursev/recipe-scrapers provides a nice interface to scrape a number of websites, which could replace our importers for more flexible imports.

This would also remove the burden for us to maintain importers on top of the application itself, as well as support way more websites than we could.

@cydanil
Copy link
Contributor Author

cydanil commented Sep 2, 2021

I have a branch that does this.

The import dialog was reworked to be a single dialog for file and web imports, with validation of the supported urls:

image

File import behaves the same.

Web recipes are imported using recipe-scrapers, eg https://www.allrecipes.com/recipe/257887/lunch-biscuits/ :

image

However, there's still a couple of items needing improvements:

  1. Separate the web_importer from Gourmand internals;
  2. Create web import status;
  3. Set the cooking and preparation times.

The motivation for 1 is to be able to plug in more web importers in a homogeneous should the need arise, and to decouple the import mechanism from user feedback.

web imports still take a few seconds, mostly related to retrieving the recipe image. As such, it would be beneficial to have the import status in an info bar, as file imports do it.

@cydanil
Copy link
Contributor Author

cydanil commented Sep 4, 2021

To finalize this feature, an import status still needs to be implemented: recipes take a while to import, mostly due to retrieving and scaling images.

@eliotb
Copy link
Contributor

eliotb commented Sep 11, 2021

Does this change mean that it will only be possible to import from one of the supported websites, rather than from any webpage with a recipe on it? I notice that the recipe scraper library has a sort of wild card option that may work on sites that are not directly supported
If so, workaround is to save page as a file, or copy and paste text into plain text file, save that and then import it.
Related - to shortcut the copy,paste, save, import sequence would you be open to supporting direct import from the clipboard?

@cydanil
Copy link
Contributor Author

cydanil commented Oct 9, 2021

I would have preferred to support these websites indeed, but it's not realistic.
Their wildcard option still expects the recipes to be formatted in a specific way (json-ld).

I would prioritise recipe-scrapers.

I like your suggesting of importing recipes by pasting selected text (or recipe file) in a window (the main window?). I will prototype something and ask for your feedback :)

@founderio
Copy link

@cydanil Someone made a wrapper to text parsing here: hhursev/recipe-scrapers#9 (comment)

Maybe that can be utilized?

@cydanil
Copy link
Contributor Author

cydanil commented Oct 9, 2021

Thanks for pointing me towards this comment, @founderio. However, it seems to still expect SchemaOrg content.
I tried to use it with a recipe using basic html (which I think will be the #1 need) to no avail.

I picture the imports to be unstructured text as some variation of what's below:

text = """
<html>
<title>Recipe title </title>
<br/>
Ingredients:

Ingredient Group 1:

    1 ⅓ cups all-purpose flour
    1 tablespoon granulated sugar
    ½ teaspoon salt
    ½ cup shortening
    3 ½ tablespoons cold water

Ingredient Group 2:

    2 cups mashed, cooked pumpkin, or about 1 1/2 pounds skin-on, raw pumpkin; Shortcut: You can substitute with canned pumpkin. Before you buy, you should check the label to make sure of what's in the can; if it's labeled "pumpkin pie filling" it's already spiced. If you plan to sweeten yourself with the ingredients below, go for unflavored canned pumpkin.
    1 (12 fluid ounce) can evaporated milk
    2 eggs, beaten
    ¾ cup packed brown sugar
    ½ teaspoon ground cinnamon
    ½ teaspoon ground ginger
    ½ teaspoon ground nutmeg
    ½ teaspoon salt
<br/>
whole pumpkin for pie

1. Instructions Part 1
blabla

2. Instructions Part 2
text

3. Instruction Part 3
text
</html>
"""

@eliotb
Copy link
Contributor

eliotb commented Oct 9, 2021

I like your suggesting of importing recipes by pasting selected text (or recipe file) in a window (the main window?).

I have just realised that the traditional importer (i.e. with the text pane and all the tag buttons on the right) almost does this now.
The text pane is already editable, and supports paste. So all that is needed is to be able to open it with the text pane empty. Then the recipe can be pasted in there, edited, tagged and imported.
I tested this by "importing" a local text file, and then deleting the content of the text pane.

@cydanil
Copy link
Contributor Author

cydanil commented Oct 28, 2021

I've been experimenting with drag and drop and pasting.
It resulted in the following (pasting not demonstrated):

output

What do you think?

@eliotb
Copy link
Contributor

eliotb commented Oct 28, 2021

This looks great!
I'd be really happy with the second part where you drop selected text and get the import editor.

@cydanil
Copy link
Contributor Author

cydanil commented Nov 10, 2021

This has been finalized in #67, where there's a couple of screencasts demonstrating drag and drop, similar to copy/paste.

I appreciate that supporting only a set of website looks like a regression in terms of functionality, even though it's compensated by plain text import.
I will see how easily can I make any url importable, as it used to be.

Thanks for your feedback and ideas!

@cydanil cydanil closed this as completed Nov 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants