Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement an ISBN Fetcher for book-info.com #9145

Closed
Siedlerchr opened this issue Sep 7, 2022 · 15 comments · Fixed by #9205
Closed

Implement an ISBN Fetcher for book-info.com #9145

Siedlerchr opened this issue Sep 7, 2022 · 15 comments · Fixed by #9205
Assignees
Labels
fetcher good first issue An issue intended for project-newcomers. Varies in difficulty. type: enhancement

Comments

@Siedlerchr
Copy link
Member

Is your suggestion for improvement related to a problem? Please describe.
Some ISBNs like 3-85869-197-6 and 3-549-05585-4 cannot be found on OpenLibrary or Ebook.de

Describe the solution you'd like
Add book-info.com as a third fallback source for checking ISBN Data.

Additional context
The HTML needs to be parsed. Seems pretty straightforward to parse the data with jsoup
https://www.book-info.com/isbn/84-08-08890-4.htm

@Siedlerchr Siedlerchr added fetcher good first issue An issue intended for project-newcomers. Varies in difficulty. labels Sep 7, 2022
@anudeep-bpgc
Copy link

Hi @Siedlerchr ,
This issue seems to be a good initiation of my journey towards contribution to FOSS projects.
Can you please assign this to me?

@sreenath-tm
Copy link
Contributor

I'm interested in contributing to this issue, so before I start working it, would you mind sparing your time explaining what the fix will be like.

@ThiloteE
Copy link
Member

ThiloteE commented Sep 8, 2022

General remarks for contribution: Check out https://github.com/JabRef/jabref/blob/main/CONTRIBUTING.md for a start. Feel free to ask if you have any questions here on GitHub or also at JabRef's Gitter chat.

Try to open a (draft) pull request early on, so that people can see you are working on the issue and so that they can see the direction the pull request is heading towards. This way, you will likely receive valuable feedback.

@ThiloteE
Copy link
Member

ThiloteE commented Sep 8, 2022

@sreenath-tm I have seen you are interested in multiple issues (9031 and 9042). I assume you are a student or a project newcomer and for this reason I would suggest to focus on one of the other issues, since:

  1. First come, first served is nice manners.
  2. Double work is inefficient.
    • In principle, all issues are free to take, so if you are fast you could theoretically come up with your own solution first and propose a change, but since @anudeep-bpgc already asked to be assigned to this issue and may already have started working on a fix, it could create a little bit of disappointment and double work for the both of you. Ultimately only one of your pull-requests will be merged, so if your study course requires you to demonstrage a successfully merged commit this is a risky plan.
    • Supervising two different pull-requests creates more work for maintainers, since they have to hop between pull-requests, communicate with the both of you and compare which pull-request is better.
  3. You could work together on this issue, but this would require good communication skills on your part. If both of you are willing, you should exchange contact details (e.g. via gitter/e-mail etc.)

@ThiloteE
Copy link
Member

ThiloteE commented Sep 8, 2022

With regard to the code, Christoph or one of the other maintainers will be able to explain more.

I would start with

  1. Look at devdocs with regard to fetchers: https://devdocs.jabref.org/code-howtos/fetchers.html
  2. looking at how other fetchers are implemented. You need to look into JabRefs code for that.
  3. Inspect documentation of jsoup: https://jsoup.org/

@Siedlerchr
Copy link
Member Author

Codewise take a look at the existing ISBN fetchers https://github.com/JabRef/jabref/blob/main/src/main/java/org/jabref/logic/importer/fetcher/IsbnFetcher.java
Implement a new ISBN Fetcher and add it to the handling code for checking if any of the two other returned a result.

@sreenath-tm
Copy link
Contributor

@anudeep-bpgc Please do take a look at https://github.com/JabRef/jabref/blob/main/docs/getting-into-the-code/guidelines-for-setting-up-a-local-workspace.md to get a headstart. If there is any help needed please feel free to let me know.

@ezienecker
Copy link
Contributor

Tomorrow the hacktoberfest starts. I would like to use this task as a project start. Is it ok for you if I take it over? Nothing has happened to the task for 20 days now.

@anudeep-bpgc
Copy link

hi @ezienecker, you may take it.

@anudeep-bpgc anudeep-bpgc removed their assignment Sep 30, 2022
@ezienecker
Copy link
Contributor

I am having trouble adding a new dependency to the project. Unfortunately I don't really understand where the problem is. I added the following dependency (implementation 'net.sourceforge.htmlunit:htmlunit:2.64.0') in the jabref/build.gradle and then triggered a reload via the Gradle View. After that I see the dependency under "External Libraries" but cannot use it in the project. What am I doing wrong?

A normal Gradle build works without problems. I have set up the project as described here https://devdocs.jabref.org/getting-into-the-code/guidelines-for-setting-up-a-local-workspace.

@Siedlerchr
Copy link
Member Author

@ezienecker why do you want to add a new dependency? You should be able to use jsoup for this. See the implementation of the other fetchers

@ezienecker
Copy link
Contributor

The problem with this page is that when you make the call with the appropriate ISBN, you are redirected first (apparently a javascript redirect). The response contains the following in the body tag:

<body onload="location.replace('https://www.book-info.com/isbn/9780321356680.htm?ID=51ijj1ootp0h688u4t9583fs34')">

The rest of the body does not provide any useful content. If you now extract this URL and load it in jsoup accordingly, you again get only a page with only this link. Both jsoup and the URLDownload class do not support the functionality to load pages using javascript. After some research I saw that you can load such pages with libraries like htmlunit.

I would alternatively use this site: https://doi-to-bibtex-converter.herokuapp.com/. There you can search for an ISBN and get a JSON back, that looks more usable than this page in my opinion.

@muzhenying
Copy link

muzhenying commented Oct 11, 2022

I would like to contribute to this issue and make some improvements.

@ThiloteE
Copy link
Member

ThiloteE commented Oct 11, 2022

@muzhenying, what would these improvements be?

I am thankful for your interest, but have you noticed that @ezienecker has already delivered some good work? As you can see there is an open pull-request just waiting to be finalised.

Unless you substantially can improve ezeneckers work (I would not know how), I would suggest to look for another issue. If you are pretty new to open source programming, have a look at issues with the tag "good first issues". Good first issues from Koppor also count. If you are a student, the "candidates for university projects" page also offers some issues of varying difficulty and scope and that have been estimated to be compatible with university courses as well and often bring a larger feature to JabRef.

@muzhenying
Copy link

muzhenying commented Oct 11, 2022

Thanks, @ThiloteE But actually I had a look at the latest repo and it seems that @ezienecker used a source to implement a new ISBN fetcher so I still want to have a try to see if we can use Book-info as a fallback source.

Siedlerchr added a commit to tomazari/jabref that referenced this issue Nov 1, 2022
…ok-info.com-JabRef#9145

* upstream/main:
  New Crowdin updates (JabRef#9333)
  Refresh example styles
  Squashed 'buildres/csl/csl-styles/' changes from 4eee79a..13fd98e
  Bump unirest-java from 3.13.11 to 3.13.12 (JabRef#9330)
  Bump checkstyle from 10.3.4 to 10.4 (JabRef#9331)
  Bump gittools/actions from 0.9.14 to 0.9.15 (JabRef#9332)
  Group context menu presents relevant options depending on number of subgroups (JabRef#9286)
  Removed BibTeX file type and included HTML and Markdown types (JabRef#9318)
  Fix issue: Auto-linking files with safe character replacements JabRef#9267 (JabRef#9316)
  Fix for issue 8806: Button highlights doesn't respect rounded corners (JabRef#9320)
  New Crowdin updates (JabRef#9324)
  Update CHANGELOG.md
  Try to relocate listener binding (JabRef#9238)
  Changed the messages after importing unlinked local files to past passive tense. (JabRef#9308)
  Changed the color of found text from red to high contrast  (JabRef#9315)

# Conflicts:
#	CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetcher good first issue An issue intended for project-newcomers. Varies in difficulty. type: enhancement
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants