Skip to content

Commit

Permalink
Added support for google scholar citations (alshedivat#2193)
Browse files Browse the repository at this point in the history
Closes alshedivat#1809, but there are caveats:
1 - it only works at build time, which means it won't update the numbers
unless you build your site again
2 - Google might block the request if it receives lots of it, failing
the whole process.

This is how it looks like when it can fetch the information:

![Screenshot from 2024-02-13
00-37-52](https://github.com/alshedivat/al-folio/assets/31376482/646d1f3c-1294-491b-bc13-9013e38918b4)

And this when it fails:


![image](https://github.com/alshedivat/al-folio/assets/31376482/516eefff-d394-44ad-8702-8982233f8c4f)

Signed-off-by: George Araujo <george.gcac@gmail.com>
  • Loading branch information
george-gca authored and Karapost committed Jul 4, 2024
1 parent 46c8789 commit aa6da9c
Show file tree
Hide file tree
Showing 4 changed files with 88 additions and 2 deletions.
1 change: 1 addition & 0 deletions _bibliography/papers.bib
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ @article{PhysRev.47.777
pdf={example_pdf.pdf},
altmetric={248277},
dimensions={true},
google_scholar_id={qyhmnyLat1gC},
selected={true}
}

Expand Down
3 changes: 2 additions & 1 deletion _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ x_username: # your X handle
mastodon_username: # your mastodon instance+username in the format instance.tld/@username
linkedin_username: # your LinkedIn user name
telegram_username: # your Telegram user name
scholar_userid: # your Google Scholar ID
scholar_userid: qc6CJjYAAAAJ # your Google Scholar ID
semanticscholar_id: # your Semantic Scholar ID
whatsapp_number: # your WhatsApp number (full phone number in international format. Omit any zeroes, brackets, or dashes when adding the phone number in international format.)
orcid_id: # your ORCID ID
Expand Down Expand Up @@ -311,6 +311,7 @@ scholar:
enable_publication_badges:
altmetric: true # Altmetric badge (https://www.altmetric.com/products/altmetric-badges/)
dimensions: true # Dimensions badge (https://badge.dimensions.ai/)
google_scholar: true # Google Scholar badge (https://scholar.google.com/intl/en/scholar/citations.html)

# Filter out certain bibtex entry keywords used internally from the bib output
filtered_bibtex_keywords:
Expand Down
8 changes: 7 additions & 1 deletion _layouts/bib.liquid
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,8 @@
{% if site.enable_publication_badges %}
{% assign entry_has_altmetric_badge = entry.altmetric or entry.doi or entry.eprint or entry.pmid or entry.isbn %}
{% assign entry_has_dimensions_badge = entry.dimensions or entry.doi or entry.pmid %}
{% if entry_has_altmetric_badge or entry_has_dimensions_badge %}
{% assign entry_has_google_scholar_badge = entry.google_scholar_id %}
{% if entry_has_altmetric_badge or entry_has_dimensions_badge or entry_has_google_scholar_badge %}
<div class="badges">
{% if site.enable_publication_badges.altmetric and entry_has_altmetric_badge %}
<span
Expand Down Expand Up @@ -249,6 +250,11 @@
style="margin-bottom: 3px;"
></span>
{% endif %}
{% if site.enable_publication_badges.google_scholar and entry_has_google_scholar_badge %}
<a href="https://scholar.google.com/citations?view_op=view_citation&hl=en&user={{ site.scholar_userid }}&citation_for_view={{ site.scholar_userid }}:{{ entry.google_scholar_id }}">
<img src="https://img.shields.io/badge/scholar-{% google_scholar_citations site.scholar_userid entry.google_scholar_id %}-4285F4?logo=googlescholar&labelColor=beige">
</a>
{% endif %}
</div>
{% endif %}
{% endif %}
Expand Down
78 changes: 78 additions & 0 deletions _plugins/google-scholar-citations.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
require "active_support/all"
require 'nokogiri'
require 'open-uri'

module Helpers
extend ActiveSupport::NumberHelper
end

module Jekyll
class GoogleScholarCitationsTag < Liquid::Tag
Citations = { }

def initialize(tag_name, params, tokens)
super
splitted = params.split(" ").map(&:strip)
@scholar_id = splitted[0]
@article_id = splitted[1]
end

def render(context)
article_id = context[@article_id.strip]
scholar_id = context[@scholar_id.strip]
article_url = "https://scholar.google.com/citations?view_op=view_citation&hl=en&user=#{scholar_id}&citation_for_view=#{scholar_id}:#{article_id}"

begin
# If the citation count has already been fetched, return it
if GoogleScholarCitationsTag::Citations[article_id]
return GoogleScholarCitationsTag::Citations[article_id]
end

# Sleep for a random amount of time to avoid being blocked
sleep(rand(1.5..3.5))

# Fetch the article page
doc = Nokogiri::HTML(URI.open(article_url, "User-Agent" => "Ruby/#{RUBY_VERSION}"))

# Attempt to extract the "Cited by n" string from the meta tags
citation_count = 0

# Look for meta tags with "name" attribute set to "description"
description_meta = doc.css('meta[name="description"]')
og_description_meta = doc.css('meta[property="og:description"]')

if !description_meta.empty?
cited_by_text = description_meta[0]['content']
matches = cited_by_text.match(/Cited by (\d+[,\d]*)/)

if matches
citation_count = matches[1].to_i
end

elsif !og_description_meta.empty?
cited_by_text = og_description_meta[0]['content']
matches = cited_by_text.match(/Cited by (\d+[,\d]*)/)

if matches
citation_count = matches[1].to_i
end
end

citation_count = Helpers.number_to_human(citation_count, :format => '%n%u', :precision => 2, :units => { :thousand => 'K', :million => 'M', :billion => 'B' })

rescue Exception => e
# Handle any errors that may occur during fetching
citation_count = "N/A"

# Print the error message including the exception class and message
puts "Error fetching citation count for #{article_id}: #{e.class} - #{e.message}"
end


GoogleScholarCitationsTag::Citations[article_id] = citation_count
return "#{citation_count}"
end
end
end

Liquid::Template.register_tag('google_scholar_citations', Jekyll::GoogleScholarCitationsTag)

0 comments on commit aa6da9c

Please sign in to comment.