-
Notifications
You must be signed in to change notification settings - Fork 1
License
elizabethpermina/citations
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
--- title: "readMe" author: "Elizabeth Permina" date: "1/22/2022" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## RISmed RISmed is a package to extract data from PUBMED and parse the output. It has functions to process authors, title, abstract, etc. I'm looking for ways of extracting the reference list (which is a part of extracted xml record). Issue - I didn't find a way to parse it yet. The xml record of the reference list is highly nested and has a lot of repeating tags, so xml to dataframe functions don't work (columns with the same name error) XML package effectively translates full pubmed xml (with references) into a list with 1) non-unique names of variables 2) nested names like ```{r, eval=FALSE} $PubmedArticle$PubmedData$ReferenceList$Reference$ArticleIdList $PubmedArticle$PubmedData$ReferenceList$Reference$ArticleIdList$ArticleId $PubmedArticle$PubmedData$ReferenceList$Reference$ArticleIdList$ArticleId$text $PubmedArticle$PubmedData$ReferenceList$Reference$ArticleIdList$ArticleId$.attrs IdType "pubmed" $PubmedArticle$PubmedData$ReferenceList$Reference$PubmedArticle$PubmedData$ReferenceList$Reference$Citation ``` corresponding xml looks like that 88dc43de-71be-11ec-8542-d21848e7cc81.xml to get an xml file out of PMID id (one by one) https://pubmed2xl.com/xml/
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published