oec package #217

pachadotdev · 2018-05-14T16:45:48Z

Summary

What does this package do? (explain in 50 words or less):

oec provides an easy way to obtain data from the Observatory of Economic Complexity by accessing its API.

Paste the full DESCRIPTION file inside a code block below:

Package: oec
Type: Package
Title: Observatory of Economic Complexity API Wrapper and Utility Program
Version: 2.8.2
Date: 2018-06-09
Authors@R: c(
  person("Mauricio", "Vargas S.", role = c("aut", "cre", "cph"), email = "oec@media.mit.edu"),
  person("Manuel", "Aristaran", role = c("ctb")),
  person("Pablo", "Paladino", role = c("ctb")),
  person("Gabriela", "Perez", role = c("ctb")),
  person(family = "UN Comtrade", role = c("dtc")),
  person(family = "MIT Media Lab", role = c("dtc")),
  person(family = "Datawheel", role = c("fnd"))
  )
URL: https://CRAN.R-project.org/package=oec
BugReports: https://github.com/pachamaltese/oec-r/issues
Description: Access The Observatory of Economic Complexity API from R to download international trade data.
License: MIT + file LICENSE
LazyData: TRUE
Depends:
  R (>= 3.2)
Imports:
  rlang,
  magrittr,
  dplyr,
  stringr,
  curl,
  purrr,
  jsonlite
RoxygenNote: 6.0.1
Suggests: 
    knitr,
    rmarkdown,
    testthat,
    covr
VignetteBuilder: knitr

URL for the package (the development repository, not a stylized html page):

https://github.com/pachamaltese/oec-r-package

Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):

databases, because the package connects to an API and does 3 or more API calls to simplify things for the final user who wants imports/exports and some metrics such as % of increase/decrease.

Who is the target audience and what are scientific applications of this package?

Non-expert users that use international trade data. This can also be targeted to intermediate/advanced users who can benefit from the speed and short syntax that this package provides.

Are there other R packages that accomplish the same thing? If so, how does
yours differ or meet our criteria for best-in-category?

Not at the moment (in 2 yrs this is the only one)

If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

No.

Requirements

Confirm each of the following by checking the box. This package:

does not violate the Terms of Service of any service it interacts with.
has a CRAN and OSI accepted license.
contains a README with instructions for installing the development version.
includes documentation with examples for all functions.
contains a vignette with examples of its essential functions and uses.
has a test suite.
has continuous integration, including reporting of test coverage, using services such as Travis CI, Coveralls and/or CodeCov.
I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Publication options

Do you intend for this package to go on CRAN?
[] Do you wish to automatically submit to the Journal of Open Source Software? If so:
- The package has an obvious research application according to JOSS's definition.
- The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
- The package is deposited in a long-term repository with the DOI:
- (Do not submit your package separately to JOSS)
Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
- The package is novel and will be of interest to the broad readership of the journal.
- The manuscript describing the package is no longer than 3000 words.
- You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
- (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no gaurantee that your manuscript willl be within MEE scope.)
- (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
- (Please do not submit your package separately to Methods in Ecology and Evolution)

Detail

Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:
Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:

None.

If this is a resubmission following rejection, please explain the change in circumstances:

No.

If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

I don't really know.

The text was updated successfully, but these errors were encountered:

maelle · 2018-05-15T10:44:15Z

@pachamaltese Thanks for your submission! From a quick look you checked "contains a vignette with examples of its essential functions and uses." and "has a test suite." but I do not see neither a vignette nor a test suite in the GitHub repo? We can put this submission on hold while you add them, and we can provide links about e.g. unit testing if necessary.

maelle · 2018-05-15T10:45:12Z

Also, could you please expand here on the added value of the data processing offered by your package as opposed to using the original data sources?

pachadotdev · 2018-05-15T13:18:51Z

@maelle thanks, no worries, let me check my laptop hdd vs my external drive, I probably forgot to move something

About the added value if I run getdata_batch("chl", "chn", 2010, 2015) I will obtain a tibble with 6 years of data for chilean exports to china but with these additions:

country codes corrected and written as ISO-3
full product names
full country names
trade balance
economic complexity measures and ranks
top importers and exporters
a color column to use the same palette as atlas.media.mit.edu

On the other hand, with API calls I would need to read https://atlas.media.mit.edu/hs92/export/2010/chl/chn/show/
https://atlas.media.mit.edu/hs92/export/2011/chl/chn/show/
... etc ...
https://atlas.media.mit.edu/hs92/export/2015/chl/chn/show/

and also to read this
https://atlas.media.mit.edu/hs92/export/2010/chl/all/show/
https://atlas.media.mit.edu/hs92/export/2011/chl/all/show/
... etc ...
https://atlas.media.mit.edu/hs92/export/2015/chl/all/show/

and also this
https://atlas.media.mit.edu/hs92/export/2010/all/all/show/
https://atlas.media.mit.edu/hs92/export/2011/all/all/show/
... etc ...
https://atlas.media.mit.edu/hs92/export/2015/all/all/show/

After reading those 15 json files the user would need some additional operations to see the same result as getdata_batch("chl", "chn", 2010, 2015):

a left_join to match API product code (e.g. 2709) to product name (e.g. petroleum), a mayor drawback here is that the code-name table can't be accessed by using the API
another left_join to match country codes (e.g. sachl) to country names (e.g. Chile) with the big problem that the API returns non-ISO codes (e.g. sachl vs chl), again the user would need a code-name table and would need to crop the strings
join everything and finally use purrr to create a unique tibble

pachadotdev · 2018-05-19T20:37:16Z

Hi @maelle.
I did put all the pieces together: external disks, google drive, etc. I have pushed a new version with lots of changes as well as a testing unit.
Have a good weekend!

maelle · 2018-05-24T16:39:42Z

Hi @pachamaltese! Thanks.

I've been traveling so my initial checks will take a bit longer than expected.

I am not sure I understood how your package got scattered over hard drives &co but please ask us any git/GitHub question you might have. See also http://happygitwithr.com ☺

pachadotdev · 2018-05-24T17:00:23Z

thanks a lot @maelle

happy travelling!!

I have +5 external hdd since my macbook fried itself and I lost tons of tidy useful data :(
I'm still recovering from that accident ;)

maelle · 2018-05-24T21:20:05Z

Editor checks:

Fit: The package meets criteria for fit and overlap
Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
License: The package has a CRAN or OSI accepted license
Repository: The repository link resolves correctly
Archive (JOSS only, may be post-review): The repository DOI resolves correctly
Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Thanks again for your submission @pachamaltese!

Here are a some points that should be solved/discussed before I start looking for reviewers. Note that now that you have everything on GitHub, you shouldn't loose pieces again, which is good news. Sorry to hear about your Macbook! 😢

This process is an ongoing discussion so ask any question/make any remark you might have about my comments!

Code

You use assign in functions instead of returning output. Please return output, you don't need to use assign here.
Why do you use rm inside a function?
There is a separate file containing imports (confusingly called exports 😉). It'd be easier to see where functions are imported from if you used the syntax jsonlite::fromJSON. You could then remove the separate R code with imports. Cf http://r-pkgs.had.co.nz/namespace.html#imports
Your package depends on purrr for a single function. Consider replacing it with do.call and rbind.
Regarding all the stopifnot, have informative error messages when the user entered a wrong parameter value.
You could have a single exported function instead of two if I follow correctly? It'd take arguments like the batch one, that could be vector of length 1 (the user would input say one or several country pairs). The other function would be an utility function, that wouldn't need to be exported. It might simplify the interface?
If there is only one function call in a pipeline it might be clearer to remove the pipe.

  origin_destination <- origin_destination %>%
    mutate(trade_exchange_val = 
!!sym("export_val") + !!sym("import_val"))

would become

  origin_destination <-  mutate(origin_destination,
    trade_exchange_val = !!sym("export_val") + !!sym("import_val"))

In case you might ignore it, and in case it could be useful, there's a countrycode package.

Data

Where do the data files in data come from? Why are they pre-prepared, are you sure they won't be updated?
You could keep their data processing in data-raw.

Documentation

Please add the URL to the data source.
Are you allowed to use the logo in your README?
A flowchart of what processing your package does might be useful to explain the added value of using the package and to explain what has been preprocessed or not (what's being downloaded vs what's in data/).
In the README when mentioning the vignette have a link to the pkgdown website.
Please add @return fields to the documentation of the functions.
A vignette with an actual use case, and visualizations, would be nice (or an extension of the existing vignette).
Please explain what the data is e.g. SITC or add an external link.
Because some chunks of the vignette are unevaluated without a pre-generated output, I can't understand what the "problem" is. You could pre-generate the output.

Misc

Could you rename the repo "oec" like the package?
The "Maintainer" field in DESCRIPTION is useless because it'll be generated automatically from Authors.
What are the files .Rapp.history in data/ and .build.timestamp in vignettes/?
Please add a code coverage integration and badge (via e.g. usethis::usethis::use_coverage())

goodpractice output

 ✖ omit "Date" in DESCRIPTION. It is not required and it gets invalid quite
    often. A build date will be added to the package when you perform `R CMD build` on
    it.
  ✖ avoid long code lines, it is bad for readability. Also, many people
    prefer editor windows that are about 80 characters wide. Try make your lines
    shorter than 80 characters

    R\getdata.R:56:1
    R\getdata_batch.R:44:1
    R\getdata_batch.R:68:1
    tests\testthat\test-oec.R:3:1
    tests\testthat\test-oec.R:15:1
    ... and 10 more lines

  ✖ avoid 1:length(...), 1:nrow(...), 1:ncol(...), 1:NROW(...) and
    1:NCOL(...) expressions. They are error prone and result 1:0 if the expression on
    the right hand side is zero. Use seq_len() or seq_along() instead.

    R\getdata_batch.R:66:13

Reviewers: @aedobbyn @cimentadaj
Due date: 2018-07-04

maelle · 2018-06-07T16:04:50Z

@pachamaltese 👋 any question or update? 😺

pachadotdev · 2018-06-07T16:17:58Z

@maelle not yet, I've been moving a lot. Can I send it tomorrow at night?

maelle · 2018-06-07T16:19:41Z

If you think you won't have enough time to work on the package in the next weeks, we can set this submission on hold until you have more time.

pachadotdev · 2018-06-07T16:20:44Z

tomorrow's night is ok, I have to complete the vignettes and that takes 1-2 hours :)

pachadotdev · 2018-06-09T17:11:42Z

@maelle it's ok now, I did all the steps from your feedback and just a few are explained why yes/no to it is better for this particular API.

maelle · 2018-06-09T17:19:59Z

Thanks, can you please answer with the explanations in this thread? 🙂

pachadotdev · 2018-06-09T18:39:54Z

Maelle' s initial feedback

Code

You use assign in functions instead of returning output. Please return output, you don't need to use assign here.
Why do you use rm inside a function? fixed, it was a bad try to make lighter function calls
There is a separate file containing imports (confusingly called exports wink). It'd be easier to see where functions are imported from if you used the syntax jsonlite::fromJSON. You could then remove the separate R code with imports. http://r-pkgs.had.co.nz/namespace.html#imports
Your package depends on purrr for a single function. Consider replacing it with do.call and rbind. To (over)simplify now I also use purrr's flatten df function
Regarding all the stopifnot, have informative error messages when the user entered a wrong parameter value.
You could have a single exported function instead of two if I follow correctly? It'd take arguments like the batch one, that could be vector of length 1 (the user would input say one or several country pairs). The other function would be an utility function, that wouldn't need to be exported. It might simplify the interface?
If there is only one function call in a pipeline it might be clearer to remove the pipe.

origin_destination <- origin_destination %>%
    mutate(trade_exchange_val = 
!!sym("export_val") + !!sym("import_val"))

would become

origin_destination <-  mutate(origin_destination,
    trade_exchange_val = !!sym("export_val") + !!sym("import_val"))

In case you might ignore it, and in case it could be useful, there's a countrycode package. the OEC uses an ISO modification, so I'm not so sure about it

Data

Where do the data files in data come from? Why are they pre-prepared, are you sure they won't be updated? everything in data/ comes from atlas.media.mit.edu and corresponds to unaltered trade codes (i.e HS96 was created not to touch HS92 created in 1992, HS02 was an update to HS96, etc so
these tables do not change
You could keep their data processing in data-raw. ok, I'll add a vignette to explain the changes

Documentation

Please add the URL to the data source.
Are you allowed to use the logo in your README? yes, I' m a part of the original team
A flowchart of what processing your package does might be useful to explain the added value of using the package and to explain what has been preprocessed or not (what's being downloaded vs what's in data/).
In the README when mentioning the vignette have a link to the pkgdown website.
Please add @return fields to the documentation of the functions.
A vignette with an actual use case, and visualizations, would be nice (or an extension of the existing vignette).
Please explain what the data is e.g. SITC or add an external link.
Because some chunks of the vignette are unevaluated without a pre-generated output, I can't understand what the "problem" is. You could pre-generate the output.

Misc

Could you rename the repo "oec" like the package? travis also renamed
The "Maintainer" field in DESCRIPTION is useless because it'll be generated automatically from Authors.
What are the files .Rapp.history in data/ and .build.timestamp in vignettes/? temp files from R CMD, ok now
Please add a code coverage integration and badge (via e.g. usethis::usethis::use_coverage())

maelle · 2018-06-10T06:31:05Z

Thanks a lot @pachamaltese! 😺 I started looking and two small comments already:

The URL to the pkgdown site should be updated now that the repo has a new name.
You don't export the batch function any more but it's still present in the website docs at least cf http://pacha.hk/oec/reference/index.html

maelle · 2018-06-10T06:36:55Z

The link to the logo is also broken at the moment.

maelle · 2018-06-10T12:26:27Z

Thanks again @pachamaltese!

A few more points before I start looking for reviewers

The URL in the repo description should be updated.
The documentation website has to be updated (it still documents the function that was removed)
The logo doesn't appear any more in the README.
You can now remove my feedback from the README 😉 and have a more prominent link to the pkgdown website in it.
You can now add a peer-review badge to your README

[![](https://badges.ropensci.org/217_status.svg)](https://github.com/ropensci/onboarding/issues/217)

It'll indicate the package is under review, until after onboarding when it'll then indicate it was peer-reviewed.

Please add a code of conduct to your repo, e.g. via usethis::use_code_of_conduct()
Rbuildignore data-raw e.g. via usethis::use_build_ignore("data-raw")
The codecov badge has a wrong URL.
goodpractice output

It is good practice to

  ✖ write unit tests for all functions, and all package code in
    general. 89% of code lines are covered by test cases.

    R/getdata.R:44:NA
    R/getdata.R:47:NA
    R/getdata.R:50:NA
    R/getdata.R:55:NA
    R/getdata.R:59:NA
    ... and 13 more lines

  ✖ omit "Date" in DESCRIPTION. It is not required and it gets
    invalid quite often. A build date will be added to the package when you
    perform `R CMD build` on it.
  ✖ avoid long code lines, it is bad for readability. Also, many
    people prefer editor windows that are about 80 characters wide. Try make
    your lines shorter than 80 characters

    R\getdata.R:47:1
    R\getdata.R:50:1
    R\getdata.R:59:1
    R\getdata.R:64:1
    R\getdata.R:69:1
    ... and 5 more lines

  ✖ avoid 1:length(...), 1:nrow(...), 1:ncol(...), 1:NROW(...) and
    1:NCOL(...) expressions. They are error prone and result 1:0 if the
    expression on the right hand side is zero. Use seq_len() or seq_along()
    instead.

    R\getdata.R:111:32
    R\getdata.R:174:26
    R\getdata.R:200:25

I think that you could easily increase code coverage, looking at https://codecov.io/gh/pachamaltese/oec/src/master/R/getdata.R you should

test that the function returns errors when it should
test more use cases (e.g. when destination == "all")

Thanks for all your work!

pachadotdev · 2018-06-10T14:21:18Z

Great, thanks ! I'll do that before 14.00 And sorry I didn't notice I broke links when renaming

…

On Sun, Jun 10, 2018, 8:26 AM Maëlle Salmon ***@***.***> wrote: Thanks again @pachamaltese <https://github.com/pachamaltese>! A few more points before I start looking for reviewers - The URL in the repo description should be updated. - The documentation website has to be updated (it still documents the function that was removed) - The logo doesn't appear any more in the README. - You can now remove my feedback from the README 😉 and have a more prominent link to the pkgdown website in it. - You can now add a peer-review badge to your README [![](https://badges.ropensci.org/217_status.svg)](https://github.com/ropensci/onboarding/issues/217) It'll indicate the package is under review, until after onboarding when it'll then indicate it was peer-reviewed. - Please add a code of conduct to your repo, e.g. via usethis::use_code_of_conduct() - Rbuildignore data-raw e.g. via usethis::use_build_ignore("data-raw") - The codecov badge has a wrong URL. - goodpractice output It is good practice to ✖ write unit tests for all functions, and all package code in general. 89% of code lines are covered by test cases. R/getdata.R:44:NA R/getdata.R:47:NA R/getdata.R:50:NA R/getdata.R:55:NA R/getdata.R:59:NA ... and 13 more lines ✖ omit "Date" in DESCRIPTION. It is not required and it gets invalid quite often. A build date will be added to the package when you perform `R CMD build` on it. ✖ avoid long code lines, it is bad for readability. Also, many people prefer editor windows that are about 80 characters wide. Try make your lines shorter than 80 characters R\getdata.R:47:1 R\getdata.R:50:1 R\getdata.R:59:1 R\getdata.R:64:1 R\getdata.R:69:1 ... and 5 more lines ✖ avoid 1:length(...), 1:nrow(...), 1:ncol(...), 1:NROW(...) and 1:NCOL(...) expressions. They are error prone and result 1:0 if the expression on the right hand side is zero. Use seq_len() or seq_along() instead. R\getdata.R:111:32 R\getdata.R:174:26 R\getdata.R:200:25 I think that you could easily increase code coverage, looking at https://codecov.io/gh/pachamaltese/oec/src/master/R/getdata.R you should - test that the function returns errors when it should - test more use cases (e.g. when destination == "all") Thanks for all your work! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#217 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJn6Oa6i5RC86vR5agDIBsMCxfApsP9-ks5t7RB6gaJpZM4T-I-i> .

maelle · 2018-06-10T14:27:19Z

You're welcome, no hurry and no problem!

maelle · 2018-06-10T14:28:08Z

Note that only one vignette appeared on the pkgdown website last time I looked

pachadotdev · 2018-06-10T20:40:58Z

@maelle ready, I used Jim Hester's lintr package for this 2nd round of feedback. Thanks a lot for the useful comments !!

maelle · 2018-06-11T04:27:44Z

Many thanks, I'll now start looking for reviewers!

maelle · 2018-06-13T06:37:59Z

Thanks @aedobbyn and @cimentadaj for accepting to review this package! 😸

Your reviews are due on 2018-07-04. As a reminder here are links to our bookdown in particular the packaging guidelines and reviewer's guide and to the review template.

cimentadaj · 2018-08-23T15:41:51Z

Will do this weekend!

aedobbyn · 2018-08-24T15:38:48Z

Looking good @pachamaltese!

Here are a few notes. I hope to be able to to look through more thoroughly this weekend but in case I'm not able to I wanted to give you something. By topic, in no particular order:

get_data

In get_data, the warnings No encoding supplied: defaulting to UTF-8. occur during the parse() call
- This warning isn't very useful to the user (since they can't do anything about it) and there could potentially be a lot of them since you have one call per year. I'd either set the encoding explicitly using parse(encoding = "UTF-8") or suppress the message
In read_from_api you have an attempts_left argument but there's no way for the user to change this. Not required but you could think about including this as a user-defined variable in get_data
This is super nitpicky but the processing data message is a little misleading because it seems like a progress bar when really it messages all the years upfront. That means that if there were a problem with a certain year these messages wouldn't help you pin down which year failed. Maybe something like sprintf("\nProcessing %s to %s data...", range(years)[1], range(years)[2]) would be better. Otherwise, you could include the message of which year we're on in the read_from_api function.

get_countrycode

The error message for country code that doesn't exist could be moe informative than length(countrycode) != 0 is not TRUE
You might want to include a mapping from "us" to "united states", "uk" to "united kingdom", and other similar situations since I can see people trying get_countrycode("us") and being surprised when they don't get a US result

Other

In the devtools install line, it should be devtools::install_github("pachamaltese/oec")
You shouldn't need to prefix package functions with oec:: (e.g. https://github.com/pachamaltese/oec/tree/64b185675e9faf1318f6cae10d16b75f97607423/R/get_data.R#L87), I think
I'm not sure what the correct protocol for the fixtures dir . Looks like you use it in tests/testthat/helper-oec.R to get vcr configurations. The presence of the extra directory does give you a NOTE on devtools::check(), so I think it should at least be buildignored
Should update the vignette titles (line 7 for both). On devtools::release_checks(): WARNING: placeholder 'Vignette Title' detected in 'title' field and/or 'VignetteIndexEntry' for: oec-data.Rmd,oec.Rmd
May want to update the year on your LICENSE and in the README to 2018
What was the rationale for keeping the tidyeval instead of removing it and using a globals.R file?

This is looking really solid overall -- will be excited to use it very soon!

pachadotdev · 2018-08-24T15:45:04Z

WHOA !! thanks a lot @aedobbyn !!!!!!!!!
I'm making changes this weekend. For now, the rationale for keeping the tidyeval was just to mimic some design decisions I like, for example those in the highcharter package by @jbkunst.

aedobbyn · 2018-08-24T16:16:42Z

No problem, glad it's helpful! And makes sense -- it doesn't affect the functionality of course and assuming rlang stays backward-compatible (which I think they will?) you should be fine if they change the tidyeval syntax.

pachadotdev · 2018-08-24T16:22:30Z

get_data

In get_data, the warnings No encoding supplied: defaulting to UTF-8. occur during the parse() call. This warning isn't very useful to the user (since they can't do anything about it) and there could potentially be a lot of them since you have one call per year. I'd either set the encoding explicitly using parse(encoding = "UTF-8") or suppress the message
In read_from_api you have an attempts_left argument but there's no way for the user to change this. Not required but you could think about including this as a user-defined variable in get_data
This is super nitpicky but the processing data message is a little misleading because it seems like a progress bar when really it messages all the years upfront. That means that if there were a problem with a certain year these messages wouldn't help you pin down which year failed. Maybe something like sprintf("\nProcessing %s to %s data...", range(years)[1], range(years)[2]) would be better. Otherwise, you could include the message of which year we're on in the read_from_api function.

get_countrycode

The error message for country code that doesn't exist could be moe informative than length(countrycode) != 0 is not TRUE
You might want to include a mapping from "us" to "united states", "uk" to "united kingdom", and other similar situations since I can see people trying get_countrycode("us") and being surprised when they don't get a US result

Other

In the devtools install line, it should be devtools::install_github("pachamaltese/oec")
You shouldn't need to prefix package functions with oec:: (e.g. https://github.com/pachamaltese/oec/blob/master/R/get_data.R#L87), I think
I'm not sure what the correct protocol for the fixtures dir . Looks like you use it in tests/testthat/helper-oec.R to get vcr configurations. The presence of the extra directory does give you a NOTE on devtools::check(), so I think it should at least be buildignored
Should update the vignette titles (line 7 for both). On devtools::release_checks(): WARNING: placeholder 'Vignette Title' detected in 'title' field and/or 'VignetteIndexEntry' for: oec-data.Rmd,oec.Rmd
May want to update the year on your LICENSE and in the README to 2018
What was the rationale for keeping the tidyeval instead of removing it and using a globals.R file?

aedobbyn · 2018-08-26T12:51:13Z

Thanks for implementing those changes so quickly! Last bit of suggestions from me, I think:

In your get_data @return documentation I would point people to your handy reference about what each of the columns means since a lot of these are non-obvious.
If you feel it's worth it (though I don't think strictly necessary), you could refactor out some of the redundancies in the get_data function. For instance, you use the same rename + mutate flow in each extraction, so this could be pulled out into a utility function. Similarly, the checks at the beginning of the function making sure that the years specified are in bounds for the classification could be stored in a lookup table. This way you only need one of these in get_data instead of five.
Right now you only allow years up to 2016, but in the coming years will the OEC have data past that year? Instead of updating this by hand is there a way to check what the greatest year available is, maybe on package load and store that as a variable?
I think in all cases you can replace single &s with &&s and |s with ||s
Small typo here ("lower of equal to")
Unnecessary line, I think
Future improvement for get_productcode would be a word_boundary boolean param for an exact word match. That would take care of the first result for get_productcode("wine") being "Live swine" 😆
Very small detail -- you might want to put the top_exporter_code and top_exporter cols next to each other in the output since they reference the same thing. Right now they're sometimes separated by a few columns (e.g. trade_exchange_val). Same for top_importer_code and top_importer
Should re-run pkgdown once new changes are implemented

pachadotdev · 2018-08-26T15:41:28Z

thanks a lot @aedobbyn !!
I still have some pendings that I am adding to this new list
btw, after all of this cool changes you suggested, can I contact you by email? it's for a related project that benefited a lot from this feedback
mine is mvargas &a t& dcc * uchile * cl

cimentadaj · 2018-08-26T15:47:22Z

Great job @pachamaltese. I think @aedobbyn has covered most of the major problems. Here are some minor concerns.

https://github.com/pachamaltese/oec/blob/master/R/get_countrycode.R#L51-L52 should say 'vector' rather than package.
Perhaps you can move this outside the main function so it's cleaner.
In general, I would create tests that check that the content of the request resembles what you requested and not only the format of the tibble. I would add a few tests checking whether get_data returned the requested country as well as for the different years. Maybe this applies to some topics as well. I just had a nasty bug in one of my packages related to this and it was awful!
Here I would check whether it has a 'try-error' to be safe. This makes sense because you're using try anyways.

Hope these are useful!

pachadotdev · 2018-08-28T01:33:24Z

@cimentadaj once again thanks for the feedback !! I completed all the points that Amanda listed (above I wrote a clickable list that I completed)

the server is down again, please give 1-2 days to complete Jorge's addresses?

@cimentadaj: any chance to contact you by email? mine is " m vargas a/t dcc uchile cl"

cimentadaj · 2018-08-28T06:01:59Z

Yes @pachamaltese , feel free to contact me and no worries about the 1-2 days.

pachadotdev · 2018-09-05T17:01:53Z

@aedobbyn @cimentadaj thanks a lot!!
I had hectic days at the office, here are some changes that I'll try to complete ASAP

In your get_data @return documentation I would point people to your handy reference about what each of the columns means since a lot of these are non-obvious.
If you feel it's worth it (though I don't think strictly necessary), you could refactor out some of the redundancies in the get_data function. For instance, you use the same rename + mutate flow in each extraction, so this could be pulled out into a utility function. Similarly, the checks at the beginning of the function making sure that the years specified are in bounds for the classification could be stored in a lookup table. This way you only need one of these in get_data instead of five. ok , I added a wrapper to tidy IDs, probably more wrappers coming during the week
Right now you only allow years up to 2016, but in the coming years will the OEC have data past that year? Instead of updating this by hand is there a way to check what the greatest year available is, maybe on package load and store that as a variable? there's no way with the current API without loading "trash" with the year value (for future work, sure!)
I think in all cases you can replace single &s with &&s and |s with ||s* ok, modified to use correct 1st element comparison
Small typo here ("lower of equal to")
Future improvement for get_productcode would be a word_boundary boolean param for an exact word match. That would take care of the first result for get_productcode("wine") being "Live swine" laughing Added to issues get_productcode pachadotdev/oec#22
Very small detail -- you might want to put the top_exporter_code and top_exporter cols next to each other in the output since they reference the same thing. Right now they're sometimes separated by a few columns (e.g. trade_exchange_val). Same for top_importer_code and top_importer noted, thanks!! I moved that and others columns to a more "logical" order
Should re-run pkgdown once new changes are implemented
https://github.com/pachamaltese/oec/blob/master/R/get_countrycode.R#L51-L52 should say 'vector' rather than package. thanks!! totally re-phrased now
Here I would check whether it has a 'try-error' to be safe. This makes sense because you're using try anyways. https://github.com/pachamaltese/oec/blob/master/R/get_data.R#L145
Unnecessary line, I think (https://github.com/pachamaltese/oec/blob/73f4a7493aa649311f88f60dcbd0cfd4a203c895/R/get_data.R#L97)* Without it the checks fail
Perhaps you can move this outside the main function so it's cleaner. https://github.com/pachamaltese/oec/blob/master/R/get_data.R#L112-L161
In general, I would create tests that check that the content of the request resembles what you requested and not only the format of the tibble. I would add a few tests checking whether get_data returned the requested country as well as for the different years. Maybe this applies to some topics as well. I just had a nasty bug in one of my packages related to this and it was awful!

pachadotdev · 2018-09-12T20:17:14Z

@aedobbyn @cimentadaj Hi. Thanks for the feedback! there are four point I can't solve without breaking the check :S
This weekend I' ll try to solve https://github.com/pachamaltese/oec/blob/master/R/get_data.R#L112-L161 as it is totally cleaner with that change that without

maelle · 2018-09-27T16:54:05Z

👋 @pachamaltese did you get any chance to make progress on the package? When you do, please summarize your response again (using your checklist again I guess 😉).

pachadotdev · 2018-09-27T17:11:06Z

@maelle Hi, I was not able to test a lot because the server has been down a lot actually.
I updated the list to reflect the only 3 point I could not fix. Those can be more tricky actually!

maelle · 2018-10-16T04:25:45Z

@pachamaltese any update?

pachadotdev · 2018-10-16T14:17:19Z

Hi Maelle The server is just acting weird. Can we put this on hold? Yesterday I spoke to Amanda and Jorge about providing an alternative API that would be hassle-free. Best

…

On Tue, Oct 16, 2018, 1:25 AM Maëlle Salmon ***@***.***> wrote: @pachamaltese <https://github.com/pachamaltese> any update? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#217 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJn6OSLbH9Nksd8-NGk4wq6cnnhuxbxjks5ulV_KgaJpZM4T-I-i> .

maelle · 2018-10-16T14:56:29Z

Do you mean you don't trust the data provider enough with stability?

pachadotdev · 2018-10-16T15:00:01Z

Exactly, at the present time that' the problem. Last year it was different

…

On Tue, Oct 16, 2018, 11:56 AM Maëlle Salmon ***@***.***> wrote: Do you mean you don't trust the data provider enough with stability? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#217 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJn6Oef7UB08vWE37dgCVnvIWxC0SVTIks5ulfOrgaJpZM4T-I-i> .

maelle · 2018-10-16T15:05:55Z

Ok, too bad, will put the submission on hold, I hope you can solve the problem, good luck!

maelle · 2018-12-06T10:00:11Z

@pachamaltese 👋, any news?

pachadotdev · 2018-12-06T12:44:24Z

Hi Maelle Sorry about the long silence! I have a tentative PostgreSQL database and now I only have to work on the API. The good part is that both Jorge and Amanda are collaborating on this. If you also want to participate please do not hesitate to give ideas to create a better API. Best! ————— *Mauricio Vargas Sepúlveda 帕夏* Do you like Data Science? visit pacha.hk 你爱科学数据专吗？你走pacha.hk El jue., 6 de dic. de 2018 a la(s) 07:00, Maëlle Salmon ( notifications@github.com) escribió:

…

@pachamaltese <https://github.com/pachamaltese> 👋, any news? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#217 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJn6OSeG4jl5GGT2v8Rmexs1HF1fz3Llks5u2OqsgaJpZM4T-I-i> .

maelle · 2018-12-06T13:22:10Z

Cool to read! Is there a public repo we can link from here in case someone reads the thread and gets curious?

pachadotdev · 2018-12-06T13:52:41Z

sure !! I have created this organization to make my last 3 years of work fully reproducible https://github.com/tradestatistics ————— *Mauricio Vargas Sepúlveda 帕夏* Do you like Data Science? visit pacha.hk 你爱科学数据专吗？你走pacha.hk El jue., 6 de dic. de 2018 a la(s) 10:22, Maëlle Salmon ( notifications@github.com) escribió:

…

Cool to read! Is there a public repo we can link from here in case someone reads the thread and gets curious? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#217 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJn6ORf8VRizQyg9eFGZAlW8UEnD0ffUks5u2RoDgaJpZM4T-I-i> .

pachadotdev · 2019-01-02T21:40:53Z

@maelle Hi!!
Sorry for the very long silence. After all the issues with the "server is down" and whatever, I ended up creating my own project.

A new package based on oec is now available here https://github.com/tradestatistics/r-package

But in broader terms, tradestatistics.io is something new I made with a help from the awesome community and everything is pure R (even the API, which is where the new tradestatistics package takes the data)

There are tons of changes such as the ability to inspect different tables, filter by commodity code length, etc provided I made my own PSQL DB.

Probably I am using a laser beam to open a tuna can, but anyways I've tried do to something to foster reproducibility.

Do you have ideas to rename the package? tradestatistics sounds ok, but maybe tsapi is more appropiate.

Least but not last, api.tradestatistics.io/friend is a (boring?) joke I made with the API. It should be easy to complete the joke with the correct query.

maelle · 2019-01-03T17:41:40Z

👋 @pachamaltese! Happy New Year! Congrats on the project (and fun endpoint)!

Should we close this issue? And when your new package is ready you can open a pre-submission inquiry (if it does the exact same things at the package using the broken server used to do, it should be in scope, but better asking first).

I find tradestatistics a better name because tsapi can be misleading since ts is often an abbreviation for time series.

pachadotdev · 2019-01-03T17:43:00Z

@maelle ok !! so I'm closing this and opening a new submission
I'll prepare the new submission during the next hour

maelle · 2019-01-03T17:43:51Z

I'll prepare the new submission during the next hour

Maybe best open a pre-submission inquiry and wait until the package is more mature until a full submission (unless the package is mature already).

noamross assigned maelle May 14, 2018

maelle added package topic:data-access 1/editor-checks labels May 15, 2018

maelle added 2/seeking-reviewer(s) and removed 1/editor-checks labels Jun 11, 2018

maelle added the 3/reviewer(s)-assigned label Jun 13, 2018

maelle added the holding label Oct 16, 2018

pachadotdev closed this as completed Jan 3, 2019

mpadge mentioned this issue Jan 16, 2019

tradestatistics package #274

Closed

25 tasks

oec package #217

oec package #217

Comments

pachadotdev commented May 14, 2018 • edited Loading

Summary

Requirements

Publication options

Detail

maelle commented May 15, 2018

maelle commented May 15, 2018

pachadotdev commented May 15, 2018 • edited Loading

pachadotdev commented May 19, 2018

maelle commented May 24, 2018

pachadotdev commented May 24, 2018

maelle commented May 24, 2018 • edited Loading

Editor checks:

Editor comments

Code

Data

Documentation

Misc

goodpractice output

maelle commented Jun 7, 2018

pachadotdev commented Jun 7, 2018

maelle commented Jun 7, 2018

pachadotdev commented Jun 7, 2018

pachadotdev commented Jun 9, 2018

maelle commented Jun 9, 2018

pachadotdev commented Jun 9, 2018

Maelle' s initial feedback

Code

Data

Documentation

Misc

maelle commented Jun 10, 2018

maelle commented Jun 10, 2018

maelle commented Jun 10, 2018

pachadotdev commented Jun 10, 2018 via email

maelle commented Jun 10, 2018

maelle commented Jun 10, 2018

pachadotdev commented Jun 10, 2018

maelle commented Jun 11, 2018

maelle commented Jun 13, 2018 • edited Loading

cimentadaj commented Aug 23, 2018

aedobbyn commented Aug 24, 2018 • edited Loading

pachadotdev commented Aug 24, 2018

aedobbyn commented Aug 24, 2018

pachadotdev commented Aug 24, 2018 • edited Loading

aedobbyn commented Aug 26, 2018

pachadotdev commented Aug 26, 2018 • edited Loading

cimentadaj commented Aug 26, 2018

pachadotdev commented Aug 28, 2018 • edited Loading

cimentadaj commented Aug 28, 2018

pachadotdev commented Sep 5, 2018 • edited Loading

pachadotdev commented Sep 12, 2018

maelle commented Sep 27, 2018

pachadotdev commented Sep 27, 2018

maelle commented Oct 16, 2018

pachadotdev commented Oct 16, 2018 via email

maelle commented Oct 16, 2018

pachadotdev commented Oct 16, 2018 via email

maelle commented Oct 16, 2018

maelle commented Dec 6, 2018

pachadotdev commented Dec 6, 2018 via email

maelle commented Dec 6, 2018

pachadotdev commented Dec 6, 2018 via email

pachadotdev commented Jan 2, 2019

maelle commented Jan 3, 2019

pachadotdev commented Jan 3, 2019

maelle commented Jan 3, 2019

pachadotdev commented May 14, 2018 •

edited

Loading

pachadotdev commented May 15, 2018 •

edited

Loading

maelle commented May 24, 2018 •

edited

Loading

maelle commented Jun 13, 2018 •

edited

Loading

aedobbyn commented Aug 24, 2018 •

edited

Loading

pachadotdev commented Aug 24, 2018 •

edited

Loading

pachadotdev commented Aug 26, 2018 •

edited

Loading

pachadotdev commented Aug 28, 2018 •

edited

Loading

pachadotdev commented Sep 5, 2018 •

edited

Loading