Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make a staticman app instead of staticmanapp to avoid reaching quotas #243

Closed
robinmetral opened this issue Dec 11, 2018 · 68 comments
Closed

Comments

@robinmetral
Copy link

The public instance of Staticman is in trouble: API calls are now regularly hitting the Github quotas of 5000 hits per hour per user.

The issue is that every comment on any site using this instance is going through the staticmanapp user, therefore staticmanapp easily reaches its API quotas.

This causes the problems that have been repeatedly mentioned in #227, #222, or #242 for example.

I got in touch with the Github staff about this and here's their recommendation:

instead of using a single account to make all those requests, you should build an app so that your users can authorize/install the app. That way, the rate limits will scale much better -- for OAuth Apps, each user has their own quota and for GitHub Apps each installation has its own quota. So, your total rate limit would scale with the number of users instead of being static, which is what you want -- you want the limits to grow with your userbase.

@eduardoboucas you know your software - would this be possible?

@robinmetral robinmetral changed the title Make a staticman app instead of hitting the Github API to avoid hitting quotas Make a staticman app instead of staticmanapp to avoid reaching quotas Dec 11, 2018
@gourav
Copy link

gourav commented Dec 12, 2018

@robinmetral
I am in process of creating an GitHub app that tries to work as staticman. It is still in development and is in a private repository. I can extend invitation if you want to look or contribute.

I expect it to be completed by end of this month.

@robinmetral
Copy link
Author

Good to hear @Erised!
Let me know when it's ready, I can help with testing 🙂

@gourav
Copy link

gourav commented Dec 12, 2018

@robinmetral Sure. I will let you know.

@casually-creative
Copy link

casually-creative commented Dec 14, 2018

Just a friendly wake-up call to @eduardoboucas. Staticman is awesome, but right now, it's hitting its limits and people are developing alternatives. Please find the time to develop a solution so we can continue using this fantastic app without this very annoying limitation.

@eduardoboucas
Copy link
Owner

Hi everyone. I'm sorry that some people are frustrated with the project, I can relate to that. But please remember that the code is fully open-source, which means anyone can simply run their own instance, with their own GitHub account, and bypass all these limitations. The issue we're seeing here is really a problem with the free, public instance I decided to host for everyone. In hindsight, this probably wasn't the best of ideas, because it puts pressure on me to be the sole gatekeeper of this service.

I've not abandoned the project and I'm thinking of solutions to solve both problems: the problem that a single GitHub account acting on behalf of everyone isn't maintainable, and the problem that in the current scheme of things, where the public instance that I run is the centrepiece of the project, I represent a bottleneck.

For example, I'm keen on the idea of rebuilding Staticman as a Netlify function, so that effectively everyone is running their own instance (for free) rather than relying on a centralised service (which I'm covering the costs for). In this scenario, people would provide access to their own GitHub account, which commits would be made from, thus removing the issue of quota limits.

All I can say is that your patience is much appreciated and I'd love to hear everyone's thoughts on how we can make this more manageable for everyone.

@gourav
Copy link

gourav commented Dec 14, 2018

@eduardoboucas The best solution in my opinion, would simply be a GitHub app.
GitHub Apps also have their call limits, but they have this per installation.

If repo A & repo B, install an app on their repos, each gets 5k hits per hour.
I have already started working on this, I will present it here when it is prsentable.

Till then, I am also open to any discussion that how we can make it work for everybody.

@robinmetral
Copy link
Author

Thanks for the updates @eduardoboucas ! 🙌

A Netlify function would be great as a simpler solution to people wanting to self-host their instance.

However one of the things I love with Staticman is how simple it is to set it up only using a Github repo!

I think that many people would benefit from such a "centralized" service, be it for testing or to allow comments on small websites and blogs (GH pages for example), where setting up Netlify+AWS Lamba to run a self-hosted instance seems like a lot of trouble.

In this case a Github app would probably be the best solution! @Erised can you consider making your working repo public so that we can take a look and maybe contribute? 🙂

@gourav
Copy link

gourav commented Dec 15, 2018

@robinmetral Please give me a day or two.

I want to finish a few tasks by myself.
I don't want myself to be an embarrasment as this is my first project.
I just need it to be presentable with a few features. Then, i will work on it with everyone.

@eduardoboucas
Copy link
Owner

I don't want myself to be an embarrasment as this is my first project.

No reason to feel embarrassed at all! We all appreciate the effort you're putting in. Whenever you feel comfortable showing your code, I'm happy to review and help you change anything that needs tweaking.

@gourav
Copy link

gourav commented Dec 15, 2018

@eduardoboucas Thank you. It means a great deal. I am looking over staticman's code all the time to see how everything works.
I am sure will let you know.

@maciek134
Copy link
Contributor

@Erised if you need any help I'd be happy to assist as well.

@casually-creative
Copy link

Thanks for your reply @eduardoboucas. Staticman running in a netlify function sounds like a very good idea. I'm looking forward to you testing this out and, if found feasible, giving us feedback on how to set it up for ourselves. Keep up the good work :D

@rliebling
Copy link

rliebling commented Dec 29, 2018

If i understand things correctly, Netlify provides a way to translate form submissions into the Functions (aws lambda events). However, the free plan limits this to 100 form submissions per month (even though the Functions limit for the free plan is much higher). Just FYI.

Meanwhile, if i understand things correctly (and I may not) the /connect controller is invoking the GET /user/repository_invitations API method (https://developer.github.com/v3/repos/invitations/#list-a-users-repository-invitations) which will only return the first 30 invitations. I do not see any handling of pagination (although possibly it's built into the github client library you are using by default, or i am missing some setup someplace). Thus, many calls to connect are likely to fail with Invitation Not Found if there are more than 30 queued up. And, their continuing retry attempts help exhaust the API limits.

If the above analysis is correct a few things would greatly help:

  1. set the page size to 100 (the max github allows)
  2. handle pagination. if the api returns them in FIFO order, then perhaps starting from the tail would help - which really means get the first page, then jump to the last page (the link should be provided in the api response) and work toward the front until found. This would favor recent invitations.
  3. Possibly, when traversing the list of invitations, reject all those older than N hours.
  4. Cache the list of invitations for N minutes to reduce API hits to github. Suspect that if you reject the old invitations then the current list at any given time should be maintainable at <100 (so a single page)

Sorry I don't have time to do it myself - but maybe someone else can contribute improvements here. My suggestion would be to start with just setting the per_page parameter and rejecting stale invitations (to unclog things and keep them unclogged). Then pagination support and caching probably become unnecessary,. Although users with pending invitations who have not yet given up will have to re-invite, at least they will likely be successful at that time.

Update: Was late when i originally posted this. While lying in bed i wondered why instead you don't just accept all invitations. Each time you getRepoInvites, just accept them all. This will keep the queue low. And, if you just schedule this to happen say each minute, then you won't need folks to hit the endpoint to connect at all. I'm assuming the point of this was only to trigger the invite acceptance.

@maciek134
Copy link
Contributor

@rliebling great analysis, I didn't even think about that. I'm more than happy to provide a PR for this.

@eduardoboucas
Copy link
Owner

eduardoboucas commented Jan 12, 2019

Hi all.

I had a go at implementing Staticman as a GitHub App, which should fix many of the issues people are seeing at the moment. Can I ask for some volunteers to help me test it? Here's how:

  1. Remove staticmanapp as a collaborator
  2. Go to https://github.com/apps/staticman-net and install the application on your repository
  3. Submit a comment to the new v3 endpoint, using dev.staticman.net as the base URL – i.e. https://dev.staticman.net/v3/entry/github/[USERNAME]/[REPOSITORY]/[BRANCH]

Any help is much appreciated.

@rliebling
Copy link

Hi @eduardoboucas

Great to hear the news from you!

Not sure if i've just done something wrong but i'm getting 500 errors. I removed staticmanapp as a collaborator. I installed the staticman-net github app with access to my repo. And, i tried submitting a comment to https://dev.staticman.net/v2/entry/rliebling/my_blog/master/comments. After retrying a couple times i tried curl'ing the /v2/connect/rliebling/my_blog endpoint and also got a 500 response.

Note that i'm using Hugo with the engimo them which has staticman support built in, but i've never successfully used staticman before (as my invitation was "not found"). Also I'm testing using Hugo on my localhost - assume the post-id stuff wouldn't cause the problem, but i mention in case i'm wrong. The request that's failing is (as copied from chrome debugger, removing user agent and cookies:

curl -i  'https://dev.staticman.net/v2/entry/rliebling/my_blog/master/comments' -H 'authority: dev.staticman.net' -H 'cache-control: max-age=0' -H 'origin: http://localhost:1313' -H 'upgrade-insecure-requests: 1' -H 'content-type: application/x-www-form-urlencoded' -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8' -H 'accept-encoding: gzip, deflate, br' -H 'accept-language: en-US,en;q=0.9' --data 'options%5BpostId%5D=dc33308b4aef09b40e86a6783d501abd&options%5Bredirect%5D=http%3A%2F%2Flocalhost%3A1313%2Fposts%2Fmore_on_tech_debt%2F%23submission-success&options%5BredirectError%5D=http%3A%2F%2Flocalhost%3A1313%2Fposts%2Fmore_on_tech_debt%2F%23submission-failure&fields%5Bhoneypot%5D=&fields%5Bpermalink%5D=%2Fposts%2Fmore_on_tech_debt%2F&fields%5Bparent_id%5D=&fields%5Bcontent%5D=test+comment&fields%5Bauthor%5D=rich&fields%5Bemail%5D=rliebling%40gmail.com&fields%5Bsite%5D=https%3A%2F%2Fexample.com' --compressed

@eduardoboucas
Copy link
Owner

Thanks for the feedback. I’ll try to debug it tonight using your sample request and will come back with my findings.

Thank you all for your patience.

@eduardoboucas
Copy link
Owner

@rliebling Your site doesn't seem to be configured properly. I don't see a configuration file for Staticman on https://github.com/rliebling/my_blog.

@rliebling
Copy link

rliebling commented Jan 13, 2019

Ah! My bad! Had only configured locally and never pushed to GH as my invite had not been accepted. And, did this testing having forgotten all about that!

Sorry - I should have checked and figured this out myself.

I've fixed that now, however, and still getting 500 response. I don't want you having to go about debugging my config if you think that's likely the issue. I'll try looking at the code to understand better what it's doing. But, one quick thing to confirm: if i configure path: "data/comments/{options.postId}" inside staticman.yml should i need that directory/path to already exist?

Note: i've also enabled commenting on my live site now -- just not working yet (eg https://rich.liebling.us/posts/more_on_tech_debt/)

@eduardoboucas
Copy link
Owner

@rliebling Your config is fine, it was an issue with an environment variable on the development instance. I've fixed it, tested again and it seems to be working.

You can see a submission here: rliebling/my_blog@e36206d

@rliebling
Copy link

@eduardoboucas Thanks so much for this project, moving it to a github app, and for your help here!

@simonarnell
Copy link

simonarnell commented Jan 13, 2019

Hi all.

I had a go at implementing Staticman as a GitHub App, which should fix many of the issues people are seeing at the moment. Can I ask for some volunteers to help me test it? Here's how:

  1. Remove staticmanapp as a collaborator
  2. Go to https://github.com/apps/staticman-net and install the application on your repository
  3. Submit a comment as usual, but use https://dev.staticman.net instead of https://api.staticman.net as the base URL.

Any help is much appreciated.

Thanks @eduardoboucas. This seems to be working great for me on my project. https://github.com/simonarnell/GDPRDPIAT

@eduardoboucas
Copy link
Owner

I've updated the comment above to point to the new v3 endpoint. The idea is that people will carry on using v1 or v2 endpoints if they're using the legacy staticmanapp authentication method, whilst people that have installed the new GitHub App will use the v3 endpoints.

@pacollins
Copy link

I don't have reCAPTCHA set but keep getting this error:

{"success":false,"message":"Missing reCAPTCHA API credentials","rawError":{"_smErrorCode":"RECAPTCHA_MISSING_CREDENTIALS"},"errorCode":"RECAPTCHA_MISSING_CREDENTIALS"}

@deadlydog
Copy link
Contributor

I think the v3 Staticman app may be broken, but I'm not certain. See #294 for more details.

@pacollins
Copy link

pacollins commented Jun 16, 2019 via email

@VincentTam
Copy link
Contributor

I don't have reCAPTCHA set but keep getting this error:

{"success":false,"message":"Missing reCAPTCHA API credentials","rawError":{"_smErrorCode":"RECAPTCHA_MISSING_CREDENTIALS"},"errorCode":"RECAPTCHA_MISSING_CREDENTIALS"}

@pacollins That's a duplicate of #223.

@VincentTam
Copy link
Contributor

VincentTam commented Jul 6, 2019

@cloudwheels

Hi, I appreciate you're working hard on this and this may just be a user config error, but I just get a 500 error back with no error message other than {success:false}.
My repo is at:

https://github.com/hortigraph/hortigraph.github.io/

The request / response is:

Request URL: https://dev.staticman.net/v3/entry/github/hortigraph/hortigraph.github.io/master/comments
Request Method: POST
Status Code: 500
Remote Address: 104.27.141.78:443
Referrer Policy: no-referrer-when-downgrade

Response headers:

access-control-allow-headers: Origin, X-Requested-With, Content-Type, Accept
access-control-allow-origin: *
cf-ray: 49bb3c136f74bbd8-LHR
content-length: 17
content-type: application/json; charset=utf-8
date: Sat, 19 Jan 2019 18:06:21 GMT
etag: W/"11-UIVUdQWNarX1D9mk06okyEMbpS8"
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
status: 500
via: 1.1 vegur
x-powered-by: Express

Request Headers:

Provisional headers are shown
Accept: /
Content-Type: application/x-www-form-urlencoded
Origin: https://hortigraph.github.io
Referer: https://hortigraph.github.io/rhs/r-2101-2/
User-Agent: Mozilla/5.0 (X11; CrOS x86_64 11316.82.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.59 Safari/537.36

Form data:

fields[message]: comment
fields[name]: name
fields[email]: nigel@nigelwheeler.com
fields[url]:
options[slug]: r-2101-2
fields[hidden]:

/staticman.yml


comments:
  # (*) REQUIRED
  #
  # Names of the fields the form is allowed to submit. If a field that is
  # not here is part of the request, an error will be thrown.
  allowedFields: ["name", "email", "url", "message"]

  # (*) REQUIRED WHEN USING NOTIFICATIONS
  #
  # When allowedOrigins is defined, only requests sent from one of the domains
  # listed will be accepted. The origin is sent as part as the `options` object
  # (e.g. <input name="options[origin]" value="http://yourdomain.com/post1")
  # allowedOrigins: ["yourdomain.com"]

  # (*) REQUIRED
  #
  # Name of the branch being used. Must match the one sent in the URL of the
  # request.
  branch: "master"

  commitMessage: "New comment by {fields.name}"

  # (*) REQUIRED
  #
  # Destination path (filename) for the data files. Accepts placeholders.
  filename: "comment-{@timestamp}"

  # The format of the generated data files. Accepted values are "json", "yaml"
  # or "frontmatter"
  format: "yaml"

  # List of fields to be populated automatically by Staticman and included in
  # the data file. Keys are the name of the field. The value can be an object
  # with a `type` property, which configures the generated field, or any value
  # to be used directly (e.g. a string, number or array)
  generatedFields:
    date:
      type: "date"
      options:
        format: "iso8601" # "iso8601" (default), "timestamp-seconds", "timestamp-milliseconds"

  # Whether entries need to be approved before they are published to the main
  # branch. If set to `true`, a pull request will be created for your approval.
  # Otherwise, entries will be published to the main branch automatically.
  moderation: true

  # Akismet spam detection.
  # akismet:
  #   enabled: true
  #   author: "name"
  #   authorEmail: "email"
  #   authorUrl: "url"
  #   content: "message"
  #   type: "comment"

  # Name of the site. Used in notification emails.
  # name: "Your Site"

  # Notification settings. When enabled, users can choose to receive notifications
  # via email when someone adds a reply or a new comment. This requires an account
  # with Mailgun, which you can get for free at http://mailgun.com.
  # notifications:
    # Enable notifications
    # enabled: true

    # (!) ENCRYPTED
    #
    # Mailgun API key
    # apiKey: ""

    # (!) ENCRYPTED
    #
    # Mailgun domain (encrypted)
    # domain: ""

  # (*) REQUIRED
  #
  # Destination path (directory) for the data files. Accepts placeholders.
  path: "_data/comments/{options.slug}" # "_data/comments/{options.slug}" (default)

  # Names of required files. If any of these isn't in the request or is empty,
  # an error will be thrown.
  requiredFields: ["name", "email", "message"]

  # List of transformations to apply to any of the fields supplied. Keys are
  # the name of the field and values are possible transformation types.
  transforms:
    email: md5

  # reCaptcha
  # Register your domain at https://www.google.com/recaptcha/ and choose reCAPTCHA V2
  reCaptcha:
    enabled: false
    siteKey: #"6LdRBykTAAAAAFB46MnIu6ixuxwu9W1ihFF8G60Q"
    # Encrypt reCaptcha secret key using Staticman /encrypt endpoint
    # For more information, https://staticman.net/docs/encryption
    secret: #"PznnZGu3P6eTHRPLORniSq+J61YEf+A9zmColXDM5icqF49gbunH51B8+h+i2IvewpuxtA9TFoK68TuhUp/X3YKmmqhXasegHYabY50fqF9nJh9npWNhvITdkQHeaOqnFXUIwxfiEeUt49Yoa2waRR7a5LdRAP3SVM8hz0KIBT4="

In the reCATPCHA section of your Staticman config, even if enabled: false, if you didn't comment out the unused siteKey and secret, that can contribute to an error. Here, you've passed the value null to these two attributes.

To avoid that you may either comment out the whole line (e.g. https://github.com/VincentTam/BJPubTest1)

  # reCaptcha
  # Register your domain at https://www.google.com/recaptcha/ and choose reCAPTCHA V2
  reCaptcha:
    enabled: false
    #siteKey: "6LdRBykTAAAAAFB46MnIu6ixuxwu9W1ihFF8G60Q"
    # Encrypt reCaptcha secret key using Staticman /encrypt endpoint
    # For more information, https://staticman.net/docs/encryption
    #secret: "PznnZGu3P6eTHRPLORniSq+J61YEf+A9zmColXDM5icqF49gbunH51B8+h+i2IvewpuxtA9TFoK68TuhUp/X3YKmmqhXasegHYabY50fqF9nJh9npWNhvITdkQHeaOqnFXUIwxfiEeUt49Yoa2waRR7a5LdRAP3SVM8hz0KIBT4="

or use an empty string "" instead of null. (e.g. Hugo Future Imperfect Slim)

  # reCaptcha
  # Register your domain at https://www.google.com/recaptcha/ and choose reCAPTCHA V2
  reCaptcha:
    enabled: false
    siteKey: ""
    # Encrypt reCaptcha secret key using Staticman /encrypt endpoint
    # For more information, https://staticman.net/docs/encryption
    secret: ""

Reference:

  1. My experiement with Staticman's reCAPTCHA support in Staticman documentation enhancement daattali/beautiful-jekyll#514 (comment)
  2. My blog post about different types of Staticman errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests