Skip to content
This repository has been archived by the owner on Jan 25, 2024. It is now read-only.

Feature Request: allow getOrder to take more than 50 order IDs #105

Open
pwrnrd opened this issue Dec 5, 2018 · 15 comments
Open

Feature Request: allow getOrder to take more than 50 order IDs #105

pwrnrd opened this issue Dec 5, 2018 · 15 comments

Comments

@pwrnrd
Copy link
Contributor

pwrnrd commented Dec 5, 2018

Current behaviour:
When passing an array of more than 50 elements to mws.getOrder() the following Error occurs:

ValidationError: List parameter AmazonOrderId can only take up to 50 items

Preferred behaviour
Allow mws.getOrder() to retrieve an array of more than 50 elements, then cut the array into chunks of less then 50 elements and make a getOrder request. It would be nice if, when making a lot of requests, these requests also automatically deal with throtteling (perhaps that's already the case given #98 .

Disadvantage
This is a clear deviation from the behaviour described by the MWS API.

Advantage
Allow users to work with getOrder() as if there were no limits.

Alternative
Perhaps we can create a another function (e.g. getMultipleOrders()) to meet the preferred behaviour.

@ericblade
Copy link
Owner

I'm.. guessing.. since it's got a max request quota of 6, and it restores at 1 per minute ..wow.. that it doesn't work like the Products functions, quota-wise. Over in Products, if you send multiple skus or asins, each product looked up counts against your quota, regardless if you split them into multiple requests or send them all in the same one.

My first feelings on this are a bit conflicted -- one it'd be great to have a utility function that can queue up a ton of virtually identical requests like a dozen GetOrders with 50 different orders each, patiently wait for all of their results, and return them back in a sensible fashion. Two, however, is that it's probably out of the scope of this library..

I don't think changes in #98 affect this too much -- with #98 if you spam past the quota, it'll back off for a minute and a half before retrying, instead of 6 minutes.

I think this'd take implementing proper queueing and throttle handling to make sense going in as part of this lib. The throttle right now is too bare bones, it just lets you throw as much as you want at it, and then it retries it all in X interval, which isn't good behavior.

I'd suggest for right now, writing a quick wrapper that splits the incoming order list into units of 50 length, then sends off each getOrder request in series. You could probably do it in parallel as long as you know your id isn't doing any other getOrder calls elsewhere. . . but you don't want to spam 12 requests at it, and then it'll have 6 of them retrying every minute and a half, then 5, then 4, until they go through.

@ericblade
Copy link
Owner

... note to self.. a queue handler could be added at callEndpoint that accepts the known endpoint data (category, name, throttle info), and attempts to handle it based on the known maxInFlight and restoreRate in our metadata... argh, you always got me thinking about things. :-D

queue handler stores a map indexed by category/name, which contains an array of promises, and a count of how many are in flight and the last time that call broke quota.. whenever a promise succeeds, decrement counter and send off another request. if a promise returns that it broke quota, then halt that queue until setTimer(restoreRate * 60 * 1000) clears it and currentlyInFlight < maxInFlight . . .

sounds kind of sensible.

@ericblade
Copy link
Owner

ericblade commented Dec 6, 2018

... guess i had the "i left a comment in my head" (or maybe on my laptop) thing too :-D

I thought quite a lot about this last night, as well as did some work to implement a simple request queue. I realized that this really does belong as an enhancement to this project, as I've already implemented a couple of similar items on the Products side. In Products, there's at least a couple of APIs that Amazon side only accept a single item, but the helper in the lib accepts an array, and will make multiple API calls until it gets them all.

My hesitation is simply how long the restore rate on GetOrder is MWS side. It's fine for automated background data pulls, but hitting request quota on GetOrder in a user-interactive situation is going to leave someone hanging for quite a while. SO that should probably be documented, that it'll chunk it into 50-unit requests, and if you have multiple requests, it could take a while to complete.

So, yes, I would accept a pull req that updates getOrder to call GetOrder on chunks of 50 per request, compiles the results into a single result, and returns it. I probably wouldn't merge it until after I'm a little happier with my request Queue implementation, though. The existing "throw requests at MWS and pray" approach doesn't lend well to multiple calls with a minute long reset. It works out OK for an API that resets at 20 per second, though, which is where I had problems with quota breaking.

@ericblade
Copy link
Owner

ericblade commented Dec 7, 2018

i'm not ready to accept this, but here's where i've started .. ea44846

each instance of MWSA creates a Queue instance for each service call (indexed by category/action) ... the Queue accepts requests, returning a promise that will eventually fulfill or reject whenever it gets to run and succeeds or fails without throttling. The Queue slots right into the existing code where it was doing "results = requestPromise(...)", does it's handling, and then calls requestPromise() for each queued request when it's ready to fire.

Since I'm explaining it, I'm realizing that there should also be a Queue Manager, and that the Queue should belong not to a specific MWSA instance, but to a specific MWSA user (whichever SellerId is being used), as it doesn't matter how many instances of MWSA you might create, if they all have the same SellerId, then they all have the same quota limits. Hooray for rubber ducking to a comment box. :-)

so, anyway, it relies on the existing brute-force-throw-it-at-the-wall-and-see-if-it-sticks method for retries, and that's not really acceptable, but if you do spam callEndpoint (and therefore all the helper functions) with more calls than are allowed by quota, it will auto throttle them. So.. need to get the Queue class to understand how to handle retrying, without relying on the code outside it to do it . . . and then add a Queue manager that keeps track of Queues across multiple MWSA instances... anyway.

It's already a big improvement over what is in master.

@ericblade
Copy link
Owner

so with the latest Queue update (assume throttling WILL occur if you fire off a lot, instead of waiting FOR it to occur), i'm able to fire off ~850 requests or so and throttle between 0 and 12 of them. Not too shabby, although I feel like it should be able to be zero all the time, even without the headers telling when quota resets, so long as that Queue is the only one acting on that user....

850 requests completed in 171746 milliseconds, approx 202ms per request. GetMatchingProductForId resets at 5 items per second, which is.. 200ms per.

@pwrnrd
Copy link
Contributor Author

pwrnrd commented Dec 13, 2018

Hi @ericblade, I took a look at e44846. I'm not sure whether I understand the behavior correctly.

Assume MWS has a "getStatus" endpoint which is called by MWSA. The endpoint has a limit of a 60 request per minute and a restore rate of 1 per second. Then, if we make say 600 request, will this function loop over all the 600 request, fire off 10 using "runQueue();", and run for the remaining 590 requests "setThrottleTimer();"? Will it then after waiting 1 second (the restore rate), run the cycle again over all the 590 request? So it will fire off 1 request and for the remain 589 request it will run "setThrottleTimer()" again?

EDIT: P.S. This class is a good read, nicely structured!

@ericblade
Copy link
Owner

If it has a maxInFlight of 10, that sounds correct. It will send the first 10 right off, and then run the throttleTimer for each that remain. Once maxInFlight * restoreRate time passes, it'll let it release another 10 at a time (or however many slots are available, maxInFlight - inFlight). That's where it gets a little bit sticky, and sometimes hits throttle point at the server.

It could probably use a little more fiddling to adjust the average request time, but I'm pretty happy with how well it works throwing 900 getProductById requests down the pipe at once.

@ericblade
Copy link
Owner

ericblade commented Dec 14, 2018

oh, and there should only be one setThrottleTimer in action at a time, runQueue() should shift() the first element off the array, and run it. If it hasn't been throttled, drainQueue() just fires as many requests as are in the queue at once, but when it's throttled, it tries to slow down to only allow one per restore rate period.

@pwrnrd
Copy link
Contributor Author

pwrnrd commented Dec 17, 2018

Hi Eric, I'm not sure whether the current code actually fires off once every restoreRate * maxInFlight. I think it's set to: (this.restoreRate || 1) * 60 * 1000). Is that correct?

I'm also looking at the following:

while (this.queue.length && this.inFlight < this.maxInFlight) {
    this.runQueue();
}
if (this.queue.length >= this.maxInFlight) {
     this.setThrottleTimer();
}

Is it possible for us to set the if to : if (this.queue.length) ?

I'm not sure whether the current if is able to deal with the special case of hitting the maxInFlight just before running out of the queue. For example, we might run into the possibility that we have 5 request remaining on the queue with a maxInFlight of 10. Then, if we just hit the maxInFlight, we won't call this.setThrottleTimer(); the remaining 5 requests might not be executed... Or am I seeing this the wrong way?

@ericblade
Copy link
Owner

drainQueue took some more changing, looks like

    drainQueue() {
        if (this.queueTimer) {
            // console.warn('* ignoring drain request, waiting on throttle timer');
            return;
        }
        if (!this.queue.length) {
            // console.warn('* ignoring drain request, queue empty');
            return;
        }

        if (!this.singleDrain && this.queue.length > 1) {
            while (this.queue.length && this.inFlight < this.maxInFlight - 1) {
                this.runQueue();
            }
        } else {
            this.runQueue();
        }

        if (this.inFlight >= this.maxInFlight * 0.5) {
            this.throttle();
        }
        // console.warn('* drainQueue at end', this.queue.length, this.inFlight);
    }

the version that you're looking at, that 'if' on the end is definitely wrong.

@pwrnrd
Copy link
Contributor Author

pwrnrd commented Dec 30, 2018

Ah okay! Thank you!

@pwrnrd pwrnrd closed this as completed Dec 30, 2018
@ericblade
Copy link
Owner

did you implement the original request?

@ericblade ericblade reopened this Jan 1, 2019
@pwrnrd
Copy link
Contributor Author

pwrnrd commented Jan 2, 2019

I implemented the most recent pull from the master branch.

@ericblade
Copy link
Owner

so, that's all stuff that i think was good to do before extending getOrders, so now someone should fix up the getOrder helper to chunk the orders array into 50 unit pieces, send off multiple requests, and put the results all back together when they come in :-D

@pwrnrd
Copy link
Contributor Author

pwrnrd commented Jan 2, 2019

Sure man, happy to help! Just assign it to me. I don't know when I have time to look at it though... Kind of related to this is a helper for requestAndDownloadReport(). I wrote a helper that splits listings report request in one request per marketplace. From the MWS docs:

If no MarketplaceId value is provided, reports that are not Listings Reports show all marketplaces the seller is registered in. You must specify a MarketplaceId value for Listings Reports.

Quote from: https://docs.developer.amazonservices.com/en_US/reports/Reports_UsingMultipleMarketplaces.html

I'll create a new issue for this and provide the simple helper function that I wrote. This helper would also benefit from stitching the results back together. So perhaps we could think of a general function that is able to perform this kind of stitching.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants