Skip to content

joelkuiper/aplomb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is an alpha release. We are using it internally in production, but the API and organizational structure are subject to change. Comments and suggestions are much appreciated.

Introduction

“For every complex problem there is an answer that is clear, simple, and wrong.” - H. L. Mencken

This is such an answer. It is wrong because a more scalable, durable, and reliable solution would be to use a queue like RabbitMQ, Kafka, or similar. But sometimes quick and dirty does the trick, with minimal overhead.

Consider the following use case:

  1. You have an HTTP service that requires some time to process requests (think several minutes)
  2. These requests take considerable computational resources to process (e.g. 100% CPU), so the amount of parallelism is constraint
  3. You have a somewhat complicated non-asynchronous application that would like to make requests
  4. You have a client-side application where users will wait for responses, and you wish to relay progress information

Enter Aplomb.

aplomb |əˈplɒm| noun [ mass noun ] self-confidence or assurance, especially when in a demanding situation

Aplomb queues requests and dispatches them to upstream servers when they are available, proxies their responses, and has an HTTP callback API that can deliver status updates over WebSockets. Concretely, we use it to proxy requests to OpenCPU, an R analysis solution over HTTP.

API

MethodURIResponseDescription
POSTapi/submit202 AcceptedQueues the POST body, and will relay it to the first upstream core that becomes available, immediately returns a response of format [1]
GETresponse/:idUpstream HTTP responseBlocks (Comet/long-polling) until the response for that ID becomes available
GETresponse/:id/*Upstream HTTP responseProxies additional data associated with that ID [2]
GETstatus/:id/ws101 Web socket connectionWebSocket connection to listen for updates
PUTstatus?id=204 No contentPuts an update on the generated callback URI (using JWS) [3]

[1]: The response looks like

{
  "id": "mHsqG4fyKQs",
  "requestUri": "http://localhost:5000/api/submit?url=...",
  "responseUri": "http://192.168.178.120:5000/api/response/mHsqG4fyKQs",
  "statusUri": "ws://192.168.178.120:5000/api/status/mHsqG4fyKQs/ws",
  "queue": {
    "num-slabs": 1,
    "num-active-slabs": 1,
    "enqueued": 1,
    "retried": 0,
    "completed": 0,
    "in-progress": 1
  }
}

Will return a 401 Unauthorized if the correct API_SECRET was not provided in the the Authorization header.

[2]: In practice this means that http://192.168.178.120:5000/api/response/<id>/foo gets proxied to the <upstream-response>/foo. See OpenCPU documentation for concrete examples (e.g. retrieving additional images).

[3]: The callback URI is generated from the ID as JWS and inserted as an additional form field parameter in the POST called statusUri.

Caveat Q&A

  • What happens if the upstreams are unresponsive? The request will time out, or stall, there is no explicit handling of this case (maybe look into Hystrix)
  • What if the proxy crashes? All queued and running tasks are lost, and will not be retried
  • What if I restart the proxy? Existing request-response pairs are held in memory so are lost in this case. It’s best to think of the proxy results as ephemeral
  • What if memory runs out? The request-response pairs are held as SoftReference, under memory pressure they will be garbage collected. Again, treat the proxy as non-durable and non-persistent
  • What if a non-finished request/response gets garbage collected? Tough luck
  • What about security? The POST route is secured with a simple HTTP Token, which should be secure enough if properly deployed under HTTPS. The PUT routes are generated from the generated ID’s using JWS, so risk of rogue agents pushing status updates is minimal. The ID’s are cryptographically randomly generated, but could be swept for valid ID’s so rate-limiting solutions are recommended.

Configuration

Configuration is provided via environ, which uses environment variables to control settings. The following settings are available:

  • UPSTREAMS A comma delimited list of the format addr_1|num_cores,addr_2|num_cores which specifies the available upstream servers
  • HOST_ADDR The canonical address of the server, without protocol. Will be guessed at if not supplied
  • PORT The port to run the Aplomb on
  • API_SECRET The API token that will give access to the post routes if provided as the Authorization: Token <API_SECRET> header
  • DEV Development mode, disables HTTPS and WSS URIs, attempts hot code reload, gives more debug output
  • REPL_PORT If provided will start an nREPL on that port

Deploy

To build

lein uberjar
docker build .

docker save [image id] > aplomb.tar

SCP it to the server.

To run

docker load < aplomb.tar

# for sanity you should tag the image with `docker tag [image id] [name]`

docker run -d --restart="on-failure" -e "UPSTREAMS=http://172.16.8.11|2,http://172.16.8.12|2" -e "HOST_ADDR=foo.bar" -e "PORT=3000" -e "API_SECRET=foo" -e "DEV=false" -p 3000:3000 [image id]

Obviously change the UPSTREAMS, API_SECRET and HOST_ADDR. Look into the recommend Nginx configuration for reverse proxy-ing with support for HTTPS and rate limiting.

License

Copyright (c) 2015, Joël Kuiper All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

About

A simple HTTP queue + load balancer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages