Filter analytics data by API Backend? #252

brylie · 2016-06-16T06:50:01Z

We are building an analytics dashboard for API Umbrella, and would like users to be able to filter the dashboard to show only analytics for a single API Backend. E.g. users can filter by API Umbrella backend ID.

How can we query Elastic or the Admin API to receive analytics related to a single API Umbrella API Backend?

GUI · 2016-06-16T14:17:38Z

Unfortunately, we don't currently store the API Backend ID in the analytics database. But I can see some potential use-cases for that, so we'd welcome welcome any pull requests, or we can see about adding it ourselves some day.

Although, if it helps, we approach filtering for specific APIs slightly differently in the default admin. We filter everything based on the URL host and path of the requests (which are already stored in the analytics). This is also how the default admin permissions with API Scopes work--you're granted permissions to a URL host and path prefix, and then the analytics are automatically filtered based on that root. There may be any number of logical APIs under a prefix, so to make the filtering more granular, you can add more specific API scopes.

We prefer filtering based on the URL, since we view API Backend IDs as an implementation detail that might change over time for the same API URL (API backends might be replaced or more specific API backends could be added that route sub-URLs differently). By basing the filtering on the URL we don't need to worry about potential API backend ID changes (or trying to track those over time).

Does that approach make sense? If so, does it seem like that could fit your use-case, or would you still prefer to filter on specific backend IDs for your purposes?

If you're looking to add the backend ID to the analytics storage, I think the main areas involved would be storing the matched API backend on an ngx.ctx variable, then adding that to the log message data, and then ensuring that gets pushed onto the data sent to elasticsearch.

bajiat · 2016-06-22T08:19:42Z

Thanks for the reply @GUI! I quess your approach entails that a certain API URL can only be stored once in the database - or does it?

GUI · 2016-06-23T03:05:38Z

@bajiat: Are you referring to the analytics database or the API Backends database?

If you're referring to the analytics database, then each individual API request is logged as a separate entry to the elasticsearch database. So there can be lots of duplicate log entries for a single URL. We then perform aggregation queries to determine the total number of API hits. Filters can also be applied to those queries to find the totals for a more specific subset of the API hits (for example, you could filter to just view APIs where the URL path began with /foo/*, or you could filter to just view a specific API where the URL path equaled /some/specific/api/endpoint.json).

If you're referring to the API Backends database, then generally speaking, yes, there would only be one backend per URL. Only a single API backend can be matched for a specific API URL (although which specific API backend is matched may change over time, if you add, edit, delete, or change the match order of API backends).

I'm not totally sure I understood the question, but does that help answer things? Let me know if not.

brylie · 2016-06-27T07:25:55Z

@GUI I am able to add two API backends with the same frontend prefix, directly via the API Umbrella UI. How would we distinguish between the API Backends that share the same frontend prefix?

brylie · 2016-06-27T07:32:10Z

Here are two screenshots showing separate API Backends with duplicate configuration:

Amazing API

Amazing API Duplicate

brylie · 2016-06-27T07:34:23Z

In both of those API Backends, the following attributes are the same:

Backend protocol
Server
Frontend host
Backend host
Frontend prefix
Backend prefix

In our analytics, how would we distinguish Amazing API from Amazing API Duplicate?

GUI · 2016-06-27T12:19:10Z

In the case of duplicate frontend prefixes, only one of those API backends would be matched and used. Which one is used would depend on the "Matching Order" of the API backend (by default, the first one added would be matched, unless you explicitly altered the matching order to give one of them higher matching precedence).

So in the case of colliding routes, we don't have a way to distinguish between the API backends in the analytics data. But this specific situation might be more related to the need for better validation or warnings when there are duplicate routes: #239 & 18F/api.data.gov#186 Or do you have situations where those duplicate prefixes are expected?

bajiat · 2016-06-27T13:01:15Z

@GUI Are you expecting that APIs are only added by particular organizations or persons so that you don't have to restrict duplicate backends? At the moment anyone can add an API backend to our database, so there is a high possibility for duplicates.

brylie · 2016-06-27T13:04:27Z

We need to prevent route collision. E.g. we are considering adding a unique validation to ensure routes are unique in our database. Are there any plans for stricter validation in the API Umbrella schema?

KrishnaPG · 2016-07-24T15:17:18Z

If supporting the validations of backend routes, then another case that could add value to the scenario @bajiat mentioned is:

verifying that the new route (that is being added) passes a generic validation regex (specified by admin)

For example,

allow only routes from particular domain or set of domains to be added
or reject domains or routes that contain some specific regEx strings (inappropriate sites / content)

The reason being, no point in taxing our gateway servers for routing traffic to, say googleAPI or other public API. So, a regex that rejects all google api domain routes to be added to our gateway for routing should make our servers only cater the owners needs.

See #252 This logs the API backend ID that was matched while serving the API request, along with the ID of the more specific "url_matches" record within that API. These details might serve as a more efficient way to lookup logs that a user has permissions to view based on IDs (versus the host and path prefix based current approach). Although, switching to use this might be tricky given our current approach to admin permissions (since IDs are a little less flexible and we have to deal with historical IDs). But in any case, let's start logging this so we can explore this and allow for other use cases that have requested this functionality.

GUI · 2017-02-08T03:23:26Z

Logging the API backend details (in the api_backend_id and api_backend_url_match_id fields) has been added in dfea879. This will be part of the forthcoming v0.14 release, which I'm hoping to finally get wrapped up in the next week.

GUI · 2017-02-23T03:06:25Z

v0.14.0 is released which adds these additional backend details to the analytics database.

bajiat mentioned this issue Jun 16, 2016

Allow owner to select the API they are viewing the analytics for apinf/platform#999

Closed

5 tasks

bajiat mentioned this issue Jun 20, 2016

Select charting library and research available data types apinf/platform#1066

Closed

2 tasks

bajiat mentioned this issue Jun 30, 2016

Allow only unique proxy backend base paths apinf/platform#1200

Closed

2 tasks

brylie mentioned this issue Feb 7, 2017

Consider whether to remove analytics data from API Umbrella when deleting ProxyBackend in Apinf apinf/platform#2032

Closed

GUI added this to the v0.14.0 milestone Feb 8, 2017

GUI closed this as completed Feb 23, 2017

bajiat mentioned this issue Feb 23, 2017

Test environment for API Umbrella rel. 0.14 apinf/platform#2176

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter analytics data by API Backend? #252

Filter analytics data by API Backend? #252

brylie commented Jun 16, 2016

GUI commented Jun 16, 2016

bajiat commented Jun 22, 2016

GUI commented Jun 23, 2016

brylie commented Jun 27, 2016 •

edited

Loading

brylie commented Jun 27, 2016 •

edited

Loading

brylie commented Jun 27, 2016

GUI commented Jun 27, 2016

bajiat commented Jun 27, 2016

brylie commented Jun 27, 2016

KrishnaPG commented Jul 24, 2016

GUI commented Feb 8, 2017

GUI commented Feb 23, 2017

Filter analytics data by API Backend? #252

Filter analytics data by API Backend? #252

Comments

brylie commented Jun 16, 2016

GUI commented Jun 16, 2016

bajiat commented Jun 22, 2016

GUI commented Jun 23, 2016

brylie commented Jun 27, 2016 • edited Loading

brylie commented Jun 27, 2016 • edited Loading

Amazing API

Amazing API Duplicate

brylie commented Jun 27, 2016

GUI commented Jun 27, 2016

bajiat commented Jun 27, 2016

brylie commented Jun 27, 2016

KrishnaPG commented Jul 24, 2016

GUI commented Feb 8, 2017

GUI commented Feb 23, 2017

brylie commented Jun 27, 2016 •

edited

Loading

brylie commented Jun 27, 2016 •

edited

Loading