Skip to content
This repository has been archived by the owner on Sep 9, 2022. It is now read-only.

Expand/refactor dynamic filtering #433

Closed
gorhill opened this issue Dec 22, 2014 · 16 comments
Closed

Expand/refactor dynamic filtering #433

gorhill opened this issue Dec 22, 2014 · 16 comments

Comments

@gorhill
Copy link
Contributor

gorhill commented Dec 22, 2014

To address one way or another various filed issues: #361, #358, #331, #282, #236, #68.

Definitions:

  • "Static filtering" = filters based on ABP net filters syntax, or hosts file entries
    • Disadvantage: Adding/removing filter(s) is CPU/short-term memory expensive
    • Advantage: Can be very fine-grained
  • "Dynamic filtering"
    • Advantage: Adding/removing filter(s) is virtually CPU/memory noop relative to static filters
    • Advantage: Very useful to unbreak (or further restrict) web sites without the overhead of static filters
    • Disadvantage: Coarse-grained

Filtering:

  • Whitelisting override all dynamic and static filters
  • Dynamic exception filters override static and dynamic block filters
  • Dynamic block filters override static exception filters and default behavior
  • Static exception filters override static block filters
  • Static block filters override default behavior
  • Default behavior: all net requests are allowed

Filtering precedence inside dynamic filtering -- most specific to least specific:

  • Hostname - hostname - any type (new)
  • Any - hostname - any type (new)
  • Hostname - any - specific type (these already exist)
  • Any - any - specific type (these already exist)

UI:

  • The default UI will always be minimalist -- just as it is now
  • Optionally expand panel to unveil dynamic filtering
  • It is not feature bloat, it just expands on the current dynamic filtering (script, iframe) to address the issues enumerated above.
  • Just as now, it is optional, tucked away by default
  • But readily available as a very useful tool to help users help themselves
@uBlock-LLC uBlock-LLC locked and limited conversation to collaborators Dec 22, 2014
@gorhill gorhill added the Fixing label Dec 23, 2014
@gorhill
Copy link
Contributor Author

gorhill commented Dec 28, 2014

webRequest.onBeforeRequest overhead after re-factoring to support broader dynamic filtering (last lines reported after completing reference benchmark once):

µBlock> onBeforeRequest: 0.132 ms (8599 samples)
µBlock> onBeforeRequest: 0.132 ms (8649 samples)
µBlock> onBeforeRequest: 0.132 ms (8698 samples)
µBlock> onBeforeRequest: 0.132 ms (8763 samples)
µBlock> onBeforeRequest: 0.132 ms (8815 samples)
µBlock> onBeforeRequest: 0.132 ms (8894 samples)
µBlock> onBeforeRequest: 0.132 ms (8969 samples)
µBlock> onBeforeRequest: 0.132 ms (9025 samples)
µBlock> onBeforeRequest: 0.131 ms (9154 samples)
µBlock> onBeforeRequest: 0.131 ms (9169 samples)
µBlock> onBeforeRequest: 0.131 ms (9266 samples)
µBlock> onBeforeRequest: 0.131 ms (9271 samples)

@gorhill
Copy link
Contributor Author

gorhill commented Dec 31, 2014

New plumbing code works fine now.

Test case for issue #358:

  • Open http://www.wired.com/2014/12/2014-year-in-dystopia/
  • Just as before, popup UI is minimalist:

    c
  • Dynamic filtering available for users who want more control: block 3rd-party frames everywhere (strongly advised):
    a
  • Frames from disqus.com, google.com, googlesyndication.com, outbrain.com, twitter.com, youtube.com were blocked (great)
  • However, as per Click to unblock temporarily a frame #358, one may want to not block frames from Youtube (generally, or on that particular site)
  • New plumbing allows to bypass dynamic filtering for a specific hostname -- youtube.com here -- globally (pic) or just site-specific, user choose
    b
  • Result: iframe from disqus.com, google.com, googlesyndication.com, outbrain.com, twitter.com still blocked (great), embedded Youtube video now plays just fine while still filtered through static filtering (great)

@uBlock-LLC uBlock-LLC unlocked this conversation Dec 31, 2014
@gorhill
Copy link
Contributor Author

gorhill commented Jan 1, 2015

Three columns:

  1. hostname
  2. global rules
  3. site-specific rules (site of current web page)

A cell can have one of four states:

  1. no rule = pale gray, can inherit broader dynamic rule
  2. red = block everything without exception (bypass static exception filters)
  3. green = allow everything without exception (useful to un-break sites broken by static filtering)
  4. dark gray = do not inherit any dynamic filtering rules -- static filtering will still apply

The + and - sign in the site-specific columns provides an overview of the number of requests, allowed or blocked respectively, which where made for a specific hostname. One + means the number of requests was in the single-digit range i.e. less than 10, ++ means the number of requests were in the doulbe-digit range, while +++ means 100 or more.

@gorhill
Copy link
Contributor Author

gorhill commented Jan 1, 2015

Case: take control of your privacy in your own hands

  1. Does EasyPrivacy, "Fanboy's Social Blocking List" and "Anti-ThirdpartySocial‎" all together prevent your browsing history from leaking?
  2. Not necessarily
  3. Go to wired.com
  4. See that twitter.com now knows you went to wired.com (see request to platform.twitter.com)
    a
  5. Another benefit of dynamic filtering: full disclosure of remote connections as a result of loading a web page
  6. Do not want your browsing history to be leaked to ubiquitous twitter.com? Block twitter.com everywhere by default (the 1st column means "apply everywhere"):
    b
  7. However, when you visit twitter.com, it's not very practical to have requests to twitter.com blocked
    c
  8. No problem, just create an exception for twitter.com when on twitter.com
    d
  9. The "dark gray" exception above means: do not dynamically filter requests to twitter.com when on twitter.com -- static network filtering will still take place
  10. Now as a result, no requests to twitter.com will be made outside twitter.com, so you are no longer leaking browsing history to ubiquitous Twitter.
  11. The same way you can create an exception for twitter.com when on twitter.com, you can create an exception for twitter.com for any other web site where you do not mind having Twitter available.
  12. That's just an example using Twitter, could be any other ubiquitous servers: Facebook, Youtube, Google, Gravatar, Disqus, whatever.

@gorhill
Copy link
Contributor Author

gorhill commented Jan 1, 2015

Case: un-breaking sites broken by some overzealous filters in one of the (static) filter lists.

  1. I will use a real case which occurred earlier this week.
  2. Broken site: boldchat.com
    a
  3. Found out the site was broken because "Peter Lowe's Ad server list" blacklists boldchat.com (I don't know the reason).
  4. So all requests to boldchat.com were blocked, which obviously is an issue when visiting boldchat.com
  5. As seen above in uBlock's popup, a user could see that network requests to boldchat.com were blocked (the −−).
  6. To un-break the site, it's a matter of overriding static filtering using an allow dynamic rule (green)
    b
  7. As seen above, an allow rule was created for boldchat.com for when visiting a web page on boldchat.com: it overrides whatever block filters in effect.
  8. Of course, the rule could be set to apply anywhere if you do not agree at all with the blacklisting of boldchat.com
  9. An allow rule (green), means: unconditionally allow requests, i.e. override any existing static block filters, or any more generic dynamic block rules (like the blocking of 3rd-party frames, etc)

@gorhill
Copy link
Contributor Author

gorhill commented Jan 4, 2015

Being able to see the log of net requests and the why they were allowed/blocked is really a PITA currently, which is something an advanced user of dynamic filtering will want to have (I do) -- as one of its purpose is to un-break broken sites. As part of this major feature, I will provide the ability to see net requests in real time from a developer tool panel.

@gorhill
Copy link
Contributor Author

gorhill commented Jan 5, 2015

Being able to see all at once is definitely a huge improvement for mroe advanced users:

a

Unsure though if/how the dev tool pane can be ported to Firefox. There is really only one platform specific call to create the pane, all the rest is platform independent, so the pane could be in an iframe on the target page, or a separate tab. Will see what @Deathamns think of this.

Never mind, using devtools API was non-trivial with regard to portability. And I could not solve it, facing a systematic whole browser crash when trying to solve in the proper manner. In the end, I went with a fully portable mechanism. Currently network request logger will have its own separate tab, which a user can detach so that he can observe the flow of network requests in real time. Down the road, it is also easily embed-able into a existing tab (in an iframe) through a mini content script if ever we want to sort-of mimic devtools, except that this will also work for Firefox (as far as I understand).

a

@harshanvn
Copy link

Can you consider to add filter name to the filter list, in the dev tools. This will help the users themselves to identify, the culprit filter list and report it to the filter list maintainers appropriately.

@gorhill
Copy link
Contributor Author

gorhill commented Jan 6, 2015

Can you consider to add filter name to the filter list

I guess you meant "to add filter list to the filter name". Dup of #43. Not trivial unless one doesn't mind sacrificing efficiency. I mind. So far a reverse lookup is what I have in mind. Only those who care to find the list will incur a resource cost.

@harshanvn
Copy link

Ok Thanks. I agree, efficiency comes first. This is just a trivial case in-terms of usability.

@Mikey1993
Copy link
Contributor

Just to note, IMO, it's almost mandatory to warn the user that the flip of a given option will consume additional, non trivial, resources.

@gorhill
Copy link
Contributor Author

gorhill commented Jan 6, 2015

What option?

@Mikey1993
Copy link
Contributor

"..of a given option.." = any option.

Just wanted to note this for the future, so that the efficiency can (and should IMO) be controlled by the user, but in a consciously way of any effect of each option that is offered to him.

@gorhill
Copy link
Contributor Author

gorhill commented Jan 6, 2015

No worry, I don't plan to depart from what has been among top priorities since the beginning.

@gorhill
Copy link
Contributor Author

gorhill commented Jan 6, 2015

Just keeping track, given code change:

µBlock> onBeforeRequest: 0.123 ms (9201 samples)
µBlock> onBeforeRequest: 0.123 ms (9274 samples)
µBlock> onBeforeRequest: 0.123 ms (9361 samples)
µBlock> onBeforeRequest: 0.123 ms (9411 samples)
µBlock> onBeforeRequest: 0.122 ms (9575 samples)
µBlock> onBeforeRequest: 0.122 ms (9590 samples)
µBlock> onBeforeRequest: 0.123 ms (9741 samples)
µBlock> onBeforeRequest: 0.123 ms (9764 samples)
µBlock> onBeforeRequest: 0.123 ms (9794 samples)
µBlock> onBeforeRequest: 0.123 ms (9854 samples)

@gorhill
Copy link
Contributor Author

gorhill commented Jan 10, 2015

Fixed in 0.8.5.0.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants