Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] "Unified" web console #6832

Closed
gianm opened this issue Jan 10, 2019 · 25 comments
Closed

[Proposal] "Unified" web console #6832

gianm opened this issue Jan 10, 2019 · 25 comments

Comments

@gianm
Copy link
Contributor

gianm commented Jan 10, 2019

Motivation

Currently, Druid has three web consoles:

  1. The coordinator console at http://coordinator:8081/#/, whose source lives at https://github.com/druid-io/druid-console
  2. The overlord console at http://overlord:8090/.
  3. The old coordinator console at http://coordinator:8081/old-console/.

Each one has a distinct design and there are very few cross-links between them. This leads to a disjointed user experience. Also, the fact that the sources for console (1) are not in the Druid repo make it difficult to keep it up-to-date.

A unified console would improve the user experience and would make it easier for Druid developers to keep it up-to-date as Druid's APIs evolve. The change also provides an opportunity to update their design and functionality a bit.

Proposed change

The proposed change is to create a new unified web console that subsumes the functionality of the three existing consoles. The unified UI would have its code live inside the Druid repository, making it easier to keep up to date.

The UI may end up being based on Druid SQL system tables (http://druid.io/docs/latest/querying/sql#retrieving-metadata), which would allow it to be more flexible than a console based on the currently-available coordinator and overlord APIs. The currently-available APIs were designed to support the existing consoles and return exactly what they need to know, but the system tables are more generic, and could power a more flexible new UI. (Imagine being able to show the distribution of data on historical nodes just for a particular datasource -- that kind of thing.) It would also allow the UI to more easily gain new powers as new columns or tables are added to system tables.

Since a unified UI would presumably need to access APIs on both the coordinator and overlord, we would need to either use proxying or CORS to allow that to happen. Proxying makes more sense, imo, since CORS requires extra configuration and some kind of consistent domain, which is not necessarily common amongst Druid deployments (many of which would use raw IPs). It could be hosted on the coordinator if the coordinator makes a proxy available to the overlord and broker. Or, it could be hosted on the router, which can already proxy to the coordinator, overlord, and broker (the former two via the optional management proxy). Or, it could be hosted on a new node type (CliWebConsole?). But we have enough node types already…

New or changed public interfaces

No new or changed APIs, but there would be a new human interface.

Migration plan and compatibility

Keep the existing consoles as-is and offer the new console as a new option. But deprecate the existing consoles and plan to remove them in a future version.

Rejected alternatives

None.

@fjy
Copy link
Contributor

fjy commented Jan 10, 2019

👍

@gianm gianm added Design Review Apache Items related to being a part of the ASF and removed Apache Items related to being a part of the ASF labels Jan 14, 2019
@gianm
Copy link
Contributor Author

gianm commented Jan 14, 2019

We have begun work on this today. In the current version being worked on, most views in the UI are based on system tables, and it is expected that the UI will be hosted on the Druid Router and requires that the Router management proxy is enabled.

@vogievetsky
Copy link
Contributor

I wanted to update this thread with the current state of progress. I am hoping there will be a PR ready early next week.

One of the main goals of this unified console project was to make it as simple as possible technically. It needs to be maintainable by anyone (even people that are not full-time web developers).

Work in progress

Here are some screenshots of the current WIP:

Data sources view:

image
Highly interactive all the views are cross linked

Segments view:

image
The quarries here get pushed down to Druid so the browser does not get overwhelmed

Tasks view:

image

Servers view:

image

SQL view:

image
This is to run arbitrary SQL, also every view provides a link to see its query in the raw so you can go beyond the predefined structure

Legacy consoles:

image
All 3 legacy consoles are also available and embedded within this console allowing you to have a fallback if there is some old functionality that has not made it over yet into the new console. Note that these legacy consoles will be served from the router as well as from their original places meaning that going forward you can go to the router for all your needs.

Design

Here are some key points driving the design of this console:

  • The console is designed to run on the router node and rely on its management proxy to interface to the overlord and coordinator. This makes it much simpler both form the point of the console developer and the user. The router presents a good unifying interface to all of Druid.

  • Heavy reliance on the SQL system tables: this allows great flexibility and iteration in the development of the views and in further iteration. Hopefully one day this console will be 100% powered by system tables.

  • All frontend code is written TypeScript with strict typing turned on. This is both the best way to develop a complex UI and the best way to ensure that future changes do not break the code. Also it should be very familiar to you Java people.

All feedback is very welcome. I will keep updating this thread.

@b-slim
Copy link
Contributor

b-slim commented Jan 18, 2019

while i totally agree and love the idea of unifying the consoles, making (1) router node and (2) sql module mandatory elements is a major limitation IMO.

  1. Most of the complaint we hear about Druid is related to how many different node types we have and this work is going to make Router yet another mandatory node which is not great IMO.
  2. SQL module comes with some overhead (updating schema ...etc) that is not needed per say since all the info can be retrieved when needed instead of the caching that can add extra GC and extra CPU pressure extra memory footprint on all the brokers.

@fjy
Copy link
Contributor

fjy commented Jan 18, 2019

@b-slim The tradeoff between incorporating another Druid process (note: Druid nodes are processes and were poorly named by myself and @cheddar) to unify the consoles is worth it, and this is coming from someone that has been the # 1 advocate of reducing processes in Druid. The router requires minimal configuration and there's no other straightforward way of even building a unified console without the router. As someone that created the original coordinator and overlord console, was heavily involved in creating the 2nd gen coordinator console, I can't emphasize enough how much this contribution is needed to make Druid better. Today, you have to go to multiple places to understand where segments are getting built and when they've been finalized. This is a huge pain for initial setup, and debugging segment problems in production is difficult. I've had probably 1000+ conversations about the problems with split consoles and the information that is lacking from them today. This is not to say that reducing Druid processes shouldn't be a goal of the project - as a community I think we should strive to think about how we can merge and simplify processes, but I think today incorporating these changes will make the project better as we continue to work on longer term changes to simplify Druid's architecture.

These changes, combined with the repackaging of Druid, will make significant strides to having more adoption, and a much better out of the box experience. Furthermore, debugging issues in production should also become much simpler. Once we reduce Druid's deployment down to 3 server types, each with colocated processes, people will stop focusing on every individual processes and focus much more on server types themselves.

@fjy
Copy link
Contributor

fjy commented Jan 18, 2019

@b-slim with regards to the SQL module - once again, I think the tradeoff is well worth it. System tables have completely changed how people get metadata from Druid, and drastically simplifies operations of the project. They've been tested and validated in production at large scale clusters with minimal impact.

@jihoonson
Copy link
Contributor

I think @b-slim agrees on that the unified UI is cool, but is concerned about that the router should be set up for the new UI. I think it's a valid concern. The router is currently optional, so it would be not good for someone who wants to use the new UI but doesn't want to run the router.

How about adding a new configuration like enableRouting to brokers? If this is set to true, then the brokers are also capable of what routers do.

@fjy
Copy link
Contributor

fjy commented Jan 18, 2019

I'm on board with that

@jon-wei
Copy link
Contributor

jon-wei commented Jan 18, 2019

Most of the complaint we hear about Druid is related to how many different node types we have and this work is going to make Router yet another mandatory node which is not great IMO.

I think adding an out-of-the-box UI for some querying functionality (SQL view in this new UI) is pretty great for users, and since the UI would also have the coordinator/overlord console functionality, the router seems like a reasonable place to run the UI.

I definitely agree with your point on too many node types; adding to @jihoonson's comment, what do think about eventually combining the router and broker into a process that's positioned as a "unified access point" for the cluster?

@b-slim
Copy link
Contributor

b-slim commented Jan 19, 2019

As i said i love the idea to consolidate the ui and have less nodes. but to get to that goal i think this UI need to be as part of coordinator (thus we do not create need for router). It seems to me that most of what the UI is serving is part of the coordinator/overlord state thus it makes more sense to me to added it there should rely on the coordinator to construct the timeline or cache it at the coordinator, instead of making ALL the brokers do that duplicate work that has an impact on the query side, also lot of the users don't care about SQL and it will be unfair to make it mandatory on by default. Would love to hear from the original main author how hard it is to by pass the SQL layer and what are the pros/cons ?

@KenjiTakahashi
Copy link
Contributor

I love the idea of a better, unified UI as well. But I also agree with @b-slim on the Router part. We sometimes do quite small, on premise deployments of Druid (think, 1-2 machines) for our clients and adding another node there can actually be a dealbreaker, because of already scarce resources. I know this may be a bit "unorthodox" usage of Druid, but it's real.

@vogievetsky
Copy link
Contributor

Just wanted to update this thread with some progress:

Dark mode style:

image

Ability to change runtime configurations:

image

All 'actions' implemented:

image

Currently working on getting this integrated with maven.

@vogievetsky
Copy link
Contributor

@KenjiTakahashi how would you feel if the unified web console ran on the broker node (enabled by a config)?

@gianm
Copy link
Contributor Author

gianm commented Jan 22, 2019

Or maybe the coordinator? Either way (coordinator or broker) a new proxy route would be needed.

@vogievetsky
Copy link
Contributor

Just for completeness I would also like to throw in the historical node for consideration of where the console might run :-p

@jihoonson
Copy link
Contributor

@b-slim the new UI serves cluster management, data management, and sql query interface, so I think the router is a good place to add since it's already working as a proxy for overlords, coordinators, and brokers. So, as @jon-wei commented, we can promote it to a unified access point for a Druid cluster.

But, I agree that it's not good to make the router mandatory. To avoid that, we can add a new configuration to some module, so that one of mandatory modules can work as a proxy instead.

I'm not sure what's the best place to add this configuration to though. From the implementation side, maybe the broker is a good place since it already has a heavy dependency on the coordinator: it gets info about all published & used segments for system table from the coordinator. The coordinator also has some dependencies on the overlord, but they're small.

@KenjiTakahashi
Copy link
Contributor

I think for us the best way would be to leave it in Coordinator. For such small deployments, you can sometimes actually leave out the Broker and just query the Historical directly (as there's only one anyway). But Broker is acceptable, too, I think.
As for Overlord, we usually run it in the "embedded" mode, inside Coordinator.

BTW: Have to say that the new UI looks quite nice already :-).

@vogievetsky
Copy link
Contributor

Thank you for all the feedback! We are revising the proposal to have the new UI run in two places:

  1. The router (gated by the management proxy)
  2. The coordinator - replacing the existing UI and with a link to the old UI and a notification about where to find it.

(1) is already done and is a great way to go for someone who wants a unified endpoint for the druid cluster. As people have noted the drawback is that the router is an optional node and should not be required by default.

(2) Is the change that a lot of people have advocated for, it will require adding a capability to proxy requests from the coordinator to the broker (for the SQL queries) - we are adding that.

Does this make sense to everyone?

We are planning on making a blogpost that will go into more details about the new functionality of the UI. Meanwhile here are some screenshots (there has been a lot of style cleanup since the last update):

Datasources view:

image

Tasks view:

image

About dialog (over the Data servers view):

image

@jihoonson
Copy link
Contributor

I feel like it makes sense to split this proposal into two PRs: adding a new UI to the router and adding a capability to enable routing to the coordinator (or broker). It would be easier to focus on a specific functionality and its implementation.

@vogievetsky
Copy link
Contributor

@jihoonson I agree. This will also make it easier to review. The new console is quite a bit of raw code (as it also embeds into itself the old consoles - all 3 of them).

@b-slim
Copy link
Contributor

b-slim commented Jan 25, 2019

(2) Is the change that a lot of people have advocated for, it will require adding a capability to proxy requests from the coordinator to the broker (for the SQL queries) - we are adding that.

We need to ensure that users can index data (via Overlord) and look at the coordinator UI if needed to get what they want about indexing even if the Broker is Not running or does not expose SQL module.

@jihoonson
Copy link
Contributor

@b-slim I don't think we are replacing the old UIs with the new one in this proposal. People can still use the old ones if they prefer, so we don't have to ensure that right now.

In the future, we may want to choose one of them as default. The new UI combines data from multiple sources (coordinator, overlord, and broker enabled SQL) and shows them together. I think this would be really useful for users but it introduces some new requirements (enabling SQL) and dependencies (between the node running the UI and coordinator, overlord, and broker). IMO, this is a sort of trade-off and it's worth to pay the price. Unlike HDFS and Yarn, Druid's indexing, storage, and querying system are tightly coupled with each other, and so it would be really useful for debugging if we can see those information in one page.

FYI, when SQL is enabled, the broker starts DataSchema which periodically gathers segment information from historicals by executing SegmentMetadataQuery. This may be quite expensive, but is inevitable if we want to show this information in the UI.

@vogievetsky
Copy link
Contributor

@b-slim absolutely, there is no plan here to remove ANY functionality afforded by the older consoles. They will be embedded within the unified console and easily accessible.

In fact there are two notifications to make things really clear.

One to generally inform people that the old consoles still exist and to smooth the transition:

image

And one to specifically detect the case where the SQL endpoint is disabled:

image

@vogievetsky
Copy link
Contributor

part (1) of this is now done. Standby for a part (2) followup

@gianm
Copy link
Contributor Author

gianm commented Feb 19, 2019

Implemented by #6923.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants