[RFC] Kibana notification service #90297

mshustov · 2021-02-04T16:37:38Z

RFC for a Kibana notification service. The notification service improves UX by providing a way of drawing the user's attention to an event that occurred in Kibana or another Elastic product.

Open questions:

how to store notification content translations [RFC] Kibana notification service #90297 (comment)
user identity availability [RFC] Kibana notification service #90297 (comment)
security plugin availability outside of x-pack [RFC] Kibana notification service #90297 (comment)
UI-sourced notifications [RFC] Kibana notification service #90297 (comment)
URL backward compatibility [RFC] Kibana notification service #90297 (comment)

elasticmachine · 2021-02-04T16:49:13Z

Pinging @elastic/kibana-core (Team:Core)

jowiho · 2021-02-05T07:52:01Z

About the icon being a public URL: I'm not sure if that will be feasible in all situations. I'm thinking about on-prem ECE installations where the icon would have to be hosted by the ECE UI (because some customers don't like ECE making outside calls, or run ECE in an air-gapped env, so using an elastic.co URL won't work). But the user-visible ECE UI URL isn't always known to us, and can change. For instance, customers can decide to move their ECE UI behind a reverse proxy.

Even without ECE, how would this work for an air-gapped on-prem Kibana?

Would it make sense to support in-line svg icons? Perhaps combined with a set of predefined icon names (like we have for the Kibana login selector icons)?

jowiho · 2021-02-05T09:37:50Z

I've left a few other comments in the gdoc (https://docs.google.com/document/d/1iIjL1CzlYQ6EcVzOY_5_vgDsGetnxw7Xtp_9PUUJXrs/edit#) because GH comments aren't ideal for discussing review feedback.

mshustov · 2021-02-08T11:46:56Z

Would it make sense to support in-line svg icons? Perhaps combined with a set of predefined icon names (like we have for the Kibana login selector icons)?

yes, euiIconType: string supports SVG URL and data URL out of the box. I will clarify it in the RFC

…m Jochem

joshdover · 2021-02-10T09:46:16Z

rfcs/text/0014_notification_service.md

+- `message: { i18n_key: string, values: object } | { text: string }` - a notification message is subject to i18n. 
+Locale might be changed on the Space level (on the User level in the future).
+Thus, we cannot determine a message locale during notification creation.
+Considering that Kibana notifications might be rendered outside of Kibana UI (in Cloud UI, for example), the message must contain translations for all the supported locales.
+- `title: { i18n_key: string, values: object, url?: string } | { text: string, url?: string }` - notification header.
+- `action: { i18n_key: string, values: object, url: string } | { text: string, url: string }` - notification CTA


I'm still concerned about i18n keys changing between releases. We don't have enough rigor or tooling in place to be sure that the keys don't change and then break notifications created before the change (or sourced from different Stack versions). I also think we'd need to be able to add new i18n keys to old Kibana versions so that newer notifications could be rendered in the UI of older Kibanas. Or are we expecting the UNS to handle this for us?

If we have decided that locale config changes should also affect notifications that already exist, we need to account for taking on the work to make sure i18n keys are stable. Would it be acceptable to delay this work and do the i18n translating at creation time until we enforce that?

Also, what do you think about always providing a fallback message string even when i18n_key is provided? That way we can make changes to how we handle this in the future without ever breaking older Kibana instances.

I'm still concerned about i18n keys changing between releases. We don't have enough rigor or tooling in place to be sure that the keys don't change and then break notifications created before the change (or sourced from different Stack versions).

That's a valid concern. We are going to store translations for a given notification:

Considering that Kibana notifications might be rendered outside of Kibana UI (in Cloud UI, for example), the message must contain translations for all the supported locales.

we need i18n_key to retrieve all the translations from @kbn/i18n before storing them in an index or sending them to UNS.

Could we simplify things by only storing the translation for the users current language preference? Old notifications won't change when a user changes their language prefs, but that seems like an acceptable tradeoff.

I agree that it seems simpler to store the translated text. Spaces does throw an interesting wrench in that plan though... Perhaps if the source of the notification event is some space-specific entity, we use the Space's locale if the user hasn't specified one?

Could we simplify things by only storing the translation for the users current language preference? Old notifications won't change when a user changes their language prefs, but that seems like an acceptable tradeoff.

@alexfrancoeur How do you feel about this from a product point of view? It's not the best UX, but might be an acceptable trade-off.

I talked about this with @alexfrancoeur a bit yesterday. In general, I don't like when surprising things happen in our product and having UI be in multiple languages depending on when the state of a cluster setting changes is surprising in my opinion. That being said, language setting changes are probably infrequent and it is my understanding that our internationalization sophistication is still maturing today, so we agreed this would be okay.

If we put ourselves in a user's shoes, the worst case scenario is probably something like: English is the default language for a deployment, I bring a deployment up, the deployment experiences problems and generates notifications about what went wrong, I do not read English, I switch my deployment language expecting to be able to return and read those messages, I am frustrated that this expectation was not met.

It will be a while before the notification system and internationalization is capable of producing an experience like that, but it's good to paint a real picture.

In case we want to look back at this for context, are our concerns around storing multiple languages in a notification driven by disk concerns or network concerns?

In case we want to look back at this for context, are our concerns around storing multiple languages in a notification driven by disk concerns or network concerns?

mostly yes.

I agree that it seems simpler to store the translated text. Spaces does throw an interesting wrench in that plan though... Perhaps if the source of the notification event is some space-specific entity, we use the Space's locale if the user hasn't specified one?

@kobelb That will require extending User profile API to retrieve user-specific settings (once we have them). Worth noting that Kibana doesn't have access to user credentials at notification creation time. Probably we can mark some settings as public not to apply the security model to them.

In general, this approach might not give us a lot of benefits if several users set different locales because we have to duplicate a notification for every locale.
For example, Alert is triggered for UserA, UserB, UserC:
UserA and UserB have locale en: we store a notification with a message in English.
UserC has locale de: we store the same notification with the message in German.

So instead of storing the single notification object with en and de locales, we persist two notification objects, which ends up consuming even more disk space.

interface TranslatedContent { translations: { [locale: string]: string; } arguments: TranslateArguments; } interface Notification { recipient_id: string[]; priority?: 'normal' | 'high' | 'critical'; created_at: number; expire_at?: number; is_pinned?: boolean; source_type: string; group_id?: string; content: { icon?: { euiIconType: string; } severity?: 'normal' | 'high' | 'critical'; message: { text: TranslatedContent; } title: { url: string; text: TranslatedContent; } action: { url: string; text: TranslatedContent; } } // vs several objects interface Notification { recipient_id: string[]; priority?: 'normal' | 'high' | 'critical'; created_at: number; expire_at?: number; is_pinned?: boolean; source_type: string; group_id?: string; content: { icon?: { euiIconType: string; } severity?: 'normal' | 'high' | 'critical'; text: string; title: { url: string; text: string; } action: { url: string; text: string; } }

Also, it's unclear how to store notifications without a particular recipient, addressed to all the users: creating a copy of notification per a locale?

@jowiho @kobelb What do you think?

Considering that at the current moment there is no way to get user locale without user credentials, I'd suggest either starting with storing all possible translations for a message or storing translations for en locale only.

@restrry as we discussed over Zoom, I think that both options are fine for the short-term. Long-term, I think that we're going to end up having to store all possible translations to provide the ideal UX.

rfcs/text/0014_notification_service.md

joshdover · 2021-02-10T10:16:35Z

rfcs/text/0014_notification_service.md

+This option consumes less storage space but might lead to expensive read operations due to an additional JOIN call to 
+search for notification state objects. We don’t expect Kibana to read more than 10-20 notifications at once, so the overhead might be acceptable.
+- Create a copy for every notification in a user-specific list of notifications. It’s [the recommended way to store data](https://www.elastic.co/guide/en/elasticsearch/reference/current/joining-queries.html),
+but it leads to duplicating notification messages. 


It would be useful to do some rough math on the scale of data we'd be duplicating before choosing an option here.

My calculation is based on https://www.javamex.com/tutorials/memory/ and https://www.javamex.com/tutorials/memory/object_memory_usage.shtml:
The single notification object:

interface Notification { // 16 bytes recipient_id: string[]; // the single uuid 36 char, so use StringBuilder = 144 bytes + 16 bytes for Array = 88 bytes priority?: 'normal' | 'high' | 'critical'; // max. 16 char. = 56 bytes created_at: number; // 8 bytes expire_at?: number; // 8 bytes is_pinned?: boolean; // 8 bytes source_type: string; // max. 16 chars = 56 bytes group_id?: string // StringBuilder max 100 char. = 144 bytes content: { // 16 bytes icon?: { // 16 bytes euiIconType: string; // StringBuilder = 144 bytes. URL or inlined SVG can take up to several KB } severity?: 'normal' | 'high' | 'critical' // max. 16 char. = 56 bytes message: { // 16 bytes translations: { // 16 bytes [locale: string]: string; // Unicode, max. 80 char. x 2bytes = 160 bytes x N locales } arguments: { // 16 bytes values: object; // assume max. 100 bytes defaultMessage: string; // StringBuilder max 100 char. = 144 bytes description: string; // StringBuilder max 100 char. = 144 bytes } } title: { // 16 bytes url: string; // url max. 200 char x 2 bytes = 400 bytes. translations: { // 16 bytes [locale: string]: string; // Unicode, max. 30 char. x 2bytes = 60 bytes x N locales } arguments: { // 16 bytes values: object; // assume max. 100 bytes defaultMessage: string; // StringBuilder max 100 char. = 144 bytes description: string; // StringBuilder max 100 char. = 144 bytes } } action: { url: string; // url max. 200 char x 2 bytes = 400 bytes. translations: { // 16 bytes [locale: string]: string; // Unicode, max. 30 char. x 2bytes = 60 bytes x N locales } arguments: { // 16 bytes values: object; // assume max. 100 bytes defaultMessage: string; // StringBuilder max 100 char. = 144 bytes description: string; // StringBuilder max 100 char. = 144 bytes } } } }

with N = 3 supported locales, one message consumes ~ 3.7KB. (It will a bit more since I have no idea how JSON object keys are stored in java)

for an state with interface:

interface NotificationState { // 16 bytes notification_id: string; // the single uuid 36 char, so use StringBuilder = 144 bytes recipient_id: string; // StringBuilder = 144 bytes is_read: boolean; // 8 bytes is_pinned: boolean; // 8 bytes }

one state object consumes ~0.3KB.

let's assume that the deployment has a message addressed to 10 users, then the scenario with storing notification content with the state will require (3.7 + 0.3) x 10 ~ 40KB, while the case storing message content separately from the state requires 3.7 + (0.3 x 10) ~ 6.7 KB.

joshdover · 2021-02-10T10:17:24Z

rfcs/text/0014_notification_service.md

+- Change a notification state.
+
+Kibana-specific entities shouldn’t leak to UNS: 
+- Kibana sends message translations for all the supported locales.


Can elaborate on what this means? We would send the fully translated string, one for each locale?

yes. If we want to support i18n on Cloud and show notifications across deployments with different Kibana versions.
@jowiho Can you see a better option? This one consumes a lot of disk space and network bandwidth, but a notification contains everything that's needed to show it in UI.

I just commented above that we could store the notification in a single language, according to the user's preference at that time. Not sure what the best interface would be. Either the notification source would first lookup the user's language preference and then generate a notification in that language. Or the source would generate a notification in all languages, and the notification service would lookup op the preferred language and store the notification in that language.

rfcs/text/0014_notification_service.md

joshdover · 2021-02-10T10:23:23Z

rfcs/text/0014_notification_service.md

+Writing to the same Notification storage might cause conflict on the `write`. If the storage mechanism implementation
+doesn’t make them impossible, we will have to use a single place to change notification status (via the [queue](#localrepository)).


Are you saying that we should leverage the ES-backed queue to determine the order of state changes? So the flow would be:

NotificationService.markAsRead() is called

NotificationEvent (or similar) is put into ES queue

markAsRead returns (even before the change has been fully processed)

Later on, the background task picks up events form the queue:

Take the latest X events in queue and process them serially

This should eliminate write conflicts because only one instance of the background task is running (a guarantee provided by Task Manager).

Are you saying that we should leverage the ES-backed queue to determine the order of state changes?

It depends on the data storage implementation. If we end up storing a notification state object per user, the probability of writing conflicts is reduced significantly.

joshdover · 2021-02-10T10:26:47Z

rfcs/text/0014_notification_service.md

+A notification cannot have a particular recipient in the lack of a way to identify a user in the system. A notification will be shown to all the users instead.
+User-specific notification state (*read_status*) is not stored in Kibana but browser Local Storage.
+### Phase II
+When [user profiles](https://github.com/elastic/kibana/issues/17888) are supported, Kibana UI starts showing user-specific notifications.


One concern I have is how we will handle the case of when security is disabled. How do you imagine the storage mechanism working in that case? Do we know if we could just not support notifications in that type of deployment?

Do we know if we could just not support notifications in that type of deployment?

This is my vote. The onboarding notifications potentially make sense w/o security, and I could make an argument for others, but this could also be an enticing forcing function to get our clusters secured.

We are still planning on enabling security by default starting in 8.0, so the burden of getting security setup will be greatly diminished.

One concern I have is how we will handle the case of when security is disabled. How do you imagine the storage mechanism working in that case? Do we know if we could just not support notifications in that type of deployment?

The storage mechanism doesn't depend on user_id as long as a notification source passes it during notification creation.

We need user_id when retrieving notifications addressed to a particular recipient.
Lack of user_id is not an obstacle to show notifications addressed to all the users (referred to as global notifications in the RFC).

Also, the ability to disable Security plugin has been deprecated #85159

cc @alexfrancoeur as it is a product decision as well.

Also, the ability to disable Security plugin has been deprecated #85159

Note that enabling Kibana's security plugin !== having stack security features enabled. Future versions of the stack will still be able to run without security features -- they just won't be able to explicitly disable Kibana's security plugin anymore.

I think we should prefer for "global" cluster notifications to still function even if the parts of our stack that allow us to differentiate users is unavailable. That being said, it isn't yet clear to me how complex bringing user identity into Kibana and this system will prove to be (very is my guess). So we should keep an eye on this and I'd like to understand the details of this better myself as we go.

afharo

Thank you @restrry for putting together this great RFC! I love the structure and the thinking process! Also the design, of course :)

I added some NIts and questions. Another thing I wanted to point out: the rendered link in the description looks outdated to me.

rfcs/text/0014_notification_service.md

afharo · 2021-02-10T10:36:56Z

rfcs/text/0014_notification_service.md

+Kibana HTTP API:
+- Endpoint: `GET /notifications/` 
+    - `search_after?: number` - Unix timestamp to start searching from.
+    - `type_id?: string[]` - list of source types to filter by.


is this meant to be an allow list? If so, as explained later on in the Filtering section, when a user hides a specific channel, we'll store in the local storage all the types the user wants to see. What happens when a new channel pops up (i.e.: after an upgrade, or a new type that the user never saw before, so it didn't add it to the list)?
Should we add the query parameter exclude_type_id?: string[]?

What happens when a new channel pops up

Maybe, it is ok to show the new type?
If we hide it by default, the user will not even know that a new type is available.

If we hide it by default, the user will not even know that a new type is available.

That's why I'm asking, the suggested behaviour in this RFC makes me think we'll leave out any new types if we follow the type_id list based on the local storage "known subscription list".

After thinking more, I see the point to add exclude_type_id, but I don't think it justifies the complexity.
We will have to support include / exclude options in filtering for all the parameters. It's hard to reason about what is the final combination of filters will be applied, esp. if exclude_type_id and include_type_id contain the same id.

Fair enough, as long as we keep this limitation in mind when handling the local storage to handle the unsubscribed channels in our first implementation.

afharo · 2021-02-10T10:49:19Z

rfcs/text/0014_notification_service.md

+- The service relies on the yet-to-be-added ability to uniquely identify the user, enumerate users with a given set of permissions.
+- The service doesn't provide a built-in security mechanism but relies on a notification source implementation.
+- Lack of a centralized i18n service makes message transferring over-bloated due to the necessity to send all the possible translations over the wire.
+


Suggested change

- We are relying on X-Pack features (like security and task manager), so we are tied to that. Having to maintain both: the current Newsfeed and NC as an X-Pack replacement.

the current Newsfeed and NC as an X-Pack replacement.

We should support Notifications for plugins outside of x-pack. With the latest licensing changes, it's likely a subset of security plugin functionality ends up outside of the x-pack folder. But still waiting for an announcement from Tech Leads. @kobelb

rfcs/text/0014_notification_service.md

pgayvallet · 2021-02-10T09:45:50Z

rfcs/text/0014_notification_service.md

+- `pinned_status?: boolean` - a flag to pin a notification at top of the list of notifications. Used by a source to draw a user's attention to a notification.
+- `source_type: string` - source type. the same as source domain name: `cloud`, `alerting`.
+- `source_subtype: string` - source-specific type (`cloud.insufficient_funds`, `alerting.something`)
+- `group_id?: string` - identifier to associate several notifications in the UI


Would the grouping be only based on group_id / can distinct sources use the same group_id and have their notifications grouped together under the same group?

Would the grouping be only based on group_id

group_id per a source_type. Alerting will leverage it to group notifications for an event: an alert triggered, an alert acknowledged, etc. With this grouping, UI provides a sort of a domain event timeline.

In the MVP implementation, a click on + 3 messages shows them in the flyout.

Later, when we have Notification Center UI, a click on + 3 messages navigates to the NC UI and applies a filter by group_id.

rfcs/text/0014_notification_service.md

pgayvallet · 2021-02-10T09:53:17Z

rfcs/text/0014_notification_service.md

+- `message: { i18n_key: string, values: object } | { text: string }` - a notification message is subject to i18n. 
+Locale might be changed on the Space level (on the User level in the future).
+Thus, we cannot determine a message locale during notification creation.
+Considering that Kibana notifications might be rendered outside of Kibana UI (in Cloud UI, for example), the message must contain translations for all the supported locales.


the message must contain translations for all the supported locales

Not sure to understand that, given the described format?

Was wondering the same, I assumed this would be something like

message: { i18n_key: string; values: Record<string, string> }

Where Record<string, string> is { [languageCode]: 'translation of message' }

where values: { dictionary, defaultMessage, values, description }.
Ok, I updated the section with

import type { TranslateArguments } from '@kbn/i18n'; interface TranslatedContent { translations: { [locale: string]: string; } arguments: TranslateArguments; } interface NotificationContent { ... message: TranslatedContent; title: TranslatedContent & { url?: string }; action: TranslatedContent & { url: string }; }

We need to extend @kbn/i18n with a method returning all the translations for a given string.

notification.create({ message: i18n.getAllForKey('hello.wonderful.world', { defaultMessage: '...', values: {...}, description: '...' }); });

pgayvallet · 2021-02-10T10:28:44Z

rfcs/text/0014_notification_service.md

+    - `expired_at` if not specified. Kibana allows administrators to set how long notifications will be stored. Falls back to 30 - 90 days be default. 
+- Writing new incoming messages to the Notification storage.


Kibana allows administrators to set how long notifications will be stored

How will that work once we switch to UNS?

I'd expect administrators will configuration a retention period:

on Cloud via Cloud UI

on-prem via kibana.yml

rfcs/text/0014_notification_service.md

pgayvallet · 2021-02-10T11:37:03Z

rfcs/text/0014_notification_service.md

+Filtering allows users to hide notifications they aren’t interested in. Kibana UI attaches a list of filters when requesting notifications.
+Kibana stores filter as a part of User settings data to provide a seamless experience between different deployments on Cloud and user sessions.


What 'User settings' refers to exactly ? If it's stored using the local Kibana's user data / user persisted data, how will this be accessible from cloud / other deployments?

What 'User settings' refers to exactly ?

Right now, it refers to filters only.

If it's stored using the local Kibana's user data / user persisted data, how will this be accessible from cloud / other deployments?

The notification system writes them to the same storage when notifications are persisted. On Cloud, it will be stored in UNS to provide the same settings across different deployments.

rfcs/text/0014_notification_service.md

legrego · 2021-02-10T17:52:35Z

rfcs/text/0014_notification_service.md

+A notification might be addressed to:
+- a user
+- a specific group of users
+- all the users


How do we represent a notification that's addressed to all users? I don't expect we'd want to actually place all users within the recipient list

I added it initially, but removed later. Notifications addressed to all users will be created via a special method

createForAllUsers(events: Omit<NotificationEvent, 'recipient_id'>): void;

legrego · 2021-02-10T18:21:27Z

rfcs/text/0014_notification_service.md

+Kibana reuses Notification service implementation on Cloud but enhances it with additional pluggable storage of external notifications.
+![img](../images/0014/alternative_implementation.png)
+
+The main benefit of this approach is that Kibana always uses the same service in any environment.


A few of other benefits of this approach come to mind for me:

Graceful degradation

If the UNS is unavailable for any reason (even a planned/scheduled outage), then all cloud hosted instances will be unable to publish or retrieve notifications, even if these notifications were generated by that very Kibana instance.

If we took this pluggable storage approach, then locally sourced notifications could still work, as they wouldn't rely on the UNS for storage or retrieval.

To me, this is similar to your Netflix feed: the different categories are managed by different services to provide a single UX. If one of those services is unavailable, then the rest of your feed continues to work.

Extensibility

Once solutions start taking advantage of the notification service, we might uncover new requirements, which may in turn require changes to the data model. If we are coupled to the UNS's representation of the data model, then we are potentially limited by what we can do (short of a _meta dumping ground).

If, on the other hand, we store locally sourced events locally, then we have full control over the data model, and we can perform data migrations as needed.

We would still need to support the UNS's data model, but the types of events created by external sources may not require the same amount of functionality as locally sourced events.

Flexible authorization model

We could choose to authorize locally sourced notifications differently from externally sourced notifications. The UNS (i.e. externally sourced notification) is likely in a better position to perform auth on their own, but locally sourced notifications could have a stack-specific ACL attached.

Migration to ESS?

If an on-prem user wants to move their cluster to the Cloud, then a snapshot/restore would allow them to retain their existing notifications and notification history.

Having thought about this some more, I don't think many of these benefits really apply if locally sourced notifications have to be available consistently in other Elastic products

Graceful degradation
If the UNS is unavailable for any reason (even a planned/scheduled outage), then all cloud hosted instances will be unable to publish or retrieve notifications, even if these notifications were generated by that very Kibana instance.

Later, when UNS is available, it should synchronize notifications and their state between the remote and local storage. For sure, it's undoable, but it's not simple.

Extensibility
Once solutions start taking advantage of the notification service, we might uncover new requirements, which may in turn require changes to the data model.

Agree. It might be useful if we are ready to put up with the cost of data and UI inconsistency between a notification shown in Kibana UI and Console UI.

Flexible authorization model
We could choose to authorize locally sourced notifications differently from externally sourced notifications. The UNS (i.e. externally sourced notification) is likely in a better position to perform auth on their own, but locally sourced notifications could have a stack-specific ACL attached.

Based on the requirements that Cloud must show Kibana notifications in Cloud Console UI, how would it implement the Kibana-compatible authorization model?

Migration to ESS?
If an on-prem user wants to move their cluster to the Cloud, then a snapshot/restore would allow them to retain their existing notifications and notification history.

That's a valid point as long as we are sure that a user migrates to deployment with the same configuration as the local one.

legrego · 2021-02-10T18:29:14Z

rfcs/text/0014_notification_service.md

+A notification cannot have a particular recipient in the lack of a way to identify a user in the system. A notification will be shown to all the users instead.
+User-specific notification state (*read_status*) is not stored in Kibana but browser Local Storage.
+### Phase II
+When [user profiles](https://github.com/elastic/kibana/issues/17888) are supported, Kibana UI starts showing user-specific notifications.


Do we know if we could just not support notifications in that type of deployment?

This is my vote. The onboarding notifications potentially make sense w/o security, and I could make an argument for others, but this could also be an enticing forcing function to get our clusters secured.

We are still planning on enabling security by default starting in 8.0, so the burden of getting security setup will be greatly diminished.

rfcs/text/0014_notification_service.md

legrego · 2021-02-10T18:31:51Z

rfcs/text/0014_notification_service.md

+- Security solutions plugin - Cases
+- Alerting plugin - Alerts
+
+The notification system supports a user-specific state for notifications.


If it's not already done for Phase I, then it sounds like Phase II would need space-awareness? Or will you always see notifications from all of your spaces?

As far as I'm concerned, notifications are space-agnostic.

Spaces are difficult to use on the Cloud because a user does not have an active Space during the session in Cloud Console UI.

For on-prem Kibana, it makes sense to show notifications from all the user's Spaces. Say, you have an alert triggered in Space-A, why shouldn't you receive a notification even if your active Space is Space-B?

@alexfrancoeur could you please confirm these conclusions?

Confirming the current notification UX/UI is space agnostic, and the spirit of the feature is to provide visibility across our stack/products where ever you are (as you illustrate @restrry ).

That being said, I could see filtering by space making sense at some point. If we wanted to include the concept of a space as an optional part of the notification data model (intended as a way to organize notifications) what challenges would we run into?

If we wanted to include the concept of a space as an optional part of the notification data model (intended as a way to organize notifications) what challenges would we run into?

Let's continue with Alerting example. When an alert notification is created, it stores in what Space the alert has been created. https://github.com/elastic/kibana/blob/master/x-pack/plugins/alerts/server/alerts_client/alerts_client.ts#L1335
It can pass spaceId as a part of Notification content. Notification service itself cannot get this information in runtime.

It can pass spaceId as a part of Notification content. Notification service itself cannot get this information in runtime.

Agreed that the default experience should be showing all notifications across all spaces, but if we wanted to support filtering by space, couldn't we automatically source the current space from the request context when a notification is created? There may be some licensing and/or technical limitations that make this difficult, but it may be worth solving those if we want this experience.

couldn't we automatically source the current space from the request context when a notification is created?

Sure, we can do it when we have a request context (as in the case of Background search). but it's not always the case - we don't have request context when an alert is triggered. I'd consider SpaceId to be passed to provide A-Single-Way-To-Do-Things™.

legrego · 2021-02-10T18:35:05Z

rfcs/text/0014_notification_service.md

+although there are some odd instances with >100k users. So we consider two options at the moment (see the [workflow](https://docs.google.com/document/d/18s-BTHog_oPaQKf871gzuhxkah4pK1GYJM9DCLceaLo/edit#)):
+- Store notification state separately from a notification. State for new notifications created for every user when they change the state.
+```typescript
+interface  NotificationState {


Will we want audit logging for this state? Would administrators find it useful to know that a specific user read a specific notification?

I'm not aware of such a requirement at the moment. @alexfrancoeur

Let's say the answer is "no" for the MVP and we can think about it more when read-state moves server-side. I don't know enough about our audit logging myself, so feel free to speak up if anyone disagrees.

If read-state ends up being saved-object backed, will we kindof get this for free with audit logging or are system-managed saved objects exempt?

If read-state ends up being saved-object backed, will we kindof get this for free with audit logging or are system-managed saved objects exempt?

That's correct, we will have a record that someone updates SO (saved_object_update event, the full list of events). But we will need to add integration if we want to provide a more meaningful event on notification state change: notification_marked_read, notification_marked_unread, etc.

legrego · 2021-02-10T18:48:32Z

rfcs/text/0014_notification_service.md

+Kibana-specific entities shouldn’t leak to UNS: 
+- Kibana sends message translations for all the supported locales.
+- Kibana-specific groups of recipients unfolded to a list of *recipient_id*.
+- *recipient_id* is set to *elastic_id*.


What is the distinction between a recipient_id and an elastic_id? Will all users on ESS have a unique elastic_id?

What is the distinction between a recipient_id and an elastic_id?

I suspect it will be elastic_id as long as we need to unambiguously identify a user across the Stack and Cloud.
In any case, Notification plugin relies on:

recipient_id passed in from outside when creating the notification. So recipient_id form is defined by Security plugin eventually.

getting user id from Security plugin when requesting notifications from Kibana UI.

I'd refer to Security team whether:

Will all users on ESS have a unique elastic_id?

Will elastic_id be created for on-prem Kibana users or user id will be in another format?

I don't know the answer here, but I do think it's important to not overlook. As far as I'm aware, we don't have precedent to rely on when figuring out how to translate from "Cloud users" to "Stack users".

lukeelmers · 2021-02-11T17:25:43Z

rfcs/text/0014_notification_service.md

+- `created_at: number` - Unix timestamp in UTC timezone when a notification has been created.
+- `expired_at?: number` - Unix timestamp in UTC timezone when notification can be removed from the system. 
+- `pinned_status?: boolean` - a flag to pin a notification at top of the list of notifications. Used by a source to draw a user's attention to a notification.
+- `source_type: string` - source type. the same as source domain name: `cloud`, `alerting`.


I'm probably just thrown off by the examples here, but I assume the plugins/sources relationship is one:many? (That is, a plugin could create notifications as multiple sources). Or are you suggesting that for internal sources, source type === plugin name?

That is, a plugin could create notifications as multiple sources

I assumed source_type to be a domain name. Sometimes source type === plugin name, sometimes source type might be the same for several plugins in the common domain.

OTOH the only use case for source_type usage at the moment is filtering.

lukeelmers · 2021-02-11T17:33:01Z

rfcs/text/0014_notification_service.md

+- `message: { i18n_key: string, values: object } | { text: string }` - a notification message is subject to i18n. 
+Locale might be changed on the Space level (on the User level in the future).
+Thus, we cannot determine a message locale during notification creation.
+Considering that Kibana notifications might be rendered outside of Kibana UI (in Cloud UI, for example), the message must contain translations for all the supported locales.


Was wondering the same, I assumed this would be something like

message: { i18n_key: string; values: Record<string, string> }

Where Record<string, string> is { [languageCode]: 'translation of message' }

rfcs/text/0014_notification_service.md

Co-authored-by: Alejandro Fernández Haro <afharo@gmail.com> Co-authored-by: Luke Elmers <lukeelmers@gmail.com> Co-authored-by: Josh Dover <1813008+joshdover@users.noreply.github.com>

…tion for all users

rfcs/text/0014_notification_service.md

kobelb · 2021-02-12T00:00:31Z

rfcs/text/0014_notification_service.md

+- `message: { i18n_key: string, values: object } | { text: string }` - a notification message is subject to i18n. 
+Locale might be changed on the Space level (on the User level in the future).
+Thus, we cannot determine a message locale during notification creation.
+Considering that Kibana notifications might be rendered outside of Kibana UI (in Cloud UI, for example), the message must contain translations for all the supported locales.
+- `title: { i18n_key: string, values: object, url?: string } | { text: string, url?: string }` - notification header.
+- `action: { i18n_key: string, values: object, url: string } | { text: string, url: string }` - notification CTA


I agree that it seems simpler to store the translated text. Spaces does throw an interesting wrench in that plan though... Perhaps if the source of the notification event is some space-specific entity, we use the Space's locale if the user hasn't specified one?

rfcs/text/0014_notification_service.md

kobelb · 2021-02-17T23:30:08Z

rfcs/text/0014_notification_service.md

+Locale might be changed on the Space level (on the User level in the future).
+Thus, we cannot determine a message locale during notification creation.
+Considering that Kibana notifications might be rendered outside of Kibana UI (in Cloud UI, for example), the message must contain translations for all the supported locales.
+- `title: TranslatedContent & { url?: string }` - notification header.


Assuming these URLs are for Kibana, the backward compatibility guarantees for URLs don't cross major versions. To side-step this complication, would it make sense for us to drop all Kibana-generated notifications from the prior major version?

To side-step this complication, would it make sense for us to drop all Kibana-generated notifications from the prior major version?

In this case, the user will lose access to all notifications created in the previous version.
If we store notifications with the SO mechanism, might it be better to implement migrations for notifications? Plugins can extend notification service migrations with custom logic. We have something similar for embeddable plugin https://github.com/restrry/kibana/blob/099e9bc50fafa89301d438ae7112330da5ab421c/src/plugins/kibana_utils/common/persistable_state/index.ts#L60

kobelb · 2021-02-17T23:46:34Z

rfcs/text/0014_notification_service.md

+Kibana-specific entities shouldn’t leak to UNS: 
+- Kibana sends message translations for all the supported locales.
+- Kibana-specific groups of recipients unfolded to a list of *recipient_id*.
+- *recipient_id* is set to *elastic_id*.


I don't know the answer here, but I do think it's important to not overlook. As far as I'm aware, we don't have precedent to rely on when figuring out how to translate from "Cloud users" to "Stack users".

rfcs/text/0014_notification_service.md

kobelb · 2021-02-17T23:52:56Z

rfcs/text/0014_notification_service.md

+
+# Drawbacks
+
+- It is not entirely clear what functionality will be supported by UNS and when this will be known.


The uncertainty around UNS does concern me because it's going to be a drop-in replacement for the local implementation. There's the potential that both systems won't integrate seamlessly because we know so little about it.

rfcs/text/0014_notification_service.md

kobelb · 2021-02-17T23:55:41Z

rfcs/text/0014_notification_service.md

+Integration with UNS is something that is not going to happen any time soon. The work can be divided into the following phases:
+
+### Phase I
+Kibana shows notifications created by the Newsfeed Kibana plugin in Notification flyout.


Do you know how we're going to "pull" from the newsfeed and persist these notifications locally? Will this be using TaskManager? Are there any situations where we currently delete newsfeed items that complicate this implementation?

Are there any situations where we currently delete newsfeed items that complicate this implementation?

The current Newsfeed implementation doesn't store any data. It fetches notifications from the browser on every visit and shows them in UI.
On Phase I we start with moving the existing Newsfeed plugin logic to the Notification service and add transformation layer to convert https://github.com/elastic/newsfeeds/blob/master/data/kibana.yaml#L10-L17 to the notification compatible interface. News is shown in Notification UI, but not stored in Notification storage.

Do you know how we're going to "pull" from the newsfeed and persist these notifications locally? Will this be using TaskManager? Are there any situations where we currently delete newsfeed items that complicate this implementation?

Newsfeed plugin can fetch notifications and create a notification for every news item. Every news has hash property, which is recommended to use as id(see output fields). We can make notification creation idempotent operation and use hash as id to avoid creating duplicates.

Or we have to track manually what news already created notifications.

kobelb · 2021-02-18T18:55:14Z

rfcs/text/0014_notification_service.md

+Integration with UNS is something that is not going to happen any time soon. The work can be divided into the following phases:
+
+### Phase I
+Kibana shows notifications created by the Newsfeed Kibana plugin in Notification flyout.


Having the Kibana server pull the notifications from the remote newsfeed service won't work if the user has a network topology where Kibana can't access the newsfeed on the public internet. It's my understanding that this is a significant number of users. @afharo do you happen to know a rough percentage of users that have Kibana firewalled off from the public internet?

@kobelb I think this is a key discussion you raised here (and I think it extends to more than the Newsfeed context):

To answer your question, as of today: close to 70% of the clusters that reported in the last month, did so from the server at least once. We enabled server-side reporting ON by default on 7.9.0. If we filter the stacks from that version, the percentage grows to ~90%. Disclaimer: this is only for those clusters opted-in telemetry

Newsfeed nowadays runs entirely on the public side: each user's browser fetches the newsfeed as they load the page (and keeps the data in the browser's memory, not even the local storage). The Kibana server is not involved in the process: it doesn't fetch/store anything. Even when the trend looks like we're heading to more relaxed firewall rules, I think we should consider UI-sourced notifications, and the implications they might have. In my experience, if we don't provide a common way to do so, plugins will follow their own way, potentially leading to security issues.

If we solve how UI-sourced notifications may work, I don't think we need to worry about the implementation of the Newsfeed fetcher.

Apart from working around the potential firewall issue, other UI-sourced notifications I can think of:

Accumulate toasts? i.e.: Errors, "Upgrade your license", ...

Search session (although this could be generated from the server)

"Upgrade your browser for a better experience", "Use this feature to save you these 10 clicks you always go through" (maybe they are more related to Suggestion service though).

What do you think?

I think we should consider UI-sourced notifications, and the implications they might have. In my experience, if we don't provide a common way to do so, plugins will follow their own way, potentially leading to security issues.

This RFC doesn't distinguish between Kibana client-side and server-side sources.
But I would prefer not to introduce the concept of UI-sourced notifications as long as possible because of:

the security concern. The browser environment is open, it much easier for an intruder to use API. For example, if we provided an HTTP endpoint to create a notification within the system, someone would use it to spam the users with notifications.

proper separation of concerns. It's almost always a good idea to keep business logic out of UI: it's easier to reason about and test it.

coordination problem: a user can have Kibana open in several tabs (and on different devices). We have to come up with a solution to coordinate notification creation to avoid duplicate records.

If we filter the stacks from that version, the percentage grows to ~90%. Disclaimer: this is only for those clusters opted-in telemetry

and exclude Elastic Cloud the percentage on Kibana instances not hidden behind the firewall is ~91%

~90% is a lot better than I expected...@restrry do you think it's going to make the architecture a lot more complicated to get the last 10% working with the newsfeed?

@kobelb It's doable if we are okay to live with API that has the potential problems outlined above #90297 (comment)

Are there other alternatives? What if we reconsider the decision to display News as part of the Notification experience? @thesmallestduck

Are there other alternatives? What if we reconsider the decision to display News as part of the Notification experience?

Newsfeed notifications are paused. We are going to wait for more use-cases of UI-sourced notifications.

add notification service rfc

2285d9c

mshustov changed the title ~~[RFCadd notification service~~ [RFC] Kibana notification service Feb 4, 2021

add ToC

39b23cc

mshustov added RFC backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc labels Feb 4, 2021

mshustov added the v8.0.0 label Feb 4, 2021

mshustov requested review from kobelb, legrego and a team February 4, 2021 16:50

mshustov mentioned this pull request Feb 4, 2021

Notification system investigation #86251

Closed

26 tasks

clarify euiIcon interface, filtering storage mechanisms, comments fro…

580c32e

…m Jochem

joshdover reviewed Feb 10, 2021

View reviewed changes

afharo reviewed Feb 10, 2021

View reviewed changes

pgayvallet reviewed Feb 10, 2021

View reviewed changes

legrego reviewed Feb 10, 2021

View reviewed changes

lukeelmers reviewed Feb 11, 2021

View reviewed changes

mshustov and others added 7 commits February 15, 2021 14:14

Apply suggestions from code review

27bbe44

Co-authored-by: Alejandro Fernández Haro <afharo@gmail.com> Co-authored-by: Luke Elmers <lukeelmers@gmail.com> Co-authored-by: Josh Dover <1813008+joshdover@users.noreply.github.com>

Merge remote-tracking branch 'origin' into rfc-kns

9232051

Merge remote-tracking branch 'origin/rfc-kns' into rfc-kns

f8dc88f

fix typo

3d00fd2

address comments

0337af5

change statuses to prefixed versions, add a method to create notifica…

ba63757

…tion for all users

clarify group_id, remove source_subtype

026f53d

mshustov added 5 commits February 16, 2021 12:27

flesh out 817n strategty

8509058

use consistent naming with snake_case

24b9dec

add a note to use SO

255db48

add other options to store notification and its state

e908524

Merge branch 'master' into rfc-kns

099e9bc

kobelb reviewed Feb 17, 2021

View reviewed changes

address Brandon comments

8b04961

kobelb reviewed Feb 18, 2021

View reviewed changes

mshustov added 5 commits February 19, 2021 11:12

propose new UserStorageStatus interface, add a note about SO

1994f9c

rename RFC, im-memory queue, no notification state on Phase I

c995ae9

Merge remote-tracking branch 'origin/rfc-kns' into rfc-kns

dcdbbc0

add a note about URL format in CTA & title

68e64c1

URL format remark

30e6ee6

lukeelmers mentioned this pull request May 3, 2021

[Deprecation service] Using i18n for deprecation messages #99072

Closed

mshustov mentioned this pull request May 4, 2021

Investigate Server --> Client communication channel support #98881

Open

tsullivan mentioned this pull request May 13, 2021

Move reporting from management application into the user setting #36141

Closed

mshustov mentioned this pull request May 26, 2021

Move task manager out of x-pack #100663

Open

legrego mentioned this pull request Sep 23, 2021

[Discuss] Support for authenticating users via proxy-based mechanisms #113002

Closed

lukeelmers removed the v8.0.0 label Feb 10, 2022

legrego closed this Jul 17, 2024

		Writing to the same Notification storage might cause conflict on the `write`. If the storage mechanism implementation
		doesn’t make them impossible, we will have to use a single place to change notification status (via the [queue](#localrepository)).


	- We are relying on X-Pack features (like security and task manager), so we are tied to that. Having to maintain both: the current Newsfeed and NC as an X-Pack replacement.

		- `expired_at` if not specified. Kibana allows administrators to set how long notifications will be stored. Falls back to 30 - 90 days be default.
		- Writing new incoming messages to the Notification storage.

		Filtering allows users to hide notifications they aren’t interested in. Kibana UI attaches a list of filters when requesting notifications.
		Kibana stores filter as a part of User settings data to provide a seamless experience between different deployments on Cloud and user sessions.


		# Drawbacks

		- It is not entirely clear what functionality will be supported by UNS and when this will be known.

[RFC] Kibana notification service #90297

[RFC] Kibana notification service #90297

Conversation

mshustov commented Feb 4, 2021 • edited Loading

elasticmachine commented Feb 4, 2021

jowiho commented Feb 5, 2021

jowiho commented Feb 5, 2021

mshustov commented Feb 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshustov Feb 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshustov Feb 16, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshustov Feb 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afharo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshustov Feb 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshustov Feb 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshustov Feb 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Graceful degradation

Extensibility

Flexible authorization model

Migration to ESS?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshustov Feb 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshustov Feb 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshustov commented Feb 4, 2021 •

edited

Loading

mshustov commented Feb 8, 2021 •

edited

Loading

mshustov Feb 18, 2021 •

edited

Loading

mshustov Feb 16, 2021 •

edited

Loading

mshustov Feb 11, 2021 •

edited

Loading

mshustov Feb 11, 2021 •

edited

Loading

mshustov Feb 13, 2021 •

edited

Loading

mshustov Feb 13, 2021 •

edited

Loading

mshustov Feb 18, 2021 •

edited

Loading

mshustov Feb 18, 2021 •

edited

Loading

mshustov Feb 18, 2021 •

edited

Loading