Simple Dashboard Launcher UI (To launch old dashboards) #1147

Nanthagopal-Eswaran · 2024-05-21T19:26:15Z

Items to add to release announcement:

Heading: 🚀 TruLens Dashboard Launcher

It provides a simple UI, where user can select sqlite files from previous runs and launch the TruLens Dashboard. In this way, users don't have to worry about logging the results for later and just use the trulens provided stremlit UI any time.

🕹️ Usage
Run the following command:

poetry run trulens-eval-dashboard

Features

Simple UI to quickly select different sqlite files and launch trulens dashboards.
Multiple dashboards can be viewed by giving different port numbers.

Open Multiple Dashboards

Currently, the tool can open only one dashboard at a time. To open multiple dashboards, quick work around is to launch this tool multiple times.

Other details that are good to know but need not be announced:

🌱 Improvements [For Future]

Open multiple dashboards from single window.
Allow sharing the streamlit dashboard with others using sharable link.
Build an exe to avoid the need of installing python and poetry. Also to make the tool more portable.
UI/UX Improvements

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

* bump versions in quickstarts * bump version * remove openai references in function definitions page * gemini example * headers * second example: semantic evals * updates, add rag triad * update top header

…truera#696) * add aliases for selectors for main method args and main method return * break down * refine --------- Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

* exposed AzureOpenAI provider * added docs * Update CONTRIBUTING.md * typo in mkdocs.yml --------- Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

* first * typos * typehint

* import llama only if needed * use optional imports instead --------- Co-authored-by: Piotr Mardziel <piotrm@gmail.com>

* fix * typo * don't print external if internal is available

Co-authored-by: Piotr Mardziel <piotrm@gmail.com>

* adjust docstring for select_context * langchain select_context, update quickstarts * undo app name change * remove dev cell * generalized langchain select_context (truera#711) * generalized langchain select_context * typo * typo in string * update langchain example to pass app in select_context --------- Co-authored-by: Josh Reini <joshreini1@gmail.com> * comments, clarity updates to quickstarts * add lib-independent select_context * update lc li quickstarts --------- Co-authored-by: Piotr Mardziel <piotrm@gmail.com>

* add optional * bug class_info fix

* update configs * bugfix * dont add class info to dicts

* Fix correctness prompt Fixes truera#718 * Update base.py

* Bump suggested notebook versions * Combine notebooks and py files --------- Co-authored-by: Shayak Sen <shayak@truera.com>

* Bump suggested notebook versions * Combine notebooks and py files * Update __init__.py --------- Co-authored-by: Shayak Sen <shayak@truera.com>

* debug * display python version * python version * PromptTemplate update import * bad escape fix * add msg to exception * pass kwargs in Groundedness * pass kwargs with GroundTruthAgreement * give default value to ground_truth_imp * migrate db on reset

* fix example notebook * fixes * remove commented out

* always use prompt instead of messages * use messages in base * use prompt in bedrock * move score to top of cot template, request entire template be used * remove dev * add TODO

* update langchain instrumentation page * include instrumented methods * llama-index instrumentation updates * update the overview * change path to instrumentation overview * add some more info in appendices and line space --------- Co-authored-by: Piotr Mardziel <piotrm@gmail.com>

* add instructions and text wrapping * format * clean up github scripts and update README sources * typo --------- Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

* fix * remove redundant --------- Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

* adjusted * fix typo

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

* add instructions and text wrapping * format * debugging * making obj arg no longer required * remove obj and add documentation for WithClassInfo * remove IPython from most notebooks and organize imports * fix test errors * forgot warning --------- Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

* update notebooks to test * rehack * update langchain requirement * add core lowerbound

* fix rag triad and awaitable calls * remove locals printout in awaitables message * update __getattr__ in select_context (truera#1119) --------- Co-authored-by: Piotr Mardziel <piotrm@gmail.com>

Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

* Update feedback.py * use name

…ruera#1134)

retreivers -> retrievers

* unify groundedness start * remove groundedness.py * groundedness nli moves * remove custom aggregator * groundedness aggregator to user code * move agg to trulens side by default (groundedness) * remove extra code * remove hf key setting * remove hf import * add comment about aggregation for context relevance * update init * remove unneeded import * use generate_score_and_reasons for groundedness internally * f-strings for groundedness prompts * docstring * docstrings formatting * groundedness reasons template * remove redundant prompt * update quickstarts * llama-index notebooks * rag triad helper update * oai assistant nb * update readme * models notebooks updates * iterate nbs * mongo, pinecone nbs * update huggingface docstring * remove outdated docstring selector notes * more docstring cleaning

* open ai streaming adjustments in cost tracking * notes * delete outputs

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com> Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

* Update selecting_components.md * Update MultiQueryRetrievalLangchain.ipynb * Update random_evaluation.ipynb * Update canopy_quickstart.ipynb

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

* update comprehensiveness + nb * nb expansion * fix typo * meetingbank transcript data * oss models in app * test * benchmarking gpt-3.5-turbo, gpt-4-turbo, and gpt-4o * update path * comprehensiveness benchmark * updated summarization_eval nb * fix normalization * show improvement in comprehensiveness feedback functions --------- Co-authored-by: Daniel <dah0417912@gmail.com>

* version bump * simpler lc quickstart * update installs and imports * update langchain instrumentation docs * remove groundedness ref from providers.md * build docs fixes * remove key cell * fix docs build * firx formatting for stock.md * remove extra spaces * undo format change * update docstrings for hugs and base provider * openai docstring updates * hugs docstring update * update context relevance hugs docstring * more docstring updates * remove can be changed messages from openai provider docstrings

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

* add to glossary * finish some terms

Nanthagopal-Eswaran · 2024-05-21T19:31:06Z

We have been internally using this tool since we need to open multiple dashboards side by side and compare. Thought this might be a common need for all trulens users. So sharing it here.

I know this might not be very attractive UI but the need is true.

joshreini1 · 2024-05-22T04:25:11Z

@Nanthagopal-Eswaran would love to understand the need more here. Why do you need to compare multiple dashboards rather than logging the different apps to the same sqlite db and thus comparing the apps in the same dashboard?

Nanthagopal-Eswaran · 2024-05-22T04:46:59Z

@Nanthagopal-Eswaran would love to understand the need more here. Why do you need to compare multiple dashboards rather than logging the different apps to the same sqlite db and thus comparing the apps in the same dashboard?

Hi @joshreini1,

There are two main reasons,

Evaluating the changes in different versions of app
I get that we can use the same db and add. But this is an ideal case right. What if we want to compare the tests executed by different engineers / teams or if we want to automate these tests through github actions or azdo pipelines and share the db alone through mail.

Sharing the report / Re-open the report
Since it is a streamlit UI, we always need to use small snippet of python code to load previous results. I initially thought of an exe which I can share with my stakeholders, so that, they don't have to install python and dependencies to open the reports. Instead they can launch this tool and open the report.

joshreini1 · 2024-05-22T12:53:47Z

Thanks @Nanthagopal-Eswaran - I hope you don't mind if I drill down a bit more :)

Evaluating the changes in different versions of app

I get that we can use the same db and add. But this is an ideal case right. What if we want to compare the tests executed by different engineers / teams or if we want to automate these tests through github actions or azdo pipelines and share the db alone through mail.

Comparing tests executed by different engineers/teams/automation would be better supported by using a shared database(s) to store the results than tracking a bunch of different sqlite dbs. Adding a shared database for TruLens to log to only requires passing a database URL compatible with SQLAlchemy (docs). Does this seem reasonable?

Sharing the report / Re-open the report
Since it is a streamlit UI, we always need to use small snippet of python code to load previous results. I initially thought of an exe which I can share with my stakeholders, so that, they don't have to install python and dependencies to open the reports. Instead they can launch this tool and open the report.

This use case seems reasonable, however I'm not sure it makes sense to support it directly in the package; it might be better as an internal tool. I would suggest that hosting the dashboard might be easier here (for you and the stakeholder), via an ec2 or similar.

Nanthagopal-Eswaran · 2024-05-22T19:16:42Z

@joshreini1, Thanks for your insights.

For first point, I get your point. We have to try how practical it is though. I am more worried about the data getting lost / corrupted by mistake and we have to spend huge cost to regenerate all the previous reports as they are in single location now. FYR, it takes more than $40 to execute one test run in our case.

And for the second point, yes it would be good to have this as a separate tool. Feel free to skip this PR. But please have an internal discussion and add this as a separate repo if really needed.

But this conversation actually made me realize the main problem here.
you can see the problem with reports being streamlit apps right?.
As a developer, I can use this tool to view previous results whenever I want. But to Stakeholders, they might not need full details. Is there a way to export the report as standalone html file (similar to pytest-html reports)? it doesn't have to have lot of features and details, atleast the leaderboard alone. This would also be helpful if we want to have automated tests and send the leaderboard alone in automated mail.

I quickly went through streamlit repo and found this issue - This clearly shows the important of standalone html reports - streamlit/streamlit#611

joshreini1 · 2024-05-24T14:31:59Z

Thanks @Nanthagopal-Eswaran - definitely understand and agree with your points on the importance of standalone reports. I'll continue to discuss with the team and get back to you once we've got a plan here.

BTW - one additional workaround might be to use tru.get_leaderboard() in a notebook and export that to HTML.

github-actions bot and others added 30 commits December 18, 2023 17:36

Automated File Generation from Docs Notebook Changes (truera#695)

03a5b14

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

Gemini Example (truera#697)

165d7f7

* bump versions in quickstarts * bump version * remove openai references in function definitions page * gemini example * headers * second example: semantic evals * updates, add rag triad * update top header

fix colab link (truera#699)

02ca4ec

add aliases for selectors for main method args and main method return (…

71bf1d4

…truera#696) * add aliases for selectors for main method args and main method return * break down * refine --------- Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

exposed AzureOpenAI provider (truera#698)

159c0be

* exposed AzureOpenAI provider * added docs * Update CONTRIBUTING.md * typo in mkdocs.yml --------- Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

ollama quickstart (truera#703)

91bb9e0

allow debug timeout to be adjusted (truera#713)

5fa4b82

* first * typos * typehint

import llama only if needed (truera#714)

332858b

* import llama only if needed * use optional imports instead --------- Co-authored-by: Piotr Mardziel <piotrm@gmail.com>

fix dashboard starts for colab (truera#721)

426fa70

* fix * typo * don't print external if internal is available

savE (truera#719)

37772e2

Co-authored-by: Piotr Mardziel <piotrm@gmail.com>

first (truera#720)

c26988c

add optional (truera#723)

a791a96

* add optional * bug class_info fix

pydantic2 deprecation fix to model config (truera#724)

7e024c3

* update configs * bugfix * dont add class info to dicts

Fix correctness prompt (truera#725)

f2eefbd

* Fix correctness prompt Fixes truera#718 * Update base.py

Releases/rc trulens eval 0.20.0 (truera#727)

0c0b484

* Bump suggested notebook versions * Combine notebooks and py files --------- Co-authored-by: Shayak Sen <shayak@truera.com>

Releases/rc trulens eval 0.20.0 (truera#729)

27b664c

* Bump suggested notebook versions * Combine notebooks and py files * Update __init__.py --------- Co-authored-by: Shayak Sen <shayak@truera.com>

azureopenai fixes (truera#735)

d7290be

* fix example notebook * fixes * remove commented out

fix typo (truera#739)

5c84ecc

Update extract_score_and_reasons to work across providers (truera#732)

b8d9303

* always use prompt instead of messages * use messages in base * use prompt in bedrock * move score to top of cot template, request entire template be used * remove dev * add TODO

add instructions for installing from github (truera#740)

e2b5ad1

* add instructions and text wrapping * format * clean up github scripts and update README sources * typo --------- Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

Automated File Generation from Docs Notebook Changes (truera#744)

7e2d753

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

adjust optional llama (truera#745)

1b7c516

* fix * remove redundant --------- Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

adjust human feedback notebook (truera#746)

c28c0af

* adjusted * fix typo

Automated File Generation from Docs Notebook Changes (truera#749)

caa9205

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

update notebooks to test (truera#753)

b9a2fc9

langchain thread executor rehack (truera#755)

98d9623

* update notebooks to test * rehack * update langchain requirement * add core lowerbound

joshreini1 and others added 21 commits May 9, 2024 08:56

trurails: update to getattr (truera#1130)

fc2c5a2

* fix rag triad and awaitable calls * remove locals printout in awaitables message * update __getattr__ in select_context (truera#1119) --------- Co-authored-by: Piotr Mardziel <piotrm@gmail.com>

Remove placeholder feedback (truera#1127)

529c68c

Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

Default names for rag triad utility (truera#1122)

7457e53

* Update feedback.py * use name

add reasons to answer, context relevance; add collect to groundedness

45338b0

Update custom functions notebook to reflect context relevance change (t…

9f7bbab

…ruera#1134)

Update custom_feedback_functions.ipynb

e365a6e

Update README.md (truera#1136)

744ea34

retreivers -> retrievers

dont iterate streams in openai cost tracking (truera#1138)

78dbb12

* open ai streaming adjustments in cost tracking * notes * delete outputs

Automated File Generation from Docs Notebook Changes (truera#1137)

95d8d0b

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com> Co-authored-by: Josh Reini <60949774+joshreini1@users.noreply.github.com>

Fix a few old groundedness references (truera#1139)

c424cd4

* Update selecting_components.md * Update MultiQueryRetrievalLangchain.ipynb * Update random_evaluation.ipynb * Update canopy_quickstart.ipynb

Update all_tools.py

e557792

Update llama_index_quickstart.ipynb

9a6edec

Automated File Generation from Docs Notebook Changes (truera#1141)

e41e513

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

Automated File Generation from Docs Notebook Changes (truera#1143)

0c1d745

Co-authored-by: joshreini1 <joshreini1@users.noreply.github.com>

glossary additions (truera#1144)

43bfb74

* add to glossary * finish some terms

Update ollama_quickstart.ipynb

e06afaf

Added trulens dashboard viewer

4ea9855

added root init

7b537b7

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label May 21, 2024

Nanthagopal-Eswaran changed the title ~~Simple Dashboard Launcher UI (To view launch old dashboards)~~ Simple Dashboard Launcher UI (To launch old dashboards) May 22, 2024

sfc-gh-dhuang force-pushed the main branch from 7e0fbf3 to 26848c6 Compare June 29, 2024 04:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple Dashboard Launcher UI (To launch old dashboards) #1147

Simple Dashboard Launcher UI (To launch old dashboards) #1147

Nanthagopal-Eswaran commented May 21, 2024

Nanthagopal-Eswaran commented May 21, 2024

joshreini1 commented May 22, 2024

Nanthagopal-Eswaran commented May 22, 2024 •

edited

Loading

joshreini1 commented May 22, 2024

Nanthagopal-Eswaran commented May 22, 2024 •

edited

Loading

joshreini1 commented May 24, 2024

Simple Dashboard Launcher UI (To launch old dashboards) #1147

Are you sure you want to change the base?

Simple Dashboard Launcher UI (To launch old dashboards) #1147

Conversation

Nanthagopal-Eswaran commented May 21, 2024

Nanthagopal-Eswaran commented May 21, 2024

joshreini1 commented May 22, 2024

Nanthagopal-Eswaran commented May 22, 2024 • edited Loading

joshreini1 commented May 22, 2024

Evaluating the changes in different versions of app

Nanthagopal-Eswaran commented May 22, 2024 • edited Loading

joshreini1 commented May 24, 2024

Nanthagopal-Eswaran commented May 22, 2024 •

edited

Loading

Nanthagopal-Eswaran commented May 22, 2024 •

edited

Loading