Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tf.summary.text fails keeping summaries #83

Closed
chihuahua opened this issue Jun 16, 2017 · 8 comments
Closed

tf.summary.text fails keeping summaries #83

chihuahua opened this issue Jun 16, 2017 · 8 comments

Comments

@chihuahua
Copy link
Member

chihuahua commented Jun 16, 2017

This issue had been migrated from tensorflow/tensorflow#10204:

I got following issues when I use tf.summary.text and view the summaries on tensorboard.

  • It shows me text summaries in random order.
  • It randomly removes existing summaries and show me only a few (Is there a configuration for maximum number of summaries to keep?)
  • I can usually see only around 5 summaries on tensorboard even if I added summaries 100+ times.
  • Other summaries work properly when I use summaries like below.
summary_op = tf.summary.merge(summaries) # Other scalar, distribution, histogram summaries
valid_summary_op = tf.summary.merge([valid_sentence_summary]) # text summary with tf.summary.text

I can reproduce this problem in two different environments.

  1. Ubuntu 14.04 / CUDA 8.0 / Cudnn 5.1 / TF 1.1.0rc2 / Bazel 0.4.5 / GPU TITAN X Pascal (use 0 gpus~4gpus)
  2. Mac OSx Sierra / TF 1.1.0rc2 / Bazel 0.4.5 / No GPU

Below is sample code to reproduce this issue.

import tensorflow as tf

text_list = ['this is the first text', 'this is 2nd text', 'this is random text']
id2sent = {id:sent for id, sent in enumerate(text_list)}
sent2id = {sent:id for id, sent in id2sent.items()}

tf.reset_default_graph()    

outer_string = tf.convert_to_tensor('This is string outside inner scope.')
outer_summary = tf.summary.text('outside_summary', outer_string)

with tf.name_scope('validation_sentences') as scope:
    id_list = tf.placeholder(tf.int32, shape=[3], name='sent_ids')

    valid_placeholder = tf.placeholder(tf.string, name='valid_summaries')

    inner_summary = tf.summary.text('sent_summary', valid_placeholder)
    summaries = [outer_summary, inner_summary]
    summary_op = tf.summary.merge(summaries)
        
sess = tf.Session()
summary_writer = tf.summary.FileWriter(logdir='./text_summary', graph=sess.graph)

for step in range(10):

    predicted_sents_ids = sess.run(
        id_list,
        feed_dict={
            id_list: [0, 1, 2]
        })

    # list of string
    predicted_sents = [id2sent[id] for id in predicted_sents_ids]

    valid_summary = sess.run(summary_op, feed_dict={
        valid_placeholder: predicted_sents
    })

    summary_writer.add_summary(valid_summary, global_step=step)
    # summary_writer.flush()
# summary_writer.flush()
# flush() didn't help..

And below is the result on tensorboard.

image

@b3nk4n
Copy link

b3nk4n commented Nov 20, 2017

I'm using TensorFlow 1.3 and have the same issue:

  • like in @chihuahua 's screenshot above, TensorBoard's Text-tab randomly skips some steps, so I can only see e.g step 0, step 12000, step 17000, ...
  • furthermore, when I start my training process, I'm using tf.summary.text to write out all my argparse-parameters of the program, so that I can easily check which hyperparams I have used. Unfortunately, it looks like that there is a limit of 10 tf.summary.text() calls per step. At least only 10 values are displayed in TensorBoard.

@nfelt
Copy link
Collaborator

nfelt commented Nov 27, 2017

@bsautermeister - the random skipping is a result of reservoir sampling applied to all the event file data that TensorBoard processes. The sampling process means that for each tag's tensors, TensorBoard displays only a random subsample of up to N values. The value of N varies - it's 1000 for the scalars dashboard charts, for example - but it's set to just 10 by default for the images, audio, and text dashboards (the former two are set explicitly in DEFAULT_TENSOR_SIZE_GUIDANCE while the latter inherits the overall default from DEFAULT_SIZE_GUIDANCE):
https://github.com/tensorflow/tensorboard/blob/0.4.0-rc3/tensorboard/backend/application.py#L50

I'm not sure if there's a good way right now to override those values, others on the project might know better - if not, we can at least create a feature request for that.

In terms of writing out argparse parameters, I'm guessing that the limit of 10 is coming from the reservoir size as discussed above, but I'd expect that to apply to a single tag over all steps, rather than to a single step across tags. How exactly are you calling tf.summary.text()? If you can show a minimal reproduction we might be able to diagnose more closely. If it does turn out to be the reservoir size limit again, one option there might be doling a single tf.summary.text() call and passing in a rank-1 tensor with a list of all your parameters, and a unique tag name (like "parameters") that you don't use for any other summary ops.

@b3nk4n
Copy link

b3nk4n commented Nov 28, 2017

@nfelt
Thank you for your detailed answer!

I already did a workaround to concat all values to a single comma-seperated string, so that I can write the hyperparams as a single chunk. It is not formatted that nicely, but this is more or less sufficient for me. But your idea using a rank-1 tensor should also work I guess, and might be even better ;-)

@nfelt
Copy link
Collaborator

nfelt commented Dec 6, 2017

Sounds good. For now, I'm going to going ahead and close this issue as "working as intended" but I think we'd be open to either bumping up the default text plugin sample count or maybe making it configurable (for Googlers, see also internal bug b/34722493). Or yet another option might be to more clearly highlight on the dashboards when sampling is or isn't in effect, something like a message that says "Showing 10 uniformly sampled values out of 100" to indicate what's going on.

I think our default plan would be to revisit this once we're farther along towards the DB backend since that will change how we do sampling anyway, but if you'd like any of these short-term resolutions, feel free to open a FR as a separate issue and we may be able to get to it.

@sleighsoft
Copy link

For me personally it would be great to just select the N steps I want to see without any sampling used.

@nfelt
Copy link
Collaborator

nfelt commented Apr 3, 2018

@sleighsoft yes, the database backend should allow us to avoid sampling unless necessary, and once we have that ability we'd like to add the ability to basically select arbitrary steps for viewing.

In the interim, one option is just editing the "SIZE_GUIDANCE" values as described above to bump it up above the number of steps you're emitting text summaries for. We could also probably just bump the default value for text plugin up to 100 or more since the text data is typically not that large (compared to e.g. images), if that would help (while recognizing that you still run into the sampling issue, just at a larger number of summaries logged).

@sleighsoft
Copy link

My current solution is writing markdown tables on disk as tensorboard is not usable for me with sampling.

@nfelt
Copy link
Collaborator

nfelt commented Jun 15, 2018

FYI, PR #1138 added a --samples_per_plugin flag that can be used to set the number of samples retained on a per-plugin basis. So e.g. --samples_per_plugin=text=100 should set the text dashboard to retain 100 samples for each series.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants