Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip highlighting for output lines. #856

Merged
merged 1 commit into from
Mar 26, 2018

Conversation

chriswells0
Copy link
Contributor

Lines marked as output are skipped by the highlighting process to prevent false positives.

@LeaVerou
Copy link
Member

Thanks! Wouldn't it be more usable if there was a character in the beginning of such lines to indicate them instead of having to figure out line numbers for data-output?

@chriswells0
Copy link
Contributor Author

Do you mean some sort of non-code text at the beginning of every line of output? My instinct says that's at least as cumbersome as identifying the output lines in an attribute, but then I haven't tried it. :)

There are 2 other concerns that come to mind:

  1. It seems like a kludge (for lack of a better word) to add fake data when the role of markup is to describe data.
  2. How would we choose character(s) to identify output in any given language and know it's never part of the code/output?

I may have misunderstood what you were suggesting, though. Please correct me if I'm misinterpreting it.

}
}
});

env.element.innerHTML = prompt.outerHTML + env.element.innerHTML;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is not part of this PR, but I've just noticed this line. How about something like this:

pre.insertBefore(prompt, env.element);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better:

env.element.insertBefore(prompt, env.element.firstChild);

@zeitgeist87
Copy link
Collaborator

It is a bit strange, that user has to specify the output lines with data-output. Wouldn't it be more natural to specify the lines that have prompts? I think that could also simplify the code. What do you think @chriswells0 ?

EDIT: I guess it depends on the type of content. If you have more commands or more output lines...

@chriswells0
Copy link
Contributor Author

I guess it depends on the type of content. If you have more commands or more output lines...

Exactly. If the purpose of the plugin is to highlight commands, then the output lines would be the outliers. That's the way I see it, anyway.

@LeaVerou
Copy link
Member

It seems like a kludge (for lack of a better word) to add fake data when the role of markup is to describe data.

Your argument seems like a philosophical concern, whereas the problem I pointed out is very real: Specifying line numbers is not only difficult (how do you even know if you're not using the line number plugin? Do you count them, one by one, with your finger?), but also fragile (all numbers change if new lines are added).
Philosophical purity is NOT above usability. At least not in this project.

How would we choose character(s) to identify output in any given language and know it's never part of the code/output?

Same way as any metacharacter or keyword in any language, ever: You provide an escape sequence to escape it in the odd chance it's actually used as code. We're not treading new ground here :P

@chriswells0
Copy link
Contributor Author

Do you count them, one by one, with your finger?

That seems easier than manually adding text before every line of output, but that could be solely because I've only done it a handful of times using the current approach and no other. From a usability perspective, would you rather count the lines or paste special chars before each line? To simplify that, we could instead insert beginning and ending markers around the output lines, but then we're effectively resorting to marking it up.

You provide an escape sequence to escape it in the odd chance it's actually used as code.

Obviously, we could make it very unlikely that the characters would match the output (e.g., ^\[OUTPUT\]) or go with something simple (e.g., ^_{5}). Then we're asking users to insert those chars in front of every line and also escape the ones that do exist in their output.

@LeaVerou
Copy link
Member

From a usability perspective, I would definitely prefer to add a prefix before these lines than to have to find their line numbers and keep them in sync with the code.
The main issue I see with that approach is that it spoils the fallback in case Prism fails to load/run for whatever reason. But we could make the prefix readable, e.g. something like (Output) or (Out).
Of course, both approaches can co-exist so that you can still use your line numbers and everyone else can use a prefix.

@chriswells0
Copy link
Contributor Author

it spoils the fallback in case Prism fails to load/run for whatever reason

If that's a concern, then it might make sense to add extra data. I'd appreciate retaining the attribute's functionality so I wouldn't have to add the data in my case, but maybe I'll change my mind once I see it in action.

  1. What would you like the prefix to be?
  2. Do you want that in a separate PR? Seems like it should be.
  3. Should I add the same fallback functionality to the Line Highlight plugin?

@zeitgeist87
Copy link
Collaborator

I think it is a good idea to also support a marker character for the output lines. You already have the code lines in the variable codeLines. It would be a simple matter to iterate through them and check if they start with a certain character. A simple loop and regex should do it. You can still support your data-output attribute at the same time:

// using > as line marker and \> to escape it
for (var i = 0; i < codeLines.length; ++i) {
    codeLines[i] = codeLines[i].replace(/(^\s*)(\\?>)(.*)/, function(match, p1, p2, p3) {
        if (p2 == '>') {
            outputLines[i] = p1 + p3;
            return '';
        } else {
            return p1 + '>' + p3;
        }
    });
}

@chriswells0
Copy link
Contributor Author

Is > specifically the character we want to use?

If it's being used to mark the output lines, is there any reason to allow whitespace in front of the marker? I would think it should come at the beginning of the line when used as a marker.

@zeitgeist87
Copy link
Collaborator

Is > specifically the character we want to use?

No, that was just an example. Feel free to use something different. I chose it, because it reminded me of the multiline string marker in bash:

[andreas@terok ~]$ echo "multiline
> 
> 
> string"

If it's being used to mark the output lines, is there any reason to allow whitespace in front of the marker? I would think it should come at the beginning of the line when used as a marker.

You are right, there is no good reason. If we disallow whitespace, then we don't need a regex and we can simplify the code.

@LeaVerou
Copy link
Member

I think > is not a good option exactly because it's like the multiline character, so multiline commands and output would get confused.
Anything works for me, but please choose something that is short to type and works as a fallback too.
I suggested (Out) earlier, but I'm not married to it.

@chriswells0
Copy link
Contributor Author

@zeitgeist87: thoughts on (Out) or other suggestions?

I'd lean toward all lowercase or all uppercase, but we might as well make it case-insensitive anyway.

@chriswells0
Copy link
Contributor Author

@LeaVerou, @zeitgeist87: Would you expect the data-output attribute and the line prefix to be allowed on the same code block or would you prefer to skip the prefix check if the attribute is provided? I could go either way on that.

EDIT: Actually, I'd lean toward skipping the line prefix check if the user provides a data-output attribute. There's no reason to potentially make someone escape their code if we know they specified the output already. Let me know if you disagree.

@zeitgeist87
Copy link
Collaborator

Would you expect the data-output attribute and the line prefix to be allowed on the same code block or would you prefer to skip the prefix check if the attribute is provided?

I think it's better not to allow them at the same time. As you pointed out, there is no need to force the user to escape code that only uses the data-output attribute.

thoughts on (Out) or other suggestions?

We could do it like the diff format used by git diff. Instead of using a long marker like (Out) and escaping it, we could simply add a marker to every line. For example a + for a prompt and a space for an output line. For example this code

ls -al
total 44
drwxr-xr-x 11 root root 4096 Jan 23 23:44 .
drwxr-xr-x  5 root root 4096 Dec 31  2014 ..

would be escaped like this:

+ls -al
 total 44
 drwxr-xr-x 11 root root 4096 Jan 23 23:44 .
 drwxr-xr-x  5 root root 4096 Dec 31  2014 ..

Notice that the output lines also have a space added in front of them. We don't have to use + and . We can use any single character and don't have to worry about escaping things. What do you think about that?

@chriswells0
Copy link
Contributor Author

The only issue I see with such an approach is that would mean a little more effort to prefix every line--even when there are no output lines. Although, we could allow an empty data-output attribute for scenarios where there are no output lines and the user doesn't want to prefix every line.

@chriswells0
Copy link
Contributor Author

chriswells0 commented Mar 25, 2018

Allowed specifying output prefix using data-filter-output attribute, so users can choose their own prefix for the output lines.
+ @Golmote for review.

Copy link
Contributor

@Golmote Golmote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good! Please take a look at my comments.

var outputFilter = pre.getAttribute('data-filter-output');
if (outputSections || outputSections === '') { // The user specified the output lines. -- cwells
outputSections = outputSections.split(',');
for (var i in outputSections) { // Parse the output sections into start/end ranges. -- cwells
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of iterating through Arrays with for/in loops. Could you change this to a regular for loop please?

outputSections = outputSections.split(',');
for (var i in outputSections) { // Parse the output sections into start/end ranges. -- cwells
var range = outputSections[i].split('-');
var outputStart = parseInt(range[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to specify parseInt's second parameter to 10, just to make 012-013 work the same as 12-13.

}
}
} else if (outputFilter) { // Treat lines beginning with this string as output. -- cwells
for (var i in codeLines) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, please use a regular for loop.


// Reinsert the output lines into the highlighted code. -- cwells
var codeLines = env.highlightedCode.split('\n');
for (var i in env.vars['command-line'].outputLines) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.


var pre = env.element.parentNode;
var clsReg = /\s*\bcommand-line\b\s*/;
if (clsReg.test(env.element.className)) { // Remove the class "command-line" from the <code>
env.element.className = env.element.className.replace(clsReg, '');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering: if the class of the <code> is something like language-shell command-line foobar, won't it be replaced with language-shellfoobar here? The replacement string should probably be a space ' '.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't believe I missed that. :( Thanks for catching it.

}

var pre = env.element.parentNode;
var clsReg = /\s*\bcommand-line\b\s*/;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could move that declaration outside of the hooks definitions, since you use it twice.

}
}
// Remove the prompt from the output lines. -- cwells
for (var i in env.vars['command-line'].outputLines) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, please use a regular for loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outputLines should only have an i for lines that are output, so I leveraged that in each loop through its properties. I switched the loops anyway and guarded with hasOwnProperty().

@chriswells0
Copy link
Contributor Author

I've made the updates and squashed it all into 1 commit.

@Golmote
Copy link
Contributor

Golmote commented Mar 26, 2018

Alright, looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants