Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds "Coding in Regex" chapter for re-review #9

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

briandominick
Copy link
Owner

@briandominick briandominick commented Feb 4, 2018

This whole chapter is fleshed out and re-opened to review.

TL;DR: If you know what you're doing on GitHub, here's a rendered PDF of the chapter. Thanks for checking it out!

I know several people who volunteered to review this chapter expressed an interest in learning Git/GitHub at the same time, which is awesome. All we're going to learn at this point is GitHub, so we can review the contents of this chapter inline. There are enough people participating that hopefully we'll get some good chaos going and folks can see how a lively PR review goes.

What is a PR (review)?

A PR is a pull request, which is a very cool, poorly named concept. Really this is a merge request -- I am submitting a Git branch for consideration, hoping it will get merged into the master branch (or in some cases into a trunk other than master). So a PR or MR review is the stage in the Git workflow at which somebody responsible for reviewing code -- or docs-as-code ;-) -- will approve and/or carry out that final merge.

Which is the long way of saying, this is the step/feature of GitHub that lets us analyze our code diffs before we commit it to master.

So what are we looking at?

This is a chapter of my book Codewriting, which am I writing in this very open-source repo. The book is and always will be free, and you're actually welcome to contribute to it. You are doing that already by giving this chapter a review, for which I thank you.

The book is written in AsciiDoc markup, which is what you'll be looking at if you choose to insert any inline comments. I recommend you first read the rendered chapter (PDF).

So what do we do?

If you have any feedback, you can start a review or pile on to others' comments. Go to the Files changed tab and insert comments inline. Since I write in ventilated prose (one-sentence-per-line), you can comment on any sentence individually.
screenie_github_add-comment

I'm open to everything from correcting typos to open-ended critiques of my instruction/explanation style to recommending a better chapter title ("Coding Regular Expressions"?). Bonus points for anyone who finds inaccurate regex patterns in my examples!

Finally, if Write the Docs folks want to do more of these, or set up any kind of group learning around Git/AsciiDoc/docs-as-code, I'm very interested in learning how better to convey these concepts, so let me know. You can also comment on the whole PR below without submitting a full review. Thanks for checking out my work!

@mjang
Copy link

mjang commented Feb 4, 2018

You've done some great work here!

I think your work assumes some level of knowledge of wildcards and variables that your audience may not have. If your intention is to turn this into a longer book, I might start with a chapter on wildcards, with examples that you could use at the command line.

Your intro is a little discouraging, with words like "notorious" and "difficult". If I were afraid of regex, I might run away after seeing that intro.

If I were new to regex, I'd want some examples that I could practice with, starting at a "Hello world" level (e.g., how to find all instances of "Docs" in a file). Then you could congratulate the reader, and tell them that they've started their journey

@briandominick
Copy link
Owner Author

Thanks for the review, @mjang! I think you're probably right all around. This is Chapter 8, and variables are definitely addressed earlier. I think the book might be for slightly advanced audience. I think I assume if you're not already a technical writer, you're a developer looking to do more tech writing. But that doesn't mean all of the chapters have to assume that. Certainly not this one.

As for the part about calling it "notorious" up front, I did that intentionally so readers know I'm not just another person trying to instruct it without realizing people find it really hard. I'm curious what others think about that part. I may ask in WtD Slack #general, since I think that's a general enough issue. Thanks again!

Copy link

@timothymcmackin timothymcmackin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start! I've made some suggestions that I think will ease beginners into the topic more smoothly.

@@ -1,3 +1,348 @@
= Coding in Regex

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a more task-based title here, like "Using regular expressions to automate your work"?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My chapter titles are 2-3-word format. I wonder if I should reconsider that.

@@ -1,3 +1,348 @@
= Coding in Regex

Chapter content removed for re-review (in progress).
The patterning syntax known as _regular expressions_ or _regex_ is as notorious for its alleged steep learning curve as for its confirmed usefulness.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before you tell me that regex is difficult, tell me what it's useful for, to get me psyched up to deal with cryptic syntax because it's worth it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd start off with a simple but realistic example of a regex and what it can do. Maybe something like

newText = text.replace(/word/g, 'new word');

to replace instances of 'word' everywhere with 'new word'.

Then point out that regex is more than just substring matching, it also knows about things like word boundaries. So you could make that last expression a bit more robust (so you don't end up with things like "sword" transforming to "snew word").

newText = text.replace(/\bword\b/g, 'new word');

You don't want to go too far down this road in the beginning, of course, but at least it gives a hint of the power without just saying it has a notoriously steep learning curve.

https://$1codewriting.$2
----

== Why (and When) to Use Regex

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend that you move the "Why" of regex up higher, or else the first section is mostly "regex is hard." You might also separate the "Why regex is useful" info from the cost-benefit analysis of automation, because they seem like different topics to me.

Nobody seriously questions the _efficacy_ of this set of scripting rules, but the efficiency of writing regular expressions is sometimes unclear.
Knowing when they're essential takes some familiarity, so we'll consider a few use=case types.

Regular expressions can be used used not only to scan for predefined patterns in arbitrary text; expressions can extract contextualized data from within those matched patterns, passing it to an application for further manipulation and storage.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Not only can regular expressions search for patterns in text, but they can extract data from the matched patterns and pass that data to an application for further manipulation and storage."

At its simplest application, think of regex as a much smarter version of the `*` or `%` wildcard symbols you've probably used in other contexts.
Where a conventional wildcard finds _any and all text_, regex can find shockingly detailed patterns without false positives.

The string `codewriting.org` is matched by the following regex patterns:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend that you either summarize these three regexes or say that you'll explain them later, otherwise beginners aren't going to see why that gibberish of symbols is useful here. Regexes are really hard to read and parse if you didn't write them, and in this case you're writing for complete beginners. I'd run away screaming from that first one as a beginner.


* `codewriting.org`

Probably in most cases of a find-and-replace task, I don't get to use regex.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're undermining your point by saying that you don't get to use regex often. Maybe say instead that while search fields in some applications can accept regexes, regex is even more useful in programs and scripts.

http\:\/\/([a-z]+\.)?codewriting\.org
----

This regex would efficiently match the following strings:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add "... but not the following strings:" and some examples?

* `http://git.codewriting.org`
* `http://staging.codewriting.org`

The single set of parentheses in the regex will capture any “group” it matches and store it as a variable we can later reference to reinsert the captured string.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a better summary of the regex code here. Explain more clearly what's within the parens.

As a bonus, we got to solve a cool little puzzle!
Seriously what more could a technical documentarian ask for?

Proficient use of regex does not require memorization of all regex patterns -- they can always be looked up.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good point. You might want to point to your favorite regex resources here. No need to reproduce.

Also, some regex testers, like regexr.com and regex101.com can be really helpful.

@briandominick
Copy link
Owner Author

Thanks for those great reviews @evbacher and @timothymcmackin! Much appreciated feedback. The next commit will reflect most if not all of these suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants