-
-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal draft: Enhance Amy workshop management tool #36
Proposal draft: Enhance Amy workshop management tool #36
Conversation
@pbanaszkiewicz looks good to me. |
|
||
## Abstract | ||
|
||
The number of Software Carpentry's workshops run weekly dynamically grows. All |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe “is growing rapidly” instead of “dynamically grows”? I know Greg publishes charts of students-taught in the blog. I don't know if he publishes charts of workshops-held, but linking to either of those from here would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unable to find numbers so I inquired @gvwilson about them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Tue, Mar 24, 2015 at 11:38:20AM -0700, Piotr Banaszkiewicz wrote:
+The number of Software Carpentry's workshops run weekly
dynamically grows. AllI was unable to find numbers so I inquired @gvwilson about them.
So I'm not sure about workshops, but I think this is the last graph
showing the increase in student numbers 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice, thank you.
I've just got data from Greg, I'm producing an updated chart. Not sure where to publish it, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not able to get rid of jiggling entirely - this may be because I have a dataset with non-unique index (aka date). Very small jiggling is visible even after sorting.
(To fix horrible jiggling in this one I removed entries with duplicated dates leaving the latest entry with biggest number of instructors)
Anyway, I'll include two nicest looking plots I could get to this proposal and later change them to links - if they get published. Otherwise I'll link to the plot you pointed out earlier (#36 (comment)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Tue, Mar 24, 2015 at 02:21:07PM -0700, Piotr Banaszkiewicz wrote:
I'm not able to get rid of jiggling entirely - this may be because I
have a dataset with non-unique index (aka date).
Right, you want to sort by increasing count (which will automatically
sort by increasing date). It looks like you currently have multiple
entries for one date and the higher-count entry is currently landing
before the lower count entry.
To fix horrible jiggling in this one I removed entries with
duplicated dates leaving the latest entry with biggest number of
instructors
That works. I'm not sure how you're creating the plots, but most
plotting packages have something that lets you draw stepped lines.
You should use something like gnuplot's 'steps' 1. With that
plotting style, it won't matter whether you de-dup the dates (as long
as you've sorted by increasing count).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using Pandas with Seaborn for nicer colors.
Pandas is very straightforward when it comes to reading and plotting CSV date-indexed files:
df = pandas.read_csv("enrolment_workshops_data.csv", index_col=0, parse_dates=True)
df.plot()
However, if I understand correctly, Pandas has support for multi-index (multi-indices?) data frames. It would take me 5 minutes to remove duplicate data points from the CSV files, but way longer to figure out how to do that in Pandas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Tue, Mar 24, 2015 at 02:55:38PM -0700, Piotr Banaszkiewicz wrote:
However, if I understand correctly, Pandas has support for
multi-index (multi-indices?) data frames. It would take me 5 minutes
to remove duplicate data points from the CSV files, but way longer
to figure out how to do that in Pandas.
Or you could load the CSV with the Python stdlib, sort and dedup, and
then pass that to Pandas ;). But still, it may not be worth the
trouble.
I added a handful of copy-edit comments, but the meat of this proposal
looks great to me.
|
Thanks @wking for your comments and suggestions :) Now I only need to find some source to back up my "SwC is growing rapidly" claim. |
I added some plots to the proposal - you can view the rendered proposal with the plots here. |
On Tue, Mar 24, 2015 at 02:26:16PM -0700, Piotr Banaszkiewicz wrote: Probably use a right-side y axis for workshops if you're plotting both |
Hey @wking,
Indeed, the workshops line is squashed. I was thinking about a log scale, but then the plot would not be intuitive.
Yes, and that was intended. I wanted to first show how few workshops we run compared to number of people reached. Then I wanted to show how workshops line really looks. |
On Tue, Mar 24, 2015 at 02:59:42PM -0700, Piotr Banaszkiewicz wrote:
I'd spell that out in the surrounding text then. |
@pbanaszkiewicz Thanks for your proposal. You need to submit your proposal to https://www.google-melange.com/gsoc/homepage/google/gsoc2015 as soon as possible. The deadline is March 27th 19:00 UTC. |
Ship it! |
I'm merging this issue since student application period is over. |
Proposal draft: Enhance Amy workshop management tool
Ref #6