Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ls: Add JSON output support for restic ls cmd #1953

Merged
merged 4 commits into from
Aug 18, 2018
Merged

Conversation

kitone
Copy link
Contributor

@kitone kitone commented Aug 14, 2018

What is the purpose of this change? What does it change?

This PR adds support for JSON output for the restic ls command, using the existing global --json flag.

output truncated of : ./restic ls 1977f34d /doc/_static --json

[{
  "time": "2018-08-10T13:41:51.624627031+02:00",
  "tree": "0d7878997a8317b052105f933c2ff0920f2942ce61c4842de1b1ab067d305b35",
  "paths": ["/Users/kitone/go/src/github.com/restic/restic/doc"],
  "hostname": "hostname.local",
  "username": "kitone",
  "uid": 502,
  "gid": 20,
  "id": "1977f34d970e318bd3249f59ec8e87d5a8b17551cfb49664c0ba5f0d56e9a97c",
  "short_id": "1977f34d",
  "nodes": [{
    "name": "_static",
    "type": "dir",
    "path": "/doc/_static",
    "uid": 502,
    "gid": 20,
    "mode": 2147484141,
    "mtime": "2018-08-10T13:07:54+02:00",
    "atime": "2018-08-10T13:07:54+02:00",
    "ctime": "2018-08-10T13:07:54+02:00",
    "struct_type": "node"
  }, {
    "name": "css",
    "type": "dir",
    "path": "/doc/_static/css",
    "uid": 502,
    "gid": 20,
    "mode": 2147484141,
    "mtime": "2018-08-10T13:07:54+02:00",
    "atime": "2018-08-10T13:07:54+02:00",
    "ctime": "2018-08-10T13:07:54+02:00",
    "struct_type": "node"
  }, {
    "name": "favicon.ico",
    "type": "file",
    "path": "/doc/_static/favicon.ico",
    "uid": 502,
    "gid": 20,
    "size": 1406,
    "mode": 420,
    "mtime": "2018-08-10T13:07:54+02:00",
    "atime": "2018-08-10T13:07:54+02:00",
    "ctime": "2018-08-10T13:07:54+02:00",
    "struct_type": "node"
  }],
  "struct_type": "snapshot"
}]

Was the change discussed in an issue or in the forum before?

Previous PR : #1939

Checklist

  • I have read the Contribution Guidelines
  • I have added tests for all changes in this PR
  • I have added documentation for the changes (in the manual)
  • There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

@kitone
Copy link
Contributor Author

kitone commented Aug 14, 2018

/cc @fd0 and @mholt

@mholt , I switch to a "struct_type" field.
I will not be available for several days, feel free to make change.

Best regards,

@codecov-io
Copy link

codecov-io commented Aug 14, 2018

Codecov Report

Merging #1953 into master will decrease coverage by 4.33%.
The diff coverage is 36.58%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1953      +/-   ##
==========================================
- Coverage    50.5%   46.16%   -4.34%     
==========================================
  Files         170      170              
  Lines       13266    13304      +38     
==========================================
- Hits         6700     6142     -558     
- Misses       5559     6203     +644     
+ Partials     1007      959      -48
Impacted Files Coverage Δ
cmd/restic/cmd_ls.go 41.83% <36.58%> (-6.5%) ⬇️
internal/backend/b2/b2.go 0% <0%> (-78.54%) ⬇️
internal/backend/swift/swift.go 0% <0%> (-78.45%) ⬇️
internal/backend/gs/gs.go 0% <0%> (-73.61%) ⬇️
internal/backend/azure/azure.go 0% <0%> (-69.46%) ⬇️
internal/backend/swift/config.go 36.95% <0%> (-54.35%) ⬇️
internal/backend/test/tests.go 59.95% <0%> (-0.66%) ⬇️
internal/backend/s3/s3.go 62.4% <0%> (+2.25%) ⬆️
internal/archiver/blob_saver.go 100% <0%> (+2.46%) ⬆️
internal/checker/checker.go 68.4% <0%> (+3.86%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 401a564...46f71f4. Read the comment docs.

Copy link
Contributor

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good! Thanks for working on it during your travel. It's an elegant solution, especially from a developer who's new to Go.

One thing of note is that currently the ls output is streamed to stdout, printing matches as it finds them, without buffering the whole list in memory. This approach, for the JSON output, does buffer each item from all snapshots in memory before encoding them. Most of the time this won't be a problem, but I can imagine memory bloat when a listing has tens of thousands (or more) items.

I'm inclined to accept this change, though, since I think it's a really good one, and leave a change to streaming JSON output "as-needed". Heck, I might even be willing to build that on top of this later.

@fd0
Copy link
Member

fd0 commented Aug 14, 2018

Hey, thank you very much for your contribution! I have not looked at the code yet, but I think it'd be good to not print anything about the snapshot (programs will probably have that information already, or they can call restic snapshots --json to get the info), then we can have restic ls --json stream a list of nodes, one at a time. Programs can either read the whole output at once, or consume one node at a time.

@frankdugan3
Copy link

Really looking forward to this! I built a backup GUI (served via GraphQL API), and it currently relies on some poorly-performing and ugly JS regexes to parse the results of restic ls --long --quiet --no-lock [snapshot ID] to browse the files of a given snapshot.

@mholt
Copy link
Contributor

mholt commented Aug 18, 2018

@frankdugan3 Cool, do you have your GUI anywhere? I'm interested to see it!

@frankdugan3
Copy link

@mholt As of right now it's pretty integrated into a much larger ERP app I'm working on for my employer. But I've been thinking about the best way to release a more generalized version for distribution. Currently torn between making a plugin/GraphQL middleware (in JS) just to help with getting API access, or making a more general GUI webapp binary (in Go).

@mholt
Copy link
Contributor

mholt commented Aug 18, 2018

@frankdugan3 Gotcha -- that sounds like a fun project. (I'm working on one too) I've really enjoyed making restic visually accessible. I think this PR will be a great benefit for that.

@frankdugan3
Copy link

@mholt Wow... That looks nice! Well, in that case I’ll sub to updates and maybe just publish a GraphQL lib!

@fd0
Copy link
Member

fd0 commented Aug 18, 2018

Hm, now I see why we need the info about the snapshots: the ls command still supports printing more than one snapshot at a time. Should we maybe remove this function to sharpen what it should be used for? Users can search for files using the find function...

Copy link
Member

@fd0 fd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rebased the commits on the latest master branch and improved the changelog entry a bit, but apart from that it's great! Thanks for your contribution!

@fd0 fd0 merged commit 46f71f4 into restic:master Aug 18, 2018
fd0 added a commit that referenced this pull request Aug 18, 2018
ls: Add JSON output support for restic ls cmd
@mholt
Copy link
Contributor

mholt commented Aug 18, 2018

@fd0

Hm, now I see why we need the info about the snapshots: the ls command still supports printing more than one snapshot at a time. Should we maybe remove this function to sharpen what it should be used for? Users can search for files using the find function...

I have a couple ideas for this, since I'm still interested in streaming the results out:

  1. We could print each node as its own JSON object, like you suggested, but augmented with a little bit of info about the snapshot it's in, even just the snapshot ID and/or "short ID". This is just an extra field or two on the JSON output for the node and would be pretty easy to work with

  2. We could print each snapshot's JSON serialization once as we traverse into that snapshot -- assuming we list entries from each snapshot in groups, rather than mixing them. This allows the consuming program to keep simple state. Each time it encounters a snapshot object, it just updates which snapshot the following listings are for, until the next snapshot object appears. This way we can print the whole snapshot with all its information since it only prints once.

What do you think of either of those options for streaming?

@frankdugan3

Wow... That looks nice! Well, in that case I’ll sub to updates and maybe just publish a GraphQL lib!

Thanks! No doubt there will be users who find a GraphQL lib useful. I saw your reply on Twitter too, we'd be happy to have you help us test it out -- we'll be ready in a couple of weeks!

@fd0
Copy link
Member

fd0 commented Aug 18, 2018

I thought that you'd be interested in it ;)

I think it'd be fine to have a stream of json objects, one per line, starting with a snapshot object, then all nodes, then maybe another snapshot object. If you want to work on that, please go ahead and let's discuss the details in the next PR or issue.

@mholt
Copy link
Contributor

mholt commented Aug 19, 2018

@fd0 Sounds good to me! Take a look at #1962 at your convenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants