Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customizable template for Slurp notes #3

Closed
chrisgrieser opened this issue Apr 9, 2024 · 33 comments
Closed

Customizable template for Slurp notes #3

chrisgrieser opened this issue Apr 9, 2024 · 33 comments

Comments

@chrisgrieser
Copy link

Thanks for this plugin. Looking forward to replace Advanced URI + MarkDownload with it.

One of the main features missing would be to customize the properties (frontmatter) of the slurped files. Adding the date and the site name is good (already an improvement to MarkDownload!) but some things author, other names, manually added properties would be really useful

@inhumantsar
Copy link
Owner

Good news, author is already supported! It will appear when it can be found during parsing. It currently leaves out properties it doesn't get a value for though.

Would you want to be able to set those properties when you're loading a URL or after the page is saved?

I haven't tried MarkDownload, sounds like I should check it out and look for friction.

@chrisgrieser
Copy link
Author

It will appear when it can be found during parsing. It currently leaves out properties it doesn't get a value for though.

Ah, I see. I would pretty much prefer it to always add the property, so that I can manually add them when they cannot be parsed.

Would you want to be able to set those properties when you're loading a URL or after the page is saved?

I think just saving everything in a file would be the most straightforward solution, since adding values to properties works fine in a regular Obsidian note already

@inhumantsar
Copy link
Owner

I've added this as a settings toggle in v0.1.3. it won't affect existing notes though!

@chrisgrieser
Copy link
Author

Thanks for the quick implementation!

However, I do not think this issue should be closed (yet), since it has been only partially adressed – it is not possible to assign custom keys or define which properties to be included in what order etc. 

@inhumantsar
Copy link
Owner

inhumantsar commented Apr 9, 2024

ah i see! yeah that goes a bit deeper than what i understood from your initial comment. i'll be honest, i'm a bit wary of going deep on customization in these early versions.

automatically populating properties is high on my priority list, but adding new properties after people have been using slurp for a while could introduce a fair bit of friction. key conflicts in particular worry me. maybe it makes sense to establish a relatively comprehensive set of "reserved" keys and data types first, then offer customization options later.

right now i've been looking at adding fields to handle multiple authors, reference IDs (eg: arXiv:1234.56789), named entities (people, places, things), tags, and internal links. are there other fields that come to mind? is there an application using a frontmatter convention you would like to see in use here?

@inhumantsar inhumantsar reopened this Apr 9, 2024
@chrisgrieser
Copy link
Author

tbh, for me personally, I'd need full customization, since I tend to change up things sometimes.

here is for example what MarkDownload does, with {baseURI} etc. being populated with the respective values.
Pasted image 2024-04-09 at 20 55 43@2x

@inhumantsar
Copy link
Owner

yeah that makes sense. i was looking at replacing the hardcoded format with a template at some point anyway.

i see that markdownload extracts all <meta> tags from websites, is that something you've found useful in the past?

@chrisgrieser
Copy link
Author

i see that markdownload extracts all tags from websites, is that something you've found useful in the past?

Kinda. It does include some information, but as I mentioned, MarkDownload does not include the publication date or the site name, for instance (though it does include the host, which is mostly similar). It also lacks a few quality-of-life features such as removing the "by" in the author-byline, e.g. by Jane Doe.

@inhumantsar
Copy link
Owner

seems odd because according to their GitHub, they use the same library as slurp under the hood. I'll look a little more deeply into their codebase and see if they're doing things that I should avoid.

do you have a couple links to pages where that's happened? would be good for testing

@chrisgrieser
Copy link
Author

the publication year is missing everywhere, it's simply not available in MarkDownload as a token. Author is missing a lot of places, a simple example could be this article

@inhumantsar
Copy link
Owner

thanks for the link! that's a great one to know about. what's interesting is that slurp didn't get the author out of that either, even though there is a meta tag for author.

i'm going to open a few issues for supported properties that don't get picked up when they should. if you could throw more links like that verge one into those it would be a huge help.

@inhumantsar inhumantsar changed the title FR: customize properties Customizable template for Slurp notes Apr 10, 2024
@inhumantsar
Copy link
Owner

i may have got a bit sidetracked adding new metadata fields while fixing the old ones...

image

@inhumantsar
Copy link
Owner

decided against the raw template since i can't rule out breaking changes to the available properties in the near future. plus, this way it'll be easier to find out about new properties.

Recording.2024-04-11.125004.mp4

@chrisgrieser
Copy link
Author

looks awesome, makes for a much better UI as well!

A small suggestion: could you add the possibility to add custom properties as well? One use case is to add a property field read: false to all scrapped articles, and toggling the checkbox to true when I got time to actually read it.

@inhumantsar
Copy link
Owner

inhumantsar commented Apr 12, 2024

yep, that's on the agenda! wanted to nail the existing properties down first.

i've pushed up the initial version of this for testing. i had to abandon the fancy drag and drop functionality as it wouldn't play nice with the rest of the components, but the functionality is all there.

image

it would be awesome if you could help test it out by setting up BRAT.

Edit: Forgot to mention that the ordering won't work yet. it will save the ordering you configure but won't actually write new notes with that ordering.

@chrisgrieser
Copy link
Author

chrisgrieser commented Apr 13, 2024

it would be awesome if you could help test it out by setting up BRAT.

Would love to, you haven't created a new beta release for BRAT, so it still installs 0.1.4

@inhumantsar
Copy link
Owner

oops, sorry about that. should be fixed now

@chrisgrieser
Copy link
Author

0.1.5b1 throws an error:

Received URL action {url: 'https://www.theverge.com/2024/4/10/24125572/fcc-broadband-nutrition-labels-isp-deadline-today', action: 'slurp'}action: "slurp"url: "https://www.theverge.com/2024/4/10/24125572/fcc-broadband-nutrition-labels-isp-deadline-today"[[Prototype]]: Object
plugin:slurp:3915 Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'get')
    at maybe (plugin:slurp:3915:40)
    at SlurpPlugin.createContent (plugin:slurp:3935:18)
    at SlurpPlugin.slurpNewNoteCallback (plugin:slurp:3952:26)
    at async SlurpPlugin.slurp (plugin:slurp:3895:5)
maybe @ plugin:slurp:3915
createContent @ plugin:slurp:3935
slurpNewNoteCallback @ plugin:slurp:3952
await in slurpNewNoteCallback (async)
eval @ plugin:slurp:3786
t @ app.js:1
(anonymous) @ VM283:1
(anonymous) @ node:electron/js2c/renderer_init:2
(anonymous) @ node:electron/js2c/renderer_init:2
emit @ node:events:517
onMessage @ node:electron/js2c/renderer_init:2

@inhumantsar
Copy link
Owner

ok so i had to shave a lot of yaks along the way but it should be working now.

i made a lot of changes to the settings file format, so you might run into an issue there. a quick sanity check would be to check for slurped in the properties settings. if it's there and things seem to work, then it's probably fine. if slurped isn't there though:

  1. close obsidian
  2. send me a copy of <obsidian dir>/plugins/slurp/data.json
  3. delete that file and re-open obisidian

@chrisgrieser
Copy link
Author

chrisgrieser commented Apr 15, 2024

Thanks, b2 seems to work now.

With the default settings, the article gets downloaded correctly. However, with some custom settings, the metadata creation seems to fail:

---
undefined
---

some other issues I noticed:

  • the properties do not seem to accept keys with a hyphen in them (e.g. data-type does not work)
  • the dates always print full dates, there is no possibility of formatting them, e.g. to only add the initial publication year
  • There is no title property
  • there is no option to add a custom property, like for read: false as I mentioned before (though I assume you simply haven't implemented that yet?)
  • the small animation when moving properties up/down is cute, but I feel like they might annoy some users? (I personally have no issue with them though :) )

@inhumantsar
Copy link
Owner

inhumantsar commented Apr 15, 2024

the properties do not seem to accept keys with a hyphen in them (e.g. data-type does not work)

hmm yes it seems my validation function is a bit overzealous. i'll rework it. the blocked characters are only meant to trigger a validation error only if they're at the start or end of the string.

the dates always print full dates, there is no possibility of formatting them

i've implemented the necessary functions for formatting but it hasn't been added as an option yet. i'm still working out how to best to expose that through the settings UI.

so while i haven't tested the formatter thoroughly yet, you can try it out by modifying the data.json file directly for now:

    "publishedTime": {
      "id": "publishedTime",
      "key": "year",
      "idx": 2,
      "format": "d|YYYY-MM-DDTHH:mm",
      "enabled": true
    },
    "modifiedTime": {
      "id": "modifiedTime",
      "key": "updated",
      "idx": 3,
      "format": "d|YYYY-MM-DDTHH:mm",
      "enabled": false
    },

the d| tells slurp to pass it through the date formatter, and the format after is the usual date format syntax used in moment. eg: if you wanted the date to show up as "Wednesday, April 10th 2024 at 1:55pm", you could change d|YYYY-MM-DDTHH:mm to d|dddd, MMMM do YYYY [at] h:mma (i just made up that format string off the top of my head, it may not work as-is).

worth noting too that all string properties can be formatted using the same syntax as well. it's already being used for tags. the format string just needs to start with s| and slurp will replace all instances of {s} in the format string with the property value. this is being used at the moment to convert Twitter usernames into links: s|https://twitter.com/{s}

S| can also be used to replace multiple different placeholders. eg: for tags, S|{prefix}/{tag} is fed into the formatter along with an object {prefix: <tag prefix setting>, tag: <a keyword pulled from site metadata>}. this likely won't be exposed as an option though since it would be very difficult to handle those placeholders automatically and predictably, but it will get used for supported properties.

There is no title property

the page title is the parsed title anyway, so i didn't see much of a point to duplicating it in the note properties. is there a use-case you have in mind for that?

there is no option to add a custom property ... though I assume you simply haven't implemented that yet

that is indeed the case :) most of the time i put in this weekend was to ensure that the core properties and custom properties would play nicely together.

there's still more work to do on that front. in particular i need to ensure that settings will be gracefully migrated between plugin updates. right now slurp will just try to slam whatever is there together with the core options without checking for incompatibilities first.

this will likely get done before i do the custom date format options.

the small animation when moving properties up/down is cute, but I feel like they might annoy some users? (I personally have no issue with them though :) )

anyone who doesn't like it can deal 😉 it's a pretty minor thing and shouldn't interfere with anything else. i found that, without an animation of some kind, i sometimes didn't notice that the items changed order so i'd click the button again only to accidentally flip the ordering back. the animation does seem to be pretty inefficient compute-wise though, so i might simplify it in the future.

@inhumantsar
Copy link
Owner

inhumantsar commented Apr 15, 2024

also, regarding the undefined you're seeing. would you be willing to reproduce and post your console output? Ctrl+Shift+I will open the console on Windows. as you might have noticed, i started adding a "debug" option in the settings but it's not wired up to anything yet, so unfortunately it has to be a manual copy paste job for now.

i'll try to reproduce on my end as well later tonight, but it would be good to have your logs too for comparison.

edit: i couldn't help myself and ended up reproducing it just now. looks like when there are no tags, slurp doesn't get rid of the set object it uses to store them and js-yaml doesn't like that. should be an easy fix. i'll push that tonight and let you know when it's ready.

thanks again for your help testing this by the way! i promise i'll add some automated tests soon so the easy stuff won't have to be caught by users like you 😄

@inhumantsar
Copy link
Owner

hah ok so don't worry about reproducing it!

i fixed the issue. it was actually two issues, one was the empty set breaking the YAML parser, and the other was disabled properties were forcing an early exit from the metadata parsing function entirely. in testing the fix i also noticed a third issue: disabled properties were re-enabling themselves.

i've pushed those fixes up. a new v0.1.5b3 release will be available shortly.

@chrisgrieser
Copy link
Author

Can confirm, b3 fixes the undefined issue. With the other information / fixes, that leaves only these issues / todos:

  • the properties do not seem to accept keys with a hyphen in them (e.g. data-type does not work)
  • there is no UI for customizing the date format (can only manually be changed in the data.json)
  • there is no title property
  • there is no option to add a custom property, like for instance read: false

the page title is the parsed title anyway, so i didn't see much of a point to duplicating it in the note properties. is there a use-case you have in mind for that?

There are multiple reasons for a title property:

  1. Filenames have various restrictions when it comes to special characters (:, /), and also an os-dependent maximum length. Thus, long titles and titles with special characters cannot be correctly reflected in the file name.
  2. You might want to change filenames for various reasons, while preserving the title information

@inhumantsar
Copy link
Owner

inhumantsar commented Apr 17, 2024

alright so i went through and added all of those.

in classic form, i broke some of it while working on state management at the same time. 🙃

enabling/disabling and deletion seem to be affected. adding new fields, adjusting formats, and changing keys should all still be working though.

overhauled the UI to be more obsidian-like as well. new beta should find its way to your machine shortly.

@chrisgrieser
Copy link
Author

BRAT complains, since there is only a tagged commit for b4, but no pre-release yet

@inhumantsar
Copy link
Owner

just added it. obsidian's build workflow was complaining and it was too late at night to troubleshoot it

@chrisgrieser
Copy link
Author

Thanks! the UI is a nice idea. Some issues I've noticed:

  • "Enable Property" is not remembered for any key I tried.
  • the custom properties only accept a string value, I believe? My example where I wanted read: false as custom property results in read: 'false' when slurping. Similarly, changing the initial publication date to year key via d|YYYY, gets me year: "2022" instead of year: 2022
  • minor: when inspecting the frontmatter, in source mode, after the custom property, there is an unneeded blank line added
    Pasted image 2024-04-17 at 18 47 38@2x

@inhumantsar
Copy link
Owner

I have a fix for the issue with Enabled already, should be able to push it up along with a couple other changes later today.

The data type thing is annoying. I switched to using a yaml parser as part of this and I'm not really a fan of how it wraps everything in quotes like that. YAML doesn't require that and it makes handling things like boolean values particularly tedious.

It does offer a lot of options at least, so I'm hoping I'll be able to configure away that behaviour.

@inhumantsar
Copy link
Owner

alright, things looking better now. keep your eyes peeled for a new release.

@chrisgrieser
Copy link
Author

Okay, checked out b6, custom properties & b|false work nicely. New issues I noticed:

  • disabling properties only works for one slurp, afterwards, the disabled properties are re-enabled again
  • the order of keys set in the settings is not applied to the yaml frontmatter
  • very minor: there should be a blank line between the frontmatter and the note content
  • spaces are disallowed in the property, even though they are valid in Obsidian (and yaml in general)

@inhumantsar
Copy link
Owner

ok! those should be fixed in 0.1.5b7.

thanks for all your help on this!

@chrisgrieser
Copy link
Author

Thank you! At long last, it seems everything works now – cannot find any issues anymore 🥳

Will migrate my setup to Slurp now. Maybe there will be some minor leftover issues I'll find in daily use, but I guess that would be a new GitHub issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants