Feature proposal: Slurp multiple pages in website and locally link them #59

eri24816 · 2024-09-16T10:12:31Z

I tried to use Obsidian to help me learn from this website, which has over hundreds of pages intensely interlinked. Given Obsidian's ability to visually represent interlinked documents, it helped me easily capture the big picture and navigate between topics.

What I did is to write a Python script to

scrape some pages of the website with BFS from a random starting point
convert html to markdown
convert all internal links, https://ccrma.stanford.edu/~jos/<path>.html for this case, into local md links [<title>](./<path>.md)

The result:

A problem is that the website has too many pages to be entirely downloaded at once, so sometimes when I'm navigating to another page by clicking the link (or the node), it will be a blank file. It would be great if there is an automation that download the missing page once it detects the user opens it.

Then I found this plugin and I'm thinking if we can implement the mechanics onto it as an optional functionality. I'm new to Obsidian plugin and have no quite clear idea about the implementation, but maybe we can do this:

command Slurp: Assign directory to website (local_dir,website_root):
assign a local directory (local_dir) that serves as a local copy of a target website url (website_root)
command Slurp: Create notes from url and its related pages (bfs_source_url,max_pages,max_distance):
download a specific page bfs_source_url of the target website and its surrounding pages into local_dir. bfs_source_url must be in website_root.
when the user clicks a not-yet-created file in local_dir, the plugin automatically download the corresponding page to fill the local file
all links with the form website_root/<path>.html is translated into local_dir/<path>.md
all of above only takes effect inside of the local_dir. Outside of it, everything work as usual

I'm not sure if the proposal complies with the purpose of this project. Looking forward to hearing your thoughts!

The text was updated successfully, but these errors were encountered:

eri24816 changed the title ~~Feature proposal: Awareness of internal links in website~~ Feature proposal: Slurp multiple pages in website and locally link them Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature proposal: Slurp multiple pages in website and locally link them #59

Feature proposal: Slurp multiple pages in website and locally link them #59

eri24816 commented Sep 16, 2024

Feature proposal: Slurp multiple pages in website and locally link them #59

Feature proposal: Slurp multiple pages in website and locally link them #59

Comments

eri24816 commented Sep 16, 2024