Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert to text #12

Open
fetchinson opened this issue Dec 30, 2020 · 9 comments
Open

Convert to text #12

fetchinson opened this issue Dec 30, 2020 · 9 comments

Comments

@fetchinson
Copy link

Hi, rdrview is absolutely fantastic! The fastest and most relevant output I've come across from all the firefox readability based tools I've tried.

One new feature would be a great addition I think: convert the readable html output to text. Right now I'm using rdrview to get the readable html, output it with "-H" and use the links or lynx browser to dump the formatted text with the -dump option.

Would be nice if rdrview would have an option for outputting text.

In any case, thanks a lot for rdrview!
(By the way I also had to throw away the sandbox stuff from the code because libseccomp would not compile on my system.)

@eafer
Copy link
Owner

eafer commented Dec 30, 2020 via email

@fetchinson
Copy link
Author

fetchinson commented Dec 30, 2020

(By the way I also had to throw away the sandbox stuff from the code because libseccomp would not compile on my system.)

Can you share any more details here? What's your system? If libseccomp is not always available I should do something to simplify the build in those cases. Or maybe just give up and use autoconf.

I have a very old fedora 17 installation, about 8 years old, and there are no updates anymore provided by redhat. I compile almost everything from source and the only time I run into trouble is if my glibc is too old and the code I'm trying to compile relies on newer glibc features, which does happen sometimes. With libseccomp I couldn't compile it, but it wasn't a glibc related problem, it through

system.c:461:16: error: ‘__NR_seccomp’ undeclared (first use in this function)

and after looking at the code for a while and googling around I couldn't figure out where __NR_seccomp should come from. So I gave up on libseccomp, but could easily compile your code by simply deleting everything which was sandbox related.

By the way, what's the downside of running it without a sandbox?

@eafer
Copy link
Owner

eafer commented Dec 30, 2020 via email

@fetchinson
Copy link
Author

Okay, thanks, you're right, security is not really an issue in my setup. In the Makefile you could introduce a setting to have the sandbox not compile at all if it's a problem for other people too. But like I said, it's easy to just delete those parts of the code which refer to the sandbox, so it's not a big issue.

@eafer
Copy link
Owner

eafer commented Jan 2, 2021 via email

@sdsddsd1
Copy link

sdsddsd1 commented Feb 2, 2021

I had the same issue and was looking for an option to output text directly. (Easy way to scroll)
Adding text/html; /usr/bin/lynx -dump -force_html %s; copiousoutput; description=HTML Text; nametemplate=%s.html to $HOME/.mailcap is a good solution for me.
Maybe add this also the documentation?

@csehszlovakze
Copy link

I'd also like an easy option to have plain text output. Right now I have to pipe the outputted HTML into html2text, strip out the formatting marks then use that to print+TTS.

@rjolina
Copy link

rjolina commented May 19, 2024

But rdrview can easily convert html it into text.

$ rdrview "https://lite.cnn.com/alzheimers-risk-test-sanjay-gupta/index.html" > text.txt

$ cat text.txt

   Updated: 2:00 AM EDT, Sun May 19, 2024

   Source: CNN

   I’ve been reporting on Alzheimer’s disease for more than two decades, and
   any progress in the field has seemed incremental at best, leaving most
   patients and their loved ones with few options. But in the process of
   filming a new documentary, “The Last Alzheimer’s Patient,” I met with
   people all across the country who had been diagnosed with or who are at
   high risk of the disease. With lifestyle changes alone, I saw levels of
   amyloid plaque decrease in their brains, their cognition improve and even
   signs of reversal of the disease.

   It was extraordinary and it also made me start to think about my own
   brain, because I have a family history of Alzheimer’s disease.

   So with some trepidation, I decided to learn more about my risk for
   dementia. It was one of the most personal and revealing experiences I have
   ever gone through.

@eafer
Copy link
Owner

eafer commented Jun 9, 2024

@rjolina Yes, it works fine out of the box if you use debian or some other distro that sets up mailcap. Rdrview only cleans up the html, then a browser is needed to render that into text (lynx for example). The mailcap files tell us which browser to use, otherwise it can also be picked with the -B option. Of course it would be easier for the users if rdrview did everything by itself, there would be less configuration involved. But this is a small project and rendering html is probably at least somewhat tricky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants