andrewducker: (Default)
[personal profile] andrewducker
Okay, so if I want something to scrape an RSS feed and turn it into a daily
LJ/DW post, how hard is that going to be? Anyone got something handy that
I can kick off on a daily basis?

Or a service they could recommend?

Date: 2011-09-03 07:45 am (UTC)
birguslatro: Birgus Latro III icon (Default)
From: [personal profile] birguslatro
I wrote something like that in [gulp] 2004, but it was done because the blog in question didn't have an RSS feed. Obviously not of use to you (unless you want to learn REBOL). And I wouldn't have a clue if the method used there to post to LJ would still work. Still, REBOL's parse is excellent for sifting data out of text files.

Date: 2011-09-02 12:53 pm (UTC)
From: [identity profile] fub.livejournal.com
I don't know of any tools that do it -- I'd roll my own based on Python, and add an entry to the crontab on my desktop. But that may not be a plausible solution for you.

Date: 2011-09-02 04:04 pm (UTC)
From: [identity profile] hawkida.livejournal.com
I'll invite you to ifttt.com, you might be able to do it via post-by-email using that, not sure.

Date: 2011-09-02 05:13 pm (UTC)
From: [identity profile] johnbobshaun.livejournal.com
Something with Yahoo Pipes?

Date: 2011-09-02 06:14 pm (UTC)
From: [identity profile] johnbobshaun.livejournal.com
Not hard at all. I've used it for a couple of bits and pieces myself.

Date: 2011-09-02 06:15 pm (UTC)
From: [identity profile] johnbobshaun.livejournal.com
Especially if you don't need to faff around with the BigTable datastore.

Date: 2011-10-29 10:38 am (UTC)
From: [identity profile] johnbobshaun.livejournal.com
Good to know. I take it that Objectify is an ORM-style tool? Like Hibernate? Or whatever the .NET equivalent is?

TBH, everything I've used GAE for has had such simple data requirements that I've never needed to dig into that side of things. But I've used MongoDB when dicking around with Rails and am guessing that it's reasonably similar.

Date: 2011-09-03 12:52 pm (UTC)
nameandnature: Giles from Buffy (Default)
From: [personal profile] nameandnature
I run it once a day from a cron job. When it runs, it gathers new items from the feed and generates HTML for them. It posts the accumulated HTML at the point where it's been a while since the last post, or when there are more than N items gathered since the last post (I've messed about with the exact figures, but the script should make it obvious what you need to tweak to adjust that).

So, when there are more than N, it posts all of them at the point where it notices, and it has a chance to notice every time it runs.

If you use it, you might want to take out the hack that means it never mentions religion in the post titles and you'll need to tell it your password a different way (mine comes from an XML file I use to configure the backup tool).

Date: 2011-09-03 12:53 pm (UTC)
nameandnature: Giles from Buffy (Default)
From: [personal profile] nameandnature
Running from GAE should just be a matter of replacing the pickle/unpickle with use of whatever backend GAE has, as ISTR there's no filesystem on GAE.

Date: 2011-09-03 01:30 am (UTC)
From: [identity profile] drjon.livejournal.com
If you get a good answer, I'd certainly be interested. I researched the question some time ago, but didn't come up with a good answer.

September 2025

S M T W T F S
  12 3 4 5 6
7 8 9 10 111213
14151617181920
21222324252627
282930    

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 12th, 2025 09:56 am
Powered by Dreamwidth Studios