Announcement Announcement Module
Collapse
No announcement yet.
Reliable messaging with Atom feed inbound channel adapter? Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reliable messaging with Atom feed inbound channel adapter?

    I'm trying to apply SI Feed to this problem: I need to consume an Atom feed from a remote server that has a fairly high traffic volume. It's very important that I not miss any feed entries.

    Reading through the source for the feed:inbound-channel-adapter, this is how I understand the general process that occurs:
    1) Read all the entries from the given feed URL into memory. This is likely to be the "current page" of an atom feed, yielding the most recent N entries.
    2) Sort the entries by lastModified (if available) or publishedDate.
    3) Discard entries from the head of the list until one is found with a lastModified/publishedDate that is after the "high water mark", which is stored in the MetadataStore.
    4) Put all remaining entries into the queue of entries to be received by SI.
    5) When polled, return the entry at the head of the queue. When we run out of entries, go back to 1).

    Assuming my understanding is correct, I see one glaring problem. It seems highly possible, if not likely, that between subsequent invocations of step 1) on a high-volume feed, the page of feed entries being retrieved will have changed so much that some new entries will have fallen off to the second page, and SI doesn't seem to have a way to account for that.

    Does anyone have a suggestion how to handle this problem? I've forked the SI repo and am going to play around with a couple of ideas of my own.

  • #2
    Hi,

    Are you talking about supporting RFC5005 - Feed Paging and Archiving:

    http://tools.ietf.org/html/rfc5005 ?

    We need to further investigate that. Not sure if Rome (https://rometools.jira.com/wiki/display/ROME/Home) (the underlying library) supports it. We know our feed support can use some improvements. Please create a Jira and provide some more details. Even better, if you have some specific ideas, please also consider contributing your code back to the project! We are certainly grateful for any help!

    Thanks!

    Cheers,

    Gunnar

    Comment


    • #3
      Hi, I got pulled away to other things, but yes: RFC5005 is more or less what I'm looking for. Specifically, I need to consume from several Atom Hopper servers. Atom Hopper uses a paging and linking strategy that, while eternally confusing (to me anyway), assures you won't miss messages if the links are followed properly. I've cooked up my own class for doing this that's heavily influenced by the SI FeedEntryMessageSource and is tightly coupled to SI as a result (which I'm okay with). It uses the same Rome FeedFetcher that the SI Feed component does, so yes, Rome supports it just fine. I've pinged our legal department about contributing my code to SI. Does this sound like something you guys would be interested in?

      Comment

      Working...
      X