Announcement Announcement Module
No announcement yet.
File Scanner prevent duplicates Page Title Module
Move Remove Collapse
Conversation Detail Module
  • Filter
  • Time
  • Show
Clear All
new posts

  • File Scanner prevent duplicates


    I am using file inbound adapter to scan a folder and read through the files :
    <file:inbound-channel-adapter channel="filesChannel"  
       directory="C:\FilesToProcess" prevent-duplicates="true" scanner="fileScanner" 
       <poller fixed-rate="0" /> 
    the attribute 'prevent-duplicates' makes sure that the already processed files are not processed again. But the details are stored in spring memory. Is there a way to persist it to a file server or database so that even after restarting the spring container, the files which were processed before the container restarted are not processed?

    Thanks for help,

  • #2
    Not out of the box.

    When you set prevent-duplicates to true (or don't specify it at all), a CompositeFilteListFilter is created which contains (in your case), a SimplePatternFileListFilter and an AcceptOnceFileListFilter.

    You can create your own CompositeFileListFilter, containing the SimplePatternFileListFilter and a custom 'PersistedAcceptOnceFileListFilter'. Instead of using filename-pattern and prevent-duplicates, pass your CompositeFileListFilter in via the filter attribute.


    • #3
      We process rought 240 files a day. Is there a way to control/reset the number of files AcceptOnceFileListFilter keep in memory , so that we dont run into memory and performance issues?
      Last edited by facct; May 1st, 2013, 12:01 PM.


      • #4
        Yes; it takes a max capacity argument in one of its constructors.

        	 * Creates an AcceptOnceFileListFilter that is based on a bounded queue. If the queue overflows,
        	 * files that fall out will be passed through this filter again if passed to the
        	 * {@link #filterFiles(Object[])}
        	 * @param maxCapacity the maximum number of Files to maintain in the 'seen' queue.
        	public AcceptOnceFileListFilter(int maxCapacity) {
        		this.seen = new LinkedBlockingQueue<F>(maxCapacity);
        so, if you set it to 240, it will hold about a day's worth.