Announcement Announcement Module
Collapse
No announcement yet.
Disable duplicate files filter for FTP inbound channel adapter Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Disable duplicate files filter for FTP inbound channel adapter

    I am using SI 3.1, and I am trying to find a way to make int-ftp:inbound-channel-adapter to process files with the same name. Is there any easy way to disable duplicate files filter for int-ftp:inbound-channel-adapter?

    Thanks
    Gali

  • #2
    I tried to solve the problem by overriding the default filter. I added a filter to the FTP adapter:

    Code:
    <int-ftp:inbound-channel-adapter id="ftpFolderInboundChannelAdapter"
          local-directory="C/Temp/trigger" channel="monitoredFolderFileChannel" session-factory="ftpSessionFactory"
          remote-directory="#{monitoredFolders.ftpFolder}" delete-remote-files="true"
          auto-startup="false" filter="allowDuplicatesFiler">
    The filter bean is defined:

    Code:
    <bean id="allowDuplicatesFiler" class="mypackage.AllowDuplicatesFiler"/>
    The filter does not do anything - just returns the same list of files:

    Code:
    public class AllowDuplicatesFiler implements FileListFilter<FTPFile> {
    
        public List<FTPFile> filterFiles(FTPFile[] files) {
            return Arrays.asList(files);
        }
    
    }
    However, the file with the same name still does not get pulled from FTP folder second time.

    Does anyone have a solution to this?

    Thanks
    Gali
    Last edited by oagady; Dec 7th, 2011, 09:43 AM.

    Comment


    • #3
      Well, apparently my solution with the overriding the filter worked, but not 100%. One day it accepts duplicate files, next day does not, the day after it accepts them again. It's random.

      Comment


      • #4
        Sorry we are somewhat unavailable this week, but I'll look at it on Monday. If you don't hear anything just ping again.

        Comment


        • #5
          Sorry, don't want you to think i forgot about it. I actually think I know what's going on, but give me till tomorrow and I'll reply.

          Comment


          • #6
            As I have an use case where I have to poll files with the same name from a ftp-server too, I am interested in a solution for this problem.

            I have peeked into the source code and to me it seems that the configured filter indeed is used in the FtpInboundFileSynchronizingMessageSource.synchroni zer. However there is another filter internally built in FtpInboundFileSynchronizingMessageSource.fileSourc e which is not overruled by the configured filter. FtpInboundFileSynchronizingMessageSource.fileSourc e seems to be used to poll the local directory where the file is put after the transfer from the ftp server. This fileSource has a scanner with a composite filter containing an AcceptOnceFilter:

            Code:
            private FileListFilter<File> buildFilter() {
            		Pattern completePattern = Pattern.compile("^.*(?<!" + this.synchronizer.getTemporaryFileSuffix() + ")$");
            		return new CompositeFileListFilter<File>(Arrays.asList(
            				new AcceptOnceFileListFilter<File>(),
            				new RegexPatternFileListFilter(completePattern)));
            	}
            So probably the file will be transfered to the local directory, but won't be propagated any longer from there. (I have to test this later...)

            Is this right? Is there a workaround or a fix coming?

            Thanks,
            Maarten
            Last edited by mdond; Feb 23rd, 2012, 11:24 AM.

            Comment


            • #7
              Ok, I tested my assumption above and indeed the embedded fileSource of the ftp:inbound-channel-adapter doesn't process files with the same name. Also there is no easy way to change the filter of the embedded fileSource.

              However there is a solution: with the attribute local-filename-generator-expression it is possible to make the local filename unique. So the combination of a filter in the ftp:inbound-channel-adapter and a local-filename-generator that generates unique filenames indeed makes it possible to process files with duplicate file names. You need something like this:
              Code:
              <int-ftp:inbound-channel-adapter id="ftpIn"
              	channel="filesIn" session-factory="ftpClientFactory"
              	auto-create-local-directory="true"
              	delete-remote-files="true" remote-directory="." local-directory="file://${baseDirectory}/tmpdir"
              	local-filename-generator-expression="substring(0,lastIndexOf('.')) + '_' + T(java.lang.System).currentTimeMillis() + substring(lastIndexOf('.'))"
              	filter="ftpFileNameFilter"> 
                  <int:poller max-messages-per-poll="-1" fixed-rate="10000"/>
              </int-ftp:inbound-channel-adapter>
              The random case Gali described above probably is the result of a restart of the transfer process. In this case the files will be processed once again. A local-filename-generator-expression in above case should solve the problem.

              Hope this helps someone,
              Maarten
              Last edited by mdond; Feb 23rd, 2012, 11:26 AM.

              Comment


              • #8
                I have the same problem too.

                3rd party uploads to an ftp site, but it does not prefix during the upload.
                I then am polling the ftp using ftp-inbound-channel-adapter, with delete remote files=true.

                sometimes the file will be downloaded while still being uploaded by the 3rd party. however, in this case, the delete wont actually delete the remote file, due to it being locked by the uploader.

                the next time the adapter polls, it wont take the now completed file, and overwrite the old local copy.

                it is because of the following in org.springframework.integration.file.remote.synchr onizer.AbstractInboundFileSynchronizer

                Code:
                private void copyFileToLocalDirectory(String remoteDirectoryPath, F remoteFile, File localDirectory, Session<F> session) throws IOException {
                  //snipped
                  File localFile = new File(localDirectory, localFileName);
                  if (!localFile.exists()) {
                    //download and delete
                  }
                }
                Last edited by shaine; Mar 29th, 2012, 06:53 PM.

                Comment


                • #9
                  I'd say the solution described by the @mdond is the most elegant one. Its a hard problem to deal if you think about it when relying on file name, however expressions allow you to modify that name to possibly append the timestamp or other unique value.

                  Comment


                  • #10
                    This is a classic problem; relying on the rename failing is not a solution because you could consume the file before it's complete, but it's complete by the time you rename.

                    An alternative is a custom filter to look at the last update timestamp - if the timestamp is, say, newer than 5 minutes ago, then filter out the file; it will be picked up when the last update was long enough ago.

                    This still has issues, though, e.g. if the sender pauses for longer than the filter's time limit. It also relies on the operating system updating the timestamp while the file is uploaded (not all OSs do that).

                    Comment


                    • #11
                      These are 2 different problems (use cases):

                      1) The solution I described above in #7 solves the problem 'Disable duplicate files filter for FTP inbound channel adapter' as described by oagady.

                      2) The problem described by shaine in #8 is that a file on the ftp server isn't deleted when it is processed while being uploaded. This has nothing to do with duplicate files, it happens with a unique file. Gary Russels post #10 is a good starting place to solve this problem, a similar approach could be to filter by file size instead of timestamp.

                      It seems that Shaine needs a solution for problem 2). Solution 1) wouldn't solve this as this would result in parts of the same file processed multiple times. In my use case for problem 1) I (quietly) assumed that the file is not polled while it is uploaded.
                      Last edited by mdond; Mar 30th, 2012, 06:04 AM.

                      Comment


                      • #12
                        Originally posted by Gary Russell View Post
                        This is a classic problem; relying on the rename failing is not a solution because you could consume the file before it's complete, but it's complete by the time you rename.

                        An alternative is a custom filter to look at the last update timestamp - if the timestamp is, say, newer than 5 minutes ago, then filter out the file; it will be picked up when the last update was long enough ago.

                        This still has issues, though, e.g. if the sender pauses for longer than the filter's time limit. It also relies on the operating system updating the timestamp while the file is uploaded (not all OSs do that).
                        yes, I think I will go with this solution, filtering files that are not older than X. this will then add complications with timezones

                        Comment


                        • #13
                          FYI

                          due to issues with timezones, and not being able to reliably know what the timezone of the ftp server is, I implemented this using aspect j, load time weaving.

                          Code:
                          @Aspect
                          public class FtpFileAspect {
                          
                          	@Before("execution(* org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.copyFileToLocalDirectory(..)) && args(remoteDirectoryPath, remoteFile, localDirectory, ..)")
                          	public void deleteLocalFile(JoinPoint joinPoint, String remoteDirectoryPath, FTPFile remoteFile, File localDirectory) {
                          		System.out.println("ASPECT: intercepting download");
                          	}
                          
                          }

                          Comment


                          • #14
                            obviously now I realised that I must just do this with a filter.

                            Code:
                            public class FTPFileListFilter implements FileListFilter<FTPFile> {
                            	
                            	private Logger logger = LoggerFactory.getLogger(FTPFileListFilter.class);
                            	private boolean deleteLocalDuplicate = false;
                            	private String localDirectory;
                            	private String filenameRegex;
                            
                            	@Override
                            	public List<FTPFile> filterFiles(FTPFile[] files) {
                            		List<FTPFile> filteredFiles = new ArrayList<FTPFile>();
                            		for (FTPFile ftpFile : files) {
                            			String remoteFileName = getFilename(ftpFile);
                            			if (remoteFileName.matches(getFilenameRegex())) {
                            				logger.debug("including remote file " + remoteFileName);
                            				filteredFiles.add(ftpFile);
                            				if (isDeleteLocalDuplicate()) {
                            					String localFileName = remoteFileName;
                            					File localFile = new File(localDirectory, localFileName);				
                            					if (localFile.exists()) {
                            						logger.info("deleting local duplication file " + localFile.getAbsolutePath());
                            						localFile.delete();
                            					}
                            				}				
                            			} else {
                            				logger.debug("filtered out remote file " +remoteFileName);
                            			}
                            		}
                            		return filteredFiles;		
                            	}
                            	
                            	private String getFilename(FTPFile file) {
                            		return (file != null ? file.getName() : null);
                            	}
                            Code:
                            <bean id="ftpFileListFilter" class="za.co.realmdigital.iol.integration.FTPFileListFilter" abstract="true">
                            	<property name="deleteLocalDuplicate" value="${ftp.delete.remote.files}" />
                            	<property name="filenameRegex" value=".*(?i)\.pdf" />
                            </bean>
                            
                            <int-ftp:inbound-channel-adapter id="ftpChannelAdapter" filter="ftpFileListFilter" />
                            Last edited by shaine; Apr 2nd, 2012, 06:24 PM.

                            Comment

                            Working...
                            X