Announcement Announcement Module
Collapse
No announcement yet.
dm Server undeploys my application Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • dm Server undeploys my application

    Hi everyone,

    I have a problem with dm Server which undeploys my application by itself. This application consists of one .properties file, several .jar files and one final .war file. After successfully deploying application I open my browser and try to login into my application. Sometimes it works, I can go past logging in and can work, but much more often dm Server undeploys my application, everything in pickup and several other bundles. My logs show no errors, all I see is:

    Code:
    HD0001I Hot deployer processing 'DELETED' event for file 'MRPWeb.war'.
    I don't see any reason for dm Server to do that. I do not touch this file while dm Server is working, I have no antivirus software installed or see any strange behaviour in my system. So I really can say that this file wasn't deleted by anyone, it was dm Server who did this.

    Has anyone experienced similar behaviour? Does anyone have a clue or a hint where to look, or what to look for? I'm really stuck and out of ideas. Maybe you need additional information? I will gladly provide whatever you need to resolve this.

    Best regards

    Jacek Bilski

  • #2
    I presume this is on dm Server 2.0.x. Please confirm.

    I am chasing a somewhat similar bug with a commercial customer but in they undeploy a previously deployed application and then many apparently unrelated bundles are impacted (they are stopped and then restarted).

    Your problem does seem different in one respect: the deletion does not appear to be your doing.

    Please would you take a look in the diagnostic log serviceability/logs/log.log and see if there is any evidence of some activity preceding the HD0001I message which may be initiating a deletion.

    Also, it would be interesting to know if this happens after a clean start (add the "-clean" switch to the startup script invocation).

    Comment


    • #3
      Hi Glyn,

      I confirm, it's dm Server 2.0.x. We tested both 2.0.0 and 2.0.1, same results. And it seems that doing clean start or not does not have any influence, it still can undeploy our application.

      Our logs show absolutely nothing out of ordinary, it looks perfectly OK until "processing 'DELETED' event".

      We will be investigating memory parameters and aspects. We feel that those areas could give us a clue as to what's happening. I'll post here any results we'll get.

      Best regards

      Jacek Bilski

      Comment


      • #4
        Hi Jacek

        Hmmm. Please confirm that you looked in the log.log and not simply in the eventlog.log, which only contains the console messages.

        If so, it seems that the file system watcher thread is somehow getting confused and considering the file to have been deleted. Debugging through that code sequence would be the best way to determine what's going on. I would set a breakpoint on HotDeploymentFileSystemListener.onChange and try to determine the state of the file.

        Unfortunately, if some other part of dm Server is doing the deletion, this won't tell you much. One trick there would be to set the permissions of the file which is deleted to read only and hopefully see file I/O exceptions when the deletion occurs.

        Glyn

        Comment


        • #5
          Hi Glyn,

          We tried to make our WAR read-only and, so far, it works, dm Server doesn't (can't) undeploy it. We'll try to find out what process deletes our WAR, and if it is dm Server itself, we'll let you know and maybe try to investigate it.

          Best regards

          Jacek Bilski

          Comment


          • #6
            Hi,

            We did find out two things:

            1. The process which deletes files in pickup is java.exe itself. So dm Server is to blame (I assume), no other Java processes were running at that time.

            2. I looked more closely into my log files. Strange thing, but it seems that uninstalling bundles is caused by exception thrown in my application, but those exceptions are logged much later than undeploying. I have two examples, different exceptions, different applications (yet very similar, their architecture and connections are almost the same), same effect. One time exception (java.lang.IllegalStateException: No WebApplicationContext found: no ContextLoaderListener registered?) is logged 14 seconds after first undeployment, another time exception (my custom exception) is logged 7 seconds after undeployment.

            I would have to probably read a bit more about handling exceptions in OSGi services, but it definetly looks wrong for dm Server to undeploy bundles. Second example, with my custom exception thrown in business tier, was properly handled in web tier.

            Maybe that would give you some clue as to what happens inside dm Server. Or maybe you already can tell me where to look further?

            Best regards

            Jacek Bilski

            Comment


            • #7
              Isn't this a juicy problem? :-)

              The most likely reason dm Server would delete an artifact out of pickup is if the the artifact was undeployed for some reason.

              Do you see any evidence in serviceability/logs/log.log that the application fails to start?

              You could put a breakpoint on PipelinedApplicationDeployed.deploy(URI location, DeploymentOptions deploymentOptions) and see if the catch block is driven. This class is in the com.springsource.kernel.deployer.core.internal in the kernel deployer bundle.

              Comment


              • #8
                Hi Glyn,

                Just in case. I grabbed sm Server sources from git, but it seems to me that I'm missing something. How can I build dm Server myself? Where from I can get spring-build. I got that from some other Spring sources, but still I am unable to build dm Server. Where can I get help?

                I suppose that, when we find this bug, we'll try to prepare a patch, but to do that we would need to build dm Server.

                Best regards

                Jacek Bilski

                Comment


                • #9
                  Get spring build by issuing git submodule update --init from the root directory of the cloned git repo. cd into the build-xxx directory and issue "ant clean clean-integration test" typically. I can provide more details next week, including how to "ripple" changes through the repos into a dm Server packaging build.

                  Comment


                  • #10
                    Actually, the simplest way to patch dm Server once you have rebuilt one of its bundles is to copy the updated bundle into all the places in the dm Server directory structure where the original bundle occurred, keeping the same file name as the original. Please come back if this is unclear.

                    (I'll document rippling in due course, but it currently depends on having write access to certain Amazon S3 storage and this is something we'll be addressing in Virgo. So rippling is not something you could do right now. There is a manual version of rippling involving running the update dependencies script manually, but it's pretty laborious.)

                    Comment


                    • #11
                      Hi Glyn,

                      We got that far:

                      In class com.springsource.util.io.FileSystemChecker there are two interesting methods: listCurrentDirFiles and check. Second one scans every second pickup directory to check if there are any changes. This method uses listCurrentDirFiles to get all files in pickup directory. This is done by calling standard Java method java.io.File.listFiles(FilenameFilter). What we found out is that once in a while this method returns null, even though there are files in pickup. Java documentation for listFiles method says:

                      Returns null if this abstract pathname does not denote a directory, or if an I/O error occurs.
                      We digged a bit in Java sources, but cannot see anything interesting. Searching the web reveals that some users had similar problems with listing files in directories. One thread is especially interesting: http://72.5.124.102/thread.jspa?messageID=2455691. We'll try to go that way and increase open file descriptors available.

                      If you have any other ideas, we're open.

                      Best regards

                      Jacek Bilski

                      Comment


                      • #12
                        Hi Jacek

                        That's a great piece of detective work. Well done!

                        I suppose it's possible that dm Server is leaking some type of file reader by failing to close them. You could wait until the problem manifests itself and then trigger a heap dump and use a heap analysis tool to see if there are a large number of lives objects corresponding to readers, input streams, etc.

                        I guess the rate of such a leak would be affected by the number of files that dm Server processes. Do you have a particularly large number of artifacts either in pickup or in repository/usr (which is a watched repository and therefore scanned periodically)? Bit of a long shot, but worth asking.

                        Please come back when you need more ideas. If this could be captured in a testcase, then we could look at it.

                        Regards,
                        Glyn

                        Comment


                        • #13
                          Hello Glyn,

                          I'm trying to resolve the mysterious Null problem together with Jacek. And unfortunately I have to report we are running out of ideas...

                          We have tried to increase number of file handlers but it didn't help.
                          We've also searched the code for any input/output streams left open - non found.
                          We've put some double check on null - but the problem still appears.

                          One thing keeps bothering me, I'm testing our application on DM Server on Ms Virtual PC, and the error occurs about 1 time for every 20 tries, but Jacek has Sun VirtualBox and the undeployment happens in 19/20 times.

                          The number of artifacts you asked about, it's not that big I think (9 in pickup and 45 in repository/usr).

                          I wish we could capture this in a testcase, but as the reason remains unknown it is nearly impossible.

                          Best regards
                          Karolina Rusin

                          Comment


                          • #14
                            I had a careful trawl through the kernel artifact handling logic, which is the most intensive area of file usage in dm Server, and found that the ServiceScoper class contains two loops which get input streams but do not close them. This is clearly a bug which needs to be fixed, so I have raised DMS-2470 and will fix this in the 2.0.2 release.

                            The number of iterations of these loops depends on the number of XML files present in META-INF/spring directories of bundles of scoped plans and PARs. I wonder if you have a larger than typical number of these XML files?

                            Or are you deploying some scoped plans or PARs multiple times before the problem occurs?

                            Anyway, would you be willing to try out a fix in the form of a patched bundle, as I could produce that sooner than the 2.0.2 release? It's somewhat of a shot in the dark, but it certainly closes a possible source of the problem you are seeing.

                            Comment


                            • #15
                              Hi Glyn,

                              Of course we are willing to try.

                              Best regards
                              Karolina Rusin

                              Comment

                              Working...
                              X