Announcement Announcement Module
Collapse
No announcement yet.
Fixed length File, different record types Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed length File, different record types

    Hi guys,

    I'm writing to a flat file which is fixed length and contains 4 record types.

    There are 4 record types, each with fixed length column definition.
    The total record length is the same for all 4 types, the smaller record type uses a filler at the end to get to the total length.

    That is I first write one record type 1 which is a header record.
    After that I need to write multiple record type 2's and type 4's according to data retrieved.
    Lastly I need to write 1 line of record type 4 which is a summary of the record s written to file.

    Example File:

    RecordType1 (Header)
    RecordType2 (For each type 2 there are multiple type 3)
    RecordType3
    RecordType3
    RecordType3
    RecordType2
    RecordType3
    RecordType2
    RecordType3
    RecordType3
    RecordType4 (Footer)

    My question:

    Can I write these different record types to the same file in the same writer, or should I have a separate writer for each record type? Problem with multiple writers is the CompositeWriter will call each writer for every item, not really what is needed here.

    My gut feeling is to have 1 writer with the nessasary intellegence to differentiate between record types and when to write which.

    Any opinions would be greatly appreciated

    Jaen

  • #2
    If I understand correctly it is not a composite writer you need but a composite LineAggregator? There is a similar pattern in the ClassifierCompositeItemWriter (added since RC1), so if you look at that and a LineAggregator along the same lines would help, then let us know.

    Comment


    • #3
      Dave,

      Thanks for the help!

      It seems that I am keeping you busy... :-)

      Do I have to pull that from svn? or would it be in some kind of release or snapshot since RC1?

      Comment


      • #4
        There are nightly snapshots if you don't want to mess around with SVN and biulding yourself. Easiest way to get one is through Maven, or you can browse the repo (http://s3browse.com/explore/reposito...amework/batch/ or follow links from the Downloads page on the Batch website).

        Comment


        • #5
          The inverse of PrefixMatchingCompositeLineTokenizer would have worked, but I do not see such an implementation.

          Will have a look at the ClassifierCompositeItemWriter as soon as I have pulled it through our slow South African internet connection :-)

          Comment


          • #6
            Something like this:

            package com.mypackage.batch

            import org.springframework.batch.item.file.transform.Line Aggregator;
            import org.springframework.batch.support.Classifier;
            import org.springframework.batch.support.ClassifierSuppor t;


            public class ClassifierLineAggregator<T> implements LineAggregator<T> {

            private Classifier<T, LineAggregator<? super T>> classifier = new ClassifierSupport<T, LineAggregator<? super T>>(null);

            /**
            * @param classifier the classifier to set
            */
            public void setClassifier(Classifier<T, LineAggregator<? super T>> classifier) {
            this.classifier = classifier;
            }

            public String aggregate(T arg0) {

            LineAggregator<? super T> aggregator = classifier.classify(arg0);

            return aggregator.aggregate(arg0);

            }

            }

            Comment


            • #7
              Yes, that's what I was thinking of. Is that your use case? Anything else that would help to make it work better (e.g. special purpose classifiers, extending the ones we already provide)?

              Comment


              • #8
                Well I'm still playing around to see if it will work and what else I need.

                Do you think I'm going down the right track with this?

                I want to create an enum for my record types to return the correct LineAggregator implementation. something like this:

                package com.magnafs.iecs.batch.certExtract.internal;

                import org.springframework.batch.item.file.transform.Line Aggregator;

                public enum LineAggregatorEnum {



                RECORDTYPE1(new RecordType1LineAggregator()),
                RECORDTYPE2(new RecordType2LineAggregator()),
                RECORDTYPE3(new RecordType3LineAggregator()),
                RECORDTYPE4(new RecordType4LineAggregator());

                private LineAggregator lineAggregator;

                // Constructor
                LineAggregatorEnum(LineAggregator lineAggregator) {
                lineAggregator = lineAggregator;
                }

                LineAggregator getLineAggregator() {
                return lineAggregator;
                }

                }

                One thing that worries me is that I am now manually creating the lineaggregators instead of Spring. How would I have spring initialise this enum?

                Second, I wanna have a custom Classifier to return the LineAggregator from the enum within my ClassifierLineAggregator. But my problem is that I only have the 1 domain object coming from my Reader.

                How will I actually differentiate in my Classifier or do I need to create different objects for the different recordTypes.

                My domain object composition is such:

                I receive from my Reader 1 line which I break down into a Candidate object and a Set of Subjects on that Candidate via a custom Row Mapper.

                The Candidate Object is my domain object which will get passed.

                problem is that my RecordType 1 (Province info) information is also in Candidate, as well as my RecordType2 (School info) information. RecordType3(Candidate info and all his subjects info) is also in the Candidate Domain Object although the School info is obviously in the list of Subject object on that Candidate..

                RecordType4 is a summary of all the records written to file. eg total number of Record Type2. Total number of recordType3. I still having thought about how I'm gonna do this one, being a summary record or actual file footer.

                So I'm now a bit confused as to where I should start breaking into the different recordtype to write to file.

                For instance, let's say a Candidate object gets passed into a FlatFileWriter.
                I have no clue as to where the logic should sit that keeps track of 1. which part of this domain object are we writing. Is is recordType1, 2 or 3. Because all the info is within the same domain object. and 2. Should I manually loop through the subjects and call write for each one and 3. where do I keep my state, as in where do I keep track of the current school, so that I would know to write a new RecordType2 if the next Candidate has a different school to the previous one.

                Sorry for the clutter, I have a way of complicating things for myself.

                Comment


                • #9
                  import org.springframework.batch.item.file.transform.Line Aggregator;
                  import org.springframework.batch.support.Classifier;

                  public class CandidateClassifier implements Classifier<Candidate, LineAggregator> {

                  public LineAggregator classify(Candidate arg0) {
                  // Here I need to decide which LineAggrefator to return, but I only receive the Candidate domain object
                  // Should I rather receive an Object type in the param and do a instanceoff to see which aggregator to return?
                  return LineAggregatorEnum.RECORDTYPE1.getLineAggregator() ;
                  }



                  }

                  Comment


                  • #10
                    I think I misunderstood your initial description of your use case: I don't think a classifier helps (or not directly anyway). What you have to do is convert one item (Candidate) into many lines in an output file? There's a sample a bit like this called multilineOrderJob; it uses a composite processor to convert a single item to a collection and then a RecursiveCollectionLineAggregator. Something more along those lines seems more appropriate in your case?

                    Comment


                    • #11
                      I think my biggest concern is this:

                      My Job will only ever run for 1 province, so there will always be only 1 RecordType1.

                      But there will always be 1 or more schools (RecordType2) and definitely more than 1 candidate per school, with his subjects.

                      The use case as we discussed will work if I separate my incoming resultset row into 4 different domain object types etc Province, School, Candidate, Subject.

                      Then I can pass each object type and the ClassifierLineAggregator would know which aggregator to use in order to write the record out to file because it will classify on domain object type.

                      Currently I only build 2 domain object types in my RowMapper, that is Candidate and his Subjects. The reason I am not creating the Province and School domain object type is because I'm not sure how to keep track of them across multiple calls to mapRow in the rowMapper.

                      That is for each row in the resultset there will be only 1 candidate and his subjects. but the province info and school info for the same school is returned on multiple rows .

                      So my question:

                      Lets say my resultset from the DB contains 10 rows.
                      All 10 rows contains info for the same Province.

                      The first 5 row are for 1 school and the last 5 for another.

                      How do I construct my School domain object? on the first call to mapRow I will create
                      1 Province object, 1 school object, 1 Candidate object and multiple Subject objects and link them to each other via simple relationships.

                      On the 2nd call to mapRow how do I now check that this Province and School actually already exists? I only want to create a new School object if it differs from the previous one created in the previous call to mapRow?

                      Can I keep an instance variable in my custom row mapper? Is that clean?

                      I will look into the multilineOrderJob sample again, but last time I had a look it didn't help me out with my province and school info which actually sits in my Candidate domain model?

                      Any further ideas? If you would like a complete description of my use case, please say so and I will post it.

                      Comment


                      • #12
                        You see, my biggest problem is not Candidate, because all the Candidate info along with all his Subjects are written to the same line in the File, this is RecordType3.

                        My concern is the RecordType1 (Province info) and RecordType2(School info) because I am firstly struggling to model them from my Rowmapper.

                        Maybe I should extend FlatFileItemWriter, pass in Candidate object and just manually do the logic in my writer, and call the write method on another Writer (A custom writer for ecah RecordType) manually according to my logic?

                        Comment


                        • #13
                          Just Remembered why I did not follow the multilineorder use case.

                          In my first step I have the Itemreader, and a compositeItemWriter consisting out of 5 Other writers. 1 of them is the FlatFileItemWriter.

                          If I use a processor as in the Sample the following happens:

                          in the process method an Order item is received and a List<String> collection is returned.
                          That is we have now lost our original Domain Object.

                          This is fine if you only have 1 Writer which expects it like that, but in my case the other Writers are still expecting the Domain Object and not the collection as returned by the processor?

                          Comment


                          • #14
                            Just Remembered why I did not follow the multilineorder use case.

                            In my first step I have the Itemreader, and a compositeItemWriter consisting out of 5 Other writers. 1 of them is the FlatFileItemWriter.

                            If I use a processor as in the Sample the following happens:

                            in the process method an Order item is received and a List<String> collection is returned.
                            That is we have now lost our original Domain Object.

                            This is fine if you only have 1 Writer which expects it like that, but in my case the other Writers are still expecting the Domain Object and not the collection as returned by the processor?

                            Comment

                            Working...
                            X