Announcement Announcement Module
Collapse
No announcement yet.
Problem with JpaPagingItemReader and pageSize Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • jprio
    started a topic Problem with JpaPagingItemReader and pageSize

    Problem with JpaPagingItemReader and pageSize

    We have a job configured like this :

    Batch configuration :
    Code:
        <batch:job id="minimal" job-repository="jobRepository"
            restartable="true">
    
            <batch:step id="step1">
                <batch:tasklet>
                    <batch:chunk reader="jpaPersonReader" writer="addAddressToPersonWriter" commit-interval="4" />
                </batch:tasklet>
            </batch:step>
        </batch:job>
    
        <bean id="jpaPersonReader"
            class="org.springframework.batch.item.database.JpaPagingItemReader">
            <property name="entityManagerFactory" ref="entityManagerFactory" />
            <property name="queryString" value="select p from Person p order by p.id" />
            <property name="pageSize" value="4" />
        </bean>
        
        <bean id="addAddressToPersonWriter" class="org.jp.spring_batch_labs.AddAddressToPersonWriter" />
    AddAddressToPersonWriter class :
    Code:
    public class AddAddressToPersonWriter implements ItemWriter<Person> {
    
        private static final Logger LOG = Logger.getLogger(AddAddressToPersonWriter.class);
    
        @Override
        public void write(List<? extends Person> items) throws Exception {
            LOG.info("updating " + items.size() + " persons.");
            for (Person person : items) {
                Address address = new Address();
                address.setCp(Long.toString(System.currentTimeMillis()));
                person.addAddress(address);
            }
        }
    }
    We launch the batch with a test which initialize 15 persons in the @BeforeClass.
    15 persons are read by JpaPagingItemReader and modified by our AddAddressToPersonWriter but at the end, we can see 12 addresses in the database instead of 15.

    Our commit-interval is 4, so 12 commited modifications = 3 pages * 4 items per page, and we have the 3 last records not modified.

    Output logs :
    Code:
    --  INFO org.springframework.batch.core.launch.support.SimpleJobLauncher - Job: [FlowJob: [name=minimal]] launched with the following parameters: [{param1=1}]
    --  INFO org.springframework.batch.core.job.SimpleStepHandler - Executing step: [TaskletStep: [name=step1]]
    -- DEBUG org.springframework.batch.item.database.JpaPagingItemReader - Reading page 0
    --  INFO org.jp.spring_batch_labs.AddAddressToPersonWriter - updating 4 persons.
    -- DEBUG org.springframework.batch.item.database.JpaPagingItemReader - Reading page 1
    --  INFO org.jp.spring_batch_labs.AddAddressToPersonWriter - updating 4 persons.
    -- DEBUG org.springframework.batch.item.database.JpaPagingItemReader - Reading page 2
    --  INFO org.jp.spring_batch_labs.AddAddressToPersonWriter - updating 4 persons.
    -- DEBUG org.springframework.batch.item.database.JpaPagingItemReader - Reading page 3
    --  INFO org.jp.spring_batch_labs.AddAddressToPersonWriter - updating 3 persons.
    --  INFO org.springframework.batch.core.launch.support.SimpleJobLauncher - Job: [FlowJob: [name=minimal]] completed with the following parameters: [{param1=1}] and the following status: [COMPLETED]
    --  INFO org.jp.spring_batch_labs.SimpleTestJpa - 12 adresses in the db

    We noticed that the JpaPagingItemReader has been modified since version 1.1.3 : the flush has been moved from the end of the method doReadPage to the beginning of this method. If we write our own JpaPagingItemReader with a flush at the end of the doReadPage method, we solve our problem.

    So :
    - why was the flush moved from the end to the beginning of the method ?
    - is there any risk in replacing it at the end ?
    - is there a Jira that references this pb and will it be soon fixed ?

    Thank you for your answers,

    JP

  • traduz
    replied
    No problem.
    Originally posted by jprio View Post
    Couldn't wait tomorrow
    I have my JpaItemReader (pageSize = 8), 15 persons in my db. I have a processor which adds 2 addresses per person. I use a JpaItemWriter and I have a commit interval = 8. At the end of my test, I have ... 46 adresses (???) in my db.
    Assuming everything you said is correct, looks like the first page (8 person, 16 address) are saved in the JpaItemWriter (correct) but when returning to ItemReader they are being saved again, so first page makes 32 records. Next page (7 persons, 14 address), ItemWriter saves it, giving us 46 records.
    JpaItemWriter already does a merge + flush so the entities should be sync and Reader wouldn't update them. So weird.

    Leave a comment:


  • jprio
    replied
    Sorry about the 'code tag'
    I forgot to answer for the address : yes. The db is empty and just filled with the dataset above (only 'persons' wo address).

    Leave a comment:


  • traduz
    replied
    Thanks, next time use the code tag so will be better to read it my friend. I'll take a look at it.
    Also you didn't answered, are you clearing the address before testing? So we'll know how many updates we had.

    Leave a comment:


  • jprio
    replied
    The relevant parts :

    <batch:job id="minimal" job-repository="jobRepository"
    restartable="true">

    <batch:step id="step1">
    <batch:tasklet>
    <batch:chunk reader="jpaPersonReader" processor="addAdressToPersonProcessor"
    writer="jpaPersonWriter" commit-interval="4" />
    </batch:tasklet>
    </batch:step>
    </batch:job>

    <bean id="addAdressToPersonProcessor" class="org.jp.spring_batch_labs.AddAddressToPerson Processor" />

    <bean id="jpaPersonReader"
    class="org.springframework.batch.item.database.Jpa PagingItemReader">
    <property name="entityManagerFactory" ref="entityManagerFactory" />
    <property name="queryString" value="select p from Person p" />
    <property name="pageSize" value="4" />
    </bean>

    <bean id="jpaPersonWriter" class="org.springframework.batch.item.database.Jpa ItemWriter">
    <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>

    The processor :

    public class AddAddressToPersonProcessor implements ItemProcessor<Person, Person> {
    private static final Logger LOG = Logger
    .getLogger(AddAddressToPersonProcessor.class);

    public Person process(Person person) throws Exception {
    LOG.info("processing : " + person);
    Address address = new Address();
    address.setCp("92160");
    address.setStreet("street1");
    person.addAddress(address);

    Address address2 = new Address();
    address2.setCp("92160");
    address2.setStreet("street2");
    person.addAddress(address2);
    return person;
    }
    }


    My u.t. :

    public class SimpleTestJpa {

    private static final Logger LOG = Logger.getLogger(SimpleTestJpa.class);
    static ApplicationContext ac = new ClassPathXmlApplicationContext(
    new String[] { "spring-test-jpa.xml" });

    @BeforeClass
    public static void beforeClass() {
    // Création du schéma pour spring batch
    SimpleJdbcTemplate jdbcTemplate = (SimpleJdbcTemplate) ac
    .getBean("jdbcTemplate");
    Resource resource = new ClassPathResource(
    "/create_drop_spring_batch.sql");
    SimpleJdbcTestUtils.executeSqlScript(jdbcTemplate, resource, true);

    // dbunit : population des tables
    DataSource dataSource = (DataSource) ac.getBean("dataSource");
    Connection con = DataSourceUtils.getConnection(dataSource);
    IDatabaseConnection dbUnitCon = new DatabaseConnection(con);

    try {
    IDataSet dataSet = new FlatXmlDataSet(new FileInputStream(
    "./src/test/resources/persons_dbunit.xml"));
    DatabaseOperation.REFRESH.execute(dbUnitCon, dataSet);
    } catch (Exception e) {
    e.printStackTrace();
    } finally {
    DataSourceUtils.releaseConnection(con, dataSource);
    }
    }

    @Test
    public void simpleTest() throws Exception {
    Job job = (Job) ac.getBean("minimal");
    JobLauncher jobLauncher = (JobLauncher) ac.getBean("jobLauncher");
    // Lancement effectif
    JobExecution je = jobLauncher.run(job, new JobParametersBuilder()
    .addString("param1", "1").toJobParameters());
    }

    @AfterClass
    public static void afterClass() {
    DataSource dataSource = (DataSource) ac.getBean("dataSource");
    int count = 0;
    try {
    ResultSet rs = dataSource.getConnection().createStatement()
    .executeQuery("select count(*) from Address");
    rs.next();
    count = rs.getInt(1);
    rs.close();
    } catch (SQLException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }
    LOG.info("Nbre d'enregistrements : " + count);

    }
    }


    My dataset :
    <?xml version='1.0' encoding='UTF-8'?>
    <dataset>
    <PERSON ID='1' NAME="name1"/>
    <PERSON ID='2' NAME="name2"/>
    <PERSON ID='3' NAME="name3"/>
    <PERSON ID='4' NAME="name3"/>
    <PERSON ID='5' NAME="name3"/>
    <PERSON ID='6' NAME="name3"/>
    <PERSON ID='7' NAME="name3"/>
    <PERSON ID='8' NAME="name3"/>
    <PERSON ID='9' NAME="name3"/>
    <PERSON ID='10' NAME="name3"/>
    <PERSON ID='11' NAME="name3"/>
    <PERSON ID='12' NAME="name3"/>
    <PERSON ID='13' NAME="name3"/>
    <PERSON ID='14' NAME="name3"/>
    <PERSON ID='15' NAME="name3"/>
    </dataset>

    Person :
    @Entity
    public class Person {
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Long id;
    private String name;
    @OneToMany(cascade = CascadeType.ALL)
    @JoinTable(name = "PERSON_ADDRESS", joinColumns = { @JoinColumn(name = "PERSON_ID") }, inverseJoinColumns = { @JoinColumn(name = "ADDRESS_ID") })
    private Set<Address> addresses = new HashSet<Address>();
    ...+get/set


    Address :
    @Entity
    public class Address {
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Long id;
    private String cp;
    private String street;
    ...+get/set

    Leave a comment:


  • traduz
    replied
    Do you mind posting the code and config you used for JpaItemWriter please?
    There's a Jira related to it, why flush/clear was moved, damn old: https://jira.springsource.org/browse/BATCH-1110
    And are you sure about those numbers? Are you clearing the address before testing? So we'll know how many updates we had.
    Last edited by traduz; Jul 26th, 2012, 10:45 PM.

    Leave a comment:


  • jprio
    replied
    Couldn't wait tomorrow
    I have my JpaItemReader (pageSize = 8), 15 persons in my db. I have a processor which adds 2 addresses per person. I use a JpaItemWriter and I have a commit interval = 8. At the end of my test, I have ... 46 adresses (???) in my db.
    If I change the commit interval to 4 and pageSize to 4, I have 54 adresses in my db.
    I works only if pageSize > total number of record in the db.

    Leave a comment:


  • jprio
    replied
    Yep, thanx. I'll try tomorrow (processor + JpaItemWriter) and I'll let you know.

    Leave a comment:


  • traduz
    replied
    Use JpaItemWriter.

    Leave a comment:


  • jprio
    replied
    Humm may be. I'm not sure. But, anyway, to do this, I have to have access to the em instanciated by Spring (from the emf) for this transaction (i.e. this page) and I don't know how I could do this (AFAIK there is no way to get the "current em" with JPA).

    Leave a comment:


  • traduz
    replied
    Yes you are right, but don't you agree the writer should have the responsability to update and save those changes?
    So flush it in the writer.

    Leave a comment:


  • jprio
    replied
    Well, what I get in the writer is an object attached to an entitymanager (because it is read by the em). So if I modify this object, the modification should be synchronized in the db on a flush, right (and it is for the first pages) ?

    Leave a comment:


  • traduz
    replied
    No I mean, don't rely in the Reader to update your data, this isn't the right way to do it. The writer should use an update method.
    That's what the documentation says: http://static.springsource.org/sprin...ndWriters.html
    6.2. ItemWriter

    Leave a comment:


  • jprio
    replied
    Yes, that's what happens... except for the items of the last page.

    Leave a comment:


  • traduz
    replied
    So the writer should update the data in db as well, don't you think? That's what the doc says.

    Leave a comment:

Working...
X