Just listen to Alex

July 22, 2010

Hibernate data import, big sessions and slow flushing

Filed under: programming — Tags: , — bosmeeuw @ 8:03 pm

Have you ever done something like this with JPA/Hibernate?

for(Map<String,String> line : someCsvDataSource) {
	em.persist(createSomeDataObjectFromLine(line));
	em.flush();
}

If you’re creating more than a few hundred lines that way, you’ll soon see a drastic slowdown, with Hibernate eventually taking more than a second per line to do its work. The reason for this is that the Hibernate session becomes larger and larger as you persist more objects. This, in turn will cause the em().flush() to become really slow.

The fix for this is clearing your session every once in a while, like this:

int index = 0;

for(Map<String,String> line : someCsvDataSource) {
	em.persist(createSomeDataObjectFromLine(line));
	em.flush();
	
	index++;
	
	if(index % 100 == 0) {
		em.clear();
	}
}

Of course, this will “detach” any persistent objects you might have kept a reference to. Also note you absolutely need to do a manual flush before you clear your persistence context, else your data will be plain lost.

By the way: don’t use FlushMode.AUTO, it’s way too slow for practical use. You’ll be mad at yourself for having relied on it when you need to get rid of it to save your server’s performance.

Create a free website or blog at WordPress.com.