JPA Best Practices

chuk_TD09_JPA_best_practices.odp

Java Persistence API:
Best Practices

Carol McDonaldJava Architect

Agenda

Entity Manager

Persistence Context

Entities

Schema & Queries

Transaction

Persistence Context

EntityManager

persist()remove()refresh()merge()find()createQuery()createNamedQuery()contains()flush()

EntityManager
API for managing entities

set of entities managed by

Entity Manager

ServletsEJBsJava app

Catalog Java EE Application

DBRegistration ApplicationManaged BeanJSF ComponentsSession BeanEntity ClassCatalogItem ManagedBean

EJB EntityManager Example

@Statelesspublic class Catalog implements CatalogService {

@PersistenceContext(unitName=PetCatalogPu) EntityManager em;

@TransactionAttribute(NOT_SUPPORTED) public List getItems(int firstItem,int batchSize) { Query q = em.createQuery("select i from Item as i"); q.setMaxResults(batchSize); q.setFirstResult(firstItem); List items= q.getResultList(); return items; }}

To obtain an container managed EntityManager instance, inject the entity manager into the application component:@PersistenceContextEntityManager em;

you don't need any cm lifecycle methods

With a container-managed entity manager, an EntityManager instance's persistence context is automatically propagated by the container to all application components that use the EntityManager instance within a single Java Transaction Architecture (JTA) transaction.

JTA transactions usually involve calls across application components. To complete a JTA transaction, these components usually need access to a single persistence context. This occurs when an EntityManager is injected into the application components via the javax.persistence.PersistenceContext annotation. The persistence context is automatically propagated with the current JTA transaction, and EntityManager references that are mapped to the same persistence unit provide access to the persistence context within that transaction. By automatically propagating the persistence context, application components don't need to pass references to EntityManager instances to each other in order to make changes within a single transaction. The Java EE container manages the lifecycle of container-managed entity managers.

Catalog Spring JPA Application

DBRegistration ApplicationManaged BeanJSF ComponentsSpring BeanEntity ClassCatalogItem ItemController

Spring Framework

Spring with JPA

@Repository
@Transactional
public class CatalogDAO implements CatalogService {

@PersistenceContext(unitName="PetCatalogPu")
private EntityManager em;

@Transactional(readOnly=true)
public List getItems(int firstItem,int batchSize) {
Query q = em.createQuery("select object(o) from Item as o");
q.setMaxResults(batchSize);
q.setFirstResult(firstItem);
List items= q.getResultList();
return items;
}

Component Stereotype

Spring transactions use aop

To obtain an container managed EntityManager instance, inject the entity manager into the application component:@PersistenceContextEntityManager em;

you don't need any cm lifecycle methods

With a container-managed entity manager, an EntityManager instance's persistence context is automatically propagated by the container to all application components that use the EntityManager instance within a single Java Transaction Architecture (JTA) transaction.

JTA transactions usually involve calls across application components. To complete a JTA transaction, these components usually need access to a single persistence context. This occurs when an EntityManager is injected into the application components via the javax.persistence.PersistenceContext annotation. The persistence context is automatically propagated with the current JTA transaction, and EntityManager references that are mapped to the same persistence unit provide access to the persistence context within that transaction. By automatically propagating the persistence context, application components don't need to pass references to EntityManager instances to each other in order to make changes within a single transaction. The Java EE container manages the lifecycle of container-managed entity managers.

Container vs Application Managed

Container managed entity managers (EJB, Spring Bean, Seam component)

Injected into application

Automatically closed

JTA transaction propagated

Application managed entity managers

Used outside of the JavaEE 5 platform

Need to be explicitly created

Persistence.createEntityManagerFactory()

RESOURCE_LOCAL transaction not propagated

Need to explicitly close entity manager

Agenda

Entity Manager

Persistence Context

Entities

Queries

Transaction

Persistence Context

EntityManager

persist()remove()refresh()merge()find()createQuery()createNamedQuery()contains()flush()

Persistence Context

set of entities managed by

Entity Manager

Persistence context acts as a first level cache for entities

Two types of persistence context

Transaction scoped

Extended scoped persistence context

Level1 and Level2 caches

The terms Java Virtual Machine and JVM mean a Virtual Machine for the Java Platform. Source:http://weblogs.java.net/blog/guruwons/archive/2006/09/understanding_t.html

Persistence Context is a Level 1 cache

Transaction

Transaction

Transaction

Persistence Context(EntityManager)



L2 Cache(Shared Cache)

Entity managers for a specific PersistenceUnit on a given Java Virtual Machine (JVM)

Persistence Context

ManagedEntityManagedEntityNewEntityRemovedEntityDetachedEntity

Transaction commit

new()

persist()

Updates

remove()

Merge()find()

PC ends

Entity Lifecycle

Guaranteed Scope of Object Identity only one manage entity in PC represents a row

ManagedEntity

Entity Lifecycle Illustrated The Code

@Stateless public ShoppingCartBeanimplements ShoppingCart {

@PersistenceContext EntityManager entityManager;

public OrderLine createOrderLine(Product product, Order order) {OrderLine orderLine = new OrderLine(order, product);entityManager.persist(orderLine);return (orderLine);}

}

New entity

Managed entity

Detached entity

Persistence context

Here is how you could do this with a stateless session bean. We inject reference to a container managed EntityManager using the persistence context annotation. We create a new order and the entity has the state of new. We call persist, making this a managed entity. because it is a stateless session bean it is by default using container managed transactions , when this transaction commits , the order is made persistent in the database. When we return the serialiable orderline entity it is a detached entity

Scope of Identity

@Stateless public ShoppingCartBean implements ShoppingCart {

@PersistenceContext EntityManager entityManager;

public OrderLine createOrderLine(Product product,Order order) {OrderLine orderLine = new OrderLine(order, product);entityManager.persist(orderLine); OrderLine orderLine2 =entityManager.find(OrderLine,orderLine.getId())); (orderLine == orderLine2) // TRUEreturn (orderLine);}

}

Persistence context

Multiple retrievals of the same object return references to the same object instance


Persistence Context

Two types of persistence context

Transaction scoped

Used in stateless components

Typically begins/ends at request entry/exit points respectively

Extended scoped persistence context

Persistence Context Propagation

@Stateless public class ShoppingCartBean implements ShoppingCart {

@EJB InventoryService inv;

@EJB OrderService ord;

public void checkout(Item i, Product p) {

inv.createOrder(item);

ord.updateInventory(Product p)

}

}

Persistence context

Persistence Context Propagation

@Stateless public class OrderServiceBean implements OrderService {

@PersistenceContext EntityManager em1;

public void createOrder(Item item) {

em1.persist(new Order(item));

}

}

@Stateless public class InventoryServiceBean implements InventoryService {

@PersistenceContext EntityManager em2;

public void updateInventory(Product p) {

Product product = em2.merge(p);

. . .

}

}

Declarative Transaction Management Example

TX_REQUIRED

TX_REQUIRED

TX_REQUIRED

PC

PC

PC

ShoppingCart

InventoryService

OrderService

Check Out

1. Update Inventory

New PersistenceContext

Persistence Context Propagated

TransactionAttributes

2. Create Order

In a transaction-scoped container managed entity manager (common case in a Java EE environment), the JTA transaction propagation is the same as the persistence context resource propagation. In other words, container-managed transaction-scoped entity managers retrieved within a given JTA transaction all share the same persistence context.

AuditServiceBean

@Statelesspublic class AuditServiceBean implements AuditService { @PersistenceContext private EntityManager em;

@TransactionAttribute(REQUIRES_NEW) public void logTransaction2(int id, String action) { LogRecord lr = new LogRecord(id, action); em.persist(lr); }

NEWPC !

Declarative Transaction Management Example 2

REQUIRED

REQUIRED

REQUIRES_NEW

PC

PC

PC2

ShoppingCart

InventoryService

AuditService

Check Out

1. Update Inventory

New PersistenceContext

Persistence Context Propagated

TransactionAttributes

2. log transaction

NEWPC !

In a transaction-scoped container managed entity manager (common case in a Java EE environment), the JTA transaction propagation is the same as the persistence context resource propagation. In other words, container-managed transaction-scoped entity managers retrieved within a given JTA transaction all share the same persistence context.

Persistence Provider PC Transaction Features

Attribute-level change tracking

Only the minimal updates are sent to the database

Orders INSERT, UPDATE and DELETE statements

Minimizes database interactions

EntityManager flush SQL prior to commit

Persistence Context

ManagedEntityDetachedEntityConversation with detached entity

ManagedEntityManagedEntity

Persistence Context

ManagedEntityManagedEntityManagedEntityDetachedEntity

merge()

transaction-scoped persistence context

request

response

request

response

Conversation

transaction-scoped persistence context

Conversation with detached entity

@Stateless public ShoppingCartBean implements ShoppingCart { @PersistenceContext EntityManager entityManager;

public OrderLine createOrderLine(Product product,Order order) {OrderLine orderLine = new OrderLine(order, product);entityManager.persist(orderLine);return (orderLine);}

public OrderLine updateOrderLine(OrderLine orderLine){ OrderLine orderLine2 =entityManager.merge(orderLine)); return orderLine2;}}

Managed entity

Detached entity

Managed entity


Types of Persistence Context

Persistence Context

lifetime maybe transaction-scoped or extended

Transaction-scoped persistence context

Extended persistence context

spans multiple transactions

A persistence context can be either transaction scoped or extended scope. A transaction scope pc is one that is bound to a jta it starts and ends at transaction boundaries. At the end of the transaction the entities become detached. An extended pc spans multiple transactios, and it exists from the time the em is created until it is closed. An extended pc can be useful in web apps when you have conversations that span multiple requests. So the em can remain open between requests and you can close the em when done. Components like Stateful session can hold references to managed instances. It avoids flushing the database between requests and then re-finding entities so in some cases can improve performance. The type of pc is defined when you create the entity managerPersistence contexts may live as long as a transaction and be closed when a transaction completes. This is called a transaction-scoped persistence context . When the transaction completes, the transaction-scoped persistence context will be destroyed and all managed entity object instances will become detached. Persistence contexts may also be configured to live longer than a transaction. This is called an extended persistence context . Entity object instances that are attached to an extended context remain managed even after a transaction is complete. This feature is extremely useful in situations where you want to have a conversation with your database but not keep a long-running transaction

Persistence Context

ManagedEntityConversation with Exented Persistence Context

ManagedEntityManagedEntity

request

response

request

response

Conversation

extended persistence context

Extended Persistence Context

@Stateful public class OrderMgr {

//Specify that we want an EXTENDED @PersistenceContext(type=PersistenceContextType.EXTENDED)EntityManager em;

//Cached orderprivate Order order;

//create and cache orderpublic void createOrder(String itemId) {//order remains managed for the lifetime of the beanOrder order = new Order(cust);em.persist(order);}

public void addLineItem(OrderLineItem li){ order.lineItems.add(li); }

Managed entity

Managed entity

Extended Persistence Context

@Stateful public class DeptMgr {@PersistenceContext(type=PersistenceContextType.EXTENDED)EntityManager em;

private Department dept;

@TransactionAttribute(NOT_SUPPORTED)public void getDepartment(int deptId) {dept = em.find(Department.class,deptId);}

@TransactionAttribute(NOT_SUPPORTED) public void addEmployee(int empId){ emp = em.find(Employee.class,empId); dept.getEmployees().add(emp); emp.setDepartment(dept); } @Remove @TransactionAttribute(REQUIRES_NEW)public void endUpdate(int deptId) {dept = em.find(Department.class,deptId);}

Persistence Context-Transactional vs. Extended

@Statelesspublic class OrderMgr implements OrderService {

@PersistenceContext EntityManager em; public void addLineItem(OrderLineItem li){ // First, look up the order. Order order = em.find(Order.class, orderID); order.lineItems.add(li);}

@Statefulpublic class OrderMgr implements OrderService {@PersistenceContext(type = PersistenceContextType.EXTENDED))EntityManager em; // Order is cachedOrder orderpublic void addLineItem(OrderLineItem li){ // No em.find invoked for the order object order.lineItems.add(li);}

look up the order

No em.find invoked

Managed entity

In this example the sb looks up an order, adds a line items. The stateless session bean does the workflow with a lookup and merge in each operation as needed, whereas the stateful session bean does the same by caching the references.

Persistence Context Micro Benchmark

Source: Internal benchmarks

Micro benchmark with lots of lookups

Persistence context is caching entities

SEAM Conversations

@Name("shopper")

@Scope(CONVERSATION)

public class BookingManager {

@In EntityManager entityManager;

private Booking booking;

@Begin public void selectHotel(Hotel selectedHotel){

hotel = em.merge(selectedHotel);

}

@End public void confirm() {

em.persist(booking);

}

}

SEAM injected

SEAM conversation

Concurrency and Persistence Context

Persistence Context 1

Entity Manager

Persistence Context 2

Entity Manager

Data source

same entity

Object Identity only one manage entity in PC represents a row

User 2 transaction

User 1 transaction

Optimistic versus Pessimistic Concurrency

Optimistic Concurrency

ProsNo database locks held

ConsRequires a version attribute in schema

user or app must refresh and retry failed updates

Suitable when application has few parallel updates

Pessimistic Concurrency

Lock the row when data is read in

database locks the row upon a select

(SELECT . . . FOR UPDATE [NOWAIT])

ProsSimpler application code

ConsDatabase locks

limits concurrent access to the data = scalability

May cause deadlocks

Not in JPA 1.0 (vendor specific), supported in JPA 2.0

Suitable when application has many parallel updates

Optimistic locking permits all users to read and attempt to update the same data, concurrently. It does not prevent others from changing the same data, but it does guarantee the database will not be updated based on stale data.Pessimistic locking is the most restrictive form of locking but guarantees no changes are performed on the data during your transaction. The database physically locks the row upon a select (SELECT . . . FOR UPDATE [NOWAIT]) and prevents others from altering that row.This reassurance comes at a cost as pessimistic locks cannot be held across transactions and only one user can access the underlying data. Pessimistic locking should be used carefully as it limits concurrent access to the data and may cause deadlocks.

Preventing Parallel Updates

use @Version for optimistic locking

Used by persistence manager , Results in following SQL
UPDATE Employee SET ..., version = version + 1
WHERE id = ? AND version = readVersion

Version Updated when transaction commits, merged or acquiring a write lock

OptimisticLockException if mismatch

Not used by the application!

public class Employee {@ID int id;@Version int version;...

Can be int, Integer, short, Short, long, Long, Timestamp

Preventing Parallel Updates 1

tx1.begin();
//Joe's employee id is 5
//e1.version == 1
e1 = findPartTimeEmp(5);

//Joe's current rate is $9
e1.raise(2);

tx1.commit();
//e1.version == 2 in db

//Joe's rate is $11

tx2.begin();
//Joe's employee id is 5
//e1.version == 1
e1 = findPartTimeEmp(5);

//Joe's current rate is $9
if(e1.getRate() < 10)
e1.raise(5);

//e1.version == 1 in db?
tx2.commit();
//Joe's rate is $14
//OptimisticLockException

Time

Preventing Stale Data JPA 1.0

Perform read or write locks on entities

Prevents non-repeatable reads in JPA

entityManager.lock( entity, READ);

perform a version check on entity before commit


entityManager.lock( entity, WRITE);

perform a version check on entity


and increment version before commit

Preventing Stale Data

tx1.begin();
d1 = findDepartment(dId);

//d1's original name is
//Engrg
d1.setName(MarketEngrg);

tx1.commit();

tx2.begin();

e1 = findEmp(eId);
d1 = e1.getDepartment();
em.lock(d1, READ);
if(d1's name is Engrg)
e1.raiseByTenPercent();

//Check d1.version in db
tx2.commit();
//e1 gets the raise he does
//not deserve
//Transaction rolls back

Time

Write lock prevents parallel updates

tx1.begin();
d1 = findDepartment(dId);

//d1's original name is
//Engrg
d1.setName(MarketEngrg);

tx1.commit();
//tx rolls back

tx2.begin();

e1 = findEmp(eId);
d1 = e1.getDepartment();
em.lock(d1, WRITE);

//version++ for d1
em.flush();
if(d1's name is Engrg)
e1.raiseByTenPercent();

tx2.commit();

Preventing Parallel Updates 2

Time

The spec mendates that a persistence application operates at read committed level that is no dirty reads (or uncommitted data) is read.

Bulk Updates

Update directly against the database, can be Faster But

By pass EntityManager

@Version will not be updated

Entities in PC not updated

tx.begin();
int id = 5; //Joe's employee id is 5
e1 = findPartTimeEmp(id); //Joe's current rate is $9

//Double every employee's salary
em.createQuery(
Update Employee set rate = rate * 2).executeUpdate();

//Joe's rate is still $9 in this persistence context
if(e1.getRate() < 10)
e1.raiseByFiveDollar();

tx.commit();
//Joe's salary will be $14

JPA 2.0 Locks

JPA1.0 only supported optimistic locking, JPA 2.0 also supports pessimistic locks

JPA 2.0 LockMode values :

OPTIMISTIC (= READ)

OPTIMISTIC_FORCE_INCREMENT (= WRITE)

PESSIMISTIC

PESSIMISTIC_FORCE_INCREMENT

Multiple places to specify lock

database locks the row (SELECT . . . FOR UPDATE )

//Read then lock:

Account acct = em.find(Account.class, acctId);

// Decide to withdraw $100 so lock it for update

em.lock(acct, PESSIMISTIC);

int balance = acct.getBalance();

acct.setBalance(balance - 100);

//Read and lock:

Account acct = em.find(Account.class, acctId,PESSIMISTIC);

// Decide to withdraw $100 (already locked)



JPA 2.0 Locking

Locks longer,could cause bottlenecks,deadlock

Lock after read, risk stale, could cause OptimisticLockException


// read then lock and refresh

Account acct = em.find(Account.class, acctId);

// Decide to withdraw $100 - lock and refresh

em.refresh(acct, PESSIMISTIC);



JPA 2.0 Locking

right approach depends on requirements

Trade-offs:

lock earlier : risk bad scalability, deadlock

Lock later : risk stale data for update, get optimistic lock exception


L2 cache shared across transactions and users

Putting it all together

User Session

User Session

User Session




L2 Cache(Shared Cache)

Entity managers for a specific PersistenceUnit on a given Java Virtual Machine (JVM)

(EntityManagerFactory)

Second-level Cache

L2 Cache shares entity state across various persistence contexts

If caching is enabled, entities not found in persistence context, will be loaded from L2 cache, if found

Best for read-mostly classes

L2 Cache is Vendor specific

Java Persistence API 1.0 does not specify L2 support

Java Persistence API 2.0 has basic cache operations

Most persistence providers-- Hibernate, EclipseLink, OpenJPA ... provide second-level cache(s)

An application might want to share entity state across various persistence contexts

This is the domain of second level (L2) cache

If caching is enabled, entities not found in persistence context, will be loaded from L2 cache, if found

JPA does not specify support of a second level cache

However, most of the persistence providers provide in-built or integrated support for second level cache(s)

Basic support for second level cache in GlassFish-TopLink Essentials is turned on by default

No extra configuration is needed

L2 Cache

User transaction 1

Persistence Context

User transaction 2

Persistence Context

Data source

same entitynot shared

L2 Cache

EntityEntity

query that looks for a single object based on Id will go 1st to PC then to L2 cache, other queries go to database or query cache

Shared entity

L2 Caching

Pros:

avoids database access for already loaded entities

faster for reading frequently accessed unmodified entities

Cons

memory consumption for large amount of objects

Stale data for updated objects

Concurrency for write (optimistic lock exception, or pessimistic lock)

Bad scalability for frequent or concurrently updated entities

L2 Caching

Configure L2 caching for entities that are

read often

modified infrequently

Not critical if stale

protect any data that can be concurrently modified with a locking strategy

Must handle optimistic lock failures on flush/commit

configure expiration, refresh policy to minimize lock failures

Configure Query cache

Useful for queries that are run frequently with the same parameters, for not modified tables

JPA 2.0 Shared Cache API

entity cache shared across persistence unit

Accessible from EntityManagerFactory

Supports only very basic cache operations

Can be extended by vendors

public class Cache {

//checks if object is in IdentityMap

public boolean contains(Class class, Object pk);

// invalidates object in the IdentityMap

public void evict(Class class, Object pk);

public void evict(Class class); // invalidates the class in the IdentityMap.

public void evictAll(); // Invalidates all classes in the IdentityMap

}

EclipseLink Caching Architecture

EntityManagerEntityManagerFactoryServerL1 Cache

PC CacheL2 Shared Cache

Cache CoordinationJMS (MDB)RMI CORBAIIOP

EclipseLink caches Entities in L2, Hibernate does not

EclipseLink Extensions - L2 Caching

Default: Entities read are L2 cached

Cache Configuration by Entity type or Persistence Unit

You can disable L2 cache

Configuration Parameters

Cache isolation, type, size, expiration, coordination, invalidation,refreshing

Coordination (cluster-messaging)

Messaging: JMS, RMI, RMI-IIOP,

Mode: SYNC, SYNC+NEW, INVALIDATE, NONE

@Entity

@Table(name="EMPLOYEE")

@Cache (

type=CacheType.WEAK,

isolated=false,

expiry=600000,

alwaysRefresh=true,

disableHits=true,

coordinationType=INVALIDATE_CHANGED_OBJECTS

)

public class Employee implements Serializable {

...

}

EclipseLink Mapping Extensions

Type=Full: objects never flushed unless deleted or evictedweak: object will be garbage collected if not referenced

@Cache

type, size,isolated, expiry, refresh, cache usage, coordination

Cache usage and refresh query hints

=truedisables L2 cache


Hibernate L2 Cache

Hibernate L2 cache is not configured by default

Hibernate L2 does not cache Entities. Hibernate caches Id and state

Hibernate L2 cache is pluggable

EHCache, OSCache, SwarmCacheProvider (JVM)

JBoss TreeCache Cluster

Can plug in others like Terracotta

Cache Concurrency Strategy

Hibernate L2 Cache

net.sf.ehcache.configurationResourceName=/name_of_configuration_resource

@Entity

@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)

public Class Country {

private String name;

...

}

not configured by default

Cache Concurrency Strategymust be supported by cache provider

OpenJPA L2 Caching

OpenJPA L2 caches object data and JPQL query results

Updated when data is loaded from database and after changes are successfully committed

For cluster caches are notified when changes are made

Can plug in implementations, such as Tangosols Coherence product

several cache eviction strategies:

Time-to-live settings for cache instances

@Entity @DataCache(timeout=5000)

public class Person { ... }

Agenda

Entity Manager

Persistence Context

Entities

Schema and Queries

Transaction

Maintaining Relationship

Application bears the responsibility of maintaining relationship between objects

Department A

Employee1

Employee2

Employee3

Employee1

Employee2

Employee3

Before

After

Inconsistent!

Department B

Department A

Department B

deptA.getEmployees().add(e3);

Example Domain Model

@Entity public class Employee {
@Id private int id;
private String firstName;
private String lastName;
@ManyToOne(fetch=LAZY)
...
}@Entity public class Department {
@Id private int id;
@OneToMany(mappedBy = "dept", fetch=LAZY)
private Collection emps = new ...;
... }

Example Managing Relationship

public int addNewEmployee(...) {Employee e = new Employee(...);Department d = new Department(1, ...);

e.setDepartment(d);//Reverse relationship is not setem.persist(e);em.persist(d); return d.getEmployees().size();}

INCORRECT

Example Managing Relationship

public int addNewEmployee(...) {Employee e = new Employee(...);Department d = new Department(1, ...);

e.setDepartment(d);d.getEmployees().add(e);em.persist(e);em.persist(d); return d.getEmployees().size();}

CORRECT

Navigating Relationships

Data fetching strategy

EAGER immediate

LAZY loaded only when needed

LAZY is good for large objects and/or with relationships with deep hierarchies

Lazy loading and JPA

Default FetchType is LAZY for 1:m and m:n relationships

benefits large objects and relationships with deep hierarchies

However for use cases where data is needed can cause n+1 selects

LAZY N + 1 problem:

@Entity public class Department {
@Id private int id;
@OneToMany(mappedBy = "dept")
private Collection emps ;
... }

SELECT d.id, ... FROM Department d // 1 time
SELECT e.id, ... FROM Employee e
WHERE e.deptId = ? // N times

Lazy loading and JPAWith JPA many-to-one and many-to-many relationships lazy load by default , meaning they will be loaded when the entity in the relationship is accessed. Lazy loading is usually good, but if you need to access all of the "many" objects in a relationship, it will cause n+1 selects where n is the number of "many" objects. You can change the relationship to be loaded eagerly as follows :public class Employee{ @OneToMany(mappedBy = "employee", fetch = FetchType.EAGER) private Collection addresses; .....}However you should be careful with eager loading which could cause SELECT statements that fetch too much data. It can cause a Cartesian product if you eagerly load entities with several related collections. If you want to temporarily override the LAZY fetch type, you could use Fetch Join. For example this query would eagerly load the employee addresses: @NamedQueries({ @NamedQuery(name="getItEarly", query="SELECT e FROM Employee e JOIN FETCH e.addresses")})public class Employee{.....}


Relationship can be Loaded Eagerly

But if you have several related relationships, could load too much !

OR

Temporarily override the LAZY fetch type, use Join Fetch in a query:

@Entity public class Department {
@Id private int id;
@OneToMany(mappedBy = "dept", fetch=EAGER)
private Collection employees ;
... }

@NamedQueries({ @NamedQuery(name="getItEarly", query="SELECT d FROM Department d JOIN FETCH d.employees")})

public class Department{.....}

Can cause Cartesian product

Lazy loading and JPAWith JPA many-to-one and many-to-many relationships lazy load by default , meaning they will be loaded when the entity in the relationship is accessed. Lazy loading is usually good, but if you need to access all of the "many" objects in a relationship, it will cause n+1 selects where n is the number of "many" objects. You can change the relationship to be loaded eagerly as follows :

However you should be careful with eager loading which could cause SELECT statements that fetch too much data. It can cause a Cartesian product if you eagerly load entities with several related collections. If you want to temporarily override the LAZY fetch type, you could use Fetch Join. For example this query would eagerly load the employee addresses: Cartiesian product is the set of all possible ordered pairs


Capture generated SQL

persistence.xml file:

Test run use cases and examine the SQL statements

optimise the number of SQL statements executed!

only retrieve the data your application needs!

Lazy load large (eg BLOB) attributes and relationships that are not used often

Override to Eagerly load in use cases where needed

Lazy loading and JPAWith JPA many-to-one and many-to-many relationships lazy load by default , meaning they will be loaded when the entity in the relationship is accessed. Lazy loading is usually good, but if you need to access all of the "many" objects in a relationship, it will cause n+1 selects where n is the number of "many" objects. You can change the relationship to be loaded eagerly as follows :public class Employee{ @OneToMany(mappedBy = "employee", fetch = FetchType.EAGER) private Collection addresses; .....}However you should be careful with eager loading which could cause SELECT statements that fetch too much data. It can cause a Cartesian product if you eagerly load entities with several related collections. If you want to temporarily override the LAZY fetch type, you could use Fetch Join. For example this query would eagerly load the employee addresses: @NamedQueries({ @NamedQuery(name="getItEarly", query="SELECT e FROM Employee e JOIN FETCH e.addresses")})public class Employee{.....}


Accessing a LAZY relationship from a detached entity

If not loaded , Causes an exception

Solutions:

Use JOIN FETCH

Or Set Fetch type to EAGER

Or Access the collection before entity is detached:

d.getEmployees().size();

DetachedEntity

Persistence Context

ManagedEntity


Data fetching strategy

Cascade specifies operations on relationships

ALL, PERSIST, MERGE, REMOVE, REFRESH

The default is do nothing

Avoid MERGE, ALL with deep hierarchies

If want to do it, limit the scope

Using Cascade

@Entity public class Employee {
@Id private int id;
private String firstName;
private String lastName;
@ManyToOne(cascade=ALL, fetch=LAZY)
...
}@Entity public class Department {
@Id private int id;
@OneToMany(mappedBy = "dept"cascade=ALL, fetch=LAZY)
private Collection emps = new ...;@OneToManyprivate Collection codes;
... }

Employee

Department

DepartmentCode

cascade=ALL

X

Agenda

Entity Manager

Persistence Context

Entities

Schema and Queries

Transaction

Database design Basic foundation of performance

Smaller tables use less disk, less memory, can give better performance

Use as small data types as possible

use as small primary key as possible

Vertical Partition:

split large, infrequently used columns into a separate one-to-one table

Use good indexing

Indexes Speed up Querys

Indexes slow down Updates

Index columns frequently used in Query Where clause

Normalization

Normalization Eliminates redundant data

updates are usually faster.

there's less data to change.

However Normalized database causes joins for queries

Queries maybe slower

Normalize then maybe De-normalize frequently read columns and cache them

In a normalized database, each fact is represented once and only once. Conversely, in a denormalized database, information is duplicated, or stored in multiple places.People who ask for help with performance issues are frequently advised to normalize their schemas, especially if the workload is write-heavy. This is often good advice. It works well for the following reasons: Normalized updates are usually faster than denormalized updates. When the data is well normalized, there's little or no duplicated data, so there's less data to change. Normalized tables are usually smaller, so they fit better in memory and perform better. The lack of redundant data means there's less need for DISTINCT or GROUP BY queries when retrieving lists of values. Consider the preceding example: it's impossible to get a distinct list of departments from the denormalized schema without DISTINCT or GROUP BY, but if DEPARTMENT is a separate table, it's a trivial query.The drawbacks of a normalized schema usually have to do with retrieval. Any nontrivial query on a well-normalized schema will probably require at least one join, and perhaps several. This is not only expensive, but it can make some indexing strategies impossible. For example, normalizing may place columns in different tables that would benefit from belonging to the same index.

Mapping Inheritance Hierarchies

Employee--------------------------- int id String firstName String lastName
Department dept

PartTimeEmployee------------------------ int rate

FullTimeEmployee----------------------- double salary

Single Table Per Class

Benefits

Simple

No joins required

can be fast for Queries

Drawbacks

Not normalized

Wastes space

Requires columns corresponding to subclasses' state to be null

Table can have too many columns

Larger tables= more data, can have negative affects on performance

@Inheritance(strategy=SINGLE_TABLE)

EMPLOYEE--------------------------- IDInt PK, FIRSTNAMEvarchar(255), LASTNAMEvarchar(255),
DEPT_IDint FK, RATEint NULL, SALARYdouble NULL, DISCRIMvarchar(30)

single table means that all of those classes are being stored in the same table and all we need is a a discriminator column in that table . So the disc column tells what type. This leads to some wastage of space, if you have a rate then you don't have a salary . But it does make for quick querying because you can find through a single table scan all of the objects of a type.

Joined Subclass

Benefits

Normalized database

Better for storage

Database view same as domain model

Easy to evolve domain model

Drawbacks

Queries cause joins

Slower queries

Poor performance for deep hierarchies, polymorphic queries and relationships

@Inheritance(strategy=JOINED)

IDint PK, FIRSTNAMEvarchar(255), LASTNAMEvarchar(255),
DEPT_IDint FK,

IDint PK FK,RATEint NULL

IDint PK FK, SALARYdouble NULL

EMPLOYEE

PARTTIMEEMPLOYEE

FULLTIMEEMPLOYEE

joined hierarchy , great for storage a little more expensive for querying

Table Per Class

Benefits

No need for joins to read entities of same type

Faster reads

Drawbacks

Not normalized

Wastes space

Polymorphic queries cause union (all employees)

Poor performance

This strategy is not mandatory

@Inheritance(strategy=TABLE_PER_CLASS)

IDint PK, FIRSTNAMEvarchar(255), LASTNAMEvarchar(255), DEPT_IDint FK, SALARYdouble NULL

IDint PK, FIRSTNAMEvarchar(255), LASTNAMEvarchar(255), DEPT_IDint FK, RATEint NULL

PARTTIMEEMPLOYEE

FULLTIMEEMPLOYEE

finally table per class says that every concrete class gets its own table and it will repeat all the inherited state in that table.

vertical partitioning

limit number of columns per table

split large, infrequently used columns into a separate table

CREATE TABLE Customer ( user_id INT NOT NULL AUTO_INCREMENT, email VARCHAR(80) NOT NULL, display_name VARCHAR(50) NOT NULL, password CHAR(41) NOT NULL, first_name VARCHAR(25) NOT NULL, last_name VARCHAR(25) NOT NULL, address VARCHAR(80) NOT NULL, city VARCHAR(30) NOT NULL, province CHAR(2) NOT NULL, postcode CHAR(7) NOT NULL, interests TEXT NULL, bio TEXT NULL, signature TEXT NULL, skills TEXT NULL, PRIMARY KEY (user_id), UNIQUE INDEX (email)) ENGINE=InnoDB;

Less Frequently

referenced,

TEXT data

Frequently

referenced

CREATE TABLE Customer( user_id INT NOT NULL AUTO_INCREMENT, email VARCHAR(80) NOT NULL, display_name VARCHAR(50) NOT NULL, password CHAR(41) NOT NULL, PRIMARY KEY (user_id), UNIQUE INDEX (email)) ENGINE=InnoDB;

CREATE TABLE CustomerInfo ( user_id INT NOT NULL, first_name VARCHAR(25) NOT NULL, last_name VARCHAR(25) NOT NULL, address VARCHAR(80) NOT NULL, city VARCHAR(30) NOT NULL, province CHAR(2) NOT NULL, postcode CHAR(7) NOT NULL, interests TEXT NULL, bio TEXT NULL, signature TEXT NULL, skills TEXT NULL, PRIMARY KEY (user_id), FULLTEXT KEY (interests, skills)) ENGINE=MyISAM;

An example of vertical partitioning might be a table that contains a number of very wide text or BLOB columns that aren't addressed often being broken into two tables that has the most referenced columns in one table and the seldom-referenced text or BLOB data in another. limit number of columns per table split large, infrequently used columns into a separate one-to-one tableBy removing the VARCHAR column from the design, you actually get a reduction in query response time. Beyond partitioning, this speaks to the effect wide tables can have on queries and why you should always ensure that all columns defined to a table are actually needed.

Vertical partitioning

@Entitypublic class Customer {
int userid; String email;
String password; @OneToOne(fetch=LAZY) CustomerInfo info;
}@Entitypublic class CustomerInfo { int userid; String name; String interests; @OneToOne(mappedBy= "CustomerInfo") Customer customer;}

split large, infrequently used columns into a separate table

Scalability: Sharding - Application Partitioning

Cust_id 1-999

Cust_id 1000-1999

Cust_id 2000-2999

Web/AppServers

Sharding Architecture

Sharding =Horizontal partitioning

Split table by rows into partitions

Know what SQL is executed

Capture generated SQL:

persistence.xml file:

Find and fix problem SQL:

Watch for slow Queries

use the MySQL slow query log and use Explain

Can reveal problems such as a missing Indexes

Watch for Queries that execute too often to load needed data

Watch for loading more data than needed

You need to understand the SQL queries your application makes and evaluate their performance To Know how your query is executed by MySQL, you can harness the MySQL slow query log and use EXPLAIN. Basically you want to make your queries access less data: is your application retrieving more data than it needs, are queries accessing too many rows or columns? is MySQL analyzing more rows than it needs? Indexes are a good way to reduce data access. When you precede a SELECT statement with the keyword EXPLAIN, MySQL displays information from the optimizer about the query execution plan. That is, MySQL explains how it would process the SELECT, including information about how tables are joined and in which order. With the help of EXPLAIN, you can see where you should add indexes to tables to get a faster SELECT that uses indexes to find rows. You can also use EXPLAIN to check whether the optimizer joins the tables in an optimal order.Developers should run EXPLAIN on all SELECT statements that their code is executing against the database. This ensures that missing indexes are picked up early in the development process and gives developers insight into how the MySQL optimizer has chosen to execute the query.

MySQL Query Analyser

Find and fix problem SQL:

how long a query took

results of EXPLAIN statements

Historical and real-time analysis

query execution counts, run time

Its not just slow running queries that are a problem, Sometimes its SQL that executes a lot that kills your system

MySQL Query AnalyzerThe MySQL Query Analyzer is designed to save time and effort in finding and fixing problem queries. It gives DBAs a convenient window, with instant updates and easy-to-read graphics,The analyzer can do simple things such as tell you how long a recent query took and how the optimizer handled it (the results of EXPLAIN statements). But it can also give historical information such as how the current runs of a query compare to earlier runs.Most of all, the analyzer will speed up development and deployment because sites will use it in conjunction with performance testing and the emulation of user activity to find out where the choke points are in the application and how they can expect it to perform after deployment.The MySQL Query Analyzer saves time and effort in finding and fixing problem queries by providing: Aggregated view into query execution counts, run time, result sets across all MySQL servers with no dependence on MySQL logs or SHOW PROCESSLIST Sortable views by all monitored statisticsSearchable and sortable queries by query type, content, server, database, date/time, interval range, and "when first seen"Historical and real-time analysis of all queries across all serversDrill downs into sampled query execution statistics, fully qualified with variable substitutions, and EXPLAIN resultsThe new MySQL Query Analyzer was added into the MySQL Enterprise Monitor and it packs a lot of punch for those wanting to ensure their systems are free of bad running SQL code.let me tell you the two things I particularly like about it from a DBA perspective: 1. It's Global: If you have a number of servers, you'll love what Query Analyzer does for you. Even Oracle and other DB vendors only provide single-server views of bad SQL that runs across their servers. Query Analyzer bubbles to the top the worst SQL across all your servers which is a much more efficient way to work. No more wondering what servers you need to spend your time on or which have the worst code. 2. It's Smart: Believe it or not, sometimes it's not slow-running SQL that kills your system it's SQL that executes way more times than you think it is. You really couldn't see this well before Query Analyzer, but now you can. One customer already shaved double-digits off their response time by finding queries that were running more much than they should have been. And that's just one area Query Analyzer looks at; there's much more intelligence there too, along with other stats you can't get from the general server utilities.

Agenda

Entities

Entity Manager

Persistence Context

Queries

Transaction

Query Types 1

Named query

Like findByXXXX() from EntityBeans

Compiled by persistence engine

Created either with @NamedQuery or externalized in orm.xml

Dynamic query

Created dynamically by passing a query string to EntityManager

Query query = em.createQuery(select ...);

Beware of SQL injection, better to use with named parameters

q = em.createQuery(select e from Employee e WHERE + e.empId LIKE ' + id + ');

q = em.createQuery(select e from Employee e WHERE + e.empId LIKE ':id');q.setParameter(id, id);

NOT GOOD

GOOD

Query Types 2

Native query

Leverage the native database querying facilities

Not portable ties queries to database

Flush Mode

Controls whether the state of managed entities are synchronized before a query

Types of flush mode

AUTO immediate, default

COMMIT flush only when a transaction commits

NEVER need to invoke EntityManger.flush() to flush

//Assume JTA transactionOrder order = em.find(Order.class, orderNumber);em.persist(order);Query q = em.createNamedQuery(findAllOrders);q.setParameter(id, orderNumber);q.setFlushMode(FlushModeType.COMMIT);//Newly added order will NOT visibleList list = q.getResultList();

Agenda

Entities

Entity Manager

Persistence Context

Queries

Transaction

Transactions

Do not perform expensive and unnecessary operations that are not part of a transaction

Hurt performance

Eg. logging disk write are expensive, resource contention on log

Do not use transaction when browsing data

@TransactionAttribute(NOT_SUPPORTED)

Carol McDonald Java Architect

Sun Microsystems, Inc.

Page

Click to edit the title text format

Click to edit the outline text format

Second Outline Level

Click to edit the notes format

Page



Third Outline Level

Fourth Outline Level

Fifth Outline Level

Sixth Outline Level

Seventh Outline Level

Eighth Outline Level

Ninth Outline Level



Page




Third Outline Level

2007 JavaOneSM Conference | Session TS-4902 |


04/03/09

| JavaOne 2007 | Session XXXX

tx/sec

Extended1

Transactional0.49

CacheTypeRead-onlyRead-writeTransactional EHCachememory, diskYes OSCache memory, diskYes SwarmCacheclustered Yes JBoss Cacheclustered YesYes

???Page ??? (???)04/03/2009, 18:00:17Page /

JPA Best Practices

Technology

Transcript of JPA Best Practices