Using ClusterJPA (part of MySQL Cluster Connector for Java) – a tutorial

Fig. 1 Java access to MySQL Cluster

This is a follow up to the earlier post Using ClusterJ (part of MySQL Cluster Connector for Java) – a tutorial but covers the ClusterJPA interface rather than ClusterJ.

JPA is the Java standard for persistence and different vendors can implement their own implementation of this API and they can (and do) add proprietary extensions. Three of the most common implementations are OpenJPA, Hibernate and Toplink. JPA can be used within server containers or outside of them (i.e. with either J2EE or J2SE).

Typically a JPA implementation would access the database (for example, MySQL Cluster) using JDBC. JDBC gives a great deal of flexibility to the JPA implementer but it cannot give the best performance when using MySQL Cluster as there is an internal conversion to SQL by Connector/J and a subsequent translation from SQL to the C++ NDB API by the MySQL Server. As of MySQL Cluster 7.1, OpenJPA can be configured to use the high performance NDB API (via ClusterJ) for most operations but fall back on JDBC for more complex queries.

The first implementation of ClusterJPA is as an OpenJPA BrokerFactory but in the future, it may be extended to work with other JPA implementations.

ClusterJPA overcomes ClusterJ limitations, notably:

  • Persistent classes
  • Relationships
  • Joins in queries
  • Lazy loading
  • Table and index creation from object model

Fig.2 ClusterJPA Performance

Typically users base their selection of a JPA solution on factors such as proprietary extensions, what existing applications already use and (increasingly with ClusterJPA) performance.

The performance of ClusterJPA (OpenJPA using ClusterJ) has been compared with OpenJPA using JDBC in Figure 2. It should be noted that the performance is significantly better when using ClusterJPA (the yellow bar). It is hoped that in the future the performance can be improved even further for finds, updates and deletes.

Adapting an OpenJPA based application to use ClusterJPA with MySQL Cluster should be fairly straight-forward with the main change being in the definition of the persistence unit in persistence.xml:

<persistence xmlns=http://java.sun.com/xml/ns/persistence xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0">
 <persistence-unit name="clusterdb" transaction-type="RESOURCE_LOCAL“>
  <provider> org.apache.openjpa.persistence.PersistenceProviderImpl </provider>
  <class>Employee</class>
  <class>Department</class>
  <properties>
   <property name="openjpa.jdbc.SynchronizeMappings" value="buildSchema" />
   <property name="openjpa.ConnectionDriverName"
    value="com.mysql.jdbc.Driver" />
   <property name="openjpa.ConnectionURL" value="jdbc:mysql://localhost:3306/clusterdb" />
   <property name="openjpa.ConnectionUserName" value="root" />
   <property name="openjpa.ConnectionPassword" value="" />
   <property name="openjpa.BrokerFactory" value="ndb" />
   <property name="openjpa.jdbc.DBDictionary" value="TableType=ndbcluster" />
   <property name="openjpa.ndb.connectString" value="localhost:1186" />
   <property name="openjpa.ndb.database" value="clusterdb" /
  </properties>
 </persistence-unit>
</persistence>

Fig. 3 ClusterJPA Annotations

Defining the object-to-table mappings is performed by annotating the persistent class for the domain object. If not already in existence, OpenJPA will create the table. The property  openjpa.jdbc.DBDictionary tells OpenJPA to create the tables using ndb as the storage engine.

This paper does not go into the use of JPA in great depth – focusing instead on the specifics of using OpenJPA with MySQL Cluster/ClusterJPA. For more information on the use of JPA and OpenJPA, refer to http://openjpa.apache.org/ and in particular, http://openjpa.apache.org/builds/latest/docs/manual/manual.html

The tutorials are using MySQL Cluster 7.1.2a on Fedora 12. If using earlier or more recent versions of MySQL Cluster then you may need to change the class-paths as explained in http://dev.mysql.com/doc/ndbapi/en/mccj-using-jpa.html

For this tutorial, it is necessary to have MySQL Cluster up and running. For simplicity all of the nodes (processes) making up the Cluster will be run on the same physical host, along with the application.

Although most of the database access is performed through the NDB API, the Cluster includes a MySQL Server process for OpenJPA to use for complex queries and to allow the user to check the contents of the database manually.

These are the MySQL Cluster configuration files being used :

config.ini:

[ndbd default]
noofreplicas=2
datadir=/home/billy/mysql/my_cluster/data

[ndbd]
hostname=localhost
id=3

[ndbd]
hostname=localhost
id=4

[ndb_mgmd]
id = 1
hostname=localhost
datadir=/home/billy/mysql/my_cluster/data

[mysqld]
hostname=localhost
id=101

[api]
hostname=localhost

my.cnf:

[mysqld]
ndbcluster
datadir=/home/billy/mysql/my_cluster/data
basedir=/usr/local/mysql

This tutorial focuses on ClusterJPA rather than on running MySQL Cluster; if you are new to MySQL Cluster then refer to Running a simple Cluster before trying these tutorials.

JPA/OpenJPA/ClusterJPA can be used within or outside a container (i.e. it can be used with J2EE or J2SE) – for simplicity, this tutorial does not use a container (i.e. it is written using J2SE).

Before being able to run any ClusterJPA code, you first need to download and install OpenJPA from http://openjpa.apache.org/ – this tutorial uses OpenJPA 1.2.1. Simply extract the contents of the binary tar ball to the host you want to run your application on; for this tutorial, I use /usr/local/openjpa.

Additionally, ClusterJPA must sometimes use JDBC to satisfy certain queries and so “JDBC Driver for MySQL (Connector/J)” should also be installed – this can be downloaded from http://dev.mysql.com/downloads/connector/j/ Again, simply extract the contents of the tar ball, for this tutorial the files are stored in /usr/local/connectorj and version 5.1.12 is used.

If the ClusterJ tutorial has already been run on this MySQL Cluster database then drop the tables from the cluster so that you can observe them being created automatically – though in a real application, you may prefer to create them manually.

A configuration file is required to indicate how persistence is to be handled for the application. Create a new directory called META-INF in the application source directory and within there create a file called persistence.xml:

<persistence xmlns="http://java.sun.com/xml/ns/persistence"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0">
 <persistence-unit name="clusterdb" transaction-type="RESOURCE_LOCAL">
 <provider>
 org.apache.openjpa.persistence.PersistenceProviderImpl
 </provider>
 <class>Employee</class>
 <class>Department</class>
 <properties>
 <property name="openjpa.jdbc.SynchronizeMappings" value="buildSchema" />
 <property name="openjpa.ConnectionDriverName"
 value="com.mysql.jdbc.Driver" />
 <property name="openjpa.ConnectionURL"
 value="jdbc:mysql://localhost:3306/clusterdb" />
 <property name="openjpa.ConnectionUserName" value="root" />
 <property name="openjpa.ConnectionPassword" value="" />
 <property name="openjpa.BrokerFactory" value="ndb" />
 <property name="openjpa.jdbc.DBDictionary" value="TableType=ndb"/>
 <property name="openjpa.ndb.connectString" value="localhost:1186" />
 <property name="openjpa.ndb.database" value="clusterdb" />
 </properties>
 </persistence-unit>
</persistence>

A persistence unit called ‘clusterdb’ is created; the provider (implementation for the persistence) is set to openjpa (as opposed for example to hibernate). Two classes are specified – ‘Employee’ and ‘Department’ which relate to the persistent classes that the application will define. Connector/J is defined as the JDBC connection (together with the host and the port of the MySQL Server to be used). The key to having OpenJPA use ClusterJPA is to set the BrokerFactory to ndb and specify the connect string (host:port) for the MySQL Cluster management node. The database is defined to be ‘clusterdb’ for both the JDBC and ClusterJ connections. The engine type when creating tables is set to ndb.

If not already done so, create the ‘clusterdb’ database (if it already contains tables from the ClusterJ tutorial then drop them):

mysql> create database clusterdb;

The next step is to create the persistent class definitions for the Department and Employee Entities:

Department.java:

import javax.persistence.*;

@Entity(name = "department")
public class Department {
  private int Id;
  private String Site;

  public Department(){}

  @Id public int getId() {return Id;}
  public void setId(int id) {Id=id;}

  @Column(name="location")    
  public String getSite() {return Site;}
  public void setSite(String site) {Site=site;}

  public String toString() {
  return "Department: " + getId() + " based in " + getSite();
 }
}

Using the @Entity tag, the table name is specified to be ‘department’. Note that unlike ClusterJ, ClusterJPA uses persistent classes (rather than interfaces) and so it is necessary to define the properties as well as the getter/setter methods. The primary key is defined using the @Id tag and we specify that the column associated with the Site property should be called ‘location’ using the @Column tag.

As this is a class, it is possible to add other useful methods – in this case toString().

Employee.java:

import javax.persistence.*;
@Entity(name = "employee") //Name of the table
public class Employee {
 private int Id;
 private String First;
 private String Last;
 private String City;
 private String Started;  
 private String Ended;  
 private int Department;

 public Employee(){}

 @Id public int getId() {return Id;}
 public void setId(int id) {Id=id;}

 public String getFirst() {return First;}
 public void setFirst(String first) {First=first;}

 public String getLast() {return Last;}
 public void setLast(String last) {Last=last;}

 @Column(name="municipality")  
 public String getCity() {return City;}
 public void setCity(String city) {City=city;}

 public String getStarted() {return Started;}
 public void setStarted(String date) {Started=date;}

 public String getEnded() {return Ended;}
 public void setEnded(String date) {Ended=date;}

 public int getDepartment() {return Department;}
 public void setDepartment(int department) {Department=department;}

 public String toString() {
  return getFirst() + " " + getLast() + " (Dept " +
  getDepartment()+ ") from " + getCity() +
  " started on " + getStarted() + " & left on " + getEnded();
 }
}

The next step is to write the application code which we step through here block by block; the first of which simply contains the import statements and then:

Main.java (part 1):

import java.util.List;
import javax.persistence.EntityManager;
import javax.persistence.EntityManagerFactory;
import javax.persistence.EntityTransaction;
import javax.persistence.Persistence;
import javax.persistence.Query;
import java.io.*;
public class Main {
public static void main (String[] args) throws java.io.IOException {
 EntityManagerFactory entityManagerFactory = Persistence.createEntityManagerFactory("clusterdb");
 EntityManager em = entityManagerFactory.createEntityManager();
 EntityTransaction userTransaction = em.getTransaction();
 BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
 System.out.println("The tables will now have been created - check through SQL.");
 System.out.println("mysql> use clusterdb;");
 System.out.println("mysql> show tables;");
 System.out.println("Hit return when you are done");
 String ignore = br.readLine();

As part of creating the EntityManagerFactory and EntityManager, OpenJPA creates the tables for the two classes specified for the ‘clusterdb’ persistence unit. While the application waits for the user to press return, this can be checked:

mysql> use clusterdb
 mysql> show tables;
 +---------------------+
 | Tables_in_clusterdb |
 +---------------------+
 | department          |
 | employee            |
 +---------------------+

After hitting return, the application can create an Employee object and then persist it – at which point it will be stored in the ‘employee’ table. A second Employee object is then created and populated with the data read back from the database (using a primary key look up on the Id property with a value of 1):

Main.java (part 2):

 userTransaction.begin();
 Employee emp = new Employee();
 emp.setId(1);
 emp.setDepartment(666);
 emp.setFirst("Billy");
 emp.setLast("Fish");
 emp.setStarted("1st February 2009");
 em.persist(emp);
 userTransaction.commit();
 userTransaction.begin();
 Employee theEmployee = em.find(Employee.class, 1);
 userTransaction.commit();
 System.out.println(theEmployee.toString());
 System.out.println("Chance to check the database before City is set");
 System.out.println("Hit return when you are done");
 ignore = br.readLine();

The Employee object read back from the database is displayed:

Billy Fish (Dept 666) from null started on 1st February 2009 & left on null
Chance to check the database before City is set
Hit return when you are done

At this point, the application waits to give the user a chance to confirm that the Employee really has been written to the database:

mysql> select * from employee;
+----+--------------+------------+-------+-------+------+-------------------+
| id | municipality | department | ended | first | last | started           |
+----+--------------+------------+-------+-------+------+-------------------+
|  1 | NULL         |        666 | NULL  | Billy | Fish | 1st February 2009 |
+----+--------------+------------+-------+-------+------+-------------------+

After hitting return, the application continues and an update is made to the persisted Employee object – note that there is no need to explicitly ask for the changes to be persisted, this happens automatically when the transaction is committed:

Main.java (part 3):

 userTransaction.begin();
 theEmployee.setCity("London");
 theEmployee.setDepartment(777);
 userTransaction.commit();
 System.out.println("Chance to check the City is set in the database");
 System.out.println("Hit return when you are done");
 ignore = br.readLine();

At this point, the application again waits while the user has a chance to confirm that the changes did indeed get written through to the database:

mysql> select * from employee;
+----+--------------+------------+-------+-------+------+-------------------+
| id | municipality | department | ended | first | last | started           |
+----+--------------+------------+-------+-------+------+-------------------+
|  1 | London       |        777 | NULL  | Billy | Fish | 1st February 2009 |
+----+--------------+------------+-------+-------+------+-------------------+

When allowed to continue, the application creates and persists an additional 100 Employee & Department entities. It then goes on to create and execute a query to find all employees with a department number of 777 and then looks up the location of the site for that department.

Main.java (part 4):

 Department dept;
 userTransaction.begin();
 for (int i=700;i<800;i++) {
  emp = new Employee();
  dept = new Department();
  emp.setId(i+1000);
  emp.setDepartment(i);
  emp.setFirst("Billy");
  emp.setLast("No-Mates-"+i);
  emp.setStarted("1st February 2009");
  em.persist(emp);
  dept.setId(i);
  dept.setSite("Building-"+i);
  em.persist(dept);
 }
 userTransaction.commit();
 userTransaction.begin();
 Query q = em.createQuery("select x from Employee x where x.department=777");
 Query qd;
 for (Employee m : (List<Employee>) q.getResultList()) {
  System.out.println(m.toString());
  qd = em.createQuery("select x from Department x where x.id=777");
  for (Department d : (List<Department>) qd.getResultList()) {
   System.out.println(d.toString());
  }
 }
 userTransaction.commit();

These are the results displayed:

Billy No-Mates-777 (Dept 777) from null started on 1st February 2009 & left on null
Department: 777 based in Building-777
Billy Fish (Dept 777) from London started on 1st February 2009 & left on null
Department: 777 based in Building-777

Note that joins between tables are possible with JPA but that is beyond the scope of this tutorial.

Finaly, the EntityManager and EntityManagerFactory are closed:

Main.java (part 5):

  em.close();
  entityManagerFactory.close();
 }
}

Compiling and running the ClusterJPA tutorial code

javac -classpath /usr/local/mysql/share/mysql/java/clusterjpa.jar:/usr/local/openjpa/openjpa-1.2.1.jar:/usr/local/openjpa/lib/geronimo-jpa_3.0_spec-1.0.jar:. Main.java Employee.java Department.java
java -Djava.library.path=/usr/local/mysql/lib/ -classpath /usr/local/mysql/share/mysql/java/clusterjpa.jar:/usr/local/openjpa/openjpa-1.2.1.jar:/usr/local/openjpa/lib/*:/usr/local/connectorj/mysql-connector-java-5.1.12-bin.jar:. Main 

Download the source code for this tutorial from here (together with the code for the previous ClusterJ tutorial).