Monday, March 1, 2010

Notes about Groovy's DataSet

I have blogged before about GroovySql and I find that it meets my needs nicely. However, some people prefer to use Groovy's DataSet class which the Groovy 1.7.1 API documentation describes as "an enhancement of Groovy's Sql class providing support for accessing and querying databases using POGO fields and operators rather than JDBC-level API calls and RDBMS column names."

Before getting to the DataSet example, I'll quickly demonstrate using GroovySql used to get the IT (department 60) employees from the HR sample schema in most Oracle databases.


#!/usr/bin/env groovy

// Need to include the DataSource class on the classpath. Might look like this:
//
// groovy -cp C:\app\Dustin\product\11.1.0\db_1\jdbc\lib\ojdbc6.jar getEmployeesViaDataSet.groovy

import groovy.sql.Sql
sql = Sql.newInstance("jdbc:oracle:thin:@localhost:1521:orcl", "hr", "hr",
"oracle.jdbc.pool.OracleDataSource")
sql.eachRow("SELECT employee_id, last_name, first_name from employees where department_id = 60")
{
println "Employee of Department 200 is ${it.first_name} ${it.last_name}."
}


The IT's department number is hard-coded into the query, but it works for this simple example and the output lists five employees in the IT department.


Employee of Department 60 is Alexander Hunold.
Employee of Department 60 is Bruce Ernst.
Employee of Department 60 is David Austin.
Employee of Department 60 is Valli Pataballa.
Employee of Department 60 is Diana Lorentz.


This same result can be achieved with the Groovy DataSet as shown next.


#!/usr/bin/env groovy

// Need to include the DataSource class on the classpath. Might look like this:
//
// groovy -cp C:\app\Dustin\product\11.1.0\db_1\jdbc\lib\ojdbc6.jar getEmployeesViaDataSet.groovy

import groovy.sql.DataSet
import groovy.sql.Sql
sql = Sql.newInstance("jdbc:oracle:thin:@localhost:1521:orcl", "hr", "hr",
"oracle.jdbc.pool.OracleDataSource")
def employee = new DataSet(sql, 'EMPLOYEES')
def itEmployees = employee.findAll { it.department_id = 60 }
itEmployees.each{ println "Employee of Department 60 is ${it.first_name} ${it.last_name}" }


Comparing the second example (using DataSet) to the first example (using GroovySql directly) shows that the biggest difference is between using SQL/JDBC statements with GroovySql and using a more Java/Groovy-like syntax with the DataSet approach.

Now, let's replace the hard-coded 60 with an ID passed into the script as a command-line argument. Here is the version using GroovySql, followed by its output.


#!/usr/bin/env groovy

// Need to include the DataSource class on the classpath. Might look like this:
//
// groovy -cp C:\app\Dustin\product\11.1.0\db_1\jdbc\lib\ojdbc6.jar getEmployeesViaDataSet.groovy

if (!args) { println "Enter a Department ID!"; System.exit(-1); }

import groovy.sql.DataSet
import groovy.sql.Sql
departmentId = args[0]
sql = Sql.newInstance("jdbc:oracle:thin:@localhost:1521:orcl", "hr", "hr",
"oracle.jdbc.pool.OracleDataSource")
sql.eachRow("SELECT employee_id, last_name, first_name from employees where department_id = ${departmentId}")
{
println "Employee of Department ${departmentId} is ${it.first_name} ${it.last_name}."
}




A first cut at using DataSet might look like this:


#!/usr/bin/env groovy

// Need to include the DataSource class on the classpath. Might look like this:
//
// groovy -cp C:\app\Dustin\product\11.1.0\db_1\jdbc\lib\ojdbc6.jar getEmployeesViaDataSet.groovy

if (!args) {println "Enter a Department ID!"; System.exit(-1);}

import groovy.sql.DataSet
import groovy.sql.Sql
departmentId = args[0]
def employee = new DataSet(sql, 'EMPLOYEES')
def itEmployees = employee.findAll { it.department_id = departmentId }
itEmployees.each{ println "Employee of Department ${departmentId} is ${it.first_name} ${it.last_name}" }


This doesn't work so well, as shown in the next screen snapshot:



The error message shown in the above screen snapshot states "Caught: groovy.lang.MissingPropertyException: No such property: sql for class: ..." In an effort to deal with this, I add static typing to the department ID variable as demonstrated in the next code listing.


#!/usr/bin/env groovy

if (!args) {println "Enter a Department ID!"; System.exit(-1);}

// Need to include the DataSource class on the classpath. Might look like this:
//
// groovy -cp C:\app\Dustin\product\11.1.0\db_1\jdbc\lib\ojdbc6.jar getEmployeesViaDataSet.groovy

import groovy.sql.DataSet
import groovy.sql.Sql
int departmentId = args[0]
//sql = Sql.newInstance("jdbc:oracle:thin:@localhost:1521:orcl", "hr", "hr",
// "oracle.jdbc.pool.OracleDataSource")
//sql.eachRow("SELECT employee_id, last_name, first_name from employees where department_id = ${departmentId}")
//{
// println "Employee of Department ${departmentId} is ${it.first_name} ${it.last_name}."
//}
def employee = new DataSet(sql, 'EMPLOYEES')
def itEmployees = employee.findAll { it.department_id = departmentId }
itEmployees.each{ println "Employee of Department 60 is ${it.first_name} ${it.last_name}" }


Unfortunately, there is still an error as shown in the next screen snapshot.



The error is simply a result of not casting the String argument to an integer. The next code listing addresses that.


#!/usr/bin/env groovy
// Need to include the DataSource class on the classpath. Might look like this:
//
// groovy -cp C:\app\Dustin\product\11.1.0\db_1\jdbc\lib\ojdbc6.jar getEmployeesViaDataSet.groovy

if (!args) {println "Enter a Department ID!"; System.exit(-1);}

import groovy.sql.DataSet
import groovy.sql.Sql
int departmentId = args[0] as Integer
sql = Sql.newInstance("jdbc:oracle:thin:@localhost:1521:orcl", "hr", "hr",
"oracle.jdbc.pool.OracleDataSource")
def employee = new DataSet(sql, 'EMPLOYEES')
def itEmployees = employee.findAll { it.department_id = departmentId }
itEmployees.each{ println "Employee of Department ${departmentId} is ${it.first_name} ${it.last_name}" }


The output from trying to run this version is shown next. In this case, the error message states: "java.sql.SQLSyntaxErrorException: ORA-00936: missing expression." The departmentId variable is obviously not working in the closure.



One of the limitations of the Groovy DataSet is that constants (numbers and literal strings) must be used in construction of the narrowing clauses. I often need my SQL/JDBC statements to be dynamically populated and in such cases the GroovySql is the better choice. Although Groovy DataSet supports inserts in addition to queries (see its add method), the DataSet does not support updates or deletes.

There is one final disadvantage of DataSet that is fairly well advertised. Code using the DataSet must be included on the classpath whether it is a script or a class. Up until now, I've run the script from the same directory in which the script resides, so it was automatically on the classpath. The next screen snapshot indicates what happens when I run the script from a different directory.



The error states: "groovy.lang.GroovyRuntimeException: Could not find the ClassNode for MetaClass: org.codehaus.groovy.runtime.metaclass.ClosureMetaClass@1a0b53e[class getEmployeesViaDataSet$_run_closure1]". This error is documented in bug GROOVY-2450, bug GROOVY-1877, and in Groovy Goodness: Groovy SQL Dataset. Perhaps most obviously, even the API documentation for DataSet warns: "Currently, the Groovy source code for any accessed POGO must be on the classpath at runtime." The warning applies to code run as script or as a class.

The next screen snapshot demonstrates that including the directory in which the script presides directly on the classpath takes care of the previously encountered "Could not find ClassNode for MetaClass" error.




One of the interesting features of the DataSet is that it provides a getSql() method and a getParameters() method that make it easy to see the SQL statement underlying a DataSet along with the parameters it uses. Similarly the rows() method returns the result rows from a query.


Conclusion

Groovy's DataSet provides some nice advantages such as a more object friendly syntax for those who are SQL-adverse and providing easy access to the SQL that underlies it. However, the DataSet also has some disadvantages from the small nuisance of needing to include the code using DataSet on the classpath to more significant drawbacks of not being able to use it for updates/deletes and not being able to use dynamic values to narrow searches. I find GroovySql easy enough to use and appreciate its consistency in satisfying different needs. Therefore, I generally prefer GroovySql, but I also understand that others may find DataSet's advantages make it more attractive.

1 comment:

Unknown said...

Agreed, update/delete would make dataset far more useful.

As a getter, thankful for its existence, but am grumpy over lack of update/delete, will have to hack those methods in...