How to integrate groovy and Talend

Talend proposes 2 components to use Groovy in jobs, tGroovy and tGroovyFile.

tGroovy allows to execute groovy code and tGroovyFile allows to launch an external groovy script.

However, those 2 components have both a severe drawback. They run at the beginning of the job and they do not give the possibility to process the rows in the middle of a job.

Recently, I wanted to implement a quite complicate logic to process a row. Moreover, I wanted to be able to apply the logic on rows with different schema.  

I could have used the tJavaFlex but then I would have been obliged to recode the logic for every job because of the schema difference.

I could have used a custom Java method but the logic was really simpler to code in groovy. I needed dynamic expression like row.${attribute} and this is far more easier to do in Groovy than with the Java reflection API.

I started thinking about a way to integrate Groovy and Talend and I finally end up with the following.

The main idea is to put in the globalMap, an groovy objet that will perform the requested logic. Then at the appropriate step in the job, one will call the right method on the groovy object to do the work.

Putting the groovy object in the global map is done using the tGroovyFile. The component could be place anywhere in the job since it is process during the BEGIN phase. The content of the groovy script will be as follow:

globalMap.put("GroovyObject", new MyGroovyObject())
class MyGroovyObject {
def myMethod(row) {
// Do anything you want with the row
// update the row directly like
row.counter++
row.name = row.firstName + row.name
}
}

the globalMap is bind to the groovy context.

At the point where one want to process the row with the groovy method, just use a tJavaFlex component with the following code:

groovy.lang.GroovyObject groovyObj = (groovy.lang.GroovyObject)globalMap.get("GroovyObject");
groovyObj.invokeMethod("myMethod", rowInput);

where myMethod& is the name of the method to call and rowInput the name of the input row.

The output row of the tJavaFlex will be the row passed to the groovy method.

Java Swing application on Mac OSX

I spent a few hours trying to make a Java Swing application to work on a Mac Book .

The issues were that some swing components had an odd behavior, menu labels disappeared, content of JComboBox was not rendered correctly, …

Looking in the Java console, I found the following exception:

Exception in thread 

Roller 4.0

I just migrated my blog to roller 4.0.

Updated

After the migration, I couldn’t login anymore.

In fact, Roller 3.x does not use password encryption by default althought Roller 4.

I updated my roller-custom.properties with the following rule and everything went back in order.

passwds.encryption.enabled=false 

Packaging Java Webstart application with maven 2

There is a plugin for maven to package Java Webstart application. It’s the webstart maven plugin.

To create the jnlp file, the documentation suggests to launch:

mvn webstart:jnlp

However, sometimes I got the following error message:

<jnlp> configuration element missing.

To workaround this, just launch the same command with the explicit plugin name:

 mvn org.codehaus.mojo.webstart:webstart-maven-plugin:jnlp

This solves the error.

I guess, this is due to a mismatch between webstart plugin. I should reset my plugin repository and make a try.

Required vs RequiredString or how loosing 1 hour

Struts 2 has two type of required validator.

  • required: checks that the value is not null
  • requiredstring: checks that the value is a string and that the string is not empty.

It is important to note the differences between those 2 validators.

For example, when using textbox linked to String attribute, Struts creates empty string. If the required validator is used to check the form values, the form will be validated because the text will not be null.

With the requiredstring validator, the form will not be validated.

As a conclusion, I strongly recommend not to use required validator for String field.

Garbage collection tuning on AIX

On the last AIX IBM JVM (1.5), the default garbage collection algorithm is the “Mark and Sweep” algorithm.

On a current project, running a web application on JBoss, I had some performance issues. The CPU usage was abnormally high. Even during low load (during the night), the Java process was using neatly 20% of the CPU.

After analysis of garnage collection trace output, I found out that the garbage collection was running every 10-12 sec ans was taking bout 1.2s. From those 1.2 sec, about 1.1sec was used by the mark phase. The IBM Pattern Modeling and Analysis tool shows an overall garbage collection overhead of 9%.

I tried different garbage collection policies and finally I found out that the “gencon” policy seems to be the best in our case. The overal garbage collection overhead falls below 2% and the avarage CPU usage drops by more than 15%.

My explanation is that there was a high amount of long life objects – even really long life object, maybe the connection pools. At every run, the Mark and Sweep algorithm had to mark every object and that phase was highly time consuming. The generational algorithm (the “gencon” policy) didn’t process the long life objects until it the old heap size is exceeded (in my case, once every 10 min). So the number of objects to process at each GC run was smaller and the garbage collection was running more quickly.

I just have still one question, in which case would the default setting (the mark and sweep algorithm) be suitable for a server application?

Problem with garbage collection of class definition ?

Yesterday, I talked about the IBM JVM bug I encountered with the XML Decoder.

I didn’t mention the type if the bug. In fact, it was quite similar to the bug with the XML Encoder we had a few weeks ago.

target instanceof Class  was returning false while  (target.getClass().getName().equals(Class.class.getName()))  was returning true.

This is quite strange. Moreover, it did happen only under heavy load.

Is there some issue withe the garbage collection of class definition in the AIX IBM JVM ?

Another Bug with IBM JVM

A few weeks ago, I discover a bug within the IBM JVM during XML encoding with the XML encoder.

Last week, I discover an equivalent bug with the XML decoding process.

Like a few weeks ago, I decided to work around the bug by “rewriting” the decoding process. I looked at the JVM code and found that the real parser used during XML decoding was not included in the JVM source (as of JDK 1.5).

Fortunaltely, since a few month, through the OpenJDK it is possible to find more JDK related source.

I just wanted thanks to Sun to have open those source. It helped me correcting IBM JVM issues.

Really a nasty bug

It’s a long time, I didn’t write a post in my blog. I just had a really heavy workload period.

Last week, I spent my whole sunday chasing a really nasty bug related to the XmlEncoder running an JVM 1.5 AIX 64bits

The symptoms where the followings, after a certain amount of time, a XML dump done using the XmlEncoder was throwing StackOverflowException. What was really strange is that the dumped object was correctly dumped a few second before and that a new dump was throwing the exception.o After some investigations, I came to the conclusion that the problem was due the PersistentDelegate used for the class (aka java_lang_Class_PersistentDelegate). I tried to bypass this default PersistentDelegate of the java.lang.Class object but then I got InvalidClassChangeError.

Eventually, I decided to rewrite the XmlEncoder and I found the issue.

In the java_lang_Class_PersistenceDelegate, there is the following code:

protected Expression instantiate(Object oldInstance, Encoder out) {        Class c = (Class)oldInstance;        // As of 1.3 it is not possible to call Class.forName("int"),        // so we have to generate different code for primitive types.        // This is needed for arrays whose subtype may be primitive.        if (c.isPrimitive()) {            Field field = null;        try {        field = ReflectionUtils.typeToClass(c).getDeclaredField("TYPE");        } catch (NoSuchFieldException ex) {                System.err.println("Unknown primitive type: " + c);            }            return new Expression(oldInstance, field, "get", new Object[]{null});        }        else if (oldInstance == String.class) {            return new Expression(oldInstance, "", "getClass", new Object[]{});        }        else if (oldInstance == Class.class) {            return new Expression(oldInstance, String.class, "getClass", new Object[]{});        }        else {            return new Expression(oldInstance, Class.class, "forName", new Object[]{c.getName()});        }    }

The issue was due to the code:

else if (oldInstance == Class.class) {

Before calling instantiate, oldInstance.toString() was returning class org.acme.Foo. However, within the instantiate method oldInstance == Class.class was returning true which for me is totally wrong.

As a workaround, I rewrited a PersistentDelegate and replace the if statement by:

else if (((Class) oldInstance).getName().equals(Class.class.getName())) {

And now it’s working properly.

It was a Sunday, I would have prefer to spend outside.

Please don’t place trap for the developer

Recently, I had to develop a add-in for Outlook. Of course, I have to admit that I’m not an expert in C#. However, being used to Java, I quickly made me easy with this new language.

I just found something that makes me loosing a lot of time.

My add-in had to delete all the contacts from an Outlook contact folder. So naively, I write the following:

foreach (ContactItem contact in contactFolder.Items)
{
contact.Delete();
}

Simple isn’t it? But the result is that it doesn’t work. No compilation error, no runtime error, just that only half of the contacts were deleted. After some Google search to find why it could be wrong, I end up on the following: Working with Members of an Items Collection.

In this note, you can read that the element of a Outlook collection, should not be deleted using a classic loop. To delete them, you have to browse the collection in reverse order.

Even if I can’t understand the reason (it seems to be close to what causes a ConcurrentAccessException in Java), I found really frustrating that nothing warn you of such a behavior. For me, a good API should be implement in such a way that a normal expected behavior should at least generate exception.

To finish, here is the code that works:

for (int i = contactsFolder.Items.Count; i > 0; i --)
{
ContactItem contact = (ContactItem) contactsFolder.Items[i];
contact.Delete();
}