Sunday, March 11, 2007

1952 Vincent Black Lightning

The theme for this weekend: 1952 Vincent Black Lightning. That's because I heard Richard Thompson's song for the first time. It immediately appealed to me, what with its protagonist sharing my name and his girl being red-headed, not to mention some pretty tasty finger picking throughout... but who's this cat, Vincent?

And so began my journey of not only learning what the heck a Vincent is -- it's a famous motorcycle -- but also about Richard Thompson, his technique, and a new open tuning I'd never tried before.

I also learned about Rollie Free and decided that being famous for this photo is darn cool.

And even though I've previously cautioned against it, here is why YouTube is so freaking great:



So I ignored all the stuff I had wanted to get done this weekend and attempted to learn to play the tune. My youngest daughter thinks I'm a bit obsessed with it. Maybe so, but it's fun to sing, and especially funny to hear her sing it.

I ended up with a decent approximation... and a newfound respect for both Richard Thompson and Phil Vincent.

Sunday, March 04, 2007

Testing EJB3 (JPA) with Maven2

This tutorial describes how to test EJB3 beans as a part of a Maven2 build, i.e. without an EJB container but with an in-memory, transactional database. I make the following assumptions:
  • JDK 1.5
  • EJB 3.0, i.e. Java Persistence API (JPA) 1.0
  • Maven 2 (2.0.5)
I don't assume much more than that. Our application contains no explicit references to Spring or Hibernate as EJB3 itself borrows much of their goodness. We avoid all the XML config usually associated with Spring/Hibernate by using the JPA persistence annotations in our entity beans. To facilitate testing, any injected dependency references, e.g. @PersistenceContext or @EJB, are declared package access so that the JUnit test classes can set them appropriately.

As with many enterprise applications, much of our business logic is embedded within database queries, most in JPQL, and some can be quite complex. It's vital that we're able to test them outside of a container as a part of our normal, continuously-integrated build. For that, we use the Hypersonic (hsqldb) in-memory database. We'll need a JPA implementation, of course, and because we deploy to JBoss 4.0 in production, which uses Hibernate's JPA, we choose the same. Of course, any certified JPA implementation should work just as well.

pom.xml

So our first step is to declare our testing dependencies in our POM:

<!-- For in-memory sql testing -->
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-entitymanager</artifactId>
<version>3.2.0.ga</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>hsqldb</groupId>
<artifactId>hsqldb</artifactId>
<version>1.8.0.7</version>
<scope>test</scope>
</dependency>

Aside: I toyed with using the new H2 database instead of Hypersonic, but the version I tested, 1.0.20061217, included only table locks, causing my app to deadlock. Hypersonic supports row locking.

Although not strictly necessary, I recommend adding this to your POM, too:

<testResources>
<testResource>
<directory>src/test/resources</directory>
<filtering>true</filtering>
</testResource>
</testResources>

You'll see why in the next section.

Note that we didn't need to change the POM much. In particular, no special configuration of the Surefire plugin (the testcase invoker) is required.

src/test/resources/META-INF/persistence.xml

All JPA apps need a persistence.xml file. Your app will likely already have one beneath src/main/resources. Both will be in your CLASSPATH when the tests are run. The JPA will aggregate them, so you need to make sure all your persistence-units, both real and test, are uniquely named. How you organize your persistence-units is up to you. It's probably easiest to have just one for testing, or you might, for example, prefer to create one per test class.

Here's what I recommend:

<persistence>
<persistence-unit name="hibernate-hsqldb">
<provider>org.hibernate.ejb.HibernatePersistence</provider>
<jar-file>${project.build.directory}/classes</jar-file>
<properties>
<property name="hibernate.dialect" value="org.hibernate.dialect.HSQLDialect"/>
<property name="hibernate.connection.driver_class" value="org.hsqldb.jdbcDriver"/>
<property name="hibernate.connection.url" value="jdbc:hsqldb:."/>
<property name="hibernate.connection.username" value="sa"/>
<property name="hibernate.connection.password" value=""/>
<property name="hibernate.hbm2ddl.auto" value="create-drop"/>
</properties>
</persistence-unit>
</persistence>

Note the jar-file element. The ${...} expansion only works when filtering is turned on in the testResources element of the POM (see above). By setting the jar-file this way, we're telling the persistence provider to find our entities by searching for persistence annotations in our classes. From these, the required DDL is generated and the schema is created. Alternatively, you could use one or more class elements instead. This would allow you finer-grained control over which classes and/or packages are involved in a test. This strategy would probably also lead to multiple persistence-unit elements defined.

One thing to keep in mind about Hibernate. There's a known bug in which it won't create a schema before creating tables for your entities, so you're in trouble if you've set the schema attribute in your @Table annotations. I recommend removing this attribute from your entities anyway: it's a violation of the DRY principle. A better solution is to set the hibernate.default_schema property in your persistence-unit.

One debugging tip: you may want to include the following property in your persistence-unit:

<property name="hibernate.show_sql" value="${hibernate.show_sql}"/>

That way, you could add something along these lines in your POM:

<properties>
<hibernate.show_sql>false</hibernate.show_sql>
</properties>
<profiles>
<profile>
<id>debug</id>
<properties>
<hibernate.show_sql>true</hibernate.show_sql>
</properties>
</profile>
</profiles>

So when you want to see the SQL produced by your unit tests...

$ mvn -Pdebug test

src/test/resources/import.sql

Some apps make use of Hibernate's ability to run some SQL after a schema is created. If it finds a file named import.sql on the classpath, it'll run the SQL within it. This can be a problem if, for example, your import.sql is exploiting Oracle PL/SQL commands that wouldn't make any sense to Hypersonic. Fortunately, you can easily solve the problem by simply putting a different -- possibly empty -- import.sql file beneath src/test/resources, because Hibernate will find that one first in the CLASSPATH.

Aside: I experienced some problems with a test-specific import.sql file, strange stack traces generated by Hibernate's SchemaExport class dumped to stdout without actually failing the tests. I ended up truncating src/test/resources/import.sql completely.

src/test/java/org/yours/SomeTest.java

Speaking of unit tests, here's an obviously-contrived example:

package org.yours;

import javax.persistence.*;
import junit.framework.*;
import org.apache.log4j.Logger;

public class SomeTest extends TestCase
{
public void testService() throws Exception
{
log.info ("testService");
EntityManager em = emf.createEntityManager();
TestData.build (em);
ServiceBean slsb = new ServiceBean();
slsb.em = em;
assertTrue (slsb.something());
}
protected void setUp() throws Exception
{
log.debug ("setUp");
emf = Persistence.createEntityManagerFactory ("hibernate-hsqldb");
}
protected void tearDown() throws Exception
{
log.debug ("tearDown");
emf.close();
}
private Logger log = Logger.getLogger(getClass());
private EntityManagerFactory emf;
}

There is a lot of room for artistic freedom within this structure, but the essential point is the create/close of the EntityManagerFactory in the setUp/tearDown methods. This provides a clean database to each testXXX method.

If your transactional requirements are minimal, you may also want to create/close an EntityManager member variable in setUp/tearDown, too.

If your seed data requirements are complex, you may want to look into something like DBUnit, but in my experience, it's often easier to construct "builder" objects that can model various situations for your tests.

Transactions

For a lot of test cases, you can probably safely ignore transactions. Within one test, letting each of your beans use the same EntityManager without ever beginning or committing its transaction may go a long way toward testing your app sufficiently, especially if all of your beans are simply propagating the transaction anyway.

But transactions can get icky quickly, especially when you have multiple cooperating session beans with their transaction attribute set to REQUIRES_NEW. I'm going to show you one way of solving the problem using Java's dynamic proxies, but there's probably a more elegant solution.

Here's something I call a TransactionProxy. Others might call it "a good argument for AOP". It assumes your target bean class has an EntityManager member named em.

public class TransactionProxy implements InvocationHandler
{
private EntityManagerFactory emf;
private Object target;
private Field field;

private TransactionProxy (Object target, EntityManagerFactory emf)
{
this.emf = emf;
this.target = target;
try {
this.field = target.getClass().getDeclaredField ("em");
this.field.setAccessible (true);
} catch (Exception e) {
throw new RuntimeException (e);
}
}

public static Object newInstance (Object target, EntityManagerFactory emf)
{
return Proxy.newProxyInstance (target.getClass().getClassLoader(),
target.getClass().getInterfaces(),
new TransactionProxy (target, emf));
}

public Object invoke (Object proxy, Method m, Object[] args)
throws Throwable
{
EntityManager em = emf.createEntityManager();
try {
field.set (target, em);
em.getTransaction().begin();
Object result = m.invoke (target, args);
em.getTransaction().commit();
return result;
} catch (InvocationTargetException e) {
em.getTransaction().rollback();
throw e.getTargetException();
} catch (Exception e) {
throw new RuntimeException (e);
} finally {
em.close();
}
}
}

The idea is that you create it with an instance of say, a stateless session bean (SLSB) and an EntityManagerFactory, and subsequent invocations on the proxy will wrap calls to the real bean inside a transaction obtained from the EntityManagerFactory. For example:

Service service = (Service) TransactionProxy.newInstance (new ServiceBean(), emf);

This allows you to combine your beans in any number of transactional contexts. But it's not optimal since it's not actually using the @TransactionAttribute annotation of the classes under test. Hopefully, that's what the more elegant solution mentioned above is doing.

I'll update this tutorial after I've confirmed that. Until then, happy hacking!

From Gentoo+Postfix+Courier to Debian+Exim+Dovecot

Ok, let's say -- hypothetically, of course -- you chose Gentoo for your home email server a few years ago. You found one of Gentoo's great docs on setting up Postfix, Courier and SpamAssassin. You accepted all the reasonable default options during the install, and things seemed to work great.

Over time, you began to take the little email server for granted. Oh sure, you'd occasionally login and "emerge" some security updates, but the dang thing just worked. Why bother it?

Of course, entropy is a cold bitch, and "for granted taking" turned to neglect. After a while, you began to forget all the arcane portage commands, having to use rpms at work and happily experimenting with Ubuntu's adopted dpkg on your desktop. In the beginning, spam was under control, but it's a constant battle, and forgetting your distro's packaging commands won't exactly help you in that fight.

And then terror struck: the Gentoo maintainers decided to release a portage update that was incompatible with the version running on your little email server. And at that precise version did your server remain for the rest of its life. The prospect of recompiling your entire system was just too much to bear. You decided it was a good time to look for another distro, maybe in a couple weeks or so, when you're not so busy...

Well, a couple of weeks turned into a couple of years. And over that time, you came to realize the benefits of the maildir storage format over the traditional mbox one, the latter being one of those reasonable defaults mentioned above. More importantly, you began to appreciate apt/dpkg as "portage without all the compiling", and you became aware of Debian's reputation for server stability.

One day when the spam/ham ratio was particularly high, you decided the time had finally come to switch the ol' home server from Gentoo to Debian. You decided you'd build a new box, transfer the family email over and convert it to the Maildir format. Just because you're a total dumbass-glutton-for-punishment, you also decided to switch MTA's (Postfix to Exim) and IMAP daemons (Courier to Dovecot). "Why the hell not?" you wondered.

Um... I'm not sure I can keep up the hypothetical perspective any longer. I'm thinking you were seeing right through it, anyway. Sorry.

So you -- I mean, I -- found a cheap Dell P3 on Ebay for $33+s/h, installed Debian Etch, and then fired up aptitude...

Exim

By default, Debian installs exim4-daemon-light, which is fine for most purposes, but exim4-daemon-heavy is required for integration with most spam/virus detection packages and, despite its name, it's really not that heavy.

# aptitude install exim4-daemon-heavy

Configuring Exim is easy enough, though it took me a while to grok the split vs non-split formats. I wasted a lot of time googling for docs/howtos: I wish I had just started with /usr/share/doc/exim4/README.Debian.gz like the maintainers suggest. I don't know why Google results seem so much more glamorous than the packaged docs.

I went through the normal routine:

# dpkg-reconfigure exim4-config

I chose non-split, mail sent by smarthost, received via SMTP, and accepted mail for *.internal.domain;internal.domain;external.domain. Your -- my? -- domain names will differ, of course. I chose to relay mail for my local network, set the name of the outgoing smarthost, hid the local mail names making only my external domain visible, and made sure to set Maildir as my delivery format.

I then created user accounts for all my family members, and sent them some test mail from a box configured to relay outgoing mail to my new server. In addition to testing the above config, it also caused the Maildir structures to be created in the users' home directories. Easy-peazy-lemon-squeazy.

Something else cool: I can add entries to /etc/aliases and exim will pick them up immediately without restarting or rerunning newaliases or some such. I do that for extended relatives so I don't have to remember their real email addresses, only their alias, e.g. unclebob@crossleys.org. Let me know if you want one.

Dovecot

So delivery is working. Now I need an IMAP server for retrieval.

# aptitude install dovecot-imapd

Configuring Dovecot was simple: I just needed to enable plaintext logins (I'm not exposed to the bad ol' Internet) and the imap protocol. Here's a diff of my changes to /etc/dovecot/dovecot.conf:

21c21
< protocols =
---
> protocols = imap
46c46
< #disable_plaintext_auth = yes
---
> disable_plaintext_auth = no

After saving those changes, a restart is required:

# /etc/init.d/dovecot restart

imapsync

Now comes the hard part: how to move my email from the old server to the new one? After googling a bit, I came across a utility called mb2md, but its interface is a little weird, seemed to expect files to be in certain places, and seemed more geared toward a single user rather than a group. Somewhat discouraged, I restarted my search, and came across imapsync. This discovery simplified my approach: instead of copying mbox files (both spooled and saved) from the old server to the new and *then* converting them, I could simply encapsulate the format differences behind the IMAP interface, i.e. imapsync allows me to copy email from one IMAP server to another, irrespective of the underlying formats of each! Neat-Oh!

Installing imapsync is easy:

# aptitude install imapsync

Using it takes some practice. Fortunately it provides a --dry option for just that purpose. Here's an example of my Courier-to-Dovecot imapsync incantations for a particular user. I needed to invoke it twice: once for the main inbox, and again for the categories of saved mail I kept beneath the ~/mail directory on the old server:

imapsync --host1 oldbox --user1 jim --password1 oldpass \
--host2 newbox --user2 jim --password2 newpass \
--folder INBOX
imapsync --host1 oldbox --user1 jim --password1 oldpass \
--host2 newbox --user2 jim --password2 newpass \
--include "^mail/.+"

The nice thing about imapsync is that it's a true sync. You can run it multiple times and it only copies over what's changed. This allowed me to be pretty sloppy about when I actually updated my firewall to send SMTP requests to the new server.

ClamAV

Ok, so we've (we, as in me, not you) accomplished a lot. The only thing remaining is to reject viruses and spam. ClamAV will handle the viruses:

# aptitude install clamav-daemon

To integrate with Exim, two changes to /etc/exim4/exim4.conf.template were required. I set av_scanner to this:

av_scanner = clamd:/var/run/clamav/clamd.ctl

And uncommented these lines (presumably put there by exim4-daemon-heavy):

deny
malware = *
message = This message was detected as possible malware ($malware_name).

After running update-exim4.conf, restarting exim and sending a fake virus (as described here), I noticed some "lstat() failed" errors in the mainlog. Smells like a perms problem. To fix that, I did this:

# adduser clamav Debian-exim
# /etc/init.d/clamav-daemon restart

SpamAssassin, er... greylistd

At this point in our story, I expected to tell you how easy it was to integrate SpamAssassin with our (my) new setup, but frankly I haven't done it yet.

Instead, I discovered greylisting. This has eliminated 95% of my spam. For this reason, I'm holding off on SpamAssassin integration for the time being.

The greylistd package provides a convenient script that completely idiot-proofs exim4 integration:

# aptitude install greylistd
# greylistd-setup-exim4 add

Simple, huh?

Conclusion

That's it. We're done. Whew!

Now I have a stable home email server that I can actually update and try to secure. Sweet.