Extending the simple Java/Jetty/Guice/Jersey/Jackson web stack with automatic Jersey resource method metrics

Several months ago I put up a tutorial on using Jetty, Jersey, Jackson and Guice to make a small JSON web service stack. The code is more of a proof of concept than anything else, but we’re going to be giving it a few more real-world features today by adding the use of the Metrics library.

As a quick review, the project is a web service that allows accessing peanut butter & jelly sandwiches over HTTP. It has a primitive “stats” resource to allow you to see how many sandwiches have been made and how much peanut butter & jelly was used. This stats facility is more of demo of how to wire stuff together with Guice and how requests are routed to JAX-RS resources than a useful way to monitor usage, though.

We’re going to do two things to improve the situation. First, we’ll re-implement the number of sandwiches and amount of PB&J tracking using individual Counters and Gauges. Second, we’ll add code to automatically create various useful metrics for every resource method via Jersey ResourceFilter, ResourceFilterFactory, ContainerFilterResponseFilter, ResourceMethodDispatchAdapter, and ResourceMethodDispatchProvider implementations. Fun! :)

As before, I’ll outline the code changes in this article, but the complete code is in the repo for your browsing pleasure.

Reimplementing Sandwich stats

The SandwichStats object (a singleton) keeps track of the number of sandwiches and the amount of PB & J used to make them. This data is exposed through SandwichStatsResource, which response to GET of /sandwich/stats with data like {"sandwichesMade":2,"gramsOfJam":200,"gramsOfPeanutButter":400}. We want to keep track of those three numbers with metrics now as well. Metrics (the library) offers many different types of metrics (the concept). For these three numbers, there are two ways we could do it. We could use a Gauge to expose the numbers that SandwichStats is already tracking internally, or we could use a Counter that we increment whenever we increment the fields that SandwichStats already has. To demonstrate the use of Gauges and Counters, we’ll do some of both.

When creating a metric object, you can use a MetricsRegistry instance or the static method on the Metrics class. If you have multiple applications running inside the same JVM, or need that separation of metrics for other reasons, use MetricsRegistery. If you aren’t doing that, you can use the static methods on Metrics, but we’ll go ahead and use a MetricsRegistry because it’s just as easy as using the Metrics class when we have Guice to help us.

First, add a binding for MetricsRegistry in a Guice module:

    bind(MetricsRegistry.class).toInstance(Metrics.defaultRegistry());

We’ll inject a MetricsRegistry into the SandwichStats ctor:

private final Counter jamCounter;
private final Counter pbCounter;
 
@Inject
SandwichStats(MetricsRegistry metricsRegistry) {
    metricsRegistry.newGauge(SandwichStats.class, "sandwich-count", new Gauge<Integer>() {
	@Override
	public Integer value() {
	    return sandwichesMade;
	}
    });
    jamCounter = metricsRegistry.newCounter(SandwichStats.class, "grams-of-jam");
    pbCounter = metricsRegistry.newCounter(SandwichStats.class, "grams-of-pb");
}

Then, we can update the existing recordSandwich method:

synchronized void recordSandwich(Sandwich sandwich) {
    sandwichesMade++;
    gramsOfJam += sandwich.getGramsOfJam();
    gramsOfPeanutButter += sandwich.getGramsOfPeanutButter();
 
    jamCounter.inc(sandwich.getGramsOfJam());
    pbCounter.inc(sandwich.getGramsOfPeanutButter());
}

We don’t have to do anything for the “sandwich-count” metric because it’s a Gauge that references the existing field. The other two metrics are Counters, and therefore need to be updated separately.

Start up the service’s main method and open JConsole. When you connect to your process, open the MBeans tab and look in the com.teamlazerbeez.http.sandwich package. You should see some SandwichStats MBeans. Expand the grams-of-jam MBean, select the attribute and double click the numeric value to open a graph. Re-run the /sandwich/create request a few times and note that the graph will periodically update.

Automatic Jersey Request Metrics

Using and creating metrics by hand is already useful, but we can do better. We’re going to hook into the Jersey request dispatch mechanism to automatically create metrics for per-status code counts, etc. for every JAX-RS resource method.

In Jersey, resource methods are represented by subclasses of AbstractMethod. AbstractMethod is an abstract class representing a method that is invoked by Jersey. This could be a setter that’s invoked by Jersey’s IoC mechanism or a resource method or other types of methods. We only care about two subclasses: AbstractResourceMethod and AbstractSubResourceMethod (a subclass of AbstractResourceMethod). In JAX-RS, a resource method is one that uses the @Path annotation of its containing class, and a sub-resource method is one that has a @Path annotation of its own.

There are two mechanisms that we’re going to use in the Jersey request invocation pipeline to do this. We’ll do the the simpler one first.

Per-resource method counters with ResourceFilterFactory

Think of a Jersey ResourceFilter as filling a subset of the role of a javax.servlet.Filter. It can observe and modify the input and the output of a JAX-RS method invocation, but it does not wrap the actual invocation the way a servlet Filter does. This is still useful, though, since we can observe the HTTP status codes being output and create metrics to expose them.

A ResourceFilter itself doesn’t do much of anything except provide getters for ContainerRequestFilter and ContainerResponseFilter. Those two do the actual work. In our case (observing status codes) we only care about the latter. However, ResourceFilter objects are created by ResourceFilterFactory implementations, so we’ll start there. Here’s part of an implementation of a ResourceFilterFactory (check the source for the full details):

@Override
public List<ResourceFilter> create(AbstractMethod am) {
    // documented to only be AbstractSubResourceLocator, AbstractResourceMethod, or AbstractSubResourceMethod
    if (am instanceof AbstractSubResourceLocator) {
	// not actually invoked per request, nothing to do
	logger.debug("Ignoring AbstractSubResourceLocator " + am);
	return null;
    } else if (am instanceof AbstractResourceMethod) {
	String metricBaseName = getMetricBaseName((AbstractResourceMethod) am);
	Class<?> resourceClass = am.getResource().getResourceClass();
 
	return Lists.<ResourceFilter>newArrayList(
		new HttpStatusCodeMetricResourceFilter(metricBaseName, resourceClass));
    } else {
	logger.warn("Got an unexpected instance of " + am.getClass().getName() + ": " + am);
	return null;
    }
}

What we see here is that if the AbstractMethod is an AbstractResourceMethod, we generate a ‘metricBaseName’ (a prefix common to all metrics tied to that resource method) and return a list of ResourceFilter objects containing a HttpStatusCodeMetricResourceFilter, which is one of our classes. Here’s what that class does (again, check the source).

final class HttpStatusCodeMetricResourceFilter implements ResourceFilter, ContainerResponseFilter {
 
    ....
 
    @Override
    public ContainerResponse filter(ContainerRequest request, ContainerResponse response) {
        Integer status = response.getStatus();
 
        Counter counter = counters.get(status);
        if (counter == null) {
            // despite the method name, this actually will return a previously created metric with the same name
            Counter newCounter = metricsRegistry.newCounter(resourceClass, metricBaseName + " " + status + " counter");
            Counter otherCounter = counters.putIfAbsent(status, newCounter);
            if (otherCounter != null) {
                // we lost the race to set that counter, but shouldn't create a duplicate since Metrics.newCounter will do the right thing
                counter = otherCounter;
            } else {
                counter = newCounter;
            }
        }
 
        counter.inc();
 
        return response;
    }
}

This filter doesn’t modify the response; instead, it just gets the HTTP status, gets the appropriate Counter (creating it if need be), and increments it.

We have enough now to be able to hook our status-code-counting filter into Jersey. This is a little ungainly, but basically we’re informing GuiceContainer (the Jersey code that interacts with Guice and Guice Servlet) the ResourceFilterFactory to use. You can specify more than one (comma-separated strings, Class[], etc — check the docs on ResourceConfig.PROPERTY_RESOURCE_FILTER_FACTORIES) but we have just one here.
Before:

serve("/*").with(GuiceContainer.class);

After:

Map<String, String> guiceContainerConfig = new HashMap<String, String>();
guiceContainerConfig.put(ResourceConfig.PROPERTY_RESOURCE_FILTER_FACTORIES,
    HttpStatusCodeMetricResourceFilterFactory.class.getCanonicalName());
serve("/*").with(GuiceContainer.class, guiceContainerConfig);

Start up the service and check the MBeans. You’ll see SandwichStats, but nothing for the SandwichMakerResource until you use the “/sandwich/create” endpoint. Once you do that, you’ll see a metric named “/sandwich/create GET 200 counter” appear, and it will update appropriately as you make more requests.

Per-resource method timing with ResourceMethodDispatchAdapter and friends

Status code counters are great, but we really want to be able to generate timing info for request invocation, and to get timing information we’d have to use a ThreadLocal in a ContainerRequestFilter/ContainerResponseFilter pair, and that’s not how we roll.

The Metrics library has built-in support for timing method invocations if you annotate the methods you wish to be timed with @Timed. This uses Guice AOP under the hood, which has a few limitations (no non-final classes, must instantiate via Guice, etc.), so let’s see if we can do it a little less invasively.

We’ll hook into the Jersey method invocation mechanism via ResourceMethodDispatchAdapter and ResourceMethodDispatchProvider. A ResourceMethodDispatchProvider is responsible for creating RequestDispatcher objects for AbstractResourceMethods, and a ResourceMethodDispatchAdapter is a Jersey extension mechanism to allow us to control what ResourceMethodDispatchProvider gets used. Therefore, by registering our ResourceMethodDispatchAdapter, we can wrap the default ResourceMethodDispatchProvider with one that gathers timing info. The code you’ll see here does timing for all resource methods, but you could trivially make it conditional on an annotation on the resource method or class or anything you wish.

We’ll start with our custom RequestDispatcher implementation TimingRequestDispatcher. We want to use whatever dispatcher Jersey provides under the hood, so we wrap that dispatcher and just capture timing info.

@Override
public void dispatch(Object resource, HttpContext context) {
    final TimerContext time = timer.time();
 
    try {
	wrappedDispatcher.dispatch(resource, context);
    } finally {
	time.stop();
    }
}

We then can use this RequestDispatcher in TimingResourceMethodDispatchProvider. If you wanted to have conditional logic to sometimes not create timers, this would be a good place to do it.

@Override
public RequestDispatcher create(AbstractResourceMethod abstractResourceMethod) {
    String metricBaseName = getMetricBaseName(abstractResourceMethod);
    Timer timer = metricsRegistry.newTimer(abstractResourceMethod.getResource().getResourceClass(),
	metricBaseName + " timer");
    return new TimingRequestDispatcher(wrappedProvider.create(abstractResourceMethod), timer);
}

We can use that ResourceMethodDispatchProvider in a TimingResourceMethodDispatchAdapter.

@Override
public ResourceMethodDispatchProvider adapt(ResourceMethodDispatchProvider provider) {
    return new TimingResourceMethodDispatchProvider(metricsRegistry, provider);
}

And then bind your Adapter class (which must also be annotated with com.google.inject.Singleton and javax.ws.rs.ext.Provider):

bind(TimingResourceMethodDispatchAdapter.class);

Start up the service and you should see Timer metrics being generated for all endpoints. It’s a little bit more convoluted to use the ResourceMethodDispatchAdapter approach, but it’s probably only really needed for timing; the ResourceFilterFactory approach should work for most everything else. Either way, now you know how to use the basics of the Metrics library, and how to execute custom logic before (ContainerRequestFilter), after (ContainerResponseFilter), and around (ResourceMethodDispatchAdapter) Jersey resource method invocation.

Java 2-way TLS/SSL (Client Certificates) and PKCS12 vs JKS KeyStores

There’s some confusion on the Internet about how to control which certificates are used for server (and non-server) TLS sockets and why client certs just don’t seem to work right (see here, here, here, here, etc.). I’ll explain how the process is supposed to happen, explain why it doesn’t necessarily work easily with Java, and how to work around the problem. Though this article generally applies to SSL as well as TLS, I’ll refer to just TLS from now on. Also, the testing and bug hunting in this article were done against Sun/Oracle’s JDK 6u25. I have not confirmed whether or not these issues are fixed in Java 7.

Basic terminology

Certificate or cert
The public half of a public/private key pair, though it’s not generally referred to as a key. This part is freely given to anyone.
Private key
A private key is never given out publicly. It is used to sign or encrypt data. A private key can be used to verify that its corresponding certificate was used to sign or encrypt things and vice versa.
Certificate Signing Request or CSR
A file that you generate with your private key. You can send just the CSR to your CA and they will create a signed certificate for you.
Certificate Authority or CA
These are places like Thawte that you pay in order to get a certificate that browsers will accept. You can also use someone like CACert.org to get a free certificate that browsers will not accept. You can also generate your own simple CA using openssl. A CA uses its private key to digitally sign a CSR and create a signed cert so that browsers can use the CA’s cert to tell that your cert is approved by that CA.
Distinguished Name or DN
This is defined by LDAP. It’s a grouping of RDNs (Relative Distinguished Names). A RDN is something like “CN=your name”. This one means that the Common Name is set to the string “your name”. A DN would be something like “CN=your name,OU=Engineering,O=Initech”. In this case, OU means Organizational Unit and O means Organization.
X509
A specification governing the format and usage of certificates.

How client certs are supposed to work

This is a greatly simplified explanation of the TLS 1.0 protocol. Check out the RFC for more details at around section 7.4. (There are newer versions of TLS, but 1.0 is what Java 6 supports.) I’m showing the case where client certificates have been configured on the server side, so this isn’t exactly what happens when you do normal server-only TLS.

  1. ClientHello: client informs server what ciphers and compression methods it supports
  2. ServerHello
    • Server picks a cipher and compression that both it and the client support and tells the client about its choices, as well as some other things like a session id
    • It presents its certificate (this is what the client needs to validate as being signed by a trusted CA)
    • It presents a list of certificate authority DNs that client certs may be signed by
  3. Client response
    • The client continues the key exchange protocol necessary to set up a TLS session
    • The client presents a certificate that was signed by one of the CAs described in the Server hello
  4. The server accepts the cert that the client presented and all is well

Note that the ServerHello does not ask for specific client certificates. It just provides info about CAs (in the form of DNs) and expects the client to figure out an appropriate cert that was signed by one of those CAs.

File formats for certs and keys

There are many different formats used for storing keys and certs on-disk, but the most common ones are probably PEM, PKCS12, and JKS. The formats are not treated equally by Java, so it’s important to understand the different formats. It’s not obvious how to manipulate these formats, so I’ve also included sample commands for working them them.

PEM

PEM is just DER that’s been Base64 encoded. It looks like this for a certificate:

-----BEGIN CERTIFICATE-----
(base 64 encoded stuff)
-----END CERTIFICATE-----

or for a private key:

-----BEGIN PRIVATE KEY-----
(base 64 encoded stuff)
-----END PRIVATE KEY-----

Sometimes PEM files will have a human-readable block of text above the Base64 encoded block. You can safely remove this human-readable text. (It confuses Java’s keytool.)

Some applications prefer the cert PEM and the private key PEM to be in one file. (Apache httpd is one, if I remember correctly.) Since PEM files are just plain text, you can do that with cat: cat cert.pem key.pem > cert-with-key.pem. While you could arbitrarily combine as many PEM blocks as you wanted into one file, typically they are kept separate except for this one case.

You can get a human-readable description of a cert in PEM format with openssl x509 -in cert.pem -noout -text. (Certs are X509 formatted, hence the ‘x509′ subcommand to openssl.) For keys, the command is openssl rsa -in key.pem -text -noout.

Private keys can also be encrypted, in which case the marker block will say BEGIN ENCRYPTED PRIVATE KEY. You can create the decrypted form of the key with openssl rsa -in key-encrypted.pem -out key-decrypted.pem.

PKCS12

PKCS12 is a password-protected format that can contain multiple certificates and keys.

You can view the contents of a PKCS12 file (typically .p12 is used for PKCS12 files) with openssl pkcs12 -in file.p12. Add -info for a little bit more metadata. Note that if the file includes a private key, openssl will ask you for another password after asking for the decryption password for the PKCS12 file. This second password is used to encrypt the private key before displaying its PEM data to you. You could put this data in a separate file and decrypt it as shown above if you want the decrypted form.

You can create PKCS12 files with or without private keys or CA certs.
Cert and key:
openssl pkcs12 -export -out cert-and-key.p12 -in cert.pem -inkey key.pem
Cert and key that includes the CA cert that signed the cert:
openssl pkcs12 -export -out cert-and-key-with-ca.p12 -in cert.pem -inkey key.pem -CAfile /path/to/cacert.pem -chain
Cert without key (useful for CA certs):
openssl pkcs12 -export -out cacert.p12 -in cacert.pem -nokeys

JKS

A JKS keystore stores multiple certs and keys like PKCS12, but it’s just a Java thing, not a widespread standard like PKCS12. The tool to manage JKS files is ‘keytool’ which ships with the JDK. Entries in a JKS file have an “alias” that must be unique. If you don’t specify an alias, it will use “mycert” by default. This is fine if you’re only putting one thing in a keystore, but if you add another thing you’ll get an error because it will try to use the same (default) alias twice. JKS keystores also have a password, just like PKCS12.

You can use keytool to add PEM and PKCS12 files.
Create JKS with cert, key and CA cert from PKCS12:
keytool -importkeystore -destkeystore cert-and-key-with-ca.jks -srckeystore cert-and-key-with-ca.p12 -srcstoretype PKCS12
Add a CA cert, then add the cert without a key:
keytool -keystore cacert-added-then-cert-nokey.jks -import -file cacert.pem -alias cacert (Say yes when it asks if you want to trust the CA)
keytool -keystore cacert-added-then-cert-nokey.jks -import -file cert.pem -alias cert
Add a CA cert, then add the cert with a key:
keytool -keystore cacert-added-then-cert-withkey.jks -import -file cacert.pem -alias cacert (Say yes when it asks if you want to trust the CA)
keytool -destkeystore cacert-added-then-cert-withkey.jks -importkeystore -srckeystore cert-and-key.p12 -srcstoretype PKCS12

TLS with Java

There are going to be a lot of certs involved in configuring TLS, so let’s first decide on some names.

  • server-cert: the cert that the server presents to clients
  • server-key: the private key that corresponds to server-cert
  • server-ca-cert: the cert of the CA that signed server-cert
  • client-cert: the cert that the client presents to the server (if asked to)
  • client-key: the private key that corresponds to client-cert
  • client-ca-cert: the cert of the CA that signed client-cert

When client certs aren’t enabled, the server presents server-cert to the client. If the client has server-ca-cert designated as a trusted CA, the connection can proceed. When client certs are enabled, the server presents server-cert as its cert and also sends the DN of client-ca-cert. The client checks server-cert against its trusted CA certs as before. It then looks for any certs that it has that are signed by client-ca-cert, finds client-cert, and sends that back. The server-key and client-key keys are used in other parts of the TLS protocol. To be as general as possible, I’m going to assume that client-ca-cert and server-ca-cert are not the same cert and are also not in the system-wide set of trusted CAs.

Oracle’s JSSE Reference Guide is useful when figuring out how all these crypto classes fit together, so you may want to have that open as a reference.

If you want to examine the internals of the SSLSocket/SSLServerSocket implementation, set the system property javax.net.debug to “all” for maximum verbosity. You can download the OpenJDK code and step through it by looking for where the debug statements are printed. It’s not as good as a debugger, but there’s not much code and the debug statements are frequent enough that it’s not hard to follow. Wireshark is also extremely useful.

In Java, you use SSLSocketFactory to get SSLSockets and and SSLServerSocketFactory to get SSLServerSocket instances. The simplest usage looks like this:

/* 
 * The static getDefault() methods return the non-SSL
 * factory classes, so they have to be cast.
 */
SSLServerSocketFactory serverSocketFactory = 
    (SSLServerSocketFactory) SSLServerSocketFactory.getDefault();
SSLServerSocket serverSocket = 
    (SSLServerSocket) serverSocketFactory.createServerSocket(8443);
 
SSLSocketFactory socketFactory = 
    (SSLSocketFactory) SSLSocketFactory.getDefault();
SSLSocket socket = 
    (SSLSocket) socketFactory.createSocket("localhost", 8443); 
 
// do the standard socket stuff with byte streams, etc.

This isn’t really going to work, though, since we haven’t told the server socket what cert to use. Time for more terminology!

  • A keystore has certs and keys in it and defines what is going to be presented to the other end of a connection.
  • A truststore has just certs in it and defines what certs that the other end will send are to be trusted. You could put keys in a truststore, but they wouldn’t be used for anything.

Confusingly, the Java class java.security.KeyStore is used in the process of creating both keystores and truststores. I will be careful to capitalize as KeyStore when I mean the class as opposed to the conceptual items.

For the server socket, we need to specify a keystore containing server-cert and server-key. We also need a truststore containing client-ca-cert. For the client socket, we need a keystore containing the client cert and key and a truststore containing the server-ca-cert.

To get these keystores and truststores, we need to construct KeyStore instances with the appropriate certificate and key data. KeyStores can be created for JKS or PKCS12 files. This code creates a KeyStore and loads data from an input stream. After load() has been called, the KeyStore is ready for use.

// keyStoreType is either "JKS" or "PKCS12"
KeyStore keyStore = KeyStore.getInstance(keyStoreType);
keyStore.load(inputStream, keyStorePassword.toCharArray());

A KeyStore is just an intermediate step, though. Once we have a KeyStore with the keystore data and a KeyStore with the truststore data, the next step is a TrustManager (for a truststore) and a KeyManager (for a keystore).

TrustManagerFactory trustManagerFactory = 
    TrustManagerFactory.getInstance("PKIX", "SunJSSE");
trustManagerFactory.init(trustStore);

Now we have a TrustManagerFactory instance. JSSE is fairly agnostic towards cryptosystems, so it can, at least in theory, support things beyond X509. In practice, X509 is all we care about, and looking in the OpenJDK source code will give the impression that X509 is all it’s built to support anyway. The “PKIX” algorithm implements cert-chain validation for X509 certs. A TrustManagerFactory can create an array of TrustManagers, one for each type of “trust material”. We only care about the X509TrustManager instance.

X509TrustManager x509TrustManager = null;
for (TrustManager trustManager : trustManagerFactory.getTrustManagers()) {
    if (trustManager instanceof X509TrustManager) {
	x509TrustManager = (X509TrustManager) trustManager;
	break;
    }
}
 
if (x509TrustManager == null) {
    throw new NullPointerException();
}

Now we have the X509TrustManager instance we want. A similar approach will get you the X509KeyManager.

KeyManagerFactory keyManagerFactory = 
    KeyManagerFactory.getInstance("SunX509", "SunJSSE");
keyManagerFactory.init(keyStore, password.toCharArray());
 
X509KeyManager x509KeyManager = null;
for (KeyManager keyManager : keyManagerFactory.getKeyManagers()) {
    if (keyManager instanceof X509KeyManager) {
	x509KeyManager = (X509KeyManager) keyManager;
	break;
    }
}
 
if (x509KeyManager == null) {
    throw new NullPointerException();
}

Now you can construct an SSLContext. Here’s the code to create a SSLServerSocket.

// load in the appropriate keystore and truststore for the server
// get the X509KeyManager and X509TrustManager instances
 
SSLContext sslContext = SSLContext.getInstance("TLS");
// the final null means use the default secure random source
sslContext.init(new KeyManager[]{keyManager}, 
    new TrustManager[]{trustManager}, null);
 
SSLServerSocketFactory serverSocketFactory = 
    sslContext.getServerSocketFactory();
SSLServerSocket serverSocket = 
    (SSLServerSocket) serverSocketFactory.createServerSocket(PORT);
 
serverSocket.setNeedClientAuth(true);
// prevent older protocols from being used, especially SSL2 which is insecure
serverSocket.setEnabledProtocols(new String[]{"TLSv1"});
 
// you can now call accept() on the server socket, etc

And here’s how to construct an SSLSocket. Make sure you don’t use the same keystore and truststore that you did for the server! They almost certainly need to be different.

// load in the appropriate keystore and truststore for the client
// get the X509KeyManager and X509TrustManager instances
 
SSLContext sslContext = SSLContext.getInstance("TLS");
 
sslContext.init(new KeyManager[]{keyManager}, 
    new TrustManager[]{trustManager}, null);
 
SSLSocketFactory socketFactory = sslContext.getSocketFactory();
SSLSocket socket = 
    (SSLSocket) socketFactory.createSocket("localhost", SslServer.PORT);
 
socket.setEnabledProtocols(new String[]{"TLSv1"});
 
// read from the socket, etc

It should now work. Unfortunately, Java’s default KeyStore implementation has some bugs, so depending on how you set up your server-side trust store, it may or may not work.

But it doesn’t work when I try that!

There’s a bug in the way cert chains are handled for X509TrustManager objects in the Sun/Oracle implementation. A quick look in the OpenJDK code in sun.security.ssl.X509TrustManagerImpl and sun.security.validator.KeyStores shows that the logic used to get issuers is simply wrong. In the case where the KeyStore entry in a KeyStore is a key entry (not a bare cert), it unconditionally uses the first cert in the chain of certs for that key, regardless of whether or not it is even a CA cert or the actual issuing cert in the chain. In fact, the documentation for KeyStore.getCertificateChain() says that the root cert is the last cert in the chain, not the first. This code was probably tested using self-signed certs (which only have one cert in the chain, so it will always work) and not using separate CA certs.

There’s also a separate bug in the way PKCS12 files are loaded. It looks like maybe the PKCS12 parsing code can’t figure out what to do when there isn’t a private key. I haven’t looked into the source of that bug yet. The SunJSSE description in the JSSE Reference Guide has this terse note: “Storing trusted anchors in PKCS12 is not supported. Users should store trust anchors in JKS format and save private keys in PKCS12 format.” It’s unfortunate that this isn’t broadcast more clearly in the documentation.

If you load a PKCS12 file containing just client-ca-cert, you get nothing in the server’s X509TrustManager KeyStore. This is the PKCS12 loading bug. When you connect a client, you get java.net.SocketException: Broken pipe on the client side and javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: No trusted certificate found on the server. This happens because the server has sent a CertificateRequest in the ServerHello, but has not included any DNs for the client to look up certs by.

Construct a PKCS12 file containing client-cert, client-key, and client-ca-cert and then import that in one step (using the -importkeystore option to keytool). If you load that PKCS12 file or the resulting JKS, you get a chain of client-cert and client-ca-cert in the KeyStore. This is correct. However, X509TrustManager.getIssuers() mistakenly returns client-cert as an issuer (which it is not) and does not return client-ca-cert. The correct behavior would be to return only client-ca-cert. (An “issuer” is a CA.) In this case, you get javax.net.ssl.SSLHandshakeException: Received fatal alert: bad_certificate on the client and javax.net.ssl.SSLHandshakeException: null cert chain on the server. The server sends the DN of client-cert in the CertificateRequest part of the ServerHello. The client (correctly) does not find any certs signed by that cert, so it returns no certificates. The server rejects the connection with error code 42 for “bad certificate” (see the TLS RFC section A.3 for error codes) and dies with its own error that (accurately) says there is a null certificate chain from the client.

If you load a JKS file that was created by separately adding client-ca-cert in step 1 (using -import -file client-ca-cert.pem) and then the client-cert (with -import -file client-cert.pem or -importkeystore with a p12 containing client-cert and client-pem) in step 2, the JKS will have two separate entries instead of one entry containing a chain of certs. This will lead to getIssuers() returning both client-cert and client-ca-cert, so both certs will have their DNs in the ServerHello. This is technically incorrect (client-cert is not a CA cert) but does work.

The good news

Fortunately, the JKS implementation of KeyStore is not suffering from the same bug as the PKCS12 code. A KeyStore loaded from a JKS containing only client-ca-cert does end up with a cert in it. Since it’s just a cert, not a cert in a chain attached to a key, it avoids the buggy code path in KeyStores, so the correct DN gets sent to the client in the ServerHello and all proceeds normally.

Key contents Key type Result of getIssuers()
client-cert, client-key, client-ca-cert PKCS12 client-cert
client-cert, client-key, client-ca-cert JKS client-cert
client-cert, client-key PKCS12 client-cert
client-cert, client-key JKS client-cert
client-cert, client-ca-cert PKCS12 (empty)
client-ca-cert PKCS12 (empty)
client-ca-cert added first, then client-cert & client-key JKS client-cert and client-ca-cert
client-ca-cert added first, then client-cert JKS client-cert and client-ca-cert
client-ca-cert JKS client-ca-cert (what you want)

The result of all this analysis:
For your SSLServerSocket’s TrustManager’s KeyStore, use a JKS containing only the CA cert for the client certs. If you us a PKCS12, you’ll get no certs. If you include the cert and key that you’re looking for as well as the CA cert and you create the JKS keystore from one PKCS12 containing all three entities, you’ll get the wrong cert. If you create a JKS keystore using the CA cert and then add the client cert (with or without key) later, you’ll get too many certs.

Embedded MySQL in Java with Connector/MXJ and 64-bit Linux

MySQL’s Connector/MXJ is a tool that exposes the ability to start and stop an embedded MySQL server through a Java API. You can have the MySQL JDBC driver start up a server just by appropriately configuring your JDBC url or you can programmatically control the server through the MysqldResource class. It does this by bundling precompiled versions of the mysql (client) and mysqld (server) binaries and invoking the correct ones based on the os.name and os.arch system properties.

This sounds great for spinning up an instance of mysqld for testing, but there are a few issues to be solved. MySQL hasn’t published any recent versions of MXJ to the central Maven repository, so you’ll have to install the files by hand into your local ~/.m2 (or repo server, if you have one set up). Also, they don’t include 64-bit Linux binaries.
Update: I have submitted artifacts for Connector/MXJ 5.0.12 to Maven Central and they have been approved, so you need only add the following dependency to your code and you’re good to go.

<dependency>
  <groupId>mysql</groupId>
  <artifactId>mysql-connector-mxj</artifactId>
  <version>5.0.12</version>
  <scope>test</scope>
</dependency>

If you’d like to build the artifacts yourself, carry on with the instructions…

Download the zip distribution yourself from MySQL’s download page (5.0.12 is current as of this article). When unzipped, you’ll find two jars: mysql-connector-mxj-gpl-5-0-12-db-files.jar (the ‘db-files’ jar) and mysql-connector-mxj-gpl-5-0-12.jar (the ‘mxj’ jar). The mxj jar is quite small and just contains some Java classes. The db-files jar is pretty hefty because it contains the actual binaries for x86 32-bit FreeBSD, Linux, Mac OS X, Windows and Solaris (as well as Solaris on SPARC in case anyone still uses that…).

Alas, no 64-bit Linux… If this doesn’t affect you, skip to the end of the article to see how to install the jars correctly into your local ~/.m2 repo. You’ll know that you’re hitting this problem if you get output like this:

[MysqldResource] launching mysqld (driver_launched_mysqld_1)
/tmp/test-mxj/bin/mysqld: error while loading shared libraries: libaio.so.1: cannot open shared object file: No such file or directory

This is (correctly) saying is that the 32-bit mysqld binary cannot find libaio. You should have libaio installed since the 64-bit version we’ll be building will need it, though!

Adding 64-bit Linux binaries to the db-files jar

Fortunately, MySQL provides instructions on how to modify the db-files jar. If you’d rather use my pre-built jar with Linux 64-bit binaries, go to the end of the article. Download the mysql 5.5.9 source tarball from the MySQL archives. 5.5.9 isn’t the latest version, but all the other binaries in MXJ 5.0.12 were built with 5.5.9 so we’ll use that to stay consistent. The source installation manual shows the steps necessary to perform a complete installation. We don’t need to do all of that since we just need the compiled code.

cmake .
make

The mysql binary is in client/mysql and the mysqld binary is in sql/mysqld.

Extract mysql-connector-mxj-gpl-5-0-12-db-files.jar into a directory of your choosing so we can add in proper 64-bit versions of mysql and mysqld.

The actual binaries that are run is determined by the platform-map.properties file in the root of db-files jar. Replace the following section

Linux-x86_64=Linux-i386
Linux-amd64=Linux-i386

with

Linux-x86_64=Linux-x86_64
Linux-amd64=Linux-x86_64

This means that when x86_64 or amd64 are seen in the os.arch Java system property, the contents of the Linux-x86_64 directory will be used. First, we’ll need the directory to exist inside the 5-5-9 directory:

mkdir 5-5-9/Linux-x86_64

Copy your newly compiled mysql and mysqld binaries into the Linux-x86_64 directory, and add a version.txt file containing appropriate contents. I followed the template of the Linux-i386 version, modifying for my newer kernel (3.0.4) and platform (x86_64).

mysql-5.5.9-linux3.0.4-x86_64/bin/mysqld

Go back to the root of the extracted jar, then create a new jar.

jar -cf ../mysql-connector-mxj-gpl-5-0-12-db-files.jar *

Installing the Maven artifacts

Now we’ve constructed a mysql-connector-mxj-gpl-5-0-12-db-files.jar. To use it, you’ll need to install it into your ~/.m2 repo. I’ve built a very simple pom.xml for it to save you the trouble. Download the pom and put it in the same directory as your newly built jar. Or, if you’d rather not go to the trouble of building your own db-files jar, download mine.

mvn install:install-file -Dfile=mysql-connector-mxj-gpl-5-0-12-db-files.jar -DpomFile=mysql-connector-mxj-gpl-5-0-12-db-files.pom

You still need the mxj jar, though, so I built a pom for that jar as well.

mvn install:install-file -Dfile=mysql-connector-mxj-gpl-5-0-12.jar -DpomFile=mysql-connector-mxj-gpl-5-0-12.pom

MXJ will re-use a deployment of mysql that it has made previously, so make sure to remove any temporary directories left over by mysql. Otherwise, you’ll keep on using the old (broken) mysqld binaries. This will clear both the default mxj directory and the one used by ConnectorMXJUrlTestExample, in case you ran that previously. Make sure to add also include any directories you’ve configured yourself (e.g. by setting server.basedir in the JDBC url).

rm -rf /tmp/mysql-c.mxj
rm -rf /tmp/test-mxj

MXJ should now be ready to use. To depend on it from a Maven project, this is the dependency you need (you probably don’t want to use MXJ for production deployments, so it’s set to test scope). If you’re using Maven 3, you will also need to put the artifacts in your repo server since Maven 3 complains if it can only find artifacts in the local ~/.m2 repo.

<dependency>
  <groupId>mysql</groupId>
  <artifactId>mysql-connector-mxj</artifactId>
  <version>5.0.12</version>
  <scope>test</scope>
</dependency>

A simple Java web stack with Guice, Jetty, Jersey and Jackson

Another Java web framework? Not really…

There are dozens of web technologies for Java (Struts, Stripes, Tapestry, Wicket, GWT, Spring MVC, Vaadin, Play, plain old servlets and JSP, Dropwizard, etc.) and they all have their advantages and disadvantages. This article isn’t about comparing frameworks. Instead, I’ll describe how to serve HTTP requests from Java without using a monolithic framework in a way that’s a lot more pleasant than using just HttpServletRequests: Jetty for handling low-level HTTP things, Jersey for request routing, Jackson for serialization and Guice to tie it all together.

Update: If you actually want to write something quickly using stuff shown in this article, Dropwizard has since come out, and looks like a solid choice. If you want to learn more about how to build such a stack, check out the second article in this series.

If you want to write your web UI on the server side (whether it’s translated to JavaScript ala GWT or simply HTML markup like you might put in a JSP), this probably isn’t an approach that will be very attractive. You won’t have a web framework already there to provide pre-canned view helpers or login forms or “is this a phone number” validation or any of that.

If you are transitioning towards making the server be more of a data store and putting display logic entirely in the client (regardless of whether the client is a browser or a native mobile app), the no-big-framework approach can make a lot of sense. You don’t need portlets or JavaScript components when you’re simply shoving JSON or XML or what-have-you over HTTP.

I’ll be including code samples as new concepts are introduced, but if you want to browse the finished example project, it’s on the Team Lazer Beez GitHub site.

Jetty

Jetty is a HTTP server and servlet container that’s been designed expressly for easy embedding. I think it’s easier to do the edit-restart-test cycle with a simple main method that starts a Jetty server (no IDE I’ve used has ever had integration with a servlet container that works as fast or as reliably as running a main method). If you prefer to run and debug inside a separate container, or if you need container-managed security or JTA or XA or any of that stuff, feel free bundle your code as a war instead of using Jetty.

Here’s how you start a Jetty server listening on localhost:8080.

Server server = new Server(8080);
// handlers go here
server.start();

Pretty easy! But, this server doesn’t do anything since we haven’t told it what to do for incoming requests. So, before we call start(), we need to give the server a Handler. Handlers are a Jetty concept that can do pretty much anything in response to a HTTP request. You might have a Handler that only gathers performance metrics or only does logging. In this case, we want a Handler that can invoke servlets.

ServletContextHandler handler = new ServletContextHandler();
handler.setContextPath("/");
// set up handler
server.setHandler(handler);

One odd thing about Jetty’s ServletContextHandler is that it likes to have one servlet always, even though in our case we won’t actually end up needing any to run our business logic. So, we’ll give the handler a servlet:

// jetty always needs one servlet
handler.addServlet(new ServletHolder(new InvalidRequestServlet()), "/*");

InvalidRequestServlet is a simple HttpServlet that always 404s.

@Override
protected void service(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
    resp.setStatus(HttpServletResponse.SC_NOT_FOUND);
    resp.setContentType("text/plain");
    resp.setContentType("UTF-8");
    resp.getWriter().append("404");
}

If you create a main method with the code to start up Jetty and run it, you should now be able to go to http://localhost:8080 in your browser and see the result of your InvalidRequestServlet. Before we can start adding some actual logic, though, I’ll need to explain Guice and its servlet layer.

Guice

Guice is a dependency injection framework. Though it accomplishes more or less the same job that Spring does, it has a slightly different philosophy. Whereas Spring tends to think in terms of bean ids when gluing code together, Guice tends to think in terms of types. They’re both good tools with different strengths, but I am only going to show how to use Guice (or this article will never end!). I’ll give a whirlwind tutorial in Guice and Guice Servlet and then we’ll get back to building our stack.

Making some PB&J Sandwiches

One of the problems (but not the only one) that DI/IoC frameworks like Guice and Spring are trying to solve is the problem of passing an instance through many layers of constructors just because some deep-down class is structured to take an instance of some interface. No doubt you’ve seen this anti-pattern: Class1 instantiates Class2 which instantiates Class3, but Class3 takes a Foo instance in its constructor, so somewhere higher up (perhaps in Class1, or even higher up than that) you create the correct Foo instance and then it gets passed around for a while until it’s finally used in Class3. This sucks, but it’s a worthy goal to have Class3 take a Foo instance in its constructor rather than directly instantiating some concrete implementation: this benefits testability, separation of concerns, etc. Guice can help make this easier.

As an example, let’s suppose that you have a SandwichMaker class that takes a PeanutButter implementation in its constructor. For testing your SandwichMaker, you may wish to use a mock PeanutButter implementation, but in an actual deployment you might wish to use an OrganicCrunchyValenciaPeanutButter (OCVPB for short).

class SandwichMaker {
// ...
    SandwichMaker(PeanutButter peanutButter) {
        this.peanutButter = peanutButter;
    }
// ...
}
interface PeanutButter {
    void applyToSandwich(Sandwich sandwich, int grams);
}

Now the question is where in your actual app do you instantiate your OCVPB? It’s not ideal to do it in the class that creates a SandwichMaker since that class will now be using actual peanut butter when it’s run during tests as it’s hardcoded the concrete implementation. It’s also not ideal to create the OCVPB in the top-level main method and pass the instance around through however many levels of constructors are needed to get to the SandwichMaker. To explain how we can use Guice to solve this problem, there are a few concepts to explain.

Guice concepts

Rather than explicitly creating instances with the new keyword, Guice lets you define how classes depend on each other and takes care of wiring things together for you. In our case, we want our SandwichMaker to have a PeanutButter implementation provided to it. We inform Guice of this by annotating the ctor with @Inject. (There are versions of Inject and some other classes from both com.google.inject and from the standardized JSR 330 javax.inject packages. They work almost exactly the same; check the Guice documentation for details. Just pick one and stick with it.)

class SandwichMaker {
// ...
    @Inject
    SandwichMaker(PeanutButter peanutButter) {
        this.peanutButter = peanutButter;
    }
// ...
}

Now that we’ve said that SandwichMaker needs a PeanutButter, we need to say which PeanutButter to use. We do this with a binding in a module. Modules are intended to represent functional boundaries in your application. The decision of when to put things in separate modules is like learning when to put things in separate packages: you’ll simply need to get a feel for it by trying it out. For now, let’s create a module that sets up everything you need to make sandwiches.

class SandwichModule extends AbstractModule {
    @Override
    protected void configure() {
        bind(PeanutButter.class).to(OrganicCrunchyValenciaPeanutButter.class);
    }
}

This informs Guice to create a new instance of OCVPB whenever it sees a request to inject a PeanutButter instance. PeanutButter seems like the sort of thing that should be a singleton since it represents a real-life resource, though, so let’s go ahead and fix that:

bind(PeanutButter.class).to(OrganicCrunchyValenciaPeanutButter.class).in(Scopes.SINGLETON);

We could also have annotated our OCVPB implementation class with @Singleton to achieve the same effect. Now, no matter how many SandwichMakers we get, only one PeanutButter will be used.

We create an Injector with all of the modules necessary (in this case, just the one module) and use it to get an instance of SandwichMaker. It will inspect the bindings and injection targets and create a OCVPB and pass it to the SandwichMaker ctor. We don’t actually need to get a standalone SandwichMaker instance, but if we did, this is how we would do it.

Injector injector = Guice.createInjector(new SandwichModule());
 
SandwichMaker maker = injector.createInstance(SandwichMaker.class);
// use the maker

This is just the very most basic usage of Guice. Check out their documentation for more details.

Guice Servlet

Now that we know how to bind instances (whether as singletons or new-one-every-time) and inject them, we can move on to Guice Servlet. This extension to Guice allows us to get rid of web.xml entirely (if embedding Jetty) or make it much simpler (if using a war).

Suppose you have a FooServlet servlet. In the old days of web.xml, there would be servlet tags and servlet-mapping tags to wire it up. Here’s how it is with Guice Servlet:

class FooServletModule extends ServletModule {
    @Override
    protected void configureServlets() {
        bind(FooServlet.class);
        serve("/foo").with(FooServlet.class);
 
        // other servlets
    }
}

Since Guice is instantiating your FooServlet class intead of relying on the servlet container to invoke the 0-args ctor, this also means that you can use @Inject on the ctor and get the objects you need that way instead of pulling them out of init params or servlet context.

To feed incoming requests into the servlets you’ve laid out with Guice Servlet, you need to set up GuiceFilter as a filter for all requests. You can do this in web.xml if you’re using a war (it’s the only thing you’ll actually need in web.xml) or just configure Jetty directly.

Injector injector = Guice.createInjector(new SandwichModule(), new AbstractModule() {
    @Override
    protected void configure() {
        binder().requireExplicitBindings();
        bind(GuiceFilter.class);
    }
});
 
FilterHolder guiceFilter = new FilterHolder(injector.getInstance(GuiceFilter.class));
handler.addFilter(guiceFilter, "/*", EnumSet.allOf(DispatcherType.class));

Now, you can stop here and bind your servlets in a ServletModule and get the benefits of being able to initialize your servlets with Guice, but it gets even easier with Jersey.

Jersey and Jackson

Jersey is the reference implementation of JAX-RS. It lets you do things like POST the url “http://localhost:8080/foo?bar=baz&quux=1234″ and handle the request with this class:

@Path("/foo")
@Produces(MediaType.APPLICATION_JSON)
class FooResource {
    @POST
    public String handleFooPost(@QueryParam("bar") String bar, @QueryParam("quux") int quux) {
       return "{\"yay\":\"hooray\"}"; 
    }
}

The class, method and parameter names are not magic; only the annotations are what ties this code to handling that request. Jersey will set the status code, content length, etc. appropriately. It can also do void methods (returns status 204 for you), streaming responses, and entirely custom responses (pick the status code, the entity, etc), so check the documentation for the details.

If you use Jersey to handle actually calling your business logic instead of writing plain old servlets directly, you only need to have one servlet binding in your Guice Servlet module:

public class SandwichServletModule extends ServletModule {
    @Override
    protected void configureServlets() {
        // bind resource classes here
 
        // hook Jersey into Guice Servlet
        bind(GuiceContainer.class);
 
        // hook Jackson into Jersey as the POJO <-> JSON mapper
        bind(JacksonJsonProvider.class).in(Scopes.SINGLETON);
 
        serve("/*").with(GuiceContainer.class);
    }
}

GuiceContainer is a class that ships with Jackson. When you use GuiceContainer, all you need to do is bind the resource classes (like FooResource above) with Guice. The JacksonJsonProvider binding to make it easier to return objects as JSON; you’ll see what it does a bit later. To go back to our sandwich example, let’s suppose we have two resources: one to make sandwiches and one to say how many sandwiches have been made.

Let’s start with the stats one (at /sandwich/stats) since it is simpler (doesn’t need to modify state). Supposing I have a SandwichStats class that can provide StatSnapshot objects that represent the total amount of jam, peanut butter and sandwiches, the resource is very simple:

@Singleton
@Produces(MediaType.APPLICATION_JSON)
@Path("/sandwich/stats")
public class SandwichStatsResource {
 
    private final SandwichStats sandwichStats;
 
    @Inject
    SandwichStatsResource(SandwichStats sandwichStats) {
        this.sandwichStats = sandwichStats;
    }
 
    @GET
    public SandwichStats.StatsSnapshot getStats() {
        return sandwichStats.getStats();
    }
}

You’ll also need to bind the resource class so that GuiceContainer can find it. When you have a lot of resources, you might want to put their bindings in a module just for that purpose, but for now just add a binding in SandwichServletModule:

bind(SandwichStats.class);

The resource requests a SandwichStats object in its ctor so that it can ask it for the latest info every time a request is made. The SandwichStats object should also be a singleton since we want it to live as long as the app is up. (Guice is easy to use for non-singletons too — it just happens that in this contrived example I seem to be ending up with lots of singletons.)

Note that the @GET method returns an object, not a string. This is where the ImmutableMap.of(JSONConfiguration.FEATURE_POJO_MAPPING, "true") comes in. It tells Jersey to attempt to map the POJO returned from the method to JSON. To do this, we use the Jackson JSON library.

Jackson is a fast JSON serialization/deserialization library, and one of its features is easy annotation-based configuration for POJO mapping. It’s probably easiest to just show what it looks like in practice:

public class StatsSnapshot {
    // ...
    @JsonProperty
    public int getSandwichesMade() {
        return sandwichesMade;
    }
 
    @JsonProperty
    public int getGramsOfJam() {
        return gramsOfJam;
    }
 
    @JsonProperty
    public int getGramsOfPeanutButter() {
        return gramsOfPeanutButter;
    }
}

When serialized by Jackson, an object of that class will end up as a JSON object with three keys (“gramsOfJam”, “gramsOfPeanutButter” and “sandwichesMade”). Since Jersey has been configured to automatically use Jackson to serialize objects, you can go to /sandwich/stats in your browser and get something like this: {"gramsOfJam":150,"gramsOfPeanutButter":200,"sandwichesMade":1}

Now let’s add a resource that lets you make sandwiches. Normally I wouldn’t have something that modifies state be a GET but I want this to be easy to test in a browser, so let’s say that a GET to http://localhost:8080/sandwich/create will make a sandwich (with optional jam and peanutButter query parameters to specify how many grams of jam and peanut butter to use) and that the sandwich created should be returned as JSON. This is what such a resource might look like:

@Singleton
@Produces(MediaType.APPLICATION_JSON)
@ThreadSafe
@Path("/sandwich/create")
public class SandwichMakerResource {
 
    private final SandwichMaker sandwichMaker;
 
    private final SandwichStats sandwichStats;
 
    @Inject
    SandwichMakerResource(SandwichMaker sandwichMaker, SandwichStats sandwichStats) {
        this.sandwichMaker = sandwichMaker;
        this.sandwichStats = sandwichStats;
    }
 
    @GET
    public Sandwich makeSandwich(@QueryParam("jam") @DefaultValue("100") int gramsOfJam,
                                 @QueryParam("peanutButter") @DefaultValue("200") int gramsOfPeanutButter) {
        Sandwich sandwich = sandwichMaker.makeSandwich(gramsOfPeanutButter, gramsOfJam);
        sandwichStats.recordSandwich(sandwich);
 
        return sandwich;
    }
}

The resource (yet another singleton) gets SandwichMaker and SandwichStats instances injected, and whenever it makes a sandwich it updates the SandwichStats object. Try accessing it via your browser a few times and check the stats page to see that the counts do update correctly.

I’ve only scratched the surface of what these libraries can do with this simple example. Let me know if you think there’s some cool feature I missed (as long as it won’t complicate things too much to work it in).

A new Java Salesforce API Library

I’ve posted a lot of information about Salesforce over the years (Partner API Gotchas Part 1, Part 2, Part 3, Part 4, JAX-WS Tutorial Part 1, Part 2, Part 3, Part 4). I’m supplementing that with the release of an open source Java library for using the REST, Partner, Metadata and Apex APIs. You can get the source from the Team Lazer Beez open source project. The code is under active development but it is stable and ready for use. When I do cut a release, I’ll update this blog post to point to it.

Features

  • Easy-to-use wrappers around the APIs. The classes that tools generate for the SOAP APIs are clumsy and unintuitive (and not thread safe), so I’ve written better versions.
  • Limits concurrent API calls for Partner, Metadata and Apex connectors. This helps you avoid ‘concurrent request limit exceeded’ errors. (The REST API is harder to hit the limit with. I will add it to the request limiting system in the future.)
  • Transparent handling of INVALID_SESSION_ID for the Partner API. If someone else has closed the session ID you were using, the library will automatically re-login as needed.
  • Designed to be used with many different organizations simultaneously. If your app needs to talk to the orgs of many different customers all at the same time, it’s easy to do so. Of course, it’s also easy to work with just one org if that’s what you need.
  • Connections are reconfigurable at any time. You can update the login and password (for SOAP connections) or OAuth token (for REST connections) and the next API call you make will seamlessly use the updated information, even if it’s in another thread.
  • Designed with thread-safety in mind. Where practical, classes are thread-safe or immutable.
  • HTTP communication is gzip-compressed.
  • Well-tested, robust code.
  • Business-friendly Apache License (no GPL issues)

Getting started

First you’ll need to check out and build the code.

% git clone git://github.com/teamlazerbeez/sf-api-connector.git
% cd sf-api-connector
% mvn clean install -DskipTests

Maven will probably need to download a bunch of plugins if you’re not a Maven user yet. Once it’s done, you’ll have installed the jars into your local maven repository. The reason tests are skipped is that some of the tests actually use Salesforce’s real API and are therefore restricted to only come from certain blocks of IP addresses, so those tests will fail unless you happen to be using my ISP. Also, the tests take quite some time to run.

The SOAP-based APIs (Partner, Metadata and Apex) are in the sf-soap-api-connector module and the REST API is in the sf-rest-api-connector module. The dependencies to use for your pom.xml are shown below. You only need to include the dependency for the API that you’ll be using (that is, you don’t need both unless you’re using both). The dependencies are both SNAPSHOT versions because that’s what’s declared in the project’s pom.xml right now.

<dependency>
  <groupId>com.teamlazerbeez</groupId>
  <artifactId>sf-soap-api-connector</artifactId>
  <version>trunk-SNAPSHOT</version>
</dependency>
<dependency>
  <groupId>com.teamlazerbeez</groupId>
  <artifactId>sf-rest-api-connector</artifactId>
  <version>trunk-SNAPSHOT</version>
</dependency>

Using the REST API

Most people can probably do what they need with the REST API, so I’ll start with that since it’s simpler. RestConnectionPool is the starting point for using the REST API. The class has a generic parameter to allow you to use any class you want to identify organizations. You might use the Salesforce organization id (the id that starts with ’00D’) or perhaps the primary key for a table in your DB that has a row for each organization. Make sure you use an immutable class with a useful hashCode()/equals() implementation since these objects will be used as keys in a Map. In this example I’ll use Integer.

// in your app setup, create a pool that's shared for the whole app
RestConnectionPool<Integer> pool = new RestConnectionPoolImpl<Integer>();

Before you can use the pool to get a connection for a specific org, you need to inform the pool about that org.

pool.configureOrg(orgId, host, oauthToken);

The host is the HTTP endpoint to use. Different orgs may be on different Salesforce clusters, so this is a per-org setting. You can extract the host from the partner server URL (see ConnectionBundle.getBindingConfig() in the SOAP API module or the tip later on in this article) if you don’t already have it. The OAuth token is something you’ll get by following the steps outlined in Salesforce’s OAuth docs. You can also use a session Id from the SOAP API as the OAuth token.

You can configure a pool with as many orgs as you want. The pool is shared between all of them for efficiency. You can also reconfigure the data for an org in a pool at any time.

Now that the pool is configured, let’s actually use a connection.

RestConnection connection = pool.getRestConnection(orgId);
 
SObject contact = connection.retrieve("Contact", new Id("0035000000kmAZJ"),
    Arrays.asList("FirstName", "LastName", "Email"));
 
System.out.println("Got a " + contact.getType() + 
    " object with id " + contact.getId());
 
for (Map.Entry<String,String> entry: contact.getAllFields()) {
    System.out.println("Field <" + entry.getKey() + "> has value <" + 
        entry.getValue() + ">");
}
 
SObject newLead = RestSObjectImpl.getNew("Lead");
newLead.setField("LastName", "Smith");
newLead.setField("Company", "FooCorp");
 
SaveResult result = connection.create(newLead);
 
if (result.isSuccess()) {
    System.out.println("Created a lead with id " + result.getId());
} else {
    System.out.println("Creation failed with errors " + result.getErrors());
}

Check the RestConnection interface to see what else you can do with it.

Using the SOAP APIs

The SOAP APIs are a little more complicated because there are three different WSDLs that share some common configuration. First you’ll need to pick a “partner key” or “client id”. This is a fairly arbitrary identifier that will be used to identify what application made an individual API call. Choose something at least mildly human-readable.

As with RestConnectionPool, you can choose any class you like to be your org identifier. Here I’ll use Integer again.

ConnectionPool<Integer> pool = new ConnectionPoolImpl<Integer>(yourPartnerKey);
 
pool.configureOrg(orgId, username, password, maxConcurrentApiCalls);

Username and password are the Salesforce username and password of the person you want to log in as. The last parameter sets the max number of concurrent calls you want to happen per org. Fortunately in a recent update (API version 19 or 20, I think) Salesforce greatly relaxed the limits on concurrent connection usage to only apply to calls that take longer than 20 seconds, so you can set this limit fairly high unless you’re going to be doing long-running calls. See Salesforce’s API Usage Metering docs for more details.

You can now get a ConnectionBundle for an org. A ConnectionBundle represents the per-org configuration needed for all SOAP connections.

ConnectionBundle bundle = pool.getConnectionBundle(orgId);
 
PartnerConnection partnerConn = bundle.getPartnerConnection();
ApexConnection apexConn = bundle.getApexConnection();
MetadataConnection metadataConn = bundle.getMetadataConnection();
 
// partner connection
System.out.println("User id is " + partnerConn.getUserInfo().getUserId());
System.out.println("First contact's email is " + 
    partnerConn.query("SELECT Email FROM Contact LIMIT 1")
        .getSObjects().get(0).getField("Email"));
 
// metadata connection
List<FileProperties> fileProps = metadataConn
    .listMetadata(Arrays.asList(new ListMetadataQuery("CustomField")));
for (FileProperties fp: fileProps) {
    System.out.println("Custom field " + fp.getFullName() + " has id " + fp.getId());
}
 
// apex connection
ExecuteAnonResult result = 
    apexConn.executeAnonyous("System.debug('test debug statement');");
 
System.out.println("Compile succeeded: " + result.isCompiled());
System.out.println("Debug log output: " + result.getDebugLog());

Note that the Metadata API example shows how to get custom field Ids, something that is impossible via the Partner API.

The ConnectionBundle interface also lets you get the current configuration data for an org. You can use this to get the hostname that you need for RestConnectionPool. Using OAuth to get the token is recommended if that’s an option, but if not you can also get the current session Id from the configuration data and use that as the OAuth token.

BindingConfig bindingConfig = soapPool.getConnectionBundle(orgId).getBindingConfig();
String host = new URL(bindingConfig.getPartnerServerUrl()).getHost();
String token = bindingConfig.getSessionId();
restPool.configureOrg(orgId, host, token);

The library doesn’t support every single possible call in all of the APIs, but it does support most of them. I only implemented the calls I had occasion to use, so the Partner, Metadata and REST APIs have fairly complete support while the Apex API only supports a fraction of the available calls. If there’s a call that you use that the library doesn’t support, let me know and I’ll see what I can do. Similarly, if the library’s API doesn’t seem intuitive or doesn’t fit well with what you’re doing, I’d like to hear about that as well.

Using dynamically generated configs with puppet

After using Puppet with an external node classifier for a while one starts questioning what other information could be generated by this instead of just YAML to feed the puppetmaster. When supervisor was being rolled out there was a need to a large number of near identical config files to be generated, however any special information about the configs really had no place in Puppet. So the solution to this was to have the Django app generate the config files and then have puppet pull them down with a custom parser.

In /var/lib/puppet/lib/puppet/parser/functions lives the file webcontent.rb which has the following contents:

require 'open-uri'
module Puppet::Parser::Functions
    newfunction(:webcontent, :type =&gt; :rvalue) do |args|
        server = args[0]
        configpath = args[1]
        config = ""
        beginopen( "http://#{server}/#{configpath}/" ) do |f|
             f.each_line do |line|
                 config = "#{config}#{line}"
             end
            endrescue OpenURI::HTTPError =&gt; e
            raise Puppet::ParseError, "404 for http://#{server}/#{configpath}/"
 
        rescue Exception =&gt; e
            raise Puppet::ParseError, "content string is http://#{server}/#{configpath}/ #{e}"
        end
        return config
    end
end

Using the Ruby module open-uri content is grabbed by the puppetmaster and placed into the catalog. Using the following Django model, view and template a config file is easily generated and passed along to Puppet

class SupervisorProgram(models.Model):
    name = models.CharField(max_length=128)
    command = models.CharField(max_length=512)
    autostart = models.BooleanField(default=True)
    autorestart = models.CharField(max_length=32,choices=(('false','false'),('true','true'),('unexpected','unexpected')))
    startsecs = models.IntegerField(default=10)
    startretries = models.IntegerField(default=3)
    exitcodes = models.CharField(max_length=64,default="0,2")
    stopsignal = models.CharField(max_length=5,choices=(('TERM','TERM'),('HUP','HUP'),('INT','INT'),('QUIT','QUIT'),('KILL','KILL'),('USR1','USR1'),('USR2','USR2')),default="TERM")
    stopwaitsecs = models.IntegerField(default=10)
    user = models.CharField(max_length=16 ,default="nagios")
    redirect_stderr = models.BooleanField(default=False)
    stdout_logfile = models.CharField(max_length=256,default="AUTO")
    stdout_logfile_maxbytes = models.CharField( max_length = 8,default="50MB")
    stdout_logfile_backups = models.IntegerField(default=10)
    stderr_logfile = models.CharField(max_length=256,default="AUTO")
    stderr_logfile_maxbytes = models.CharField( max_length = 8,default="50MB")
    stderr_logfile_backups = models.IntegerField(default=10)
    environment = models.CharField( max_length=512,blank=True,null=True)
    directory = models.CharField(max_length=128,default="/")
    umask = models.IntegerField(blank=True,null=True)
    priority = models.IntegerField(default=999)
 
    def __unicode__(self):
        return self.name
    class Meta:
        ordering = ('name',)
 
class SupervisorProgramAdmin(admin.ModelAdmin):
    list_display = ('name','command','autorestart','stopsignal','exitcodes','user','stdout_logfile','stderr_logfile')

The following is the view used:

def getSupervisorConfig(request,service):
    print "getSupervisorConfig has been called for %s" % service
    service = get_object_or_404(SupervisorProgram,name=service)
    directives = {}
    directives["command"] = str(service.command)
    directives["process_name"] = str(service.name)
    directives["priority"] = int(service.priority)
    directives["autostart" ] = service.autostart
    directives["autorestart"] = service.autorestart
    directives["startsecs"] = int(service.startsecs)
    directives["startretries"] = int(service.startretries)
    directives["exitcodes"] = str(service.exitcodes)
    directives["stopsignal"] = str(service.stopsignal)
    directives["stopwaitsecs"] = int(service.stopwaitsecs)
    directives["user"] = str(service.user)
    directives["redirect_stderr"] = service.redirect_stderr
    directives["stdout_logfile"] = str(service.stdout_logfile)
    directives["stdout_logfile_maxbytes"] = str(service.stdout_logfile_maxbytes)
    directives["stdout_logfile_backups"] = int(service.stdout_logfile_backups)
    directives["stderr_logfile"] = str(service.stderr_logfile)
    directives["stderr_logfile_maxbytes"] = str(service.stderr_logfile_maxbytes)
    directives["stderr_logfile_backups"] = int(service.stderr_logfile_backups)
    directives["directory"] = str(service.directory)
    if service.environment:
        directives["environment"] = str(service.environment)
 
    return render_to_response("sock/supervisor.conf",directives)

With the 20 configuration options per supervisord controlled process there are far too many options that should be sanely passed to puppetmaster from the external node classifier.

Here is the Django template:

#generated config
[program:{{ process_name }}]
command={{ command }}
process_name=%(program_name)s
priority={{ priority }}
autostart={{ autostart }}
autorestart={{ autorestart }}
startsecs={{ startsecs }}
startretries={{ startretries }}
exitcodes={{ exitcodes }}
stopsignal={{ stopsignal }}
stopwaitsecs={{ stopwaitsecs }}
user={{ user }}
redirect_stderr={{ redirect_stderr }}
stdout_logfile={{ stdout_logfile }}
stdout_logfile_maxbytes={{ stdout_logfile_maxbytes }}
stdout_logfile_backups={{ stdout_logfile_backups }}
stderr_logfile={{ stderr_logfile }}
stderr_logfile_maxbytes={{ stderr_logfile_maxbytes }}
stderr_logfile_backups={{ stderr_logfile_backups }}
{% if environment %}
environment={{ environment }}
{% endif %}

Finally all of this can be referenced with a custom define as follows:

define supervisorconfig(
        $program,
        $server = "${rserver}"
) {
    file {
        "${name}.conf":
            owner =&gt; root,
            group =&gt; root,
            mode =&gt; 0644,
            path =&gt; "/etc/supervisord.d/${name}.conf",
            content =&gt; webcontent( $server, "dpuppet/sock3/getsupervisorconfig/$program")
    }
}