Multiple endpoints using CXF and Spring Boot

I recently ran into a problem while porting a couple of old SOAP services that currently run on JBoss into a Spring Boot application. As a start https://blog.codecentric.de/en/2016/02/spring-boot-apache-cxf/  provided a good introduction on how use CXF and Spring Boot, but the example only shows one endpoint.

For some reason I couldn’t find any good resources on how to create two independent endpoints using CXF and Spring Boot so I’m writing it down in case somebody else stumbles into the same problem.

The solution is quite simple: In order to get multiple independent endpoints it is important to create one SpringBus and ServletRegistrationBean for each base URL.

As an example, say you want to have two versions of an API:

/api/v1/orders/get
/api/v1/orders/create
/api/v2/orders/get
/api/v2/orders/create

Using Spring Boot we can define the two following configuration classes.

@Configuration
public class SomeApiV1Config {

    @Bean
    public SpringBus springBusV1() {
        SpringBus bus = new SpringBus();
        bus.setId("v1");
        return bus;
    }

    @Bean
    public ServletRegistrationBean v1Servlet() {
        CXFServlet cxfServlet = new CXFServlet();
        cxfServlet.setBus(springBusV1());

        ServletRegistrationBean servletBean = new ServletRegistrationBean(cxfServlet, "/api/v1/*");
        servletBean.setName("v1");
        return servletBean;
    }

    @Bean
    public EndpointImpl getOrderV1(GetOrderServiceV1 service) {
        EndpointImpl endpoint = new EndpointImpl(springBusV1(), service);
        endpoint.publish("/orders/get");
        return endpoint;
    }

    @Bean
    public EndpointImpl createOrderV1(CreateOrderServiceV1 service) {
        EndpointImpl endpoint = new EndpointImpl(springBusV1(), service);
        endpoint.publish("/orders/create");
        return endpoint;
    }
}
@Configuration
public class SomeApiV2Config {

    @Bean
    public SpringBus springBusV2() {
        SpringBus bus = new SpringBus();
        bus.setId("v2");
        return bus;
    }

    @Bean
    public ServletRegistrationBean v2Servlet() {
        CXFServlet cxfServlet = new CXFServlet();
        cxfServlet.setBus(springBusV2());

        ServletRegistrationBean servletBean = new ServletRegistrationBean(cxfServlet, "/api/v2/*");
        servletBean.setName("v2");
        return servletBean;
    }

    @Bean
    public EndpointImpl getOrderV2(GetOrderServiceV2 service) {
        EndpointImpl endpoint = new EndpointImpl(springBusV2(), service);
        endpoint.publish("/orders/get");
        return endpoint;
    }

    @Bean
    public EndpointImpl createOrderV2(CreateOrderServiceV2 service) {
        EndpointImpl endpoint = new EndpointImpl(springBusV2(), service);
        endpoint.publish("/orders/create");
        return endpoint;
    }

}

Tuning the Logstash multiline filter

I’m currently working on a project were we are using Apache Mesos and Marathon as our platform to run and provision our applications. We use the ELK stack to receive, store and analyze our logs. On each of the nodes in the cluster we have Filebeat (the successor of Logstash) installed which sends the logs to Logstash.

At first everything looked fine; all our tasks log showed up in Kibana. However as we added more and more applications to the cluster we started seeing a problem were log from one task would show up on the same event as another task in Kibana.

At first we suspected problems with how we had configured the Filebeat prospector path which contained multiple wildcards:

/tmp/mesos/slaves/*/frameworks/*/executors/*/runs/latest/stdout

The source field contained multiple paths, indicating that this message came from two different tasks. A log message should never have more than one path, so how could this be?

It turned out that the problem was being caused by the multiline filter we were using to collapse multiline message into one Logstash event.

Looking through the documentation of the multiline filter we found the stream_identity setting which determines which stream an event belongs to. By default the stream identity is configured to be %{host}.%{path}.%{type}, but for some reason this was not enough in our case. After appending %{source} to the stream identity the problem disappeared. This is what we added to our multiline filter:

 stream_identity => "%{host}.%{path}.%{type}.%{source}"

With that fixed we now noticed a different problem. Some messages were sometimes being tagged with  _grokparsefailure, indicating that the log parsing failed. Looking closer this happened when the log message contained long output like the stack traces from Java exceptions. The start of a message would be parsed correctly with timestamp and everything, but the message would be cut off. The following message would not contain a timestamp and would continue where the previous stopped.

This seemed strange as the job of the multiline filter is exactly this. It should collapse message into one message as long as it does not match the specified pattern, in our case wait until it matches a timestamp indicating a new message.

More reading of the documentation revealed that there is a periodic_flush setting which by default is true. That is, the multiline filter will periodically flush based on the max_age setting which by default is 5 seconds. Our theory is that due to processing delay and load on the ELK server we hit this timeout causing our messages to split.

By increasing the max_age to 60 seconds the problem seems to have disappeared.

max_age => 60

Being a time based setting this isn’t boom proof as the results will strongly vary based on the load of the server, but for now it seems to do the trick. Another option would be to disable periodic flush, but this means potentially needing to wait very long for messages from tasks that don’t log that often.

The problems mentioned in this post are some of the things being adressed by the new multiline-codec which aims to be the successor of the now deprecated multiline filter. Unfortunately changing from the filter to to the codec is not trivial in all cases, but I’ll leave that for another post.

Undefined symbols for architecture x86_64

I recently upgraded from Mac OS Mavericks to Yosemite and in the process broke GCC. Suddenly all I got when trying to compile with g++ was this error message:

~/dev/tmp$ g++ asdf.cpp
Undefined symbols for architecture x86_64:
"start", referenced from:
implicit entry/start for main executable
ld: symbol(s) not found for architecture x86_64
collect2: error: ld returned 1 exit status

Just to be clear, the code I was trying to compile was nothing fancy:

int main()
{
    return 0;
}

In reality it wasn’t really the compilation that was the problem, but rather the linking. After spending hours on google without finding any similar cases I nearly gave up.

Before the upgrade I had installed GCC with Homebrew, and it was working fine. On my system g++ is symlinked to /usr/local/Cellar/gcc/4.9.2/bin/g++-4.9. I  tried reinstalling GCC, recompile from source and upgrade Xcode, but nothing worked. It was driving me crazy. Then I  discovered the doctor option for Homebrew which gave me a break:

$ brew doctor
[...]
Warning: You have MacPorts or Fink installed:
  /opt/local/bin/port

This can cause trouble. You don't have to uninstall them, but you may want to
temporarily move them out of the way, e.g.

  sudo mv /opt/local ~/macports
[...]

I didn’t ever remember that I had Macports installed, but here it was, potentially causing problems. To uninstall I ran the following command from their website:

sudo port -fp uninstall installed

And what do you know. It turned out that there was another version of GCC installed with macports. After getting rid of that everything suddenly worked as expected!

I don’t know why this suddenly happened after upgrading to Yosemite or why it worked back in Mavericks, but at I can finally compile stuff again. If you encounter the same problem, hopefully this helps you out.