Caching in Gradle

Setting up a CI pipeline is nowadays a standard in software development. In our case, we use jenkins as a CI server with one master and one slave. The master builds our artifacts, uploads them to a repository and the slave is used to execute regression tests. As mentioned on Wikipedia, a regression test verifies that software, which was previously developed and tested still performs correctly after it was changed. As also mentioned in the article, a regression test can be used when some feature is redesigned, to ensure that some of the same mistakes that were made in the original implementation of the feature are not again made in the redesign. This applies quite well to my current situation.

Reducing legacy costs

In the past few weeks, I replaced our source for input data from a database based approach to a file based one which provides more flexibility and is also a bit faster than the old one. When developing simulation software there is one basic rule, if you do not change the input or implemented semantic, the output must stay the same.

So replacing the mechanism to load input data, should not change the output. Doing this in a legacy environment means that we do not have a good test coverage. Therefore, to preserve this behaviour a couple of simple regression tests are used which basically compare the simulation output using the old implementation with one using the new one. Each iteration of the regression tests takes round about 55 minutes to complete, so it is possible to run it 7 to 8 times a day.

Our CI pipeline handles this for us. The code is build and uploaded to a file-based repository on our file server. Afterwards the regression tests are triggered and gradle uses the latest artifacts for the tests. Nothing special here, it looks like a normal CI pipeline.

Problems with the repository

During development, it happened from time to time, that the regression tests failed with a NoClassDefFoundError pointing to our main class, which I did not understand at first, because the class was not changed and it was definitely there.

Taking an eye on that phenomenon revealed, that builds during lunch always succeeded except when there was a real bug. While builds during worktime sometimes succeeded and sometimes failed. It looked like when the artifacts were build and uploaded to the repository and one of the regression tests was started in that moment, the test failed.

Not all local resources are local

As mentioned earlier, gradle is used for the build. Gradle has built-in support to cache dependencies in a local folder after downloading them from a remote repository. Gradle can also display from where the dependencies are taken from during build, see stackoverflow.

task printDeps {
  doLast {
     println "Dependencies:"
     configurations.runtime.each { println it }
  }
}

Adding the above task to the build file and executing it shows all dependencies and the path to each of them. In my case, for the most of the dependencies, the path pointed to the local gradle cache. For the artifacts, which are located in the file-based repository, the path directly pointed to the file server instead of the local cache.

Digging deeper into this revealed, that gradle considers all file-based repositories as local. Local repositories are not worth to be cached. So the dependency is directly used from that location, even if it points to a server. As mentioned at gradle.org, this is hard coded into gradle. There were also some feature requests for ivy and maven repositories to make this behaviour configurable. However, they sadly did not survive the migration to github.

Finding our way out

This behaviour is only hard coded for file repositories. So one solution could be to switch to a binary repository like artifactory or nexus. Nevertheless, this has the drawback to maintain another server, which provides in our case little added value compared to the file server solution.

Another solution is to download and cache the dependencies manually in the build script. This can be done by adding a dedicated task to the build script, which copies the file-based repository into a local folder. This will always copy the whole repository, which can increase your build time and network load. One could add some caching logic, but this will just reinvent the wheel.

task syncDependencies(type: Sync) {
  group = 'build setup'
  from project.ext["mobitopp.repository.url"]
  into project.ext["local.cache.path"] as File
}
compileJava.dependsOn syncDependencies

In our case, the build time did not significantly increase and it is compared to the maintenance costs of another server easier to handle for us.

Conclusion

Watch out where your build tool loads the artifacts from. Be sure to have builds, which do not affect each other.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s