2016-10-15Home
What will we learn in this post ?
sbt-assembly
?pom
?SBT (Simple Build Tool) is a build tool for Scala and Java. We write build definition in Scala and it seems we can do anything with the flexibility of the language. It also means sometimes we have to do it ourselves. Another downside is SBT has scarce documentation on its configuration keys so we have to dig into its hidden capabilities.
As a Scala project, Apache Gearpump (incubating) has been using SBT from the beginning. In this post, I'll share our experience working on the shaded libraries / dependencies of Gearpump.
Gearpump uses popular libraries like guava and puts them on classpath on launch. Guava is popular so it may happen to be depended on guava by a user application but unfortunately of a different version. At runtime, two versions of guava will be on the classpath but the system guava could be loaded rather than the users' and unexpectedly breaks the application. That's where shade comes in. It allows us, for example, to rename the system guava classes from com.google.*
to org.apache.gearpump.google.*
. While user codes will import com.google.*
, Gearpump system code will have org.apache.gearpump.google.*
such that they will never be accidentally loaded into a user application.
There is no available shade plugin for SBT like Maven Shade Plugin. In the old time, we shade system libraries with Maven and host them at gearpump-shaded-repo as external dependencies. When Gearpump was moving to Apache, it's required they should be hosted at Gearpump's Apache repo. To the rescue, the sbt-assembly plugin has supported shade since its 0.14.0 release. sbt-assembly is to create a fat JAR of a project with all of its dependencies (akin to Maven Assembly Plugin). In Gearpump, we usually use it to assemble example projects. It can be used like Maven Shade Plugin as well.
We create empty SBT shaded projects with the corresponding original libraries as dependencies and define ShadeRule
(check out more shading rules) inside assemblyShadeRules
, e.g.
val shaded = Project(
id = "gearpump-shaded",
base = file("shaded")
).aggregate(shaded_akka_kryo, shaded_gs_collections, shaded_guava, shaded_metrics_graphite)
lazy val shaded_guava = Project(
id = "gearpump-shaded-guava",
base = file("shaded/guava"),
settings = shadeAssemblySettings ++ addArtifact(Artifact("gearpump-shaded-guava"), sbtassembly.AssemblyKeys.assembly) ++
Seq(
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.google.**" -> "org.apache.gearpump.google.@1").inAll
)
) ++
Seq(
libraryDependencies ++= Seq(
"com.google.guava" % "guava" % guavaVersion
)
)
)
Note we declare this project to publish with addArtifact(Artifact("gearpump-shaded-akka-kryo"), sbtassembly.AssemblyKeys.assembly)
(Please refer to Defining custom artifacts).
With shadeAssemblySettings
we add more assembly options. The assemblyJarName
and target in assembly
is crucial for other projects to pick up the dependency.
val shadeAssemblySettings = Build.commonSettings ++ Seq(
scalaVersion := Build.scalaVersionNumber,
test in assembly := {},
assemblyOption in assembly ~= {
_.copy(includeScala = false)
},
assemblyJarName in assembly := {
s"${name.value}_$scalaVersionMajor-${version.value}.jar"
},
target in assembly := baseDirectory.value.getParentFile / "target" / scalaVersionMajor
)
Remember it's the assembled shaded jar that will be depended, inter-project source dependency like the following won't work.
lazy val core = Project(
id = "gearpump-core",
base = file("core"),
settings = commonSettings ++ javadocSettings ++ coreDependencies
).dependsOn(shaded_guava)
SBT provides an unmanagedJars
setting such that we can add an arbitrary jar to the dependencies. The jar path must match that defined for shaded project.
lazy val core = Project(
id = "gearpump-core",
base = file("core"),
settings = commonSettings ++ javadocSettings ++ coreDependencies ++ Seq(
unmanagedJars in compile ++= Seq(
shaded.base / "target" / scalaVersionMajor / s"gearpump-shaded-guava_$scalaVersionMajor-$gearpumpVersion.jar"
)
)
)
Now, we can compile with sbt assembly
.
pom
?The unmanagedJar
approach doesn't add the shaded project to the published Maven pom of gearpump-core
project. Although being transitive dependencies, they have to be explicitly added to the build file, which means more cumbersome for users. Are we at a dead end ?
Not yet. It's time to dig into SBT configuration keys! The nugget I found is pomPostProcess
which "Transforms the generated POM". We traverse the XML tree and append the shaded dependency to the <dependencies></dependencies>
tags.
lazy val core = Project(
id = "gearpump-core",
base = file("core"),
settings = commonSettings ++ javadocSettings ++ coreDependencies ++ Seq(
pomPostProcess := {
(node: xml.Node) => addShadedDeps(List(
<dependency>
<groupId>{organization.value}</groupId>
<artifactId>{shaded_guava.id}</artifactId>
<version>{version.value}</version>
</dependency>
), node)
},
unmanagedJars in compile ++= Seq(
shaded.base / "target" / scalaVersionMajor / s"gearpump-shaded-guava_$scalaVersionMajor-$gearpumpVersion.jar"
)
)
)
private def addShadedDeps(deps: Seq[xml.Node], node: xml.Node): xml.Node = {
node match {
case elem: xml.Elem =>
val child = if (elem.label == "dependencies") {
elem.child ++ deps
} else {
elem.child.map(addShadedDeps(deps, _))
}
xml.Elem(elem.prefix, elem.label, elem.attributes, elem.scope, false, child: _*)
case _ =>
node
}
}
There is not too much one can find when googling for the above problems. I hope this post can become a handy guide for SBT users. Please check out Gearpump build files for a complete example.
gearpump-core
and gearpump-streaming
that depends on shaded projects.