SBT extract

Why SBT

  1. Short, concise DSL, can be extended by pure Scala code
  2. Interactivity
  3. Background execution
  4. Default parallel execution (restriction on CPU, network and disk can be specified)
  5. Scala REPL integration
  6. Incremental compilation
  7. Default folder structure (can be adjusted)
  8. Defined workflow (can be adjusted or redefined)
  9. Type safety
  10. Direct dataflow between tasks is supported
  11. Simple script entities hierarchy, just tasks and settings, some already defined, but it is easy to add custom
  12. Crossbuild (for several Scala versions in parallel)
  13. Plugin extensible

Folder structure

  • <root>/project/plugins.sbt
  • <root>/project/build.properties
  • <root>/build.sbt

SBT tasks, executing items, can depend on other tasks (use other task return value inside body), can accept a user input.

  • Declare key: val keyName = taskKey[keyType](“key description”)
  • Assign value: keyName := …
  • Get value: keyName.value

SBT setting – just a named value, can dependent only no literal or value of other setting. The exact value is determined during starting script up. It cannot depend on some task return value.

  • settingName := settingValue – for assign (redefine, if already defined)
  • settingName += settingValue – for append single value to Seq
  • settingName ++= settingValue – for append Seq to Seq

Scopes

  • project
  • configurations – namespaces for keys (default: Compile,Test, Runtime, IntegrationTest)
  • task
  • global – default, if not specified

Multiproject – can be declared as single or multiple (own for each project) sbt file. Abstract parent project can have common settings, added or redefined by concrete child projects. dependsOn – defines dependency.

Sources (compile/test configurations):

  • location settings: javaSource, resourceDirectory, scalaSource.
  • filtering: includeFilter, excludeFilter.
  • Managed: autogenerated by SBT or added explicitly into build.
  • Unmanaged: created outside of SBT, written by coder.

Dependencies (compile/test/runtime):

  • internal (between projects) or external (on some lib outside – maven / ivy)
  • external can be: managed (maven / ivy) or unmanaged (jars from lib folder)
  • resolvers – setting that can be added with additional maven/ivy external repositories.

Dependency format:  ModuleID – “groupID/organisation” % or %% “artifactID/product” % “version” (optional: “test”, “provided”)

  • exclude – specified dependency will be omitted (additionaly rules can be applied)
  • classifier – additional parameters, like JDK version
  • intransitive or notTransitive – do not load dependencies
  • withSources
  • withJavadoc
  • externalPom
  • externalIvy

Forking – execution Test or Run in separate JVM, custom settings can be applied

Session – memory mapped SBT configuration, will be lost after reload, can be saved as SBT file.

SBT script troubleshooting: streams.value.log

Extending SBT: commands and plugins

Publishing artifact: publishTo

 

Java testing notes

Test objects Dummy – returns default value

  • Stub – returns a valid predefined value
  • Mock – contains some logic
  • Fakes – valid but generated data

Testing persistence -> stubing/mocking entity manager or faking a data from in-memory database

BDD -> cucumber

Useful tools: EqualsVerifier, Testing RxJava, Hamcrest matchers

Spock – high level groovy based framework, can work over junit and cucumber, nice and concise syntax

Static code analysis of test coverage – cobertura

WEB testing – selenium, arquillian

htmlunit – simple html client for testing WEB app

pageunit – checking up-ness

Static code analysis – xlint javac’s option, PMD (command line, eclipse addon, maven plugin), checkstyle, CPD

Bytecode analysis – FindBugs, Fortify

Java performance extract

Benchmarks:

  • micro
  • meso
  • macro

Tools

  • non Java: CPU, disk, network
  • Java tools: flags, heap, GC

Profilers: sampling, instrumentation, native

JIT: client (default threshold is 1500), server (default threshold is 10000), tiered compilation

Tuning JIT:

  • code cache
  • compilation thresholds – how many times code will be interpreted before gets compiled
  • print compilation process logs
  • compilation thread (amount can be adjusted)
  • inlining (limits of code for inline – default 325 bytes)
  • escape analysis mode, very efficient, but will break improperly synchronized code
  • de-optimisation
  • tiered compilation levels

GC: serial, throughput (parallel), concurrent (CMS), G1

GC generation: new (eden, survived), old

All GC do stop-the-world pause while checking the eden, for not eden, CMS and G1 may do (lower CPU consumption) or with not stop-the-world pause (high CPU consumption)

Serial GC (x32, single core machine or Windows) – for client:

  • single threaded
  • stop-the-world for new or old generation processing

Throughput (Unix, multi-core, x62):

  • multi-threaded
  • stop-the-world for new or old generation processing

CMS:

– multiple threads for new generation

  • for old generation, one thread scans object to free in background with no stop, but old generation remains fragmented, stop-the-world still happens, but quite rare, to defragment the old generation heap, usually it happens when there is no space to allocate for new object

G1:

  • for large heaps (more than 4GB), marker heap with a region
  • System.gc() does stop-the-wrold for all types of GC and do full scan

Tuning GC:

  • sizing heap (small – too often GC works, big – OS swapping of RAM and drive)
  • sizing generations
  • Permgen/MetaSpace – keeps information about loaded classes – is expensive operation for resizing, it is better to define at startup
  • Controlling amount of GC threads
  • adaptive sizing (should be turned-on)
  • large object

Tuning threads:

  • pool size
  • thread stack size
  • avoid synchronization
  • thread priorities
  • adjusting spinning

Persistence

  • choose right driver (try different)
  • prepared statement and statement pooling
  • connection pools
  • transaction pools
  • cached queries

Other optimisations:

  • reuse Random
  • JNI is not solution for performance
  • Exceptions are not always an issue
  • One line string concatenation is faster then multiline
  • Lambdas and anonymous classes has the same performance, but lambdas loaded faster