Composing Avro Schemas from Subtypes

While working with Avro Schemas, one can quickly come to the point, where schema definitions for multiple entities start to overlap and schema files grow in number of lines. As with object oriented design of classes in your program, same principle could be applied to design of your Avro schema collection. Unfortunately Avro Schema Definitition language does not have a native require or import syntax.

One possible solution is to rewrite all the schemas into Avro Interface Definition language, which have the import feature (see [1]).

If you do not want to rewrite all the schemas or simply like the JSON schema definitions more, in this article we will introduce a mechanism how to:

  • design small schema file units, containing Avro named types
  • programatically compose the files into large Avro schemas, one file per one type

Article is accompanied with full example on usage and source code of the Avro Compose - automatic schema composition tool.

Move Flink Savepoint to a different S3 location

Users of Apache Flink are familiar with creating a savepoint and restarting a job from savepoint.

The issue with savepoint is, how to move a savepoint to a different location and be able to start a Flink job from the new location. Problem lies in the _metadata file of savepoint files, which contains absolute URIs (see documentation on moving savepoint).

In this article, we go step-by-step on how to move Flink savepoint from one S3 bucket to another and how to safely (without corrupting) alter the _metadata file in the destination, so that the Flink job starts smoothly from a new savepoint location. Setup is tested with S3 and filesystem state backend.

NiFi Registry behind nginx proxy with (client) SSL/TLS and basic auth

Running NiFi Registry behind nginx proxy with SSL/TLS and basic_auth (inside nginx) is a bit tricky. In this article, we will go step-by-step to create this hybrid setup:

  1. NiFi Registry listening plain HTTP on port 18080 and without authentication
  2. nginx reverse proxy listening on port 18443 with server-side SSL/TLS certificate and with optional client SSL/TLS authentication
  3. nginx reverse proxy fallback to basic auth for clients which do not present themselves with valid client SSL/TLS certificate
  4. Apache NiFi configured to use pre-baked keystore and truststore to authenticate itself using client SSL/TLS against nginx
  5. NiFi Registry Web UI browser accessible using basic auth

In this setup, NiFi does not authenticate against NiFi Registry (we will still use anonymous access), but the communication is encrypted between NiFi and nginx. By using two-way SSL between NiFi and nginx we can be sure, only NiFi with supplied private key and certificate will be able to talk our NiFi Registry. By using basic auth when no client-side SSL certificate is supplied, we can be sure, only web browsers (users) who know correct user/password are allowed to access NiFi Registry web UI.

We will prepare certificates and truststores in a way, that makes nginx sure about authenticity of NiFi client and vice-versa (using own CA, but you can buy commercial certificates if you want).

NiFi Registry in Docker with Git auto-cloning on startup

tl;dr

Running NiFi Registry with Git and auto-cloning on startup is possible with three authentication options:

  1. HTTPS user and password
  2. git+ssh (~/.ssh bind mount)
  3. git+ssh (SSH keys as environment variables)

Publishing Docker image to Docker Hub using Automated builds

Although there are some resources on how to create, label and publish images to hub.docker.com, I have found myself in need to put all those pieces together. This article is the complete example of how to create, build and test, label and publish docker image using Automated Builds with source code hosted at github.com.

In this article we will:

  • create small docker image, build it and test locally
  • create and link hub.docker.com repository with GitHub repository
  • setup automated builds of the image
  • alter the automated builds setup to include build-time labels
  • verify the build

Using Apache NiFi to ingest SNMP tables into Avro

SNMP is a very old protocol, dating back to 1988 and was elaborated later in 90s. When it comes to hardware monitoring, however, nearly every device supports at least SNMPv2. This makes it the “good enough” choice for basic monitoring of devices.

In this article, we create a Apache NiFi data flow, which will collect SNMP tables and convert them into Avro format.

Although there is SNMP processor in NiFi available, we will proceed with different approach, since the processor does not support querying SNMP tables. Steps in this article include:

  • Preparing sandbox environment – simulator for SNMP server, setting up command line tools to grab SNMP tables (ifTable is used as an exapmle)
  • Designing the NiFi DataFlow – step-by-step data flow design instructions to query SNMP server for a table, parse the CSV-like output into Apache Avro format (one record per CSV line).

How to add PBKDF2 password hashing to a Spring Security based project

Although there are some secure password hashing algorithms available, PBKDF2 is not yet implemented in Spring Security. Only BCryptPasswordEncoder, NoOpPasswordEncoder, StandardPasswordEncoder are available in versions 4.0.0.RC1 and 3.2.5.RELEASE.

In this article, we create and use our own PBKDF2 implementation of the PasswordEncoder.