Docker Azcopy image
In this article we introduce, publish and maintain our
michalklempa/azcopy-all image which includes az-cli, azcopy, kubectl
all with bash completion.
Apache Spark on Kubernetes - Publishing Spark UIs on Kubernetes (Part 3)
In this article we go through the process of publishing Spark Master, workers and driver UIs in our Kubernetes setup.
Apache Spark on Kubernetes - Submitting a job to Spark on Kubernetes (Part 2)
This is the Preparing Spark Docker image for submitting a job to Spark on Kubernetes (Part 2) from article series (see Part 1).
In this article we explain two options for job submitting to cluster:
- Create an inherited custom Docker image with job jar and submit capability.
- Mount a volume to original image with job jar.
Apache Spark on Kubernetes - Docker image for Spark Standalone cluster (Part 1)
In this series of articles we create Apache Spark on Kubernetes deployment. Spark will be running in standalone cluster mode, not using Spark Kubernetes support as we do not want any Spark submit to spin-up new pods for us.
This is the Docker image for Spark Standalone cluster (Part 1), where we create a custom Docker image with our Spark distribution and scripts to start-up Spark master and Spark workers.
Complete guide comprises of 3 parts:
- Docker image for Spark Standalone cluster (Part 1)
- Submitting a job to Spark on Kubernetes (Part 2)
- Publishing Spark UIs on Kubernetes (Part 3)
Running Ansible from inside Docker image for CI/CD pipeline
In this article we prepare simple Docker image packed with our Ansible roles, which will be ready-made for provisioning just by running the container from this image.
In this article we describe process of encapsulating ansible executable, Ansible roles, dependent galaxy roles, SSH key material and group variables into a docker image for CI/CD use. We also present a way to run prepared image from command-line without installing Ansible.
tt: Simplest and fastest time tracking
When it comes to tracking time on activities precisely, one has to start coping with time tracking applications. I was not able to find a simple solution, all the options were UI-based, feature-full and annoying to use. Then I started searching for command-line options, where the situation is much more clean, but still, too many features.
Finally I came up with tt: a simple script on 6 lines of code. In this article I name some other options and inspirations you may use, and the script itself with installation steps.
tt hello ... some time tt writing blog ... some time tt lunch
Composing Avro Schemas from Subtypes
While working with Avro Schemas,
one can quickly come to the point, where schema definitions for multiple entities start
to overlap and schema files grow in number of lines.
As with object oriented design of classes in your program, same principle could be applied
to design of your Avro schema collection.
Unfortunately Avro Schema Definitition language does not have a native
If you do not want to rewrite all the schemas or simply like the JSON schema definitions more, in this article we will introduce a mechanism how to:
- design small schema file units, containing Avro named types
- programatically compose the files into large Avro schemas, one file per one type
Article is accompanied with full example on usage and source code of the Avro Compose - automatic schema composition tool.
Move Flink Savepoint to a different S3 location
The issue with savepoint is, how to move a savepoint to a different location and be able to start a Flink job from the new location. Problem lies in the
_metadata file of
savepoint files, which contains absolute URIs (see
documentation on moving savepoint).
In this article, we go step-by-step on how to move Flink savepoint from one S3 bucket to another and how to safely (without corrupting) alter the
_metadata file in the
destination, so that the Flink job starts smoothly from a new savepoint location. Setup is tested with S3 and filesystem state backend.