Michal Klempa

Knowledge and experience

Experience with modern deployment frameworks like Docker, Vagrant, Kubernetes and CI/CD automation tools Maven, Jenkins, Artifactory, Ansible. I have been using AWS (S3, VPC, 53 and EC2) and Google Cloud (GCE, GKE).

Experience with Big Data using Hadoop (Spark ML, Spark SQL, NiFi, Hive, Pig). Developing data pipelines using Apache NiFi, working on data stewardship using Apache Zeppelin with Scala and Spark.

Knowledge of Hadoop administration using Apache Ambari/ Cloudera Manager, Hadoop installation (using Ansible), Hadoop tuning (HDFS, YARN) and Hadoop Security (Kerberos and AD integration). Familiar with Cloudera and Hortonworks Hadoop distributions.

Knowledge of Apache Kafka and Confluent Platform, installing, configuring and securing Kafka and Kafka Connect (Kerberos). Integrating Kafka to other frameworks (NiFi, Flink).

Experience with programming (C, assembler) custom OS kernel running on simulated CPU (MIPS R3000). This project was part of my study at Charles University.

Good knowledge of Java server-side technologies (Hibernate, myBatis, Spring), major RDBMSs (MySQL, PostgreSQL, MSSQL), Java frontend frameworks (Vaadin, Spring MVC, Swing, NetBeans Platform, Eclipse SWT).


Apache NiFi – participating in development, fixing bugs and I am fan of this project. (NiFi JIRA list of issues). My current usages of NiFi include:

  • ASN.1 and related data encodings (BER/DER/PER) parsing and transforming pipeline – including development of custom Processors in NiFi
  • SNMP data collection, transformation and storage
  • SMS notification system based on NiFi, MySQL and gammu (and Nokia 105 phone)
  • Unity3D model conversion automation, using S3 buckets as source and destination of models. NiFi runs on Windows platform in this project.

Jinfer – XML Schema inference from set of positive XML examples, using MDL (Minimum Description Length) principle to avoid overfitting the model. (jinfer.sourceforge.net, Master Thesis)

UnifiedViews – we developed custom ETL framework to leverage RDF data (unifiedviews.eu, github) and OpenDataNode – standalone platform for crawling, transferring and publishing open data (opendatanode.org)


Public profiles