The purpose of this article to show the integration of Minio buckets with Kafka and Nuclio serverless functions. To achieve this goal I have used the following components. I will not explain each of them one by one because you can get tons of information about them from official pages and google.
Minio — Object Storage with wonderful API
Kafka — Open-source distributed event streaming platform
Nuclio — Serverless Platform for Automated Data Science
Our topology like as the following:
To understand all flow just imagine if somebody sends (can be changed or deleted) some document with the PDF extension (It can be any type of extension files), then trigger will be happened and call our serverless function. The flow will be like the following.
User (Sending PDF to the Minio) -> Minio nodes (Sending event to the Kafka) -> Kafka nodes <- Nuclio listening to Kafka nodes.
Unfortunately to automate all this stuff I cannot find the time right now to convert them to vagrant. But I will upload all source code files in the GitHub repo and share them with you. To prepare the environment just clone the repo, then start firstly to configure Kafka then Minio and Nuclio parts. In the main README defined steps and for each of these components separated README files for the deployment of environment. Don’t forget to change the IP addresses and hostnames of your environment.
Documentation from Minio and Nuclio:
To test everything we can use this python code and look at the logs of the serverless container in real-time to see what is happening.
In the following video I have explained everything in details and tested it in the real environment:
I hope it will be useful to everyone.