Application compatibility for different Spark versions

Knoldus Blogs

Recently spark version 2.1 was released and there is a significant difference between the 2 versions.

Spark 1.6 has DataFrame and SparkContext while 2.1 has Dataset and SparkSession.

Now the question arises how to write code so that both the versions of spark are supported.

Fortunately maven provides the feature of building your application with different profiles.
In this blog i will tell you guys how to make your application compatible with different spark versions.

Lets start by creating a empty maven project.
You can use the maven-archetype-quickstart for setting up your project.
Archetypes provide a basic template for your project and maven has a rich collection of these templates for all your needs.

Once the project setup is done we need to make 3 modules. Lets name them core, spark and spark2 and setting the artifact id of each module to their respective names.

View original post 557 more words

Knoldus Bags the Prestigious Huawei Partner of the Year Award

Knoldus Blogs

Knoldus was humbled to receive the prestigious partner of the year award from Huawei at a recently held ceremony in Bangalore, India.

huawei-knoldus-award

It means a lot for us and is a validation of the quality and focus that we put on the Scala and Spark Ecosystem. Huawei recognized Knoldus for the expertise in Scala and Spark along with the excellent software development process practices under the Knolway™ umbrella. Knolway™ is the Knoldus Way of developing software which we have curated and refined over the past 6 years of developing Reactive and Big Data products.

Our heartiest thanks to Mr. V.Gupta, Mr. Vadiraj and Mr. Raghunandan for this honor.

att00001

About Huawei

Huawei is a leading global information and communications technology (ICT) solutions provider. Driven by responsible operations, ongoing innovation, and open collaboration, we have established a competitive ICT portfolio of end-to-end solutions in telecom and enterprise networks, devices, and cloud computing…

View original post 170 more words

Twitter’s tweets analysis using Lambda Architecture

Knoldus Blogs

Hello Folks,

In this blog i will explain  twitter’s tweets analysis with lambda architecture. So first we need to understand  what is lambda architecture,about its component and usage.

According to Wikipedia, Lambda architecture is a data processing architecture designed to handle massive quantities of data by taking advantage of both batch and stream processing methods.

Now let us see  lambda architecture components and its detail.

lambda-architecture-2-800

This architecture is classified into three layer :

Batch Layer : The batch layer precomputes the master dataset(it is core components of lambda architecture and it contains all data) into batch views so that queries can be resolved with low latency.

Speed Layer: In speed layer we are basically do two things,storing the realtime views and processing the incoming data stream so as to update those views.It fills the delta gap that is left by batch layer.that means combine speed layer view and batch view give us…

View original post 497 more words

Document generation of Akka-HTTP using Swagger

Knoldus Blogs

Hello All,

In this blog we are using Swagger to generate document of Akka-HTTP.

For Swagger you can get more information from  Introduction to Swagger .

For Akka you can get more information from Introduction to Akka-Http

You can find complete code here.

build.sbt file of the project contains all dependency.

build

SwaggerDocService.scala

swaggerdoc

AkkaSwagger.Scala

route

Get the Json in response by hitting localhost:8080/api-docs/swagger.json

swaggerjsonresponse

You can get the document by pasting json on http://editor.swagger.io/#/

output

Resources

Swagger

Thanks !!!

KNOLDUS-advt-sticker

View original post

Introduction to database migrations using Flyway

Knoldus Blogs

https://flywaydb.org/assets/logo/flyway-logo-tm-sm.png

Let us first understand why are database migrations necessary?

Assume that we have a project called Shiny and it’s primary deliverable is a piece of software called Shiny Soft that connects to a database called Shiny DB. We not only have to deal with one copy of our environment, but with several.

So the simplest view of our problem will translate to:

https://flywaydb.org/assets/balsamiq/Environments.png

View original post 413 more words

Handling HTTPS requests with Akka-HTTPS Server

Knoldus Blogs

Hi guys,

In my last blogs I explained how one can create a self-signed certificate and KeyStore in PKCS12. You can go through the previous blog, as we’ll be needing certificate and keystore  for handling HTTPS requests.

  1. https://blog.knoldus.com/2016/10/18/create-a-self-signed-ssl-certificate-using-openssl/
  2. https://blog.knoldus.com/2016/10/26/how-to-create-a-keystore-in-pkcs12-format/

Akka-HTTP provides both Server-Side and Client-Side HTTPS support.

In this blog I’ll be covering the Server-Side HTTPS support.

Let’s start with “why do we need server-side HTTPS support?”

If we want the communication between the browser and the server to be encrypted we need to handle HTTPS request.  HTTPS is often used to protect highly confidential online transactions like online banking and online shopping order forms.

Akka-HTTP supports TLS(Transport Layer Security).

For handling the HTTPS request we need to have the SSL certificate and the KeyStore. Once you have generated both you can go through the example.

In this example, you will see how easily you can handle

View original post 219 more words

AWS | Cleaning up your Amazon ECS resources

Knoldus Blogs

In my previous blog posts on AWS (Introduction to Amazon ECS | Launch Amazon ECS cluster | Scaling with Amazon ECS | Deploy updated Task definition/Docker image), I had given an overview about what is Amazon ECS with a walk-through on how to launch Amazon ECS and then deploy sample app by creating a task definition, scheduling tasks and configuring a cluster and to scale in / scale out the same on Amazon ECS and we have also gained the knowledge on how to create a new revision for the existing task definition to deploy the latest updated docker image.

In this we will have a look on cleaning up your Amazon ECS resources that we have created so far. Once you have launched the Amazon ECS cluster and try to terminate container instances in order to clean up the resources then you won’t be able to…

View original post 281 more words

AWS | Amazon ECS – Deploy Updated Task Definition/Docker Imgaes

Knoldus Blogs

In my previous blog posts on AWS (Launch Amazon ECS cluster and scaling with Amazon ECS), I had explained how to deploy sample app by creating a task definition, scheduling tasks and configuring a cluster on Amazon ECS and to scale in / scale out the same.

In this I will guide you how to create a new revision for the existing task definition to use the latest updated docker image and then to run the service with new docker image and updated revision.

You can simply update the running service, just by changing the task definition revision. When a deployment is triggered by updating the task definition of a service, the service scheduler uses the deployment configuration parameters, minimumHealthyPercent and maximumPercent , to determine the deployment strategy.

If the minimumHealthyPercent is below 100%, the scheduler can ignore the desired count temporarily during a deployment. For example, if your…

View original post 387 more words

AWS | Scaling with Amazon ECS

Knoldus Blogs

In my last post regarding AWS, I had explained how to launch Amazon ECS cluster including cloud formation, VPC and subnet creation,  ELB and ECS security group creation, auto scaling group, launch configuration, elastic load balancer creation with the help of sample app by creating a task definition, scheduling tasks and configuring a cluster through Amazon ECS First Run Wizard.

In this blog, I will talk about auto scaling means how to

  • Scale in / scale out EC2 instance in a cluster.
  • Scale in / scale out containers(tasks) for a particular service.

Auto scaling is very helpful, as configuring EC2 instance in a auto scaling group or deploying and managing different containers of a same micro service manually is a lot complicated. It can take a lot of time and efforts to do that, but Amazon EC2 Container Service make it easier as it provide one click auto scaling…

View original post 269 more words

Walk through – Amazon ECS First Run Wizard

Knoldus Blogs

In my last post, I had explained what is Amazon ECS? Its features and about the main components that are required to start using Amazon EC2 Container Service.

In this blog, I will give you a walk-through of launching EC2 container instances through Amazon ECS First Run Wizard, in which I will deploy the sample application provided by Amazon. We can start using Amazon ECS by creating a task definition, scheduling tasks, and configuring a cluster through Amazon ECS First Run Wizard.

1. Create a task definition

The task definition is a text file, in JSON format, describing the containers that together form an application. Within a Task Definition you can specify one or more containers required for your task, including the Docker repository and image, memory and CPU requirements, shared data volumes, and how the containers are linked to each other.

Task definitions created in the…

View original post 427 more words