Tuesday, September 26, 2017

Apache Kafka with SSL Confguration

Kafka with SSL:

- Generate SSL key and certificate for each Kafka broker
keytool -keystore kafka.server.keystore.jks -alias localhost -validity 365 -genkey
output: kafka.server.keystore.jks

- Creating your own CA
openssl req -new -x509 -keyout ca-key -out ca-cert -days 365
output: ca-cert, ca-key

- The next step is to add the generated CA to the clients’ truststore so that the clients can trust this CA:
keytool -keystore kafka.server.truststore.jks -alias CARoot -import -file ca-cert

- If you configure the Kafka brokers to require client authentication by setting ssl.client.auth to requested or required on the broker config then you must also provide a truststore for the Kafka brokers and it should have all the CA certificates that clients keys were signed by.
keytool -keystore kafka.client.truststore.jks -alias CARoot -import -file ca-cert


- The next step is to sign all certificates in the keystore with the CA we generated.
  1. First, you need to export the certificate from the keystore:
    keytool -keystore kafka.server.keystore.jks -alias localhost -certreq -file cert-file
  2. Then sign it with the CA:
openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file -out cert-signed -days 365 -CAcreateserial -passin pass:test1234

- Finally, you need to import both the certificate of the CA and the signed certificate into the keystore:
keytool -keystore kafka.server.keystore.jks -alias CARoot -import -file ca-cert
keytool -keystore kafka.server.keystore.jks -alias localhost -import -file cert-signed


server.properties:
listeners=PLAINTEXT://node1.openstacklocal:9092,SSL://node1.openstacklocal:9094
security.protocol=SSL
ssl.keystore.location=/home/kafka/ssl/kafka.server.keystore.jks
ssl.keystore.password=test1234
ssl.key.password=test1234
ssl.truststore.location=/home/kafka/ssl/kafka.server.truststore.jks
ssl.truststore.password=test1234
security.inter.broker.protocol=SSL


producer.properties:
bootstrap.servers=node1.openstacklocal:9092
security.protocol=SSL
ssl.truststore.location=/home/kafka/ssl/kafka.client.truststore.jks
ssl.truststore.password=test1234
ssl.keystore.location=/home/kafka/ssl/kafka.client.keystore.jks
ssl.keystore.password=test1234
ssl.key.password=test1234


consumer.properties:
security.protocol=SSL
ssl.truststore.location=/home/kafka/ssl/kafka.client.truststore.jks
ssl.truststore.password=test1234
ssl.keystore.location=/home/kafka/ssl/kafka.client.keystore.jks
ssl.keystore.password=test1234
ssl.key.password=test1234

Kafka Cluster Installation Steps


Kafka Cluster installation steps:
During Kafka server installation don’t use the local zookeeper . Use the Cloudera or your Big Data distribution zookeepers.
Kafka version should be same on all the nodes. use the Cloudera manager zookeeper for Kafka cluster.
Make sure the Java version is same on all the nodes.

Installation steps:
Step 1- Start the Kafka broker 0 on node-1 (assuming you have Kafka installed already)
go the directory where you have kafka installed :
ex: cd /usr/share/kafka_2.10-0.8.2.0
verify the server.properties , it should point to the broker .id=0(default).

Step -2 : login to node -2
go the directory where you have kafka installed :
ex: cd /usr/share/kafka_2.10-0.8.2.0
modify the server.properties file and create another kafka broker id 1 as below:
config/server.properties:
broker.id=1
port=9093
log.dir=/tmp/kafka-logs-1

Step -3 : login to node -3
go the directory where you have kafka installed :
ex: cd /usr/share/kafka_2.10-0.8.2.0
modify the server.properties file and create another kafka brokerid 2 as below:
config/server.properties:
broker.id=2
port=9094
log.dir=/tmp/kafka-logs-2

Step -4 .if you have zookeeper cluster than every node of the cluster modify modify the property zookeeper.connect from the file  kafka/config/server.properties:
zookeeper.connect=zNode01:2181,zNode02:2181,zNode03:2181

Run below command in all nodes to start kafka servers
/kafka/bin/kafka-server-start.sh /kafka/config/server.properties
----------------------
To confirm the brokers are running correctly, use this command on each node: ps -ef |grep server.properties

Struts Action Classes

Struts
Action classes
--------------
1) Action
2) DispatchAction
3) EventDispatchAction
4) LookupDispatchAction
5) MappingDispatchAction
6) DownloadAction
7) LocaleAction
8) ForwardAction
9) IncludeAction
10) SwitchAction
EventDispatchAction
-EventDispatchAction is a sub class of DispatchAction
-Developer must derive Action class from EventDispatchAction and must implement functions same as DispatchAction class
-But in DispatchAction the parameter attribute in struts-config.xml file contains request parameter name (like "method"), each form submitting to struts must submit request parameter name along with function want to be executed in DispatchAction class as request parameter value
-In EventDispatchAction, developer after implementing all the functions on Action class, the function names must be registered in parameter attribute with comma seperator. This is an indication that action class is registering all its functions as events in s-c.xml file. JSP which are requesting to the same action path must use request parameter name same as one of the event name registered in s-c.xml file. JSP those which doesn't satisfy this condition will get JSP compilation error
-Configuring EDA sub classes in s-c.xml
-During client request
If no event name is specified as request parameter the default event will be taken
-Otherwise
LookupDispatchAction
--------------------
-Developer must sub class Action class from LookupDispatchAction class and must implement one function called getKeyMethodMap()
-The getKeyMethodMap() function returns Map object
-The Map object contains messagekey mapped to function name
-During client request the form must submit request parameter name as "function" (this name is config in s-c.xml file) and its value (request parameter value) must match with one of the message value exist in ApplicationResources.properties file
-The moment RequestProcessor receives client request, RP read req parameter value and checks for the message value in AR.properties file, the corresponding message key is retrieved
-RP instantiates Action class, invokes getKeyMethodMap() function, lookupfor the message key in map object, retrieves corresponding value from Map object and assumes that value as business logic function in Action class
public class UserAction extends LookupDispatchAction
{
protected Map getKeyMethodMap()
{
Map m=new HashMap();
m.add("button.add","insert");
m.add("button.update","update");
return m;
}
insert(), update() functions are as usual
}
ApplicationResources.properties
-------------------------------
button.add=Add User
button.update=Update User
s-c.xml
-------
InsertUser.jsp
MappingDispatchAction
-The Action class must be sub classing from MappingDispatchAction with the function same as before
-The same Action class is mapped in s-c.xml file with different action paths and each such configuration contains parameter name that holds function want to be executed on Action class
UserInsert.jsp
UserUpdate.jsp
ForwardAction
-------------
-This Action is used to connect to java web components & web documents through Struts framework
-Functioning is same as RequestDispatcher.forward() operation in Servlets & tag of JSP
-When using this Action class developer need not have to derive any class from ForwardAction
-When client request action path, the path must be mapped to org.apache.struts.actions.ForwardAction class in struts-config.xml file and its parameter could be pointing to JSP, Servlet (old web components if any exist) and action path to another struts module within the same web application
IncludeAction
-------------
Same as ForwardAction except include() operation is used on JSP, Servlet and action paths of other modules
LocaleAction
------------
-This action class is used to change the localization of user
-When request comes to action path of struts-config.xml file map the same to LocaleAction class
-This class implicitly reads two request parameters "language" and "country" and changes Locale object information @ session level

Hibernate (Model Framework / ORM Framework)


Hibernate
Hibernate Framework
(Model Framework/ORM Framework)
ORM - Object Relation Mapping)
------------------------------
Features
1) Makes persistence (INSERT/UPDATE/DELETE) operations transparent (invisible) to developer
In traditional JDBC program developer used to
-obtain connection via --DriverManager.getConnection() method (or) via by connection pooling
--Determine to use Statement or PreparedStatement object
--SQL statement managements
--Consuming resultset
--ensuring Atomicity & Consistency
In Hibernate
--we must write one XML document named "hibernate-cfg.xml" that contains JDBC driver information
--Write one Java Bean class for each table whose properties are same as table columns and their types
--Optionally one XML document that maps each JavaBean to table and its properties, save the document as Student.hbm.xml file
--Input both cfg.xml and hbm.xml files to Hibernate classes. Hibernate classes will internally create conn, stmt etc objects for SQL operations
--If we instantiate JavaBean, assign all the class instance variables with values and invoke only one function on hibernate class called save(bean) as argument. Hibernate reads javabean values, prepares one dynamic SQL statement and inserts record into DB.
--Atomicity and consistency are managed by hibernate implicitly
--This way hibernate reduces the JDBC code to do SQL operations on DB
--That means in Hibernate what developer must do is map one Java class to table. Hibernate maps the java object to one entity / record in the DB
-Hibernate does object-entity relation management
-Not only hibernate does persistence operations, it caches all the objects stored via Hibernate and as and when the record is modified in the DB, hibernate updates the state of javabean also. It implicitly avoids inconsistent problems
1) Transparent persistence operations
2) Object level relationship instead of maintaining relationship @ DB level. This is to facilitate portable relationships across all the DBs
3) Instead of fetching records from the DB using SQL and operating on ResultSet, we can fetch objects directly from Hibernate using HQL
4) Caching
-Memory level caching

-Disk level caching

Spring AOP (Aspect Oriented Programming)

- The importance of AOP is to seperate secondary / cross cutting concerns (middleware service implementations - TX, security, logging, session management, state persistence etc services) from primary / core concerns (business logic implementation).
- AOP recommends to write one Advice for each service implementation
-The types of Advices are
i) Method before Advice - sub class of MethodBeforeAdvice
ii) After Advice - sub class of AfterReturningAdvice
iii) Around Advice - sub class of MethodInterceptor (3rd party vendor)

iv) Throws Advice - sub class of ThrowsAdvice

Flow of Spring Web MVC



Flow of Spring Web MVC
1) While developing Spring Web MVC application DispatcherServlet class must be configured in web.xml file with the url-pattern *.htm
2) Each form submission made from client must contain action path suffixed with .htm
3) Such requests are submitted to DispatcherServlet
4) DS instantiates SpringContainer (dispatcher-servlet.xml)
5) lookup for the bean named "urlMapping", resolves "action path" with "bean id"
6) lookup for the bean with id
7) Instantiates Controller class, invokes setter injection methods and finally invokes handleRequest method
8) As soon as SpringController returns ModelAndView object

9) DS resolves prefix and suffix names of view, prepares fully qualified name of view and redirects respective document to browser

life cycle of Spring object creation



The lifecycle of object creation is
- As soon as we request beanfactoy.getBean("h2"), spring container instantiates bean with default / no arg constructor (in setter injection) or parameterized constructor during constructor injection)
-If setter injection is required set method(s) are called on bean instance
-User defined init-method
-and then gives bean ref to client
-as soon as client requests on bean methods are completed, before garbage collecting bean instance container invokes user defined destroy-method


Wednesday, September 6, 2017

Flume sink data consumption - Kafka LAG check

To see Flume sink data consumtion. check LAG-

Just by looking at the metadata of a Consumer Group, we can determine a few key metrics:  how many messages are being written to the partitions within the group, how many messages are being read from those partitions, and what is the difference?  The difference is called Lag, and represents how far the Consumer Group application is behind the producers.

/opt/kafka/bin/kafka-consumer-groups.sh --describe --bootstrap-server localhost:9092 --group hive-sinks-group |grep -E "^GROUP|demo"


Name node restart best way - Manual Namenode Failover

Below are the steps to restart Namenode with Namenode Failover

1. Do neccessarry changes in stand by(node2-openstacklocal) and restart standby name node and wait until it reports it is in standby state
service hadoop-hdfs-namenode stop
service hadoop-hdfs-namenode start

2. Get service from hdfs-site.xml

    dfs.ha.namenodes.mas
    node1-openstacklocal,node2-openstacklocal
 

3. run below command to failover
sudo -u hdfs hdfs haadmin -failover
[root@node2 ~]# sudo -u hdfs hdfs haadmin -failover node1-openstacklocal node2-openstacklocal
Failover to NameNode at node2.openstacklocal/10.194.10.42:8020 successful

4. NOW standby(node2-openstacklocal) will become active and active(node1-openstacklocal) will be come standby

5. do neccesarry changes (heap..etc) standby(node1-openstacklocal) and restart standby(node1-openstacklocal)
service hadoop-hdfs-namenode stop
service hadoop-hdfs-namenode start

REF: https://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#haadmin

Monday, September 4, 2017

Avro Serialize and DeSerialize using java

List recordList=//List of Maps. each map represent one row

String strSc="{"type" : "record",  "name" : "event",  "doc" : "event record",  "fields" : [ {    "name" : "event_id",    "type" : "string"  }, {    "name" : "host",    "type" : "string"  }, {    "name" : "source_address",    "type" : "string"  }]}"

Schema avroSchema=new Schema(strSc)

byte[] bytes = null;
ByteArrayOutputStream out=null;
try {
out = new ByteArrayOutputStream();
Encoder encoder = EncoderFactory.get().binaryEncoder(out, null);
GenericData.Record genericRecord = new GenericData.Record(avroSchema);
   avroSchema.getFields().stream().map(Schema.Field::name).filter(field -> record.get(field) != null).forEach(field ->
   {
   //System.out.println((String) field +" - "+record.get(field));
   genericRecord.put((String) field, record.get(field));
   });
writer.write(genericRecord, encoder);
encoder.flush();

bytes = out.toByteArray();

BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(bytes, null);
     result = reader.read(null, decoder);
     System.out.println("enent id: "+ result.get("event_id") );