Friday, May 26, 2017

HBASE Tools

Apache phoenix 

Apache phoenix is an SQL interface on the top of HBASE. For more information click here
You can connect by executing the following command under bin.
python sqlline.py zookeeper_url 
The Phoenix client driver can be found here. You can  download and setup SQuirrel as your SQL client so you can issue adhoc SQL against your HBase cluster

HBASE Stargate  

HBASE startgate is the rest interface of HBASE. for more information click here.

Apache Phoenix Bulk Loading 

Here is an example:
Create us_population.sql file
CREATE TABLE IF NOT EXISTS us_population ( state CHAR(2) NOT NULL, city VARCHAR NOT NULL, population BIGINT CONSTRAINT my_pk PRIMARY KEY (state, city));
Create us_population.csv file
NY,New York,8143197 CA,Los Angeles,3844829 IL,Chicago,2842518 TX,Houston,2016582 PA,Philadelphia,1463281 AZ,Phoenix,1461575 TX,San Antonio,1256509 CA,San Diego,1255540 TX,Dallas,1213825 CA,San Jose,912332
Create us_population_queries.sql file
SELECT state as “State”,count(city) as “City Count”,sum(population) as “Population Sum” FROM us_population GROUP BY state ORDER BY sum(population) DESC;
Execute the following command from a command terminal
/opt//psql.py us_population.sql us_population.csv us_population_queries.sql
Output:
csv columns from database.
CSV Upsert complete. 10 rows upserted
Time: 0.181 sec(s)
St City Count Population Sum
-- ---------------------------------------- ----------------------------------------
NY 1 8143197
CA 3 6012701
TX 3 4486916
IL 1 2842518
PA 1 1463281
AZ 1 1461575
Time: 0.03 sec(s)
For more information click here.

HBASE ImportTSV 

ImportTsv is a utility that will load data in TSV format into HBase. It can be used for the HBASE tables those are not managed by phoenix.
Please use this link for more information.