一、概述
(https://hbase.apache.org/)
一、概述 1 HBase是Apache非关系数据库提供开源 2. HBase基于底层存储Hadoop,它是一种分布式、可扩展的储存方式 数量数据库 3. HBase大量数据可以实时读写 4. HBase是一个NOSQL (Not Only sQL)的数据库 5. HBase是由Doug模仿带领团队实现的Google的<Bigtable: ADistributed Storage system for Structured Data>。也因为HBase是仿照BigTable所以HBase的原理和BigTable一样,只是BigTable它是用C语言实现的,HBase是用Java实现的 7.在HBase如果你想删除一个表,你必须先禁止这个表,然后才能删除 8.在HBase在中间,只有支持数据类型string和整数 9.HBase存储在中间的数据稀疏 10.在HBase添加的每个数据都会添加一个字段,这是时间戳 10.在HBase添加的每个数据都会添加一个字段,即时间戳。在HBase在搜索过程中,如果没有指定默认情况下是最新的。这也就意味着HBase事实上,中改的能力不是修改原始数据,而是在文件的末尾添加数据。因为它获得了最新的数据,所以它看起来像是一个修改——这次戳称为数据版本VERSION 11.在HBase在中间,如果没有指定,每个列族只存储一个版本的数据,并且在获取时只能获取一个版本的数据。这意味着,如果您需要获取多个版本的数据,则在建立表格时需要指定保留的版本数量
比较行级和列级数据库
二、基本概念 1.行键rowkey:类比RDBMS中间的主键,在HBase中每一条数据都必须对应一个行键。注意,在放数据的时候,要求行键是唯一的。如果是相同的行键,那么认为是同一条数据。HBase默认对键排序,默认是字典序
2.列族/列簇column family:是HBase存储数据的基本单位。HBase在中间,每个表至少包含一个列族。理论上不限制列族数量,但在实际发展过程中,一般建议列族数量不超过3个。一般不建议跨列族查询。 3.列column:在HBase列通常不强调列。列族中可包含0至多列,在使用过程中可动态增删列。列数不固定。 4.名称空间namespace:类比于MySQL中的database概念。主要功能是区分表。请注意,默认使用的名称空间是未指定的default 5.单元cell: rowkey column timestamp唯一锁定的数据/version锁定唯一的数据
hbase下载
https://archive.apache.org/dist/hbase/0.98.17/ hbase-0.98.17-hadoop2-bin.tar.gz
解压安装
配置
[root@hadoop01 conf]# vim hbase-env.sh export JAVA_HOME=/home/presoftware/jdk1.8.0_181 #关闭自带zk export HBASE_MANAGES_ZK=false [root@hadoop01 conf]# source hbase-env.sh
[root@hadoop01 conf]# vim hbase-site.xml <configuration> <!--配置hbase在HDFS上存储路径--> <property> <name>hbase.rootdir</name> <value>hdfs://hadoop01:9000/hbase</value>
</property>
<!--开启hbase的分布式-->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!--#配置Zookeeper的连接地址与端口号-->
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop01:2181,hadoop02:2181,hadoop03:2181</value>
</property>
</configuration>
[root@hadoop01 conf]# vim regionservers
hadoop01
hadoop02
hadoop03
启动三个节点上的zk
拷贝第一台机器上的hbase到其他两台机器上
[root@hadoop01 presoftware]# scp -r hbase-0.98.17-hadoop2 root@hadoop02:/home/presoftware/
[root@hadoop01 presoftware]# scp -r hbase-0.98.17-hadoop2 root@hadoop03:/home/presoftware/
启动hbase(第一台)
[root@hadoop01 bin]# sh start-hbase.sh
starting master, logging to /home/presoftware/hbase-0.98.17-hadoop2/bin/../logs/hbase-root-master-hadoop01.out
hadoop02: starting regionserver, logging to /home/presoftware/hbase-0.98.17-hadoop2/bin/../logs/hbase-root-regionserver-hadoop02.out
hadoop03: starting regionserver, logging to /home/presoftware/hbase-0.98.17-hadoop2/bin/../logs/hbase-root-regionserver-hadoop03.out
hadoop01: starting regionserver, logging to /home/presoftware/hbase-0.98.17-hadoop2/bin/../logs/hbase-root-regionserver-hadoop01.out
[root@hadoop01 bin]# jps
4368 QuorumPeerMain
4770 HRegionServer
4643 HMaster
其他两台机器
[root@hadoop02 presoftware]# jps
1577 QuorumPeerMain
1774 Jps
页面访问hbase 地址
192.168.253.129:60010
效果 进入命令行
[root@hadoop01 bin]#sh hbase shell
hbase删除------CTRL+ 删除键 修改配置可以直接删除
基本命令
hbase(main):011:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load
hbase(main):012:0> version
0.98.17-hadoop2, rd5f8300c082a75ce8edbbe08b66f077e7d663a4a, Fri Jan 15 22:46:43 PST 2016
hbase(main):014:0> whoiam
NameError: undefined local variable or method `whoiam' for #<Object:0x2a2ef072>
建表
hbase(main):015:0> create 'person',{
NAME=>'basic'},{
NAME=>'info'}
0 row(s) in 4.1870 seconds
=> Hbase::Table - person
查看
hbase(main):027:0> list
TABLE
person
1 row(s) in 0.1170 seconds
=> ["person"]
查看表结构
hbase(main):029:0> desc 'person'
Table person is ENABLED
person
COLUMN FAMILIES DESCRIPTION
{
NAME => 'basic', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KE
EP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COM
PRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '6553
6', REPLICATION_SCOPE => '0'}
{
NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEE
P_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMP
RESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536
', REPLICATION_SCOPE => '0'}
2 row(s) in 0.2910 seconds
创建表简写方式
hbase(main):030:0> create 'student','basic','info'
0 row(s) in 1.5010 seconds
=> Hbase::Table - student
删除表(报错,需要禁用)
hbase(main):004:0> drop 'student'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/presoftware/hbase-0.98.17-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/presoftware/hadoop-2.7.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
ERROR: Table student is enabled. Disable it first. Here is some help for this command:
Drop the named table. Table must first be disabled:
hbase> drop 't1'
hbase> drop 'ns1:t1'
正确删除表方式(先禁用)
hbase(main):005:0>
hbase(main):005:0> disable 'student'
0 row(s) in 2.4740 seconds
hbase(main):006:0> drop 'student'
0 row(s) in 2.2510 seconds
hbase(main):007:0>
hbase(main):009:0> exists 'student'
Table student does not exist
0 row(s) in 0.1620 seconds
hbase(main):011:0> is_enabled 'person'
true
0 row(s) in 0.0790 seconds
新建名称空间
hbase(main):012:0> create_namespace 'hbasedemo'
0 row(s) in 0.6230 seconds
hbase(main):013:0> list_namespace
NAMESPACE
default
hbase
hbasedemo
3 row(s) in 0.1390 seconds
hbase(main):014:0> list_namespace_tables 'default'
TABLE
person
1 row(s) in 0.0960 second
在自定义名称空间下建表
hbase(main):016:0> create 'hbasedemo:person','basic','info'
0 row(s) in 4.7610 seconds
=> Hbase::Table - hbasedemo:person
删除名称空间(此名称空间无表才可删除)
hbase(main):017:0> disable 'hbasedemo:person'
0 row(s) in 1.5760 seconds
hbase(main):018:0> drop 'hbasedemo:person'
0 row(s) in 0.5080 seconds
hbase(main):019:0> drop_namespace 'hbasedemo'
0 row(s) in 0.4460 seconds
增加数据行键为p1,列族为basic
hbase(main):031:0> put 'person','p1','basic:name','Amy'
0 row(s) in 0.8850 seconds
hbase(main):032:0> put 'person','p1','basic:age',15
0 row(s) in 0.0920 seconds
hbase(main):033:0> put 'person','p1','info:addr','beijing'
0 row(s) in 0.2320 seconds
hbase(main):054:0> get 'person', 'p1'
COLUMN CELL
basic:age timestamp=1658618923507, value=15
basic:name timestamp=1658618824327, value=Amy
info:addr timestamp=1658618960873, value=beijing
3 row(s) in 0.1570 seconds
hbase(main):055:0> get 'person', 'p1', {
COLUMN => 'basic'}
COLUMN CELL
basic:age timestamp=1658618923507, value=15
basic:name timestamp=1658618824327, value=Amy
2 row(s) in 0.0440 seconds
hbase(main):056:0> get 'person', 'p1', {
COLUMN => 'basic:age'}
COLUMN CELL
basic:age timestamp=1658618923507, value=15
1 row(s) in 0.0350 seconds
hbase(main):057:0> get 'person', 'p1', 'basic:age'
COLUMN CELL
basic:age timestamp=1658618923507, value=15
1 row(s) in 0.1030 seconds
hbase(main):058:0> get 'person', 'p1', 'basic:name','basic:age','info:addr'
COLUMN CELL
basic:age timestamp=1658618923507, value=15
basic:name timestamp=1658618824327, value=Amy
info:addr timestamp=1658618960873, value=beijing
3 row(s) in 0.0790 seconds
hbase(main):059:0> put 'person','p2','basic:name','Sam'
0 row(s) in 0.4240 seconds
hbase(main):060:0> get 'person','p2','basic:name'
COLUMN CELL
basic:name timestamp=1658619944413, value=Sam
1 row(s) in 0.0470 seconds
hbase(main):061:0> put 'person','p2','basic:gender','male'
0 row(s) in 0.0490 seconds
hbase(main):062:0> put 'person','p2','info:phone',182336698
0 row(s) in 0.0560 seconds
删除命令(删除列族中的某列和整个列族)
hbase(main):063:0> delete 'person','p2','info:phone'
0 row(s) in 0.2670 seconds
hbase(main):064:0> delete 'person','p2','info'
0 row(s) in 0.0360 seconds`在这里插入代码片`
删除行键p2的值
hbase(main):068:0> deleteall 'person','p2'
0 row(s) in 0.2760 seconds
修改列族中的某一列的值
hbase(main):069:0> put 'person','pa','basic:name','Tom'
0 row(s) in 0.2550 seconds
扫描person全表
hbase(main):071:0> scan 'person'
ROW COLUMN+CELL
p1 column=basic:age, timestamp=1658618923507, value=15
p1 column=basic:name, timestamp=1658618824327, value=Amy
p1 column=info:addr, timestamp=1658618960873, value=beijing
pa column=basic:name, timestamp=1658620504909, value=Tom
2 row(s) in 0.3560 seconds
hbase(main):075:0> scan 'person',{
COLUMNS=>['basic']}
ROW COLUMN+CELL
p1 column=basic:age, timestamp=1658618923507, value=15
p1 column=basic:name, timestamp=1658618824327, value=Amy
pa column=basic:name, timestamp=1658620504909, value=Tom
2 row(s) in 0.2150 seconds
hbase(main):076:0> scan 'person',{
COLUMNS=>['basic:name']}
ROW COLUMN+CELL
p1 column=basic:name, timestamp=1658618824327, value=Amy
pa column=basic:name, timestamp=1658620504909, value=Tom
2 row(s) in 0.1090 seconds
查看最近版本
hbase(main):079:0> get 'person', 'p1', {
COLUMN => 'basic:name' , VERSIONS => 3}
COLUMN CELL
basic:name timestamp=1658618824327, value=Amy
1 row(s) in 0.1860 seconds
删除person表