之前我们看了HBase 集群的搭建,如下:
Hbase 2.1.3 集群搭建手册
https://www.cndba.cn/dave/article/3322
本篇我们了解一下HBase Shell及其常见操作。
1 HBase Shell 概述
HBase包含可以与HBase进行通信的Shell。 HBase使用Hadoop文件系统来存储数据。它拥有一个主服务器和区域服务器。数据存储将在区域(表)的形式。这些区域被分割并存储在区域服务器。
主服务器管理这些区域服务器,所有这些任务发生在HDFS。下面给出的是一些由HBase Shell支持的命令。
1.1 通用命令
- status: 提供HBase的状态,例如,服务器的数量。
- version: 提供正在使用HBase版本。
- table_help: 表引用命令提供帮助。
- whoami: 提供有关用户的信息。
1.2 数据定义语言
这些是关于HBase在表中操作的命令。
1) create: 创建一个表。
2) list: 列出HBase的所有表。
3) disable: 禁用表。
4) is_disabled: 验证表是否被禁用。
5) enable: 启用一个表。
6) is_enabled: 验证表是否已启用。
7) describe: 提供了一个表的描述。
8) alter: 改变一个表。
9) exists: 验证表是否存在。
10) drop: 从HBase中删除表。
11) drop_all: 丢弃在命令中给出匹配“regex”的表。
12) Java Admin API: 在此之前所有的上述命令,Java提供了一个通过API编程来管理实现DDL功能。在这个org.apache.hadoop.hbase.client包中有HBaseAdmin和HTableDescriptor 这两个重要的类提供DDL功能。
1.3 数据操纵语言
1) put: 把指定列在指定的行中单元格的值在一个特定的表。
2) get: 取行或单元格的内容。
3) delete: 删除表中的单元格值。
4) deleteall: 删除给定行的所有单元格。
5) scan: 扫描并返回表数据。
6) count: 计数并返回表中的行的数目。
7) truncate: 禁用,删除和重新创建一个指定的表。
8) Java client API: 在此之前所有上述命令,Java提供了一个客户端API来实现DML功能,CRUD(创建检索更新删除)操作更多的是通过编程,在org.apache.hadoop.hbase.client包下。 在此包HTable 的 Put和Get是重要的类。
1.4 启动 HBase Shell
如果安装HBase时配置过了环境变量,那么直接执行命令:hbase shell。
要退出交互shell命令,在任何时候键入 exit 或使用
[hadoop@hadoopMaster ~]$ hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019
Took 0.0027 seconds
hbase(main):001:0> status
1 active master, 1 backup masters, 3 servers, 0 dead, 0.6667 average load
Took 0.6609 seconds
hbase(main):002:0> exit
[hadoop@hadoopMaster ~]$
2 HBase 常用命令
HBase常用命令有:status, version, table_help和whoami。
2.1 status
命令返回包括在系统上运行的服务器的细节和系统的状态。
[hadoop@hadoopMaster ~]$ hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019
Took 0.0027 seconds
hbase(main):001:0> status
1 active master, 1 backup masters, 3 servers, 0 dead, 0.6667 average load
Took 0.5203 seconds
hbase(main):002:0>
2.2 version
该命令返回HBase系统使用的版本。
hbase(main):002:0> version
2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019
Took 0.0005 seconds
hbase(main):003:0>
2.3 table_help
此命令将引导如何使用表引用的命令。
hbase(main):003:0> table_help
Help for table-reference commands.
You can either create a table via 'create' and then manipulate the table via commands like 'put', 'get', etc.
See the standard help information for how to use each of these commands.
However, as of 0.96, you can also get a reference to a table, on which you can invoke commands.
For instance, you can get create a table and keep around a reference to it via:
hbase> t = create 't', 'cf'
Or, if you have already created the table, you can get a reference to it:
hbase> t = get_table 't'
You can do things like call 'put' on the table:
hbase> t.put 'r', 'cf:q', 'v'
which puts a row 'r' with column family 'cf', qualifier 'q' and value 'v' into table t.
To read the data out, you can scan the table:
hbase> t.scan
which will read all the rows in table 't'.
Essentially, any command that takes a table name can also be done via table reference.
Other commands include things like: get, delete, deleteall,
get_all_columns, get_counter, count, incr. These functions, along with
the standard JRuby object methods are also available via tab completion.
For more information on how to use each of these commands, you can also just type:
hbase> t.help 'scan'
which will output more information on how to use that command.
You can also do general admin actions directly on a table; things like enable, disable,
flush and drop just by typing:
hbase> t.enable
hbase> t.flush
hbase> t.disable
hbase> t.drop
Note that after dropping a table, your reference to it becomes useless and further usage
is undefined (and not recommended).
Took 0.0004 seconds
hbase(main):004:0>
2.4 whoami
该命令返回HBase用户详细信息。
hbase(main):004:0> whoami
hadoop (auth:SIMPLE)
groups: hadoop
Took 0.0130 seconds
hbase(main):005:0>
3 HBase 对象操作
3.1 HBase创建表
创建表时必须指定表名和列族名。在HBase shell中创建表的语法如下所示。
create ‘<table name>’,’<column family>’
示例创建的emp表有两个列族:“personal data”和“professional data”。
hbase(main):001:0> create 'emp', 'personal data', 'professional data'
Created table emp
Took 2.9473 seconds
=> Hbase::Table - emp
hbase(main):002:0>
list 命令可以列出HBase中所有的表,可以使用该命令查看表是否创建成功:
hbase(main):002:0> list
TABLE
emp
1 row(s)
Took 0.0392 seconds
=> ["emp"]
hbase(main):003:0>
3.2 HBase Exists
除了上面的list命令,也可以使用exists命令来验证表是否存在:
hbase(main):001:0> exists 'emp'
Table emp does exist
Took 0.5415 seconds
=> true
hbase(main):002:0> exists 'cndba'
Table cndba does not exist
Took 0.0105 seconds
=> false
hbase(main):003:0>
3.3 HBase禁用表
要删除表或改变其设置,首先需要使用 disable 命令关闭表。使用 enable 命令,可以重新启用它。
hbase(main):003:0> disable 'emp'
Took 0.8610 seconds
禁用表之后,仍然可以通过 list 和exists命令查看到。无法扫描到它存在,它会给下面的错误。
hbase(main):004:0> list 'emp'
TABLE
emp
1 row(s)
Took 0.0262 seconds
=> ["emp"]
hbase(main):005:0> exists 'emp'
Table emp does exist
Took 0.0131 seconds
=> true
hbase(main):006:0> scan 'emp'
ROW COLUMN+CELL
org.apache.hadoop.hbase.TableNotEnabledException: emp is disabled.
at org.apache.hadoop.hbase.client.ConnectionImplementation.relocateRegion(ConnectionImplementation.java:732)
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:328)
at org.apache.hadoop.hbase.client.ScannerCallable.prepare(ScannerCallable.java:139)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.prepare(ScannerCallableWithReplicas.java:399)
at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
ERROR: Table emp is disabled!
For usage try 'help "scan"'
Took 0.8882 seconds
hbase(main):007:0>
3.3.1 is_disabled
这个命令是用来查看表是否被禁用。如果禁用,它会返回true,如果没有,它会返回false。
hbase(main):007:0> is_disabled 'emp'
true
Took 0.0174 seconds
=> 1
hbase(main):008:0>
3.3.2 disable_all
此命令用于禁用所有匹配给定正则表达式的表。disable_all命令的语法如下。
hbase> disable_all 'r.*'
示例:
hbase(main):009:0> disable_all 'em.*'
emp
Disable the above 1 tables (y/n)?
y
1 tables successfully disabled
Took 3.0422 seconds
hbase(main):010:0>
3.4 HBase启用表
hbase(main):010:0> enable 'emp'
Took 1.2872 seconds
hbase(main):011:0> scan 'emp'
ROW COLUMN+CELL
0 row(s)
Took 0.1057 seconds
hbase(main):012:0>
3.4.1 is_enabled
此命令用于查找表是否被启用。它的语法如下:
hbase> is_enabled 'table name'
如果启用,它将返回true,如果没有,它会返回false。
hbase(main):012:0> is_enabled 'emp'
true
Took 0.0164 seconds
=> true
hbase(main):013:0>
3.5 HBase表描述和修改
3.5.1 描述
该命令返回表的说明。它的语法如下:
hbase> describe 'table name'
示例:
hbase(main):013:0> describe 'emp'
Table emp is ENABLED
emp
COLUMN FAMILIES DESCRIPTION
{NAME => 'personal data', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCOD
ING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false'
, PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
{NAME => 'professional data', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_E
NCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'fa
lse', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
2 row(s)
Took 0.0681 seconds
hbase(main):014:0>
3.5.2 修改
alter用于更改现有表的命令。使用此命令可以更改列族的单元,设定最大数量和删除表范围运算符,并从表中删除列家族。
3.5.2.1 更改列族单元格的最大数目
下面给出的语法来改变列家族单元的最大数目。
hbase> alter 't1', NAME => 'f1', VERSIONS => 5
示例:
hbase(main):014:0> alter 'emp', NAME => 'personal data', VERSIONS => 5
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.1366 seconds
hbase(main):015:0>
3.5.3 表范围运算符
使用alter,可以设置和删除表范围,运算符,如MAX_FILESIZE,READONLY,MEMSTORE_FLUSHSIZE,DEFERRED_LOG_FLUSH等。
3.5.4 设置只读
用以设置表为只读,语法如下:
hbase>alter 't1', READONLY(option)
示例:
hbase(main):015:0> alter 'emp', READONLY
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.1613 seconds
hbase(main):016:0>
3.5.5 删除表范围运算符
删除表范围运算。语法如下:
hbase> alter 't1', METHOD => 'table_att_unset', NAME => 'MAX_FILESIZE'
3.5.6 删除列族
使用alter,也可以删除列族。语法如下:。
hbase> alter ‘ table name ’, ‘delete’ => ‘ column family ’
示例:
hbase(main):017:0> alter 'emp','delete'=>'personal'
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.2108 seconds
hbase(main):018:0>
3.6 HBase删除表
用drop命令可以删除表。在删除之前必须先将其禁用。
hbase(main):035:0> drop 'cndba'
ERROR: Table cndba is enabled. Disable it first.
For usage try 'help "drop"'
Took 0.0135 seconds
hbase(main):036:0> disable 'cndba'
Took 0.4607 seconds
hbase(main):037:0> exists 'cndba'
Table cndba does exist
Took 0.0062 seconds
=> true
hbase(main):038:0> drop 'cndba'
Took 0.4850 seconds
hbase(main):039:0> exists 'cndba'
Table cndba does not exist
Took 0.0057 seconds
=> false
hbase(main):040:0>
3.6.1 drop_all
这个命令是用来在给出删除匹配“regex”表。语法如下:
hbase> drop_all ‘t.*’
注意:要删除表,则必须先将其禁用。
hbase(main):002:0> disable_all 'cndba.*'
hbase(main):018:0> drop_all 'cndba.*'
4 HBase 数据操作
4.1 HBase创建数据
可以使用下面的命令和方法在HBase表中创建的数据:
1) put 命令,
2) add() - Put类的方法
3) put() - HTable 类的方法.
作为一个例子,我们将在HBase中创建下表。
使用put命令,可以插入行到一个表。它的语法如下:
put ’<table name>’,’row1’,’<colfamily:colname>’,’<value>’
插入第一行
hbase(main):040:0> put 'emp','1','personal data:name','raju'
Took 0.0844 seconds
hbase(main):041:0> put 'emp','1','personal data:city','hyderabad'
Took 0.0089 seconds
hbase(main):042:0> put 'emp','1','professional data:designation','manager'
Took 0.0097 seconds
hbase(main):043:0> put 'emp','1','professional data:salary','50000'
Took 0.0101 seconds
hbase(main):044:0> scan 'emp'
ROW COLUMN+CELL
1 column=personal data:city, timestamp=1551805675526, value=hyderabad
1 column=personal data:name, timestamp=1551805641439, value=raju
1 column=professional data:designation, timestamp=1551805687669, value=manager
1 column=professional data:salary, timestamp=1551805696264, value=50000
1 row(s)
Took 0.0399 seconds
hbase(main):045:0>
4.2 HBase更新数据
可以使用put命令更新现有的单元格值。按照下面的语法,并注明新值,如下图所示。
put ‘table name’,’row ’,'Column family:column name',’new value’
新给定值替换现有的值,并更新该行。
hbase(main):047:0> scan 'emp'
ROW COLUMN+CELL
1 column=personal data:city, timestamp=1551805675526, value=hyderabad
1 column=personal data:name, timestamp=1551805641439, value=raju
1 column=professional data:designation, timestamp=1551805687669, value=manager
1 column=professional data:salary, timestamp=1551805696264, value=50000
Took 0.0165 seconds
hbase(main):049:0> put 'emp','1','personal data:city','Delhi'
Took 0.0083 seconds
hbase(main):050:0> scan 'emp'
ROW COLUMN+CELL
1 column=personal data:city, timestamp=1551805892261, value=Delhi
1 column=personal data:name, timestamp=1551805641439, value=raju
1 column=professional data:designation, timestamp=1551805687669, value=manager
1 column=professional data:salary, timestamp=1551805696264, value=50000
Took 0.0123 seconds
hbase(main):051:0>
4.3 HBase读取数据
get命令和HTable类的get()方法用于从HBase表中读取数据。使用 get 命令,可以同时获取一行数据。它的语法如下:
get ’<table name>’,’row1’
示例:扫描emp表的第一行。
hbase(main):051:0> get 'emp','1'
COLUMN CELL
personal data:city timestamp=1551805892261, value=Delhi
personal data:name timestamp=1551805641439, value=raju
professional data:designation timestamp=1551805687669, value=manager
professional data:salary timestamp=1551805696264, value=50000
1 row(s)
Took 0.0523 seconds
hbase(main):052:0>
也可以使用get方法读取指定列,语法如下:
hbase>get 'table name', ‘rowid’, {COLUMN => ‘column family:column name ’}
示例:
hbase(main):052:0> get 'emp','1',{COLUMN=>'professional data:salary'}
COLUMN CELL
professional data:salary timestamp=1551805696264, value=50000
1 row(s)
Took 0.0100 seconds
hbase(main):053:0>
4.4 HBase删除数据
4.4.1 从表删除特定单元格
使用 delete 命令,可以在一个表中删除特定单元格。 delete 命令的语法如下:
delete ‘<table name>’, ‘<row>’, ‘<column name >’, ‘<time stamp>’
示例:
hbase(main):054:0> scan 'emp'
ROW COLUMN+CELL
1 column=personal data:city, timestamp=1551805892261, value=Delhi
1 column=personal data:name, timestamp=1551805641439, value=raju
1 column=professional data:designation, timestamp=1551805687669, value=manager
1 column=professional data:salary, timestamp=1551805696264, value=50000
row1 column=personal data:city, timestamp=1551805856605, value=Delhi
2 row(s)
Took 0.0154 seconds
hbase(main):055:0> delete 'emp','1','personal data:city'
Took 0.0130 seconds
hbase(main):056:0> scan 'emp'
ROW COLUMN+CELL
1 column=personal data:city, timestamp=1551805675526, value=hyderabad
1 column=personal data:name, timestamp=1551805641439, value=raju
1 column=professional data:designation, timestamp=1551805687669, value=manager
1 column=professional data:salary, timestamp=1551805696264, value=50000
row1 column=personal data:city, timestamp=1551805856605, value=Delhi
2 row(s)
Took 0.0189 seconds
hbase(main):057:0>
4.4.2 删除表的所有单元格
使用“deleteall”命令,可以删除一行中所有单元格。下面给出是 deleteall 命令的语法。
deleteall ‘<table name>’, ‘<row>’,
示例:
hbase(main):056:0> scan 'emp'
ROW COLUMN+CELL
1 column=personal data:city, timestamp=1551805675526, value=hyderabad
1 column=personal data:name, timestamp=1551805641439, value=raju
1 column=professional data:designation, timestamp=1551805687669, value=manager
1 column=professional data:salary, timestamp=1551805696264, value=50000
row1 column=personal data:city, timestamp=1551805856605, value=Delhi
2 row(s)
Took 0.0189 seconds
hbase(main):057:0> deleteall 'emp','1'
Took 0.0076 seconds
hbase(main):058:0> scan 'emp'
ROW COLUMN+CELL
row1 column=personal data:city, timestamp=1551805856605, value=Delhi
1 row(s)
Took 0.0131 seconds
hbase(main):059:0>
4.5 HBase扫描
scan 命令用于查看HTable数据。使用 scan 命令可以得到表中的数据。它的语法如下:
scan ‘<table name>’
4.6 HBase计数和截断
4.6.1 count
可以使用count命令计算表的行数量。它的语法如下:
count ‘<table name>’
示例:
hbase(main):059:0> count 'emp'
1 row(s)
Took 0.0564 seconds
=> 1
hbase(main):060:0>
4.6.2 truncate
此命令将禁止删除并重新创建一个表。truncate 的语法如下:
hbase> truncate 'table name'
示例:
hbase(main):060:0> truncate 'emp'
Truncating 'emp' table (it may take a while):
Disabling table...
Truncating table...
Took 3.6143 seconds
hbase(main):061:0> scan 'emp'
ROW COLUMN+CELL
0 row(s)
Took 0.1170 seconds
hbase(main):062:0>
版权声明:本文为博主原创文章,未经博主允许不得转载。