镜像来自:https://hub.docker.com/r/apache/hive

1
docker pull apache/hive

/opt/apache-hive-3.1.2-bin/hive313 信息是先启动一个 docker run,然后 docker cp hive4:/opt/hive/conf ./

修改 hive-site 配置文增加以下信息,开启 hive metastore events 功能

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
<property>
    <name>hive.metastore.event.db.notification.api.auth</name>
    <value>false</value>
</property>
<property>
    <name>hive.metastore.notifications.add.thrift.objects</name>
    <value>true</value>
</property>
<property>
    <name>hive.metastore.alter.notifications.basic</name>
    <value>false</value>
</property>
<property>
    <name>hive.metastore.dml.events</name>
    <value>true</value>
</property>
<property>
    <name>hive.metastore.transactional.event.listeners</name>
    <value>org.apache.hive.hcatalog.listener.DbNotificationListener</value>
</property>
<property>
    <name>hive.metastore.event.db.listener.timetolive</name>
    <value>172800s</value>
</property>
<property>
    <name>hive.metastore.server.max.message.size</name>
    <value>858993459</value>
</property>

启动 hive-metastore

1
2
3
4
5
6
docker run -d -p 9083:9083 -v /opt/apache-hive-3.1.2-bin/hive313:/opt/hive/conf --env SERVICE_NAME=metastore --name hive4 apache/hive:3.1.3

docker exec -it -u 0 hive4 mkdir -p /user/hive/warehouse/ && docker exec -it -u 0 hive4 chown hive: /user/hive/warehouse/

必须执行上述目录的创建,否则提示报错  
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/hi.db, failed to create database hi) (state=08S01,code=1)

启动 hive-server2

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
docker run -d -p 10000:10000 -p 10002:10002  \
-v /opt/apache-hive-3.1.2-bin/hive313:/opt/hive/conf --env SERVICE_NAME=hiveserver2 \
--env SERVICE_OPTS="-Dhive.metastore.uris=thrift://127.0.0.1:9083" \
--env IS_RESUME="true" \
--name hive5 apache/hive:3.1.3

// 127.0.0.1:9083 可以换成 docker hive-metastore 所在机器 IP 地址

docker exec -it -u 0 hive5 mkdir /home/hive && docker exec -it -u 0 hive5 chown hive: /home/hive
// 必须执行上述目录的创建,不创建该目录无法执行 beeline 连接
docker exec -it hive5 beeline -u 'jdbc:hive2://localhost:10000/'

创建 hive catalogs

1
2
3
4
5
6
CREATE EXTERNAL CATALOG hive4
PROPERTIES(
   "type"="hive", 
   "hive.metastore.uris"="thrift://127.0.0.1:9083",
    "enable_hms_events_incremental_sync" = "true"
);
  1. 能验证 db / table 元数据同步

  2. 无法验证数据查询,因为这种方式 hive 不是基于 hdfs 启动的数据。StarRcoks BE 无法获取底层文件的信息,报错如下

    1
    2
    3
    
    mysql> select count(*) from hive5.hi.a;
    
    ERROR 1064 (HY000): Failed to get remote files, msg: com.starrocks.connector.exception.StarRocksConnectorException: Failed to get hive remote file's metadata on path: RemotePathKey{path='file:/user/hive/warehouse/hi.db/a', isRecursive=true}. msg: File /user/hive/warehouse/hi.db/a does not exist