0. 链接相关文章
1. 环境准备
1.1.构建服务器环境
1.2. 构建Maven项目和数据写入
2. Maven依赖
3. 核心代码
3.1. 直接查询
3.2. 条件查询
0. 链接相关文章
数据湖 文章汇总
1. 环境准备
1.1.构建服务器环境
关于构建Spark向Hudi在服务器环境中插入数据,您可以参考另一篇博客文章CentOS7上安装HDFS可以,博文连接:数据湖Hudi(6):Hudi与Spark和HDFS集成安装使用
1.2. 构建Maven项目和数据写入
本博文演示使用Spark代码查询Hudi表中有数据,需要先构建。Maven项目,并向Hudi插入一些模拟数据,可以参考博主的另一篇博文进行操作,博文连接:数据湖之Hudi(9):使用Spark向Hudi中插入数据
2. Maven依赖
在另一篇博文中Maven依赖,但在这里补充一下
<repositories> <repository> <id>aliyun</id> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> </repository> <repository> <id>cloudera</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> </repository> <repository> <id>jboss</id> <url>http://repository.jboss.com/nexus/content/groups/public</url> </repository> </repositories> <properties> <scala.version>2.12.10</scala.version> <scala.binary.version>2.12</scala.binary.version> <spark.version>3.0.0</spark.version> <hadoop.version>2.7.3</hadoop.version> <hudi.version>0.9.0</hudi.version> </properties> <dependencies> <!-- 依赖Scala语言 --> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.version}</version> </dependency> <!-- Spark Core 依赖 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.binary.version}</artifactId> <version>${spark.version}</version> </dependency> <!-- Spark SQL 依赖 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_${scala.binary.version}</artifactId> <version>${spark.version}</version> </dependency> <!-- Hadoop Client 依赖 --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>${hadoop.version}</version> </dependency> <!-- hudi-spark3 --> <dependency> <groupId>org.apache.hudi</groupId> <artifactId>hudi-spark3-bundle_2.12</artifactId> <version>${hudi.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-avro_2.12</artifactId> <version>${spark.version}</version> </dependency> </dependencies> <build> <outputDirectory>target/classes</outputDirectory> <testOutputDirectory>target/test-classes</testOutputDirectory> <resources> <resource> <directory>${project.basedir}/src/main/resources</directory> </resource> </resources> <!-- Maven 编译的插件 --> <lugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
3. 核心代码
3.1. 直接查询
采用Snapshot快照方式从Hudi表查询数据,编写DSL代码,按照业务分析数据
package com.ouyang.hudi.crud
import org.apache.hudi.QuickstartUtils.DataGenerator
import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}
/**
* @ date: 2022/2/23
* @ author: yangshibiao
* @ desc: 快照方式查询(Snapshot Query)数据,采用DSL方式
*/
object Demo02_SnapshotQuery {
def main(args: Array[String]): Unit = {
// 创建SparkSession实例对象,设置属性
val spark: SparkSession = {
SparkSession.builder()
.appName(this.getClass.getSimpleName.stripSuffix("$"))
.master("local[4]")
// 设置序列化方式:Kryo
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.getOrCreate()
}
// 定义变量:表名称、保存路径
val tableName: String = "tbl_trips_cow"
val tablePath: String = "/hudi-warehouse/tbl_trips_cow"
// 构建数据生成器,模拟产生业务数据
import spark.implicits._
val tripsDF: DataFrame = spark.read.format("hudi").load(tablePath)
tripsDF.printSchema()
tripsDF.show(10, truncate = false)
// 查询费用大于20,小于50的乘车数据
tripsDF
.filter($"fare" >= 20 && $"fare" <= 50)
.select($"driver", $"rider", $"fare", $"begin_lat", $"begin_lon", $"partitionpath", $"_hoodie_commit_time")
.orderBy($"fare".desc, $"_hoodie_commit_time".desc)
.show(100, truncate = false)
}
}
执行上述代码,点击运行会查询该路径下所有数据,打印数据格式和部分数据,如下所示:
root
|-- _hoodie_commit_time: string (nullable = true)
|-- _hoodie_commit_seqno: string (nullable = true)
|-- _hoodie_record_key: string (nullable = true)
|-- _hoodie_partition_path: string (nullable = true)
|-- _hoodie_file_name: string (nullable = true)
|-- begin_lat: double (nullable = true)
|-- begin_lon: double (nullable = true)
|-- driver: string (nullable = true)
|-- end_lat: double (nullable = true)
|-- end_lon: double (nullable = true)
|-- fare: double (nullable = true)
|-- rider: string (nullable = true)
|-- ts: long (nullable = true)
|-- uuid: string (nullable = true)
|-- partitionpath: string (nullable = true)
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key |_hoodie_partition_path |_hoodie_file_name |begin_lat |begin_lon |driver |end_lat |end_lon |fare |rider |ts |uuid |partitionpath |
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|20220223222328 |20220223222328_1_33 |bd6d99d0-107e-4891-9da6-f243b51323bc|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5655712287397079 |0.8032800489802543 |driver-213|0.18240785532240533|0.869159296395892 |92.0536330577404 |rider-213|1645625676345|bd6d99d0-107e-4891-9da6-f243b51323bc|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_34 |99bb3a25-669f-4d55-a36f-4ae0b76f76de|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.6626987497394154 |0.22504711188369042|driver-213|0.35712946224267583|0.244841817279154 |10.72756362186601 |rider-213|1645326839179|99bb3a25-669f-4d55-a36f-4ae0b76f76de|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_35 |bd4ae628-3885-4b26-8a50-c14f8e42a265|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.11488393157088261 |0.6273212202489661 |driver-213|0.7454678537511295 |0.3954939864908973 |27.79478688582596 |rider-213|1645094601577|bd4ae628-3885-4b26-8a50-c14f8e42a265|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_36 |59d2ddd0-e836-4443-a816-0ce489c004f2|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5751612868373159 |0.46940431249093517|driver-213|0.6855658616896665 |0.12686440203574556|11.212022663263122|rider-213|1645283606578|59d2ddd0-e836-4443-a816-0ce489c004f2|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_37 |5d149bc7-78a8-46df-b2b0-a038dc79e378|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.1856488085068272 |0.9694586417848392 |driver-213|0.38186367037201974|0.25252652214479043|33.92216483948643 |rider-213|1645133755620|5d149bc7-78a8-46df-b2b0-a038dc79e378|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_38 |d64b94ec-d8e8-44f3-a5c0-e205e034aa5d|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5731835407930634 |0.4923479652912024 |driver-213|0.08988581780930216|0.42520899698713666|64.27696295884016 |rider-213|1645298902122|d64b94ec-d8e8-44f3-a5c0-e205e034aa5d|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_39 |f0d208fb-b5aa-4236-acbc-a6ec283c5693|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.30057620949299213 |0.3883212395069259 |driver-213|0.8529563766655098 |0.18417876489592633|57.62896261799536 |rider-213|1645483784517|f0d208fb-b5aa-4236-acbc-a6ec283c5693|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_40 |61602de6-6839-4eb2-88ed-75fdf28bbd1f|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.023755167724156978|0.6322099740212305 |driver-213|0.2171902015800108 |0.2132173852420407 |15.330847537835645|rider-213|1645026565110|61602de6-6839-4eb2-88ed-75fdf28bbd1f|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_41 |6b8c7cdd-0302-4110-bced-a996d56828e8|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5692544178629111 |0.610843492129245 |driver-213|0.366234158145209 |0.2051302267345806 |77.05976291070496 |rider-213|1645519660912|6b8c7cdd-0302-4110-bced-a996d56828e8|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_42 |3732e4e6-2095-4eb8-903b-8daf3d307607|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|9.544772278234914E-4|0.7150696027624646 |driver-213|0.4142563844059821 |0.1214902298018885 |24.65031205441023 |rider-213|1645112245071|3732e4e6-2095-4eb8-903b-8daf3d307607|americas/united_states/san_francisco|
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
only showing top 10 rows
可以在Spark中利用DSL语法对结果进行过滤和筛选,打印结果如下所示:
+----------+---------+------------------+--------------------+-------------------+------------------------------------+-------------------+
|driver |rider |fare |begin_lat |begin_lon |partitionpath |_hoodie_commit_time|
+----------+---------+------------------+--------------------+-------------------+------------------------------------+-------------------+
|driver-213|rider-213|49.899171213436844|0.49054633351061006 |0.8716474406347761 |americas/united_states/san_francisco|20220223222328 |
|driver-213|rider-213|49.57985534250222 |0.13036108279724024 |0.2365242449257826 |americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|49.121690071563506|0.3880100101379198 |0.8750494376540229 |americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|46.971815642308016|0.6325393869124881 |0.7723215898397776 |americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|46.65992353549729 |0.9924142645535157 |0.3157934820865995 |americas/united_states/san_francisco|20220223222328 |
|driver-213|rider-213|44.839244944180244|0.6372504913279929 |0.04241635032425073|americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|43.4923811219014 |0.6100070562136587 |0.8779402295427752 |americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|42.76921664939422 |0.20404106962358204 |0.41452263884832685|americas/united_states/san_francisco|20220223222328 |
|driver-213|rider-213|42.46412330377599 |0.8918316400031095 |0.11580010866153201|americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|41.076686078636236|0.5712378196458244 |0.4559336764388273 |americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|41.06290929046368 |0.651058505660742 |0.8192868687714224 |asia/india/chennai |20220223222328 |
|driver-213|rider-213|40.211140833035394|0.9090538095331541 |0.8801105093619153 |asia/india/chennai |20220223222328 |
|driver-213|rider-213|39.31163975206524 |0.7548086309564753 |0.9049457113019617 |asia/india/chennai |20220223222328 |
|driver-213|rider-213|38.697902072535484|0.9199515909032545 |0.2895800693712469 |americas/united_states/san_francisco|20220223222328 |
|driver-213|rider-213|38.61457381408665 |0.39253605282983284 |0.5761097193536119 |asia/india/chennai |20220223222328 |
|driver-213|rider-213|34.158284716382845|0.4726905879569653 |0.46157858450465483|americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|33.92216483948643 |0.1856488085068272 |0.9694586417848392 |americas/united_states/san_francisco|20220223222328 |
|driver-213|rider-213|31.32477949501916 |0.7267793086410466 |0.2202009625132143 |americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|30.80177695413958 |0.3613216010259426 |0.8750683366449247 |asia/india/chennai |20220223222328 |
|driver-213|rider-213|30.47844781909017 |0.10509642405359532 |0.07682825311613706|asia/india/chennai |20220223222328 |
|driver-213|rider-213|30.24821012722806 |0.6437496229932878 |0.3259549255934986 |americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|28.874644702723472|0.04316839215753254 |0.49689215534636744|americas/united_states/san_francisco|20220223222328 |
|driver-213|rider-213|28.53709038726113 |0.132849613764075 |0.2370254092732652 |asia/india/chennai |20220223222328 |
|driver-213|rider-213|27.911375263393268|0.9461601725825765 |0.07097928915812768|americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|27.79478688582596 |0.11488393157088261 |0.6273212202489661 |americas/united_states/san_francisco|20220223222328 |
|driver-213|rider-213|27.66236301605771 |0.7527035644196625 |0.7525032121800279 |americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|25.216729525590676|0.48687190581855855 |0.03482702091010481|americas/united_states/san_francisco|20220223222328 |
|driver-213|rider-213|24.65031205441023 |9.544772278234914E-4|0.7150696027624646 |americas/united_states/san_francisco|20220223222328 |
|driver-213|rider-213|22.991770617403628|0.699025398548803 |0.8105360506582145 |americas/brazil/sao_paulo |20220223222328 |
|driver-213|rider-213|22.85729206746916 |0.5378950285504629 |0.14011059922351543|americas/brazil/sao_paulo |20220223222328 |
+----------+---------+------------------+--------------------+-------------------+------------------------------------+-------------------+
3.2. 条件查询
查询Hudi表数据,可以依据时间进行过滤查询,设置属性:"as.of.instant",值的格式:"20220223222328"或"2022-02-23 22:23:28",这只会获取符合条件的数据。
具体代码如下所示:
package com.ouyang.hudi.crud
import org.apache.hudi.QuickstartUtils.DataGenerator
import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}
/**
* @ date: 2022/2/23
* @ author: yangshibiao
* @ desc: 快照方式查询(Snapshot Query)数据,采用DSL方式
*/
object Demo02_SnapshotQuery {
def main(args: Array[String]): Unit = {
// 创建SparkSession实例对象,设置属性
val spark: SparkSession = {
SparkSession.builder()
.appName(this.getClass.getSimpleName.stripSuffix("$"))
.master("local[4]")
// 设置序列化方式:Kryo
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.getOrCreate()
}
// 定义变量:表名称、保存路径
val tableName: String = "tbl_trips_cow"
val tablePath: String = "/hudi-warehouse/tbl_trips_cow"
import org.apache.spark.sql.functions._
// 方式一:指定字符串,按照日期时间过滤获取数据
val df1 = spark.read
.format("hudi")
.option("as.of.instant", "20220223222328")
.load(tablePath)
.sort(col("_hoodie_commit_time").desc)
df1.printSchema()
df1.show(numRows = 5, truncate = false)
println("==================== 分割线 ====================")
// 方式二:指定字符串,按照日期时间过滤获取数据
val df2 = spark.read
.format("hudi")
.option("as.of.instant", "2022-02-23 22:23:28")
.load(tablePath)
.sort(col("_hoodie_commit_time").desc)
df2.printSchema()
df2.show(numRows = 5, truncate = false)
}
}
打印数据格式和部分数据如下所示:
root
|-- _hoodie_commit_time: string (nullable = true)
|-- _hoodie_commit_seqno: string (nullable = true)
|-- _hoodie_record_key: string (nullable = true)
|-- _hoodie_partition_path: string (nullable = true)
|-- _hoodie_file_name: string (nullable = true)
|-- begin_lat: double (nullable = true)
|-- begin_lon: double (nullable = true)
|-- driver: string (nullable = true)
|-- end_lat: double (nullable = true)
|-- end_lon: double (nullable = true)
|-- fare: double (nullable = true)
|-- rider: string (nullable = true)
|-- ts: long (nullable = true)
|-- uuid: string (nullable = true)
|-- partitionpath: string (nullable = true)
+-------------------+--------------------+------------------------------------+----------------------+---------------------------------------------------------------------+-------------------+------------------+----------+--------------------+-------------------+-----------------+---------+-------------+------------------------------------+------------------+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key |_hoodie_partition_path|_hoodie_file_name |begin_lat |begin_lon |driver |end_lat |end_lon |fare |rider |ts |uuid |partitionpath |
+-------------------+--------------------+------------------------------------+----------------------+---------------------------------------------------------------------+-------------------+------------------+----------+--------------------+-------------------+-----------------+---------+-------------+------------------------------------+------------------+
|20220223222328 |20220223222328_2_43 |c7c3c014-0dc4-42e3-a674-020ffc29a028|asia/india/chennai |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.03154543220118411|0.2887009329948117|driver-213|0.7883536904111458 |0.629523587592623 |86.92639065900747|rider-213|1645123906580|c7c3c014-0dc4-42e3-a674-020ffc29a028|asia/india/chennai|
|20220223222328 |20220223222328_2_45 |c59fa19a-b76a-4477-8015-a49615305292|asia/india/chennai |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.4805271604136475 |0.8630157667444018|driver-213|0.3272256283194892 |0.6298100777642365 |99.46343958295148|rider-213|1645259758661|c59fa19a-b76a-4477-8015-a49615305292|asia/india/chennai|
|20220223222328 |20220223222328_2_47 |1c73e11f-19f0-48cf-ba76-b79a75af9fd7|asia/india/chennai |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.7413486368980094 |0.9417400045187958|driver-213|0.03903494276309427 |0.12892252065489862|5.585015784895486|rider-213|1645511312485|1c73e11f-19f0-48cf-ba76-b79a75af9fd7|asia/india/chennai|
|20220223222328 |20220223222328_2_49 |80e12a32-f802-469a-a072-f92d1ed1ca11|asia/india/chennai |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.132849613764075 |0.2370254092732652|driver-213|0.012105237836192995|0.9180654821797201 |28.53709038726113|rider-213|1645556382792|80e12a32-f802-469a-a072-f92d1ed1ca11|asia/india/chennai|
|20220223222328 |20220223222328_2_50 |bb60dcb8-618c-444b-98ad-c22d0a128f33|asia/india/chennai |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.770028447157646 |0.730140741480257 |driver-213|0.2776410021076544 |0.02677801967450366|8.123010514625829|rider-213|1645461203317|bb60dcb8-618c-444b-98ad-c22d0a128f33|asia/india/chennai|
+-------------------+--------------------+------------------------------------+----------------------+---------------------------------------------------------------------+-------------------+------------------+----------+--------------------+-------------------+-----------------+---------+-------------+------------------------------------+------------------+
only showing top 5 rows
==================== 分割线 ====================
root
|-- _hoodie_commit_time: string (nullable = true)
|-- _hoodie_commit_seqno: string (nullable = true)
|-- _hoodie_record_key: string (nullable = true)
|-- _hoodie_partition_path: string (nullable = true)
|-- _hoodie_file_name: string (nullable = true)
|-- begin_lat: double (nullable = true)
|-- begin_lon: double (nullable = true)
|-- driver: string (nullable = true)
|-- end_lat: double (nullable = true)
|-- end_lon: double (nullable = true)
|-- fare: double (nullable = true)
|-- rider: string (nullable = true)
|-- ts: long (nullable = true)
|-- uuid: string (nullable = true)
|-- partitionpath: string (nullable = true)
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+-------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key |_hoodie_partition_path |_hoodie_file_name |begin_lat |begin_lon |driver |end_lat |end_lon |fare |rider |ts |uuid |partitionpath |
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+-------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|20220223222328 |20220223222328_1_33 |bd6d99d0-107e-4891-9da6-f243b51323bc|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5655712287397079 |0.8032800489802543 |driver-213|0.18240785532240533|0.869159296395892 |92.0536330577404 |rider-213|1645625676345|bd6d99d0-107e-4891-9da6-f243b51323bc|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_34 |99bb3a25-669f-4d55-a36f-4ae0b76f76de|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.6626987497394154 |0.22504711188369042|driver-213|0.35712946224267583|0.244841817279154 |10.72756362186601 |rider-213|1645326839179|99bb3a25-669f-4d55-a36f-4ae0b76f76de|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_35 |bd4ae628-3885-4b26-8a50-c14f8e42a265|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.11488393157088261|0.6273212202489661 |driver-213|0.7454678537511295 |0.3954939864908973 |27.79478688582596 |rider-213|1645094601577|bd4ae628-3885-4b26-8a50-c14f8e42a265|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_36 |59d2ddd0-e836-4443-a816-0ce489c004f2|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5751612868373159 |0.46940431249093517|driver-213|0.6855658616896665 |0.12686440203574556|11.212022663263122|rider-213|1645283606578|59d2ddd0-e836-4443-a816-0ce489c004f2|americas/united_states/san_francisco|
|20220223222328 |20220223222328_1_37 |5d149bc7-78a8-46df-b2b0-a038dc79e378|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.1856488085068272 |0.9694586417848392 |driver-213|0.38186367037201974|0.25252652214479043|33.92216483948643 |rider-213|1645133755620|5d149bc7-78a8-46df-b2b0-a038dc79e378|americas/united_states/san_francisco|
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+-------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
only showing top 5 rows
Hudi系列博文为通过对Hudi官网学习记录所写,其中有加入个人理解,如有不足,请各位读者谅解☺☺☺
数据湖 文章汇总