Category "hive"

How can I concatenate all values in descending order that have the same primary key in HIVE?

I am using HIVE and I have a table like this: S.no ID applicant_num f_name l_name Primary Key 1 123 202201A1 akhil yadav 123~&~akhil~&~yadav 2 123 2022

Spark-SQL plug in on HIVE

HIVE has a metastore and HIVESERVER2 listens for SQL requests; with the help of metastore, the query is executed and the result is passed back. The Thrift frame

Getting duplicate records while querying Hudi table using Hive on Spark Engine in EMR 6.3.1

I am querying a Hudi table using Hive which is running on Spark engine in EMR cluster 6.3.1 Hudi version is 0.7 I have inserted a few records and then updated t

Hive/ Impala query group by query for total success and failed record

I am trying to add the group by clause on the impala/Hive table but its not working. I am having the jobs details table which having job name and status column.

Hive generate adapter could not generate outputs

I am having issue generating the hive type adapter. But no such out put generated by the command line codes flutter packages pub run build_runner build --delete

impala/hive show file format

How can I have impala or hive return the file format of the underlying files on HDFS for a table? I tried: SHOW FILES database.table_name This ilst the files,

Hive subquery with Lateral View

I am trying to run the below code. I'm trying filtering on the jira_label field. I'm getting the below error. I know this means I should add aliases and I have

When adding a flutter dependencies in terminal, but it showing in terminal "Expected to find project root in current working directory."

In my project, When adding a flutter dependencies in terminal, but it showing in terminal "Expected to find project root in current working directory."

How to use dbt seed properly with dbt-spark[PyHive] running in EMR?

Problem I am trying to implement a new process using dbt seeds. When I use it in a Redshift connection there is no problem, but when I try to use it with dbt-sp

How to subtract two query result in hive

I have tried this code in SQL it is working fine but in hive it is not working select((select sum(price) from apart where construction_year=2020) - (select sum(

What does the HiveWarehouseConnector executeUpdate() function return?

I can't believe I have to ask this here but there seems to be no documentation on what the HWC actually does. All I can find is that it returns a boolean: publi

How to select all columns except 2 of them from a large table on pyspark sql?

In joining two tables, I would like to select all columns except 2 of them from a large table with many columns on pyspark sql on databricks. My pyspark sql: %

How to use Apache Spark to query Hive table with Kerberos?

I am attempting to use Scala with Apache Spark locally to query Hive table which is secured with Kerberos. I have no issues connecting and querying the data pro

How to extract the query result from a Hive job output logs using DataprocHiveOperator?

I am trying to build a data migration pipeline using Airflow, source being a Hive table on a Dataproc cluster and the destination is BigQuery. I'm using Datapro

How to split HBase row key into 2 columns in Hive table

HBase Table rowkey: 2020-02-02^ghfgewr3434555, cf:1 timestamp=1604405829275, value=true rowkey: 2020-02-02^ghfgewr3434555, cf:2 timestamp=1604405829275, value=

Hive query to find conversion ratio

I am trying this query in Hive and it's not working. select ( ( select count(*) from click_streaming where page_

Import MongoDB data into Hive Error: Splitter implementation is incompatible

I'm trying to import mongodb data into hive. The jar versions that i have used are ADD JAR /root/HDL/mongo-java-driver-3.4.2.jar; ADD JAR /root/HDL/mongo-hado

hive record inserted but then get a error

I create a table in hive: CREATE TABLE `test3`.`shop_dim` ( `shop_id` bigint, `shop_name` string, `shop_company_id`

Hive SQL regexp_extract (number)_(number)

I'm new to hiveSQL and I'm trying to extract a value from the column col_a from the data df which is in this format: \\\"id\\\":\\\"101_12345\\\" I only need to

HIVE CBO. Wrong results with Hive SQL query with MULTIPLE IN conditions in where clause

I am running one SQL query in Hive and it gives different results with CBO enabled and disabled. The results are wrong when CBO is enabled (set hive.cbo.enable=