I am having a very simple problem with spark, but there is very little information on the web. I have encountered this problem using both pyspark and scala. The
We have some data in our Mysql RDS, which slows down our application, but it's no longer needed. So we want to remove old records but keep them somewhere so our
Assume I have a table like this id cnt tier 1 100 gold 2 200 silver 3 300 bronze 4 400 bronze 5 500 bronze 6 600 gold 7 700 silver 8 800 silver 9 900 silver 10
I am using HIVE and I have a table like this: S.no ID applicant_num f_name l_name Primary Key 1 123 202201A1 akhil yadav 123~&~akhil~&~yadav 2 123 2022
HIVE has a metastore and HIVESERVER2 listens for SQL requests; with the help of metastore, the query is executed and the result is passed back. The Thrift frame
I am querying a Hudi table using Hive which is running on Spark engine in EMR cluster 6.3.1 Hudi version is 0.7 I have inserted a few records and then updated t
I am trying to add the group by clause on the impala/Hive table but its not working. I am having the jobs details table which having job name and status column.
I am having issue generating the hive type adapter. But no such out put generated by the command line codes flutter packages pub run build_runner build --delete
How can I have impala or hive return the file format of the underlying files on HDFS for a table? I tried: SHOW FILES database.table_name This ilst the files,
I am trying to run the below code. I'm trying filtering on the jira_label field. I'm getting the below error. I know this means I should add aliases and I have
In my project, When adding a flutter dependencies in terminal, but it showing in terminal "Expected to find project root in current working directory."
Problem I am trying to implement a new process using dbt seeds. When I use it in a Redshift connection there is no problem, but when I try to use it with dbt-sp
I have tried this code in SQL it is working fine but in hive it is not working select((select sum(price) from apart where construction_year=2020) - (select sum(
I can't believe I have to ask this here but there seems to be no documentation on what the HWC actually does. All I can find is that it returns a boolean: publi
In joining two tables, I would like to select all columns except 2 of them from a large table with many columns on pyspark sql on databricks. My pyspark sql: %
I am attempting to use Scala with Apache Spark locally to query Hive table which is secured with Kerberos. I have no issues connecting and querying the data pro
I am trying to build a data migration pipeline using Airflow, source being a Hive table on a Dataproc cluster and the destination is BigQuery. I'm using Datapro
HBase Table rowkey: 2020-02-02^ghfgewr3434555, cf:1 timestamp=1604405829275, value=true rowkey: 2020-02-02^ghfgewr3434555, cf:2 timestamp=1604405829275, value=
I am trying this query in Hive and it's not working. select ( ( select count(*) from click_streaming where page_
I'm trying to import mongodb data into hive. The jar versions that i have used are ADD JAR /root/HDL/mongo-java-driver-3.4.2.jar; ADD JAR /root/HDL/mongo-hado