Hive as underfs with append

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Hive as underfs with append

Davis Varghese
Hi,
 I have followed the link https://www.alluxio.org/docs/1.6/en/Running-Hive-with-Alluxio.html#use-alluxio-for-existing-tables-stored-in-hdfs
to integrate alluxio with our hive datawarehouse. Everything works fine when data is static. If we append data to an existing hive table, alluxio is not reflecting it. It still gives info based on what it collected at the first time of metadata caching. Is there a way to flush this info from alluxio using any API?

At times we change the schema of a hive table and this also cause alluxio to give info/data about old table definition and parquet file location.

Other issue i faced is that the approach mentioned above is not working for partitioned table as we have to do "alter location" for every partition and in our case, we have dynamic partitions.

Any suggestion is appreciated.




Xurmo Technologies Pvt. Ltd., Silver Software Tech Park, Plot No. 23 & 24, II Floor, EPIP 1st Phase, KIADB, Whitefield,
Bangalore - 560 066 | +918040314000 [T]

www.xurmo.com

 Disclaimer: This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake, please delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission.

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Hive as underfs with append

Davis Varghese
by running "refresh table table_name" solved the first issue

On Wednesday, December 13, 2017 at 12:37:44 PM UTC+5:30, Davis Varghese wrote:
Hi,
 I have followed the link <a href="https://www.alluxio.org/docs/1.6/en/Running-Hive-with-Alluxio.html#use-alluxio-for-existing-tables-stored-in-hdfs" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.6%2Fen%2FRunning-Hive-with-Alluxio.html%23use-alluxio-for-existing-tables-stored-in-hdfs\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFa4sb3ElYx8YOeZDK4P5KL6_rNXg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.6%2Fen%2FRunning-Hive-with-Alluxio.html%23use-alluxio-for-existing-tables-stored-in-hdfs\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFa4sb3ElYx8YOeZDK4P5KL6_rNXg&#39;;return true;">https://www.alluxio.org/docs/1.6/en/Running-Hive-with-Alluxio.html#use-alluxio-for-existing-tables-stored-in-hdfs
to integrate alluxio with our hive datawarehouse. Everything works fine when data is static. If we append data to an existing hive table, alluxio is not reflecting it. It still gives info based on what it collected at the first time of metadata caching. Is there a way to flush this info from alluxio using any API?

At times we change the schema of a hive table and this also cause alluxio to give info/data about old table definition and parquet file location.

Other issue i faced is that the approach mentioned above is not working for partitioned table as we have to do "alter location" for every partition and in our case, we have dynamic partitions.

Any suggestion is appreciated.




Xurmo Technologies Pvt. Ltd., Silver Software Tech Park, Plot No. 23 & 24, II Floor, EPIP 1st Phase, KIADB, Whitefield,
Bangalore - 560 066 | +918040314000 [T]

www.xurmo.com

 Disclaimer: This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake, please delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission.

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Hive as underfs with append

binfeng
Hi Davis,

Did you solve the second issue? Have you tried "MSCK REPAIR TABLE table_name" to rediscover the partitions?

On Thursday, December 14, 2017 at 7:02:00 PM UTC-8, Davis Varghese wrote:
by running "refresh table table_name" solved the first issue

On Wednesday, December 13, 2017 at 12:37:44 PM UTC+5:30, Davis Varghese wrote:
Hi,
 I have followed the link <a href="https://www.alluxio.org/docs/1.6/en/Running-Hive-with-Alluxio.html#use-alluxio-for-existing-tables-stored-in-hdfs" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.6%2Fen%2FRunning-Hive-with-Alluxio.html%23use-alluxio-for-existing-tables-stored-in-hdfs\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFa4sb3ElYx8YOeZDK4P5KL6_rNXg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.6%2Fen%2FRunning-Hive-with-Alluxio.html%23use-alluxio-for-existing-tables-stored-in-hdfs\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFa4sb3ElYx8YOeZDK4P5KL6_rNXg&#39;;return true;">https://www.alluxio.org/docs/1.6/en/Running-Hive-with-Alluxio.html#use-alluxio-for-existing-tables-stored-in-hdfs
to integrate alluxio with our hive datawarehouse. Everything works fine when data is static. If we append data to an existing hive table, alluxio is not reflecting it. It still gives info based on what it collected at the first time of metadata caching. Is there a way to flush this info from alluxio using any API?

At times we change the schema of a hive table and this also cause alluxio to give info/data about old table definition and parquet file location.

Other issue i faced is that the approach mentioned above is not working for partitioned table as we have to do "alter location" for every partition and in our case, we have dynamic partitions.

Any suggestion is appreciated.


--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.