how to use the standalone alluxio cluster with spark on yarn?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

how to use the standalone alluxio cluster with spark on yarn?

alexbaijw
I have read the document running spark on alluxio, but I am not sure if I have understood.  I use alluxio-1.7.0 with hadoop-2.7.3. And I have a spark-sql application that running on yarn in yarn-client mode. The app is mainly run queries on hive tables, and also will save the result as tables by using the spark sql.  

Now I want to  use the alluxio cluster to improve my app's performance, and I have set the hdfs:///user/hive/warehouse/test.db/ as the storage of my alluxio cluster(alluxio.underfs.address).  Does it possible to use the alluxio cluster to improve my app without change any code?  If yes, how to do it?  And how to verity the app do actully use the alluxio cluster?

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: how to use the standalone alluxio cluster with spark on yarn?

Andrew Audibert
Hi Alex,

To change the application so that it reads and writes to Alluxio, update its URIs to use the "alluxio://" scheme instead of "hdfs://", for example

> val s = sc.textFile("alluxio://localhost:19998/LICENSE")
> val double = s.map(line => line + line)
> double.saveAsTextFile("alluxio://localhost:19998/LICENSE2")

Depending on how the application is written this might require some code changes.

You can check the web UI (master_hostname:19999) or the CLI (bin/alluxio fs) to verify that your application is writing to alluxio.

Hope that helps,
Andrew

On Monday, June 4, 2018 at 7:47:19 AM UTC-7, alex wrote:
I have read the document running <a href="https://www.alluxio.org/docs/1.7/en/Running-Spark-on-Alluxio.html#running-spark-on-yarn" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;">spark on alluxio, but I am not sure if I have understood.  I use alluxio-1.7.0 with hadoop-2.7.3. And I have a spark-sql application that running on yarn in yarn-client mode. The app is mainly run queries on hive tables, and also will save the result as tables by using the spark sql.  

Now I want to  use the alluxio cluster to improve my app's performance, and I have set the hdfs:///user/hive/warehouse/test.db/ as the storage of my alluxio cluster(alluxio.underfs.address).  Does it possible to use the alluxio cluster to improve my app without change any code?  If yes, how to do it?  And how to verity the app do actully use the alluxio cluster?

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: how to use the standalone alluxio cluster with spark on yarn?

alexbaijw
Thanks Andrew. I am afraid I can not do that. Since there is no HDFS URIs like "hdfs://xx", instead my app is just use the tables in hive. It is all about spark SQLs.

在 2018年6月5日星期二 UTC+8上午4:35:53,and...@alluxio.com写道:
Hi Alex,

To change the application so that it reads and writes to Alluxio, update its URIs to use the "alluxio://" scheme instead of "hdfs://", for example

> val s = sc.textFile("alluxio://localhost:19998/LICENSE")
> val double = s.map(line => line + line)
> double.saveAsTextFile("alluxio://localhost:19998/LICENSE2")

Depending on how the application is written this might require some code changes.

You can check the web UI (master_hostname:19999) or the CLI (bin/alluxio fs) to verify that your application is writing to alluxio.

Hope that helps,
Andrew

On Monday, June 4, 2018 at 7:47:19 AM UTC-7, alex wrote:
I have read the document running <a href="https://www.alluxio.org/docs/1.7/en/Running-Spark-on-Alluxio.html#running-spark-on-yarn" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;">spark on alluxio, but I am not sure if I have understood.  I use alluxio-1.7.0 with hadoop-2.7.3. And I have a spark-sql application that running on yarn in yarn-client mode. The app is mainly run queries on hive tables, and also will save the result as tables by using the spark sql.  

Now I want to  use the alluxio cluster to improve my app's performance, and I have set the hdfs:///user/hive/warehouse/test.db/ as the storage of my alluxio cluster(alluxio.underfs.address).  Does it possible to use the alluxio cluster to improve my app without change any code?  If yes, how to do it?  And how to verity the app do actully use the alluxio cluster?

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: how to use the standalone alluxio cluster with spark on yarn?

Andrew Audibert
I see, is this page what you're looking for? https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html


On Mon, Jun 4, 2018 at 6:44 PM alex <[hidden email]> wrote:
Thanks Andrew. I am afraid I can not do that. Since there is no HDFS URIs like "hdfs://xx", instead my app is just use the tables in hive. It is all about spark SQLs.

在 2018年6月5日星期二 UTC+8上午4:35:53,[hidden email]写道:
Hi Alex,

To change the application so that it reads and writes to Alluxio, update its URIs to use the "alluxio://" scheme instead of "hdfs://", for example

> val s = sc.textFile("alluxio://localhost:19998/LICENSE")
> val double = s.map(line => line + line)
> double.saveAsTextFile("alluxio://localhost:19998/LICENSE2")

Depending on how the application is written this might require some code changes.

You can check the web UI (master_hostname:19999) or the CLI (bin/alluxio fs) to verify that your application is writing to alluxio.

Hope that helps,
Andrew

On Monday, June 4, 2018 at 7:47:19 AM UTC-7, alex wrote:
I have read the document running spark on alluxio, but I am not sure if I have understood.  I use alluxio-1.7.0 with hadoop-2.7.3. And I have a spark-sql application that running on yarn in yarn-client mode. The app is mainly run queries on hive tables, and also will save the result as tables by using the spark sql.  

Now I want to  use the alluxio cluster to improve my app's performance, and I have set the hdfs:///user/hive/warehouse/test.db/ as the storage of my alluxio cluster(alluxio.underfs.address).  Does it possible to use the alluxio cluster to improve my app without change any code?  If yes, how to do it?  And how to verity the app do actully use the alluxio cluster?

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
--

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: how to use the standalone alluxio cluster with spark on yarn?

alexbaijw

I have try it out, but it has no effect for my app.
在 2018年6月6日星期三 UTC+8上午1:55:23,Andrew Audibert写道:
I see, is this page what you're looking for? <a href="https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Hive-with-Alluxio.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH4t0PHHNNP0m9iyQJMRPgfghvwUQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Hive-with-Alluxio.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH4t0PHHNNP0m9iyQJMRPgfghvwUQ&#39;;return true;">https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html


On Mon, Jun 4, 2018 at 6:44 PM alex <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="5Dc7b-eOBAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">alex...@...> wrote:
Thanks Andrew. I am afraid I can not do that. Since there is no HDFS URIs like "hdfs://xx", instead my app is just use the tables in hive. It is all about spark SQLs.

在 2018年6月5日星期二 UTC+8上午4:35:53,and...@alluxio.com写道:
Hi Alex,

To change the application so that it reads and writes to Alluxio, update its URIs to use the "alluxio://" scheme instead of "hdfs://", for example

> val s = sc.textFile("alluxio://localhost:19998/LICENSE")
> val double = s.map(line => line + line)
> double.saveAsTextFile("alluxio://localhost:19998/LICENSE2")

Depending on how the application is written this might require some code changes.

You can check the web UI (master_hostname:19999) or the CLI (bin/alluxio fs) to verify that your application is writing to alluxio.

Hope that helps,
Andrew

On Monday, June 4, 2018 at 7:47:19 AM UTC-7, alex wrote:
I have read the document running <a href="https://www.alluxio.org/docs/1.7/en/Running-Spark-on-Alluxio.html#running-spark-on-yarn" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;">spark on alluxio, but I am not sure if I have understood.  I use alluxio-1.7.0 with hadoop-2.7.3. And I have a spark-sql application that running on yarn in yarn-client mode. The app is mainly run queries on hive tables, and also will save the result as tables by using the spark sql.  

Now I want to  use the alluxio cluster to improve my app's performance, and I have set the hdfs:///user/hive/warehouse/test.db/ as the storage of my alluxio cluster(alluxio.underfs.address).  Does it possible to use the alluxio cluster to improve my app without change any code?  If yes, how to do it?  And how to verity the app do actully use the alluxio cluster?

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="5Dc7b-eOBAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">alluxio-user...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: how to use the standalone alluxio cluster with spark on yarn?

Andrew Audibert
Are you running into error messages, or is the data just not getting written to Alluxio even after configuring Hive to use Alluxio? If you can describe the steps you tried, we can get to the bottom of the issue.

On Thu, Jun 14, 2018 at 3:39 AM alex <[hidden email]> wrote:

I have try it out, but it has no effect for my app.
在 2018年6月6日星期三 UTC+8上午1:55:23,Andrew Audibert写道:
I see, is this page what you're looking for? https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html


On Mon, Jun 4, 2018 at 6:44 PM alex <[hidden email]> wrote:
Thanks Andrew. I am afraid I can not do that. Since there is no HDFS URIs like "hdfs://xx", instead my app is just use the tables in hive. It is all about spark SQLs.

在 2018年6月5日星期二 UTC+8上午4:35:53,[hidden email]写道:
Hi Alex,

To change the application so that it reads and writes to Alluxio, update its URIs to use the "alluxio://" scheme instead of "hdfs://", for example

> val s = sc.textFile("alluxio://localhost:19998/LICENSE")
> val double = s.map(line => line + line)
> double.saveAsTextFile("alluxio://localhost:19998/LICENSE2")

Depending on how the application is written this might require some code changes.

You can check the web UI (master_hostname:19999) or the CLI (bin/alluxio fs) to verify that your application is writing to alluxio.

Hope that helps,
Andrew

On Monday, June 4, 2018 at 7:47:19 AM UTC-7, alex wrote:
I have read the document running spark on alluxio, but I am not sure if I have understood.  I use alluxio-1.7.0 with hadoop-2.7.3. And I have a spark-sql application that running on yarn in yarn-client mode. The app is mainly run queries on hive tables, and also will save the result as tables by using the spark sql.  

Now I want to  use the alluxio cluster to improve my app's performance, and I have set the hdfs:///user/hive/warehouse/test.db/ as the storage of my alluxio cluster(alluxio.underfs.address).  Does it possible to use the alluxio cluster to improve my app without change any code?  If yes, how to do it?  And how to verity the app do actully use the alluxio cluster?

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
--

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: how to use the standalone alluxio cluster with spark on yarn?

alexbaijw
There is no error messages, all things goes fine. According to the doc  https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html, I maped existing tables stored in HDFS to Alluxio. When I use the hive shell, the table can be loaded into alluxio successfully, but when I use the spark-sql to query the tables, the data can not be loaded into the alluxio. The same thing happens when I save the DataFrame into hive tables.  I have put the alluxio-client jars into my app through spark.driver.extraClassPath and spark.executor.extraClassPath.

Is there something that I had forget to do? 


在 2018年6月15日星期五 UTC+8上午12:50:08,Andrew Audibert写道:
Are you running into error messages, or is the data just not getting written to Alluxio even after configuring Hive to use Alluxio? If you can describe the steps you tried, we can get to the bottom of the issue.

On Thu, Jun 14, 2018 at 3:39 AM alex <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="Q8__va9YBwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">alex...@...> wrote:

I have try it out, but it has no effect for my app.
在 2018年6月6日星期三 UTC+8上午1:55:23,Andrew Audibert写道:
I see, is this page what you're looking for? <a href="https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Hive-with-Alluxio.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH4t0PHHNNP0m9iyQJMRPgfghvwUQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Hive-with-Alluxio.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH4t0PHHNNP0m9iyQJMRPgfghvwUQ&#39;;return true;">https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html


On Mon, Jun 4, 2018 at 6:44 PM alex <[hidden email]> wrote:
Thanks Andrew. I am afraid I can not do that. Since there is no HDFS URIs like "hdfs://xx", instead my app is just use the tables in hive. It is all about spark SQLs.

在 2018年6月5日星期二 UTC+8上午4:35:53,and...@alluxio.com写道:
Hi Alex,

To change the application so that it reads and writes to Alluxio, update its URIs to use the "alluxio://" scheme instead of "hdfs://", for example

> val s = sc.textFile("alluxio://localhost:19998/LICENSE")
> val double = s.map(line => line + line)
> double.saveAsTextFile("alluxio://localhost:19998/LICENSE2")

Depending on how the application is written this might require some code changes.

You can check the web UI (master_hostname:19999) or the CLI (bin/alluxio fs) to verify that your application is writing to alluxio.

Hope that helps,
Andrew

On Monday, June 4, 2018 at 7:47:19 AM UTC-7, alex wrote:
I have read the document running <a href="https://www.alluxio.org/docs/1.7/en/Running-Spark-on-Alluxio.html#running-spark-on-yarn" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;">spark on alluxio, but I am not sure if I have understood.  I use alluxio-1.7.0 with hadoop-2.7.3. And I have a spark-sql application that running on yarn in yarn-client mode. The app is mainly run queries on hive tables, and also will save the result as tables by using the spark sql.  

Now I want to  use the alluxio cluster to improve my app's performance, and I have set the hdfs:///user/hive/warehouse/test.db/ as the storage of my alluxio cluster(alluxio.underfs.address).  Does it possible to use the alluxio cluster to improve my app without change any code?  If yes, how to do it?  And how to verity the app do actully use the alluxio cluster?

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.

For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="Q8__va9YBwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">alluxio-user...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: how to use the standalone alluxio cluster with spark on yarn?

Andrew Audibert
Does spark-shell fail with an error message, or succeed but without loading data to Alluxio? In the first case we can debug based on the error message, otherwise it sounds like something went wrong with mapping the tables to Alluxio.

On Thursday, June 14, 2018 at 6:52:30 PM UTC-7, alex wrote:
There is no error messages, all things goes fine. According to the doc  <a href="https://www.google.com/url?q=https%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Hive-with-Alluxio.html&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNH4t0PHHNNP0m9iyQJMRPgfghvwUQ" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Hive-with-Alluxio.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH4t0PHHNNP0m9iyQJMRPgfghvwUQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Hive-with-Alluxio.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH4t0PHHNNP0m9iyQJMRPgfghvwUQ&#39;;return true;">https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html, I maped existing tables stored in HDFS to Alluxio. When I use the hive shell, the table can be loaded into alluxio successfully, but when I use the spark-sql to query the tables, the data can not be loaded into the alluxio. The same thing happens when I save the DataFrame into hive tables.  I have put the alluxio-client jars into my app through spark.driver.extraClassPath and spark.executor.extraClassPath.

Is there something that I had forget to do? 


在 2018年6月15日星期五 UTC+8上午12:50:08,Andrew Audibert写道:
Are you running into error messages, or is the data just not getting written to Alluxio even after configuring Hive to use Alluxio? If you can describe the steps you tried, we can get to the bottom of the issue.

On Thu, Jun 14, 2018 at 3:39 AM alex <[hidden email]> wrote:

I have try it out, but it has no effect for my app.
在 2018年6月6日星期三 UTC+8上午1:55:23,Andrew Audibert写道:
I see, is this page what you're looking for? <a href="https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Hive-with-Alluxio.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH4t0PHHNNP0m9iyQJMRPgfghvwUQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Hive-with-Alluxio.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNH4t0PHHNNP0m9iyQJMRPgfghvwUQ&#39;;return true;">https://www.alluxio.org/docs/1.7/en/Running-Hive-with-Alluxio.html


On Mon, Jun 4, 2018 at 6:44 PM alex <[hidden email]> wrote:
Thanks Andrew. I am afraid I can not do that. Since there is no HDFS URIs like "hdfs://xx", instead my app is just use the tables in hive. It is all about spark SQLs.

在 2018年6月5日星期二 UTC+8上午4:35:53,and...@alluxio.com写道:
Hi Alex,

To change the application so that it reads and writes to Alluxio, update its URIs to use the "alluxio://" scheme instead of "hdfs://", for example

> val s = sc.textFile("alluxio://localhost:19998/LICENSE")
> val double = s.map(line => line + line)
> double.saveAsTextFile("alluxio://localhost:19998/LICENSE2")

Depending on how the application is written this might require some code changes.

You can check the web UI (master_hostname:19999) or the CLI (bin/alluxio fs) to verify that your application is writing to alluxio.

Hope that helps,
Andrew

On Monday, June 4, 2018 at 7:47:19 AM UTC-7, alex wrote:
I have read the document running <a href="https://www.alluxio.org/docs/1.7/en/Running-Spark-on-Alluxio.html#running-spark-on-yarn" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fwww.alluxio.org%2Fdocs%2F1.7%2Fen%2FRunning-Spark-on-Alluxio.html%23running-spark-on-yarn\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6BCKjWWKnqgTygyrYIrB11RHgCQ&#39;;return true;">spark on alluxio, but I am not sure if I have understood.  I use alluxio-1.7.0 with hadoop-2.7.3. And I have a spark-sql application that running on yarn in yarn-client mode. The app is mainly run queries on hive tables, and also will save the result as tables by using the spark sql.  

Now I want to  use the alluxio cluster to improve my app's performance, and I have set the hdfs:///user/hive/warehouse/test.db/ as the storage of my alluxio cluster(alluxio.underfs.address).  Does it possible to use the alluxio cluster to improve my app without change any code?  If yes, how to do it?  And how to verity the app do actully use the alluxio cluster?

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.

For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.