Seek tuning advices for Spark SQL on Alluxio

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Seek tuning advices for Spark SQL on Alluxio

wayasxxx
Hi,
    I am doing some performance test: Spark SQL on Alluxio , Hive SQL on Alluxio.
    My alluxio cluster is not big(5 nodes, 100G MEM for each).
    For some big SQL(input data 100G+), I found Spark+Alluxio performs no better than Spark+Hdfs. Is that related to the cluster size? (The Hdfs cluster is big.)
    And I always get these two exceptions:
           alluxio.exception.status.UnavailableException: Failed to connect to FileSystemMasterClient
          alluxio.exception.status.UnavailableException: Failed to connect to BlockMasterClient
    It seems there is a bottleneck in RPC connections to the Master.
    Is there any tuning advice?

Thanks,
Anyang 

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Seek tuning advices for Spark SQL on Alluxio

binfan
Administrator
hi

To verify if your master server is RPC-bottlenecked, 

can you checkout this FAQ recently added (https://github.com/Alluxio/alluxio/pull/7239/files)?
Sorry it takes a while for the webserver to pick the change up.

Once confirmed, we can start from there

- Bin

On Sunday, May 6, 2018 at 6:44:54 PM UTC-7, [hidden email] wrote:
Hi,
    I am doing some performance test: Spark SQL on Alluxio , Hive SQL on Alluxio.
    My alluxio cluster is not big(5 nodes, 100G MEM for each).
    For some big SQL(input data 100G+), I found Spark+Alluxio performs no better than Spark+Hdfs. Is that related to the cluster size? (The Hdfs cluster is big.)
    And I always get these two exceptions:
           alluxio.exception.status.UnavailableException: Failed to connect to FileSystemMasterClient
          alluxio.exception.status.UnavailableException: Failed to connect to BlockMasterClient
    It seems there is a bottleneck in RPC connections to the Master.
    Is there any tuning advice?

Thanks,
Anyang 

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.