Spark local java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark local java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found

jason
HI -

I'm getting the error:
java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found
when trying to run alluxio alongside a Spark local instance.

I have looked at the page here:
https://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found

Which details the steps to take in this situation, but I'm not finding success.

According to Spark documentation, I can instance a local Spark like so:
SparkSession.builder
  .appName("App")
.getOrCreate

Then I can add the alluxio client library like so:
sparkSession.conf.set("spark.driver.extraClassPath", ALLUXIO_SPARK_CLIENT)
sparkSession.conf.set("spark.executor.extraClassPath", ALLUXIO_SPARK_CLIENT)

I have verified that the proper jar file exists in these config settings with:
logger.error(sparkSession.conf.get("spark.driver.extraClassPath"))
logger.error(sparkSession.conf.get("spark.executor.extraClassPath"))

But I still get the error.  Is there anything else I can do to figure out why Spark is not picking the library up?  

As an FYI there is another application in the cluster which is connecting to my alluxio using the fs client and that all works fine.

Thanks

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Spark local java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found

Gene Pang
Hi Jason,

Do you know if the error is happening on the driver or the executors? In my experience, if you are using client deployMode for Spark, you have to use the setting "--driver-class-path" instead of "spark.driver.extraClassPath". Also, you may be able to get it to pick up the class path if you set that in the spark-defaults config file.

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Spark local java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found

Gene Pang
Hi Jason,

Could you show the exception stack trace? That should be able to show if the exception is in the driver or executor.

Also, I think even if the cluster is in local mode, it should read the spark defaults file.

Thanks,
Gene

On Fri, Apr 13, 2018 at 7:58 AM, Jason Boorn <[hidden email]> wrote:
Hi Gene -

Thanks for the reply.  It appears to me that the error is happening on the executors.  The error is thrown by this code:
sparkSession.read.format("csv").option("header", "true").option("inferSchema", "true")
.option("delimiter", csvPreviewRecipe.charsep.replace("\\", "")).load(url)
and I'm using an alluxio url.

What's important to keep in mind is that I'm running Spark in local (non-distributed) mode - meaning that my driver program is creating the spark instance rather than being submitted to a cluster.  My understanding is that this means two things:

- The spark-defaults config file applies to an actual cluster (e.g. running on my local machine) and not a local, non-distributed spark instance created within my driver.
- I am unable to pass "--driver-class-path" to the driver program because the driver program is not being submitted to a cluster.  It is being run directly through sbt.

Am I wrong with either of these two assumptions?  




On Fri, Apr 13, 2018 at 10:49 AM, Gene Pang <[hidden email]> wrote:
Hi Jason,

Do you know if the error is happening on the driver or the executors? In my experience, if you are using client deployMode for Spark, you have to use the setting "--driver-class-path" instead of "spark.driver.extraClassPath". Also, you may be able to get it to pick up the class path if you set that in the spark-defaults config file.

Thanks,
Gene

--
You received this message because you are subscribed to a topic in the Google Groups "Alluxio Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/alluxio-users/nB46f6-28W4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--

Jason Boorn
Founder & CEO
Roobricks
[hidden email]
+1-303-532-6066


--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Spark local java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found

jason
Ah ok that might help -

I understand there is some ambiguity in the term "local".  When I say local, I mean that my application creates the spark instance as master=local.  This is effectively a testing instance within the same JVM and not a real cluster.  I am not starting an external standalone cluster that is local to my machine, and I am not using spark-submit.  My understanding is that the only way to set properties on an instance of this type is through the runtime configuration.

If this is not true, can you point me to the place in the documentation where master=local configuration is detailed?  I should be able to poke through that to get what I need.

Thanks again for the help.

On Fri, Apr 13, 2018 at 11:17 AM, Gene Pang <[hidden email]> wrote:
Hi Jason,

Could you show the exception stack trace? That should be able to show if the exception is in the driver or executor.

Also, I think even if the cluster is in local mode, it should read the spark defaults file.

Thanks,
Gene

On Fri, Apr 13, 2018 at 7:58 AM, Jason Boorn <[hidden email]> wrote:
Hi Gene -

Thanks for the reply.  It appears to me that the error is happening on the executors.  The error is thrown by this code:
sparkSession.read.format("csv").option("header", "true").option("inferSchema", "true")
.option("delimiter", csvPreviewRecipe.charsep.replace("\\", "")).load(url)
and I'm using an alluxio url.

What's important to keep in mind is that I'm running Spark in local (non-distributed) mode - meaning that my driver program is creating the spark instance rather than being submitted to a cluster.  My understanding is that this means two things:

- The spark-defaults config file applies to an actual cluster (e.g. running on my local machine) and not a local, non-distributed spark instance created within my driver.
- I am unable to pass "--driver-class-path" to the driver program because the driver program is not being submitted to a cluster.  It is being run directly through sbt.

Am I wrong with either of these two assumptions?  




On Fri, Apr 13, 2018 at 10:49 AM, Gene Pang <[hidden email]> wrote:
Hi Jason,

Do you know if the error is happening on the driver or the executors? In my experience, if you are using client deployMode for Spark, you have to use the setting "--driver-class-path" instead of "spark.driver.extraClassPath". Also, you may be able to get it to pick up the class path if you set that in the spark-defaults config file.

Thanks,
Gene

--
You received this message because you are subscribed to a topic in the Google Groups "Alluxio Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/alluxio-users/nB46f6-28W4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--

Jason Boorn
Founder & CEO
Roobricks
[hidden email]
+1-303-532-6066





--

Jason Boorn
Founder & CEO
Roobricks
[hidden email]
+1-303-532-6066

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Spark local java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found

Gene Pang
Hi Jason,

Were you able to get the classpath correct for your JVM? If so, could you provide information on how you got it to work?

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Spark local java.lang.ClassNotFoundException: Class alluxio.hadoop.FileSystem not found

Gene Pang
Thanks for the update and a pointer to your resolution!

-Gene

On Thu, May 24, 2018 at 10:26 AM, Jason Boorn <[hidden email]> wrote:
I did - but the core issue here was not what I thought it was.  Basically, I didn’t realize that the library had been split into a “fs” version and an “hdfs” version.  I included the fs version without thinking about it, and then couldn’t figure out what was going on.  Basically my bad.



On May 24, 2018, at 10:50 AM, Gene Pang <[hidden email]> wrote:

Hi Jason,

Were you able to get the classpath correct for your JVM? If so, could you provide information on how you got it to work?

Thanks,
Gene

--
You received this message because you are subscribed to a topic in the Google Groups "Alluxio Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/alluxio-users/nB46f6-28W4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.