Alluxio class not found error in Spark EMR cluster

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Alluxio class not found error in Spark EMR cluster

Jais Sebastian
 Hi,
We are setting "alluxio" as the default file system scheme while executing Spark EMR job. For this, we added below configurations to the core-site EMR config
core-site fs.AbstractFileSystem.alluxio.impl alluxio.hadoop.AlluxioFileSystem
core-site fs.alluxio.impl alluxio.hadoop.FileSystem


Also added Alluxio client Jars with dependencies to the spark driver and executor classpath, also downloaded client jar with dependencies in bootstrap actions and added  to yarn application class path
core-site yarn.application.classpath $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*, /usr/lib/hadoop-lzo/lib/*, /home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar, /usr/share/aws/emr/emrfs/conf, /usr/share/aws/emr/emrfs/lib/*, /usr/share/aws/emr/emrfs/auxlib/*, /usr/share/aws/emr/lib/*, /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar, /usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar, /usr/lib/spark/yarn/lib/datanucleus-api-jdo.jar, /usr/lib/spark/yarn/lib/datanucleus-core.jar, /usr/lib/spark/yarn/lib/datanucleus-rdbms.jar, /usr/share/aws/emr/cloudwatch-sink/lib/*, /usr/share/aws/aws-java-sdk/*

But still, we are getting Class not found error. What we suspect is, since we set the defaultFs to alluxio, all the dependent
jars are copied into alluxio and jar reference is supplied with alluxio scheme. Somehow the Alluxio jar is not in the classpath
and getting an error.

Attached the logs for your reference

Regards,
Jais

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

alluxio-spark-emr-error.txt (12K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Alluxio class not found error in Spark EMR cluster

Andrew Audibert
Hi Jais,

Could you try setting
spark.driver.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar
spark.executor.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar

in spark-defaults.conf? It looks like the spark application doesn't have the client on its classpath.

- Andrew

On Mon, Sep 17, 2018 at 9:03 AM Jais Sebastian <[hidden email]> wrote:
 Hi,
We are setting "alluxio" as the default file system scheme while executing Spark EMR job. For this, we added below configurations to the core-site EMR config
core-site fs.AbstractFileSystem.alluxio.impl alluxio.hadoop.AlluxioFileSystem
core-site fs.alluxio.impl alluxio.hadoop.FileSystem


Also added Alluxio client Jars with dependencies to the spark driver and executor classpath, also downloaded client jar with dependencies in bootstrap actions and added  to yarn application class path
core-site yarn.application.classpath $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*, /usr/lib/hadoop-lzo/lib/*, /home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar, /usr/share/aws/emr/emrfs/conf, /usr/share/aws/emr/emrfs/lib/*, /usr/share/aws/emr/emrfs/auxlib/*, /usr/share/aws/emr/lib/*, /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar, /usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar, /usr/lib/spark/yarn/lib/datanucleus-api-jdo.jar, /usr/lib/spark/yarn/lib/datanucleus-core.jar, /usr/lib/spark/yarn/lib/datanucleus-rdbms.jar, /usr/share/aws/emr/cloudwatch-sink/lib/*, /usr/share/aws/aws-java-sdk/*

But still, we are getting Class not found error. What we suspect is, since we set the defaultFs to alluxio, all the dependent
jars are copied into alluxio and jar reference is supplied with alluxio scheme. Somehow the Alluxio jar is not in the classpath
and getting an error.

Attached the logs for your reference

Regards,
Jais

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
--

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Alluxio class not found error in Spark EMR cluster

binfan
Administrator
hi Jais,

can you take a look at this article an see if it helps?

http://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found

- Bin

On Monday, September 17, 2018 at 11:19:50 AM UTC-7, Andrew Audibert wrote:
Hi Jais,

Could you try setting
spark.driver.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar
spark.executor.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar

in spark-defaults.conf? It looks like the spark application doesn't have the client on its classpath.

- Andrew

On Mon, Sep 17, 2018 at 9:03 AM Jais Sebastian <[hidden email]> wrote:
 Hi,
We are setting "alluxio" as the default file system scheme while executing Spark EMR job. For this, we added below configurations to the core-site EMR config
core-site fs.AbstractFileSystem.alluxio.impl alluxio.hadoop.AlluxioFileSystem
core-site fs.alluxio.impl alluxio.hadoop.FileSystem


Also added Alluxio client Jars with dependencies to the spark driver and executor classpath, also downloaded client jar with dependencies in bootstrap actions and added  to yarn application class path
core-site yarn.application.classpath $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*, /usr/lib/hadoop-lzo/lib/*, /home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar, /usr/share/aws/emr/emrfs/conf, /usr/share/aws/emr/emrfs/lib/*, /usr/share/aws/emr/emrfs/auxlib/*, /usr/share/aws/emr/lib/*, /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar, /usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar, /usr/lib/spark/yarn/lib/datanucleus-api-jdo.jar, /usr/lib/spark/yarn/lib/datanucleus-core.jar, /usr/lib/spark/yarn/lib/datanucleus-rdbms.jar, /usr/share/aws/emr/cloudwatch-sink/lib/*, /usr/share/aws/aws-java-sdk/*

But still, we are getting Class not found error. What we suspect is, since we set the defaultFs to alluxio, all the dependent
jars are copied into alluxio and jar reference is supplied with alluxio scheme. Somehow the Alluxio jar is not in the classpath
and getting an error.

Attached the logs for your reference

Regards,
Jais

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Alluxio class not found error in Spark EMR cluster

Jais Sebastian
Hi,

We added the following configurations and issue solved

  {

    "classification": "spark-defaults",

    "properties": {

      "spark.eventLog.dir": "alluxio://<alluxio>/var/log/spark/apps",

      "spark.history.fs.logDirectory": "<alluxio>/var/log/spark/apps",

      "spark.jars": "<path>/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar",

      "spark.sql.warehouse.dir": "alluxio://<alluxio>/user/spark/warehouse",

      "spark.driver.extraClassPath": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar",

      "spark.executor.extraClassPath": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar"

    }

  },

  {

    "configurations": [

      {

        "classification": "export",

        "properties": {

          "HADOOP_CLASSPATH": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar:${HADOOP_CLASSPATH}"

        }

      }

    ],

    "classification": "hadoop-env",

    "properties": {

     

    }

  }


On Tuesday, September 18, 2018 at 3:47:07 AM UTC+5:30, Bin Fan wrote:
hi Jais,

can you take a look at this article an see if it helps?

<a href="http://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FDebugging-Guide.html%23q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHafVD9zB77RRPloPspvjabFV8qRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FDebugging-Guide.html%23q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHafVD9zB77RRPloPspvjabFV8qRg&#39;;return true;">http://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found

- Bin

On Monday, September 17, 2018 at 11:19:50 AM UTC-7, Andrew Audibert wrote:
Hi Jais,

Could you try setting
spark.driver.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar
spark.executor.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar

in spark-defaults.conf? It looks like the spark application doesn't have the client on its classpath.

- Andrew

On Mon, Sep 17, 2018 at 9:03 AM Jais Sebastian <<a href="javascript:" rel="nofollow" target="_blank" gdf-obfuscated-mailto="giyOR_PpCAAJ" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">jais...@...> wrote:
 Hi,
We are setting "alluxio" as the default file system scheme while executing Spark EMR job. For this, we added below configurations to the core-site EMR config
core-site fs.AbstractFileSystem.alluxio.impl alluxio.hadoop.AlluxioFileSystem
core-site fs.alluxio.impl alluxio.hadoop.FileSystem


Also added Alluxio client Jars with dependencies to the spark driver and executor classpath, also downloaded client jar with dependencies in bootstrap actions and added  to yarn application class path
core-site yarn.application.classpath $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*, /usr/lib/hadoop-lzo/lib/*, /home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar, /usr/share/aws/emr/emrfs/conf, /usr/share/aws/emr/emrfs/lib/*, /usr/share/aws/emr/emrfs/auxlib/*, /usr/share/aws/emr/lib/*, /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar, /usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar, /usr/lib/spark/yarn/lib/datanucleus-api-jdo.jar, /usr/lib/spark/yarn/lib/datanucleus-core.jar, /usr/lib/spark/yarn/lib/datanucleus-rdbms.jar, /usr/share/aws/emr/cloudwatch-sink/lib/*, /usr/share/aws/aws-java-sdk/*

But still, we are getting Class not found error. What we suspect is, since we set the defaultFs to alluxio, all the dependent
jars are copied into alluxio and jar reference is supplied with alluxio scheme. Somehow the Alluxio jar is not in the classpath
and getting an error.

Attached the logs for your reference

Regards,
Jais

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" rel="nofollow" target="_blank" gdf-obfuscated-mailto="giyOR_PpCAAJ" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">alluxio-user...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Alluxio class not found error in Spark EMR cluster

Bin Fan
thanks for the update

could you help us on clarifying which part you added to solve the problem?
the 
spark.jars 
or

"properties": {

          "HADOOP_CLASSPATH": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar:${HADOOP_CLASSPATH}"

        }

or others?


Best


- Bin


On Tuesday, September 18, 2018 at 7:32:31 AM UTC-7, Jais Sebastian wrote:
Hi,

We added the following configurations and issue solved

  {

    "classification": "spark-defaults",

    "properties": {

      "spark.eventLog.dir": "alluxio://<alluxio>/var/log/spark/apps",

      "spark.history.fs.logDirectory": "<alluxio>/var/log/spark/apps",

      "spark.jars": "<path>/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar",

      "spark.sql.warehouse.dir": "alluxio://<alluxio>/user/spark/warehouse",

      "spark.driver.extraClassPath": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar",

      "spark.executor.extraClassPath": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar"

    }

  },

  {

    "configurations": [

      {

        "classification": "export",

        "properties": {

          "HADOOP_CLASSPATH": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar:${HADOOP_CLASSPATH}"

        }

      }

    ],

    "classification": "hadoop-env",

    "properties": {

     

    }

  }


On Tuesday, September 18, 2018 at 3:47:07 AM UTC+5:30, Bin Fan wrote:
hi Jais,

can you take a look at this article an see if it helps?

<a href="http://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FDebugging-Guide.html%23q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHafVD9zB77RRPloPspvjabFV8qRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FDebugging-Guide.html%23q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHafVD9zB77RRPloPspvjabFV8qRg&#39;;return true;">http://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found

- Bin

On Monday, September 17, 2018 at 11:19:50 AM UTC-7, Andrew Audibert wrote:
Hi Jais,

Could you try setting
spark.driver.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar
spark.executor.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar

in spark-defaults.conf? It looks like the spark application doesn't have the client on its classpath.

- Andrew

On Mon, Sep 17, 2018 at 9:03 AM Jais Sebastian <[hidden email]> wrote:
 Hi,
We are setting "alluxio" as the default file system scheme while executing Spark EMR job. For this, we added below configurations to the core-site EMR config
core-site fs.AbstractFileSystem.alluxio.impl alluxio.hadoop.AlluxioFileSystem
core-site fs.alluxio.impl alluxio.hadoop.FileSystem


Also added Alluxio client Jars with dependencies to the spark driver and executor classpath, also downloaded client jar with dependencies in bootstrap actions and added  to yarn application class path
core-site yarn.application.classpath $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*, /usr/lib/hadoop-lzo/lib/*, /home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar, /usr/share/aws/emr/emrfs/conf, /usr/share/aws/emr/emrfs/lib/*, /usr/share/aws/emr/emrfs/auxlib/*, /usr/share/aws/emr/lib/*, /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar, /usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar, /usr/lib/spark/yarn/lib/datanucleus-api-jdo.jar, /usr/lib/spark/yarn/lib/datanucleus-core.jar, /usr/lib/spark/yarn/lib/datanucleus-rdbms.jar, /usr/share/aws/emr/cloudwatch-sink/lib/*, /usr/share/aws/aws-java-sdk/*

But still, we are getting Class not found error. What we suspect is, since we set the defaultFs to alluxio, all the dependent
jars are copied into alluxio and jar reference is supplied with alluxio scheme. Somehow the Alluxio jar is not in the classpath
and getting an error.

Attached the logs for your reference

Regards,
Jais

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Alluxio class not found error in Spark EMR cluster

Jais Sebastian
Both were required of us. HADOOP_CLASSPATH resolved the issues while downloading dependent jars. But later spark tasks were failing, so I added spark.driver.classpath as well.

On Wednesday, September 19, 2018 at 10:47:51 AM UTC+5:30, Bin Fan wrote:
thanks for the update

could you help us on clarifying which part you added to solve the problem?
the 
spark.jars 
or

"properties": {

          "HADOOP_CLASSPATH": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar:${HADOOP_CLASSPATH}"

        }

or others?


Best


- Bin


On Tuesday, September 18, 2018 at 7:32:31 AM UTC-7, Jais Sebastian wrote:
Hi,

We added the following configurations and issue solved

  {

    "classification": "spark-defaults",

    "properties": {

      "spark.eventLog.dir": "alluxio://<alluxio>/var/log/spark/apps",

      "spark.history.fs.logDirectory": "<alluxio>/var/log/spark/apps",

      "spark.jars": "<path>/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar",

      "spark.sql.warehouse.dir": "alluxio://<alluxio>/user/spark/warehouse",

      "spark.driver.extraClassPath": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar",

      "spark.executor.extraClassPath": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar"

    }

  },

  {

    "configurations": [

      {

        "classification": "export",

        "properties": {

          "HADOOP_CLASSPATH": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar:${HADOOP_CLASSPATH}"

        }

      }

    ],

    "classification": "hadoop-env",

    "properties": {

     

    }

  }


On Tuesday, September 18, 2018 at 3:47:07 AM UTC+5:30, Bin Fan wrote:
hi Jais,

can you take a look at this article an see if it helps?

<a href="http://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FDebugging-Guide.html%23q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHafVD9zB77RRPloPspvjabFV8qRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FDebugging-Guide.html%23q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHafVD9zB77RRPloPspvjabFV8qRg&#39;;return true;">http://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found

- Bin

On Monday, September 17, 2018 at 11:19:50 AM UTC-7, Andrew Audibert wrote:
Hi Jais,

Could you try setting
spark.driver.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar
spark.executor.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar

in spark-defaults.conf? It looks like the spark application doesn't have the client on its classpath.

- Andrew

On Mon, Sep 17, 2018 at 9:03 AM Jais Sebastian <[hidden email]> wrote:
 Hi,
We are setting "alluxio" as the default file system scheme while executing Spark EMR job. For this, we added below configurations to the core-site EMR config
core-site fs.AbstractFileSystem.alluxio.impl alluxio.hadoop.AlluxioFileSystem
core-site fs.alluxio.impl alluxio.hadoop.FileSystem


Also added Alluxio client Jars with dependencies to the spark driver and executor classpath, also downloaded client jar with dependencies in bootstrap actions and added  to yarn application class path
core-site yarn.application.classpath $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*, /usr/lib/hadoop-lzo/lib/*, /home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar, /usr/share/aws/emr/emrfs/conf, /usr/share/aws/emr/emrfs/lib/*, /usr/share/aws/emr/emrfs/auxlib/*, /usr/share/aws/emr/lib/*, /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar, /usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar, /usr/lib/spark/yarn/lib/datanucleus-api-jdo.jar, /usr/lib/spark/yarn/lib/datanucleus-core.jar, /usr/lib/spark/yarn/lib/datanucleus-rdbms.jar, /usr/share/aws/emr/cloudwatch-sink/lib/*, /usr/share/aws/aws-java-sdk/*

But still, we are getting Class not found error. What we suspect is, since we set the defaultFs to alluxio, all the dependent
jars are copied into alluxio and jar reference is supplied with alluxio scheme. Somehow the Alluxio jar is not in the classpath
and getting an error.

Attached the logs for your reference

Regards,
Jais

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Alluxio class not found error in Spark EMR cluster

binfan
Administrator
Thanks for letting us know.
I will see how to address this in Alluxio documentation.

- Bin

On Wednesday, September 19, 2018 at 5:24:09 AM UTC-7, Jais Sebastian wrote:
Both were required of us. HADOOP_CLASSPATH resolved the issues while downloading dependent jars. But later spark tasks were failing, so I added spark.driver.classpath as well.

On Wednesday, September 19, 2018 at 10:47:51 AM UTC+5:30, Bin Fan wrote:
thanks for the update

could you help us on clarifying which part you added to solve the problem?
the 
spark.jars 
or

"properties": {

          "HADOOP_CLASSPATH": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar:${HADOOP_CLASSPATH}"

        }

or others?


Best


- Bin


On Tuesday, September 18, 2018 at 7:32:31 AM UTC-7, Jais Sebastian wrote:
Hi,

We added the following configurations and issue solved

  {

    "classification": "spark-defaults",

    "properties": {

      "spark.eventLog.dir": "alluxio://<alluxio>/var/log/spark/apps",

      "spark.history.fs.logDirectory": "<alluxio>/var/log/spark/apps",

      "spark.jars": "<path>/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar",

      "spark.sql.warehouse.dir": "alluxio://<alluxio>/user/spark/warehouse",

      "spark.driver.extraClassPath": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar",

      "spark.executor.extraClassPath": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar"

    }

  },

  {

    "configurations": [

      {

        "classification": "export",

        "properties": {

          "HADOOP_CLASSPATH": "/home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar:${HADOOP_CLASSPATH}"

        }

      }

    ],

    "classification": "hadoop-env",

    "properties": {

     

    }

  }


On Tuesday, September 18, 2018 at 3:47:07 AM UTC+5:30, Bin Fan wrote:
hi Jais,

can you take a look at this article an see if it helps?

<a href="http://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FDebugging-Guide.html%23q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHafVD9zB77RRPloPspvjabFV8qRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FDebugging-Guide.html%23q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHafVD9zB77RRPloPspvjabFV8qRg&#39;;return true;">http://www.alluxio.org/docs/master/en/Debugging-Guide.html#q-why-do-i-see-exceptions-like-javalangruntimeexception-javalangclassnotfoundexception-class-alluxiohadoopfilesystem-not-found

- Bin

On Monday, September 17, 2018 at 11:19:50 AM UTC-7, Andrew Audibert wrote:
Hi Jais,

Could you try setting
spark.driver.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar
spark.executor.extraClassPath /<PATH_TO_ALLUXIO>/client/alluxio-1.8.0-client.jar

in spark-defaults.conf? It looks like the spark application doesn't have the client on its classpath.

- Andrew

On Mon, Sep 17, 2018 at 9:03 AM Jais Sebastian <[hidden email]> wrote:
 Hi,
We are setting "alluxio" as the default file system scheme while executing Spark EMR job. For this, we added below configurations to the core-site EMR config
core-site fs.AbstractFileSystem.alluxio.impl alluxio.hadoop.AlluxioFileSystem
core-site fs.alluxio.impl alluxio.hadoop.FileSystem


Also added Alluxio client Jars with dependencies to the spark driver and executor classpath, also downloaded client jar with dependencies in bootstrap actions and added  to yarn application class path
core-site yarn.application.classpath $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*, /usr/lib/hadoop-lzo/lib/*, /home/hadoop/contents/alluxio-core-client-runtime-1.8.1-SNAPSHOT-jar-with-dependencies.jar, /usr/share/aws/emr/emrfs/conf, /usr/share/aws/emr/emrfs/lib/*, /usr/share/aws/emr/emrfs/auxlib/*, /usr/share/aws/emr/lib/*, /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar, /usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar, /usr/lib/spark/yarn/lib/datanucleus-api-jdo.jar, /usr/lib/spark/yarn/lib/datanucleus-core.jar, /usr/lib/spark/yarn/lib/datanucleus-rdbms.jar, /usr/share/aws/emr/cloudwatch-sink/lib/*, /usr/share/aws/aws-java-sdk/*

But still, we are getting Class not found error. What we suspect is, since we set the defaultFs to alluxio, all the dependent
jars are copied into alluxio and jar reference is supplied with alluxio scheme. Somehow the Alluxio jar is not in the classpath
and getting an error.

Attached the logs for your reference

Regards,
Jais

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
--
Andrew Audibert
<a href="http://alluxio.com/" style="color:rgb(17,85,204);font-size:12.8px" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Falluxio.com%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOzcgHeqiDCH9tkk9r99TjTZX7Nw&#39;;return true;">Alluxio, Inc. | <a href="http://bit.ly/alluxio-open-source" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-open-source\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEDNVXZleOB7VIXYMM8vGuSeh4NQw&#39;;return true;">Alluxio Open Source | <a href="http://bit.ly/alluxio-get-involved" style="color:rgb(17,85,204)" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2Falluxio-get-involved\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEMkj0A_5qpmy2ZeIJGUV1QLgzxRg&#39;;return true;">Alluxio Community Site

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.