problem in Alluxio with secure HDFS

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

problem in Alluxio with secure HDFS

张文歆
Alluxio version: 1.7.0
UFS: secure HDFS
OS version: CentOS Linux release 7.1.1503
java version: 1.8.0_131

problem: 
use secure HDFS as under file system. 
master and workers share the same hdfs configuration files
use crontab on all alluxio node to renew TGT. 
but it only work on master node. worker node through below exception when exectue copyFromLocal in shell.

2018-08-29 11:10:42,993 INFO  RetryInvocationHandler - Exception while invoking ClientNamenodeProtocolTranslatorPB.create over instance-91gj2ol0.novalocal/192.168.0.5:8020 after 2 failover attempts. Trying to failover immediately.
java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "instance-0mgb5shh.novalocal/192.168.16.4"; destination host is: "instance-91gj2ol0.novalocal":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1485)
        at org.apache.hadoop.ipc.Client.call(Client.java:1427)
        at org.apache.hadoop.ipc.Client.call(Client.java:1337)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy46.create(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:293)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
        at com.sun.proxy.$Proxy47.create(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:246)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1257)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1199)
        at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:472)
        at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:469)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:469)
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:410)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:928)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:806)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:795)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:596)
        at alluxio.underfs.hdfs.HdfsUnderFileSystem.createDirect(HdfsUnderFileSystem.java:184)
        at alluxio.underfs.hdfs.HdfsUnderFileSystem.create(HdfsUnderFileSystem.java:172)
        at alluxio.underfs.UnderFileSystemWithLogging$5.call(UnderFileSystemWithLogging.java:121)
        at alluxio.underfs.UnderFileSystemWithLogging$5.call(UnderFileSystemWithLogging.java:118)
        at alluxio.underfs.UnderFileSystemWithLogging.call(UnderFileSystemWithLogging.java:556)
        at alluxio.underfs.UnderFileSystemWithLogging.create(UnderFileSystemWithLogging.java:118)
        at alluxio.worker.netty.UfsFileWriteHandler$UfsFilePacketWriter.createUfsFile(UfsFileWriteHandler.java:155)
        at alluxio.worker.netty.UfsFileWriteHandler$UfsFilePacketWriter.completeRequest(UfsFileWriteHandler.java:110)
        at alluxio.worker.netty.UfsFileWriteHandler$UfsFilePacketWriter.completeRequest(UfsFileWriteHandler.java:89)
        at alluxio.worker.netty.AbstractWriteHandler$PacketWriter.runInternal(AbstractWriteHandler.java:312)
        at alluxio.worker.netty.AbstractWriteHandler$PacketWriter.run(AbstractWriteHandler.java:244)
        at alluxio.worker.netty.UfsFileWriteHandler$UfsFilePacketWriter.run(UfsFileWriteHandler.java:89)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

Gene Pang
Hi,

Do the workers and masters share the same alluxio configuration? How many workers and masters do you have deployed, and where are they deployed?

Thanks,
Gene


--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

张文歆
Hi,

workers and masters  share the same alluxio configuration.
I deploy 1 master and 2 worker on a 3 machine cluster.

在 2018年8月31日星期五 UTC+8下午10:16:22,Gene Pang写道:
Hi,

Do the workers and masters share the same alluxio configuration? How many workers and masters do you have deployed, and where are they deployed?

Thanks,
Gene

<a href="http://bit.ly/2EmpC7u" style="color:rgb(17,85,204);font-size:12.8px" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2F2EmpC7u\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_0jMFTBLczk38hwD4XXX91KWwIQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2F2EmpC7u\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_0jMFTBLczk38hwD4XXX91KWwIQ&#39;;return true;">alluxio.com | <a href="http://bit.ly/2G7XIIO" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2F2G7XIIO\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOybm0pd7v3PlGRUU-joWOqGBDQw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2F2G7XIIO\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOybm0pd7v3PlGRUU-joWOqGBDQw&#39;;return true;">alluxio.org <a href="http://bit.ly/2JD5Cwk" style="color:rgb(17,85,204);font-size:12.8px" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2F2JD5Cwk\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNF9U1Eocd-eCYZPelgbTpJ1dZJucQ&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fbit.ly%2F2JD5Cwk\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNF9U1Eocd-eCYZPelgbTpJ1dZJucQ&#39;;return true;">powered by Alluxio

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

Gene Pang
What are the kerberos principals you are using on each host?

If you try starting an Alluxio worker on the same node as the Alluxio master (the node that works), will that worker be able to connect to HDFS? To test this out, you can stop all the workers, and just start a single worker on that node with "alluxio-start.sh worker".

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

Deema Yatsyuk
Hello
I do not use kerberos at all.
also i can successfully open alluxio via
from any host
but when i run sample spark-shell script
val s = sc.textFile("alluxio://hn0-nsd-hd.yawj1ew5rq4e1biarf4nr5ngsc.ax.internal.cloudapp.net:19998/LICENSE")
val double = s.map(line => line + line)
double.saveAsTextFile("alluxio://hn0-nsd-hd.yawj1ew5rq4e1biarf4nr5ngsc.ax.internal.cloudapp.net:19998/LICENSE2")

 i have the same issue



On Tue, Sep 4, 2018 at 5:20 PM, Gene Pang <[hidden email]> wrote:
What are the kerberos principals you are using on each host?

If you try starting an Alluxio worker on the same node as the Alluxio master (the node that works), will that worker be able to connect to HDFS? To test this out, you can stop all the workers, and just start a single worker on that node with "alluxio-start.sh worker".

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

Gene Pang
The error is an HDFS error, which suggests that the HDFS client cannot log into Kerberos to connect to HDFS. Have you consulted the documentation for secure HDFS here? http://www.alluxio.org/docs/1.8/en/Configuring-Alluxio-with-secure-HDFS.html

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

Deema Yatsyuk
Hello
But i do not have kerberos at all. 

ср, 5 сент. 2018 г. в 17:16, Gene Pang <[hidden email]>:
The error is an HDFS error, which suggests that the HDFS client cannot log into Kerberos to connect to HDFS. Have you consulted the documentation for secure HDFS here? http://www.alluxio.org/docs/1.8/en/Configuring-Alluxio-with-secure-HDFS.html

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

Gene Pang
Hi,

Could you paste in the error you are seeing?

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

张文歆
In reply to this post by Gene Pang
Hi,
i start worker on the same node as master. the result is same. it's works well when i just restart it. but after one day, "mkdir" is works well, but when i try to copy from local, it is failed. the error is,

Channel to instance-0mgb5shh.novalocal/192.168.16.4:29999: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "instance-0mgb5shh.novalocal/192.168.16.4"; destination host is: "instance-91gj2ol0.novalocal":8020;

在 2018年9月4日星期二 UTC+8下午10:20:15,Gene Pang写道:
What are the kerberos principals you are using on each host?

If you try starting an Alluxio worker on the same node as the Alluxio master (the node that works), will that worker be able to connect to HDFS? To test this out, you can stop all the workers, and just start a single worker on that node with "alluxio-start.sh worker".

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

Gene Pang
If you have a setup with only 1 worker (on the same master node), does it work with copy from local?

In the error message I see multiple hosts: instance-0mgb5shh.novalocal and instance-91gj2ol0.novalocal . Can you try with only 1 worker, and everything on the master?

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

张文歆
i have only 1 worker, the host: "instance-91gj2ol0.novalocal":8020 is hdfs namenode.

在 2018年9月18日星期二 UTC+8上午1:10:28,Gene Pang写道:
If you have a setup with only 1 worker (on the same master node), does it work with copy from local?

In the error message I see multiple hosts: instance-0mgb5shh.novalocal and instance-91gj2ol0.novalocal . Can you try with only 1 worker, and everything on the master?

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: problem in Alluxio with secure HDFS

Gene Pang
After the expired ticket, is the master still able to connect to HDFS?

What is your alluxio-site.properties configuration?

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.