Delete behavior

classic Classic list List threaded Threaded
6 messages Options
Pan
Reply | Threaded
Open this post in threaded view
|

Delete behavior

Pan
Hello,
    I am using Alluxio 1.2.0 on cloudera quickstart VM as a yarn application with HDFS as underFS.

In my design, an application writes to HDFS (I cant make it to write to Alluxio as lot of older modules depend on that). I read using Alluxio. The HDFS file is then deleted using HDFS API. It however is visible in the browse files on Alluxio. So have following queries -

  • How long will the file reside in Alluxio after it has been deleted in the underlying FS.
  • Is it necessary for me to manually delete it from Alluxio.
  • Is there a way to explicitly evict a file from Alluxio (even though its not deleted from HDFS)

Regards,

Pranav Nakhe

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Delete behavior

Chaomin Yu
Hi Pranav,

- The files that has been deleted in the underlying FS, will reside in Alluxio unless you explicitly delete/evict them from Alluxio namespace.
- Yes, you can use Alluxio "free" cmd to evict a file or a directory from Alluxio, while NOT deleting it from under storage system. 
- Alternatively, you can also set TTL(time to live) on a Alluxio file. The file will automatically be deleted once the current time is greater than the TTL + creation time of the file. This delete will affect both Alluxio and the under storage system.

Hope this helps,
Chaomin

On Fri, Aug 12, 2016 at 3:16 AM, Pan <[hidden email]> wrote:
Hello,
    I am using Alluxio 1.2.0 on cloudera quickstart VM as a yarn application with HDFS as underFS.

In my design, an application writes to HDFS (I cant make it to write to Alluxio as lot of older modules depend on that). I read using Alluxio. The HDFS file is then deleted using HDFS API. It however is visible in the browse files on Alluxio. So have following queries -

  • How long will the file reside in Alluxio after it has been deleted in the underlying FS.
  • Is it necessary for me to manually delete it from Alluxio.
  • Is there a way to explicitly evict a file from Alluxio (even though its not deleted from HDFS)

Regards,

Pranav Nakhe

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--
Cheers,
Chaomin

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Pan
Reply | Threaded
Open this post in threaded view
|

Re: Delete behavior

Pan
Hello Chaomin,
   Thanks for your prompt response.

I am using the hadoop API itself to delete a file in Alluxio. I have made changes to core-site.xml and hdfs-site.xml as specified in link -

http://www.alluxio.org/docs/master/en/Running-Hadoop-MapReduce-on-Alluxio.html

I have the following code to delete a file in Alluxio -

object DeleteMain {
  def main(args: Array[String]): Unit = {
    val op : HDFSOperations = new HDFSOperations
    op.deleteHDFSFile("10.65.22.211:19998", "/log.txt")
  }
}
class HDFSOperations extends Configured{

  def deleteHDFSFile(fs: String, path: String) = {
    val conf = new Configuration
    conf.set("fs.defaultFS", fs)
    FileSystem.get(conf).delete(new Path(path),true)
  }
}

I have alluxio client jar on the classpath on my client. I get the following error

Exception in thread "main" java.io.EOFException: End of File Exception between local host is: "INW00004651/10.65.22.138"; destination host is: "quickstart.cloudera":19998; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.delete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.delete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1929)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:638)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:634)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:634)
at HDFSOperations.deleteHDFSFile(DeleteMain.scala:22)
at DeleteMain$.main(DeleteMain.scala:10)
at DeleteMain.main(DeleteMain.scala)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)

Is there something I am missing when using hadoop native API when deleting Alluxio files. 

Regards,
Pranav Nakhe


On Friday, August 12, 2016 at 10:59:44 PM UTC+5:30, Chaomin Yu wrote:
Hi Pranav,

- The files that has been deleted in the underlying FS, will reside in Alluxio unless you explicitly delete/evict them from Alluxio namespace.
- Yes, you can use Alluxio "<a href="http://www.alluxio.org/docs/master/en/Command-Line-Interface.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FCommand-Line-Interface.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEppQGIrC3JI28wS3Rl3wkusTbtQA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FCommand-Line-Interface.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEppQGIrC3JI28wS3Rl3wkusTbtQA&#39;;return true;">free" cmd to evict a file or a directory from Alluxio, while NOT deleting it from under storage system. 
- Alternatively, you can also set <a href="http://www.alluxio.org/docs/master/en/Command-Line-Interface.html" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FCommand-Line-Interface.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEppQGIrC3JI28wS3Rl3wkusTbtQA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.org%2Fdocs%2Fmaster%2Fen%2FCommand-Line-Interface.html\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEppQGIrC3JI28wS3Rl3wkusTbtQA&#39;;return true;">TTL(time to live) on a Alluxio file. The file will automatically be deleted once the current time is greater than the TTL + creation time of the file. This delete will affect both Alluxio and the under storage system.

Hope this helps,
Chaomin

On Fri, Aug 12, 2016 at 3:16 AM, Pan <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="GH1fzgCECwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">pranav...@...> wrote:
Hello,
    I am using Alluxio 1.2.0 on cloudera quickstart VM as a yarn application with HDFS as underFS.

In my design, an application writes to HDFS (I cant make it to write to Alluxio as lot of older modules depend on that). I read using Alluxio. The HDFS file is then deleted using HDFS API. It however is visible in the browse files on Alluxio. So have following queries -

  • How long will the file reside in Alluxio after it has been deleted in the underlying FS.
  • Is it necessary for me to manually delete it from Alluxio.
  • Is there a way to explicitly evict a file from Alluxio (even though its not deleted from HDFS)

Regards,

Pranav Nakhe

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="GH1fzgCECwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">alluxio-user...@googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.



--
Cheers,
Chaomin

<a href="http://www.alluxio.com" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.com\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEZyav7dNXoLmZvdCMrjvxeKHZdTw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.alluxio.com\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEZyav7dNXoLmZvdCMrjvxeKHZdTw&#39;;return true;">Alluxio Inc

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Delete behavior

Chaomin Yu
Hi,

Can you please try replacing the "10.65.22.211:19998" with "alluxio://10.65.22.211:19998" ?

To use Hadoop API accessing Alluxio files, you need to replace the "hdfs://" prefix with "alluxio://", so that the actual FileSystem calls will go through Alluxio code path. Otherwise, like what you can see in the log, it goes into Hadoop client FileSystem code path.

Hope this helps,
Chaomin

On Wed, Aug 17, 2016 at 4:03 AM, Pan <[hidden email]> wrote:
Hello Chaomin,
   Thanks for your prompt response.

I am using the hadoop API itself to delete a file in Alluxio. I have made changes to core-site.xml and hdfs-site.xml as specified in link -


I have the following code to delete a file in Alluxio -

object DeleteMain {
  def main(args: Array[String]): Unit = {
    val op : HDFSOperations = new HDFSOperations
    op.deleteHDFSFile("10.65.22.211:19998", "/log.txt")
  }
}
class HDFSOperations extends Configured{

  def deleteHDFSFile(fs: String, path: String) = {
    val conf = new Configuration
    conf.set("fs.defaultFS", fs)
    FileSystem.get(conf).delete(new Path(path),true)
  }
}

I have alluxio client jar on the classpath on my client. I get the following error

Exception in thread "main" java.io.EOFException: End of File Exception between local host is: "INW00004651/10.65.22.138"; destination host is: "quickstart.cloudera":19998; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.delete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.delete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1929)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:638)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:634)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:634)
at HDFSOperations.deleteHDFSFile(DeleteMain.scala:22)
at DeleteMain$.main(DeleteMain.scala:10)
at DeleteMain.main(DeleteMain.scala)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)

Is there something I am missing when using hadoop native API when deleting Alluxio files. 

Regards,
Pranav Nakhe


On Friday, August 12, 2016 at 10:59:44 PM UTC+5:30, Chaomin Yu wrote:
Hi Pranav,

- The files that has been deleted in the underlying FS, will reside in Alluxio unless you explicitly delete/evict them from Alluxio namespace.
- Yes, you can use Alluxio "free" cmd to evict a file or a directory from Alluxio, while NOT deleting it from under storage system. 
- Alternatively, you can also set TTL(time to live) on a Alluxio file. The file will automatically be deleted once the current time is greater than the TTL + creation time of the file. This delete will affect both Alluxio and the under storage system.

Hope this helps,
Chaomin

On Fri, Aug 12, 2016 at 3:16 AM, Pan <[hidden email]> wrote:
Hello,
    I am using Alluxio 1.2.0 on cloudera quickstart VM as a yarn application with HDFS as underFS.

In my design, an application writes to HDFS (I cant make it to write to Alluxio as lot of older modules depend on that). I read using Alluxio. The HDFS file is then deleted using HDFS API. It however is visible in the browse files on Alluxio. So have following queries -

  • How long will the file reside in Alluxio after it has been deleted in the underlying FS.
  • Is it necessary for me to manually delete it from Alluxio.
  • Is there a way to explicitly evict a file from Alluxio (even though its not deleted from HDFS)

Regards,

Pranav Nakhe

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Cheers,
Chaomin

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--
Cheers,
Chaomin

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Pan
Reply | Threaded
Open this post in threaded view
|

Re: Delete behavior

Pan
Thanks. That works :)

On Wed, Aug 17, 2016 at 11:51 PM, Chaomin Yu <[hidden email]> wrote:
Hi,

Can you please try replacing the "10.65.22.211:19998" with "alluxio://10.65.22.211:19998" ?

To use Hadoop API accessing Alluxio files, you need to replace the "hdfs://" prefix with "alluxio://", so that the actual FileSystem calls will go through Alluxio code path. Otherwise, like what you can see in the log, it goes into Hadoop client FileSystem code path.

Hope this helps,
Chaomin

On Wed, Aug 17, 2016 at 4:03 AM, Pan <[hidden email]> wrote:
Hello Chaomin,
   Thanks for your prompt response.

I am using the hadoop API itself to delete a file in Alluxio. I have made changes to core-site.xml and hdfs-site.xml as specified in link -


I have the following code to delete a file in Alluxio -

object DeleteMain {
  def main(args: Array[String]): Unit = {
    val op : HDFSOperations = new HDFSOperations
    op.deleteHDFSFile("10.65.22.211:19998", "/log.txt")
  }
}
class HDFSOperations extends Configured{

  def deleteHDFSFile(fs: String, path: String) = {
    val conf = new Configuration
    conf.set("fs.defaultFS", fs)
    FileSystem.get(conf).delete(new Path(path),true)
  }
}

I have alluxio client jar on the classpath on my client. I get the following error

Exception in thread "main" java.io.EOFException: End of File Exception between local host is: "INW00004651/10.65.22.138"; destination host is: "quickstart.cloudera":19998; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.delete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.delete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1929)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:638)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:634)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:634)
at HDFSOperations.deleteHDFSFile(DeleteMain.scala:22)
at DeleteMain$.main(DeleteMain.scala:10)
at DeleteMain.main(DeleteMain.scala)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)

Is there something I am missing when using hadoop native API when deleting Alluxio files. 

Regards,
Pranav Nakhe


On Friday, August 12, 2016 at 10:59:44 PM UTC+5:30, Chaomin Yu wrote:
Hi Pranav,

- The files that has been deleted in the underlying FS, will reside in Alluxio unless you explicitly delete/evict them from Alluxio namespace.
- Yes, you can use Alluxio "free" cmd to evict a file or a directory from Alluxio, while NOT deleting it from under storage system. 
- Alternatively, you can also set TTL(time to live) on a Alluxio file. The file will automatically be deleted once the current time is greater than the TTL + creation time of the file. This delete will affect both Alluxio and the under storage system.

Hope this helps,
Chaomin

On Fri, Aug 12, 2016 at 3:16 AM, Pan <[hidden email]> wrote:
Hello,
    I am using Alluxio 1.2.0 on cloudera quickstart VM as a yarn application with HDFS as underFS.

In my design, an application writes to HDFS (I cant make it to write to Alluxio as lot of older modules depend on that). I read using Alluxio. The HDFS file is then deleted using HDFS API. It however is visible in the browse files on Alluxio. So have following queries -

  • How long will the file reside in Alluxio after it has been deleted in the underlying FS.
  • Is it necessary for me to manually delete it from Alluxio.
  • Is there a way to explicitly evict a file from Alluxio (even though its not deleted from HDFS)

Regards,

Pranav Nakhe

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]om.
For more options, visit https://groups.google.com/d/optout.



--
Cheers,
Chaomin

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--
Cheers,
Chaomin


--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Delete behavior

Chaomin Yu
Glad to hear that the problem is solved!

On Wed, Aug 17, 2016 at 8:10 PM, Pranav Nakhe <[hidden email]> wrote:
Thanks. That works :)

On Wed, Aug 17, 2016 at 11:51 PM, Chaomin Yu <[hidden email]> wrote:
Hi,

Can you please try replacing the "10.65.22.211:19998" with "alluxio://10.65.22.211:19998" ?

To use Hadoop API accessing Alluxio files, you need to replace the "hdfs://" prefix with "alluxio://", so that the actual FileSystem calls will go through Alluxio code path. Otherwise, like what you can see in the log, it goes into Hadoop client FileSystem code path.

Hope this helps,
Chaomin

On Wed, Aug 17, 2016 at 4:03 AM, Pan <[hidden email]> wrote:
Hello Chaomin,
   Thanks for your prompt response.

I am using the hadoop API itself to delete a file in Alluxio. I have made changes to core-site.xml and hdfs-site.xml as specified in link -


I have the following code to delete a file in Alluxio -

object DeleteMain {
  def main(args: Array[String]): Unit = {
    val op : HDFSOperations = new HDFSOperations
    op.deleteHDFSFile("10.65.22.211:19998", "/log.txt")
  }
}
class HDFSOperations extends Configured{

  def deleteHDFSFile(fs: String, path: String) = {
    val conf = new Configuration
    conf.set("fs.defaultFS", fs)
    FileSystem.get(conf).delete(new Path(path),true)
  }
}

I have alluxio client jar on the classpath on my client. I get the following error

Exception in thread "main" java.io.EOFException: End of File Exception between local host is: "INW00004651/10.65.22.138"; destination host is: "quickstart.cloudera":19998; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.delete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.delete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1929)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:638)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:634)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:634)
at HDFSOperations.deleteHDFSFile(DeleteMain.scala:22)
at DeleteMain$.main(DeleteMain.scala:10)
at DeleteMain.main(DeleteMain.scala)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)

Is there something I am missing when using hadoop native API when deleting Alluxio files. 

Regards,
Pranav Nakhe


On Friday, August 12, 2016 at 10:59:44 PM UTC+5:30, Chaomin Yu wrote:
Hi Pranav,

- The files that has been deleted in the underlying FS, will reside in Alluxio unless you explicitly delete/evict them from Alluxio namespace.
- Yes, you can use Alluxio "free" cmd to evict a file or a directory from Alluxio, while NOT deleting it from under storage system. 
- Alternatively, you can also set TTL(time to live) on a Alluxio file. The file will automatically be deleted once the current time is greater than the TTL + creation time of the file. This delete will affect both Alluxio and the under storage system.

Hope this helps,
Chaomin

On Fri, Aug 12, 2016 at 3:16 AM, Pan <[hidden email]> wrote:
Hello,
    I am using Alluxio 1.2.0 on cloudera quickstart VM as a yarn application with HDFS as underFS.

In my design, an application writes to HDFS (I cant make it to write to Alluxio as lot of older modules depend on that). I read using Alluxio. The HDFS file is then deleted using HDFS API. It however is visible in the browse files on Alluxio. So have following queries -

  • How long will the file reside in Alluxio after it has been deleted in the underlying FS.
  • Is it necessary for me to manually delete it from Alluxio.
  • Is there a way to explicitly evict a file from Alluxio (even though its not deleted from HDFS)

Regards,

Pranav Nakhe

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]om.
For more options, visit https://groups.google.com/d/optout.



--
Cheers,
Chaomin

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--
Cheers,
Chaomin





--
Cheers,
Chaomin

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.