Using IB as Alluxio connections(modified)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Using IB as Alluxio connections(modified)

Jessica Ren

For higher speed, I tried to deploy Alluxio on InfiniteBand (40Gb/s). I changed the configuration file “alluxio-site.properties” as follows:

#master node, I also changed the “workers” under conf directory accordingly.


1_master.jpg


#worker node

2_worker.png


In the above configuration, yz*-ib0 represents for the address of IB which is specified in “/etc/hosts”.

3_hosts.png



My question is: when the hostname is specified as Ethernet address, such as yz1, it works well with Ethernet. When I change it to be the IB address yz1-ib0, it also works well before I disconnect the Ethernet. (There was two kinds of network connections, IB and Ethernet, between these servers). Then, I disconnet the Ethernet on yz1-ib0 (master) and yz3-ib0 (worker), that is: The master is on yz1-ib0 (with only IB connection), the worker is on yz3-ib0 (with only IB connection), and the client is on yz2-ib0 (with both IB connection and Ethernet connection for log-in purpose). It cannot working on both master and worker, showing “network unreachable”; but it works on the client with Ethernet. (Pls see the following pictures). In addition, it does not work using FUSE on the client, showing “Input/output error”.

4_tests-1.png

4_tests-2.png

4_tests-3.png


I am not quick understand. It seems that it still tries to find the Ethernet though the IB address is reachable (I tested it using ping). Is it the right configuration? Is there any underlying mechanism that was ignored under this scenario? Could you please give some suggestions? Thanks very much.


--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Using IB as Alluxio connections(modified)

binfan
Administrator

hi Jessica,

Here is my hypothesis:
for some reasons the local address you obtained is considered invalid due to here

I would suggest you to enable the following debug log here
LOG.debug("address: {} isLoopbackAddress: {}, with host {} {}", address,
          address.isLoopbackAddress(), address.getHostAddress(), address.getHostName());

See how to enable debug logging on worker or server in Alluxio here
then go to workers and check logs/worker.out

On Thursday, November 15, 2018 at 7:51:49 AM UTC-8, Jessica Ren wrote:

For higher speed, I tried to deploy Alluxio on InfiniteBand (40Gb/s). I changed the configuration file “alluxio-site.properties” as follows:

#master node, I also changed the “workers” under conf directory accordingly.


1_master.jpg


#worker node

2_worker.png


In the above configuration, yz*-ib0 represents for the address of IB which is specified in “/etc/hosts”.

3_hosts.png



My question is: when the hostname is specified as Ethernet address, such as yz1, it works well with Ethernet. When I change it to be the IB address yz1-ib0, it also works well before I disconnect the Ethernet. (There was two kinds of network connections, IB and Ethernet, between these servers). Then, I disconnet the Ethernet on yz1-ib0 (master) and yz3-ib0 (worker), that is: The master is on yz1-ib0 (with only IB connection), the worker is on yz3-ib0 (with only IB connection), and the client is on yz2-ib0 (with both IB connection and Ethernet connection for log-in purpose). It cannot working on both master and worker, showing “network unreachable”; but it works on the client with Ethernet. (Pls see the following pictures). In addition, it does not work using FUSE on the client, showing “Input/output error”.

4_tests-1.png

4_tests-2.png

4_tests-3.png


I am not quick understand. It seems that it still tries to find the Ethernet though the IB address is reachable (I tested it using ping). Is it the right configuration? Is there any underlying mechanism that was ignored under this scenario? Could you please give some suggestions? Thanks very much.


--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.