Spark Thrift Server (Spark 2.0) show table has value with NULL in all fields

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark Thrift Server (Spark 2.0) show table has value with NULL in all fields

Chanh Le
Hi everyone,

I have problem when I create a external table in Spark Thrift Server (STS) and query the data.

Scenario:
Spark 2.0
Alluxio 1.2.0 
Zeppelin 0.7.0
STS start script 
/home/spark/spark-2.0.0-bin-hadoop2.6/sbin/start-thriftserver.sh --master <a href="mesos://zk://master1:2181,master2:2181,master3:2181/mesos" class="">mesos://zk://master1:2181,master2:2181,master3:2181/mesos --conf spark.driver.memory=5G --conf spark.scheduler.mode=FAIR --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --jars /home/spark/spark-2.0.0-bin-hadoop2.6/jars/alluxio-core-client-spark-1.2.0-jar-with-dependencies.jar --total-executor-cores 35 spark-internal --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.metastore.warehouse.dir=/user/hive/warehouse --hiveconf hive.metastore.metadb.dir=/user/hive/metadb --conf spark.sql.shuffle.partitions=20

I have a file store in Alluxio <a href="alluxio://master2:19998/etl_info/TOPIC" class="">alluxio://master2:19998/etl_info/TOPIC

then I create a table in STS by 
CREATE EXTERNAL TABLE topic (topic_id int, topic_name_vn String, topic_name_en String, parent_id int, full_parent String, level_id int)
STORED AS PARQUET LOCATION '<a href="alluxio://master2:19998/etl_info/TOPIC'" class="">alluxio://master2:19998/etl_info/TOPIC';

to compare STS with Spark I create a temp table with name topics
spark.sqlContext.read.parquet("<a href="alluxio://master2:19998/etl_info/TOPIC" class="">alluxio://master2:19998/etl_info/TOPIC").registerTempTable("topics")

Then I do query and compare.


As you can see the result is different.
Is that a bug? Or I did something wrong

Regards,
Chanh

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Spark Thrift Server (Spark 2.0) show table has value with NULL in all fields

Gene Pang
Hi Chanh,

Where you able to resolve this issue? It looks like it works with Alluxio without the thrift server?

Thanks,
Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Spark Thrift Server (Spark 2.0) show table has value with NULL in all fields

Chanh Le
In reply to this post by Chanh Le
Hi Gene,
It's a Spark 2.0 issue.
I switch to Spark 1.6.1 it's ok now.

Thanks.

On Thursday, July 28, 2016 at 4:25:48 PM UTC+7, Chanh Le wrote:
Hi everyone,

I have problem when I create a external table in Spark Thrift Server (STS) and query the data.

Scenario:
Spark 2.0
Alluxio 1.2.0 
Zeppelin 0.7.0
STS start script 
/home/spark/spark-2.0.0-bin-hadoop2.6/sbin/start-thriftserver.sh --master mesos://zk://master1:2181,master2:2181,master3:2181/mesos --conf spark.driver.memory=5G --conf spark.scheduler.mode=FAIR --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --jars /home/spark/spark-2.0.0-bin-hadoop2.6/jars/alluxio-core-client-spark-1.2.0-jar-with-dependencies.jar --total-executor-cores 35 spark-internal --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.metastore.warehouse.dir=/user/hive/warehouse --hiveconf hive.metastore.metadb.dir=/user/hive/metadb --conf spark.sql.shuffle.partitions=20

I have a file store in Alluxio alluxio://master2:19998/etl_info/TOPIC

then I create a table in STS by 
CREATE EXTERNAL TABLE topic (topic_id int, topic_name_vn String, topic_name_en String, parent_id int, full_parent String, level_id int)
STORED AS PARQUET LOCATION 'alluxio://master2:19998/etl_info/TOPIC';

to compare STS with Spark I create a temp table with name topics
spark.sqlContext.read.parquet("alluxio://master2:19998/etl_info/TOPIC").registerTempTable("topics")

Then I do query and compare.


As you can see the result is different.
Is that a bug? Or I did something wrong

Regards,
Chanh

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Spark Thrift Server (Spark 2.0) show table has value with NULL in all fields

Gene Pang
Thanks for the update.

-Gene

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.