草庐IT

scala - 使用 Spark hadoop API 创建 RDD 以访问 Cassandra DB

coder 2024-01-06 原文

我正在运行一个节点 cassandra 2.0.3 和 Apache Spark 2.0.3

我创建了一个 scala 程序来使用 Spark hadoop API 创建 RDD 以访问 Cassandra DB。

还应该在 bashrc 中为 spaark 设置哪些环境变量,因为我在 spark-env.sh 中使用以下配置

export SPARK_MASTER_IP="10.0.3.15"

export SPARK_MASTER_PORT="7077"

export SCALA_HOME="/home/Desktop/CD/scala-2.9.3"

export SPARK_WORKER_MEMORY=1g

export SPARK_WORKER_INSTANCES=1

export SPARK_WORKER_DIR="/home/Desktop/CD/spark-0.8.0-incubating/sparkdata"

我的示例scala代码如下

            val job=new Job()
    job.setInputFormatClass(classOf[ColumnFamilyInputFormat])
    val host: String = "localhost"
            val port: String = "9160"
    ConfigHelper.setInputInitialAddress(job.getConfiguration(), host)
        ConfigHelper.setInputRpcPort(job.getConfiguration(), port)
    ConfigHelper.setInputColumnFamily(job.getConfiguration(), "demodb", "emp")
    ConfigHelper.setInputPartitioner(job.getConfiguration(), "Murmur3Partitioner")
    CqlConfigHelper.setInputColumns(job.getConfiguration(), "empid,deptid,first_name,last_name")
            CqlConfigHelper.setInputWhereClauses(job.getConfiguration(),"empid=104")

    // Make a new Hadoop RDD
    val casRdd = sc.newAPIHadoopRDD(job.getConfiguration(),
                            classOf[CqlPagingInputFormat],
                            classOf[Map[String, ByteBuffer]],
                            classOf[Map[String, ByteBuffer]])

    println(casRdd.count())

然而,当我在 Spark Master 上运行这个作业时,它没有完成作业并给出以下日志。

14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException
java.lang.RuntimeException
    at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:661)
    at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.<init>(CqlPagingRecordReader.java:297)
    at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader.initialize(CqlPagingRecordReader.java:163)
    at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:86)
    at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:74)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:226)
    at org.apache.spark.scheduler.ResultTask.run(ResultTask.scala:99)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:0 as TID 12 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:0 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 2 (task 0.0:2)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 1]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:2 as TID 13 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:2 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 1 (task 0.0:1)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 2]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:1 as TID 14 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:1 as 2144 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 5 (task 0.0:5)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 3]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:5 as TID 15 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:5 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 11 (task 0.0:11)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 4]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:11 as TID 16 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:11 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 8 (task 0.0:8)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 5]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:8 as TID 17 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:8 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 3 (task 0.0:3)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 6]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:3 as TID 18 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:3 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 6 (task 0.0:6)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 7]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:6 as TID 19 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:6 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 9 (task 0.0:9)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 8]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:9 as TID 20 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:9 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 7 (task 0.0:7)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 9]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:7 as TID 21 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:7 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 10 (task 0.0:10)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 10]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:10 as TID 22 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:10 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 4 (task 0.0:4)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 11]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:4 as TID 23 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:4 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 12 (task 0.0:0)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 12]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:0 as TID 24 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:0 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 13 (task 0.0:2)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 13]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:2 as TID 25 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:2 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 14 (task 0.0:1)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 14]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:1 as TID 26 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:1 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 15 (task 0.0:5)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 15]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:5 as TID 27 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:5 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 16 (task 0.0:11)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 16]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:11 as TID 28 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:11 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 17 (task 0.0:8)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 17]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:8 as TID 29 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:8 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 18 (task 0.0:3)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 18]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:3 as TID 30 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:3 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 19 (task 0.0:6)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 19]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:6 as TID 31 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:6 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 20 (task 0.0:9)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 20]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:9 as TID 32 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:9 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 21 (task 0.0:7)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 21]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:7 as TID 33 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:7 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 22 (task 0.0:10)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 22]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:10 as TID 34 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:10 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 23 (task 0.0:4)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 23]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:4 as TID 35 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:4 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 24 (task 0.0:0)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 24]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:0 as TID 36 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:0 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 25 (task 0.0:2)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 25]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:2 as TID 37 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:2 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 26 (task 0.0:1)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 26]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:1 as TID 38 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:1 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 27 (task 0.0:5)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 27]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:5 as TID 39 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:5 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 29 (task 0.0:8)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 28]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:8 as TID 40 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:8 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 28 (task 0.0:11)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 29]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:11 as TID 41 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:11 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 30 (task 0.0:3)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 30]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:3 as TID 42 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:3 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 31 (task 0.0:6)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 31]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:6 as TID 43 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:6 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 32 (task 0.0:9)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 32]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:9 as TID 44 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:9 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 33 (task 0.0:7)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 33]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:7 as TID 45 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:7 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 34 (task 0.0:10)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 34]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:10 as TID 46 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:10 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 35 (task 0.0:4)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 35]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:4 as TID 47 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:4 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 36 (task 0.0:0)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 36]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:0 as TID 48 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:0 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 37 (task 0.0:2)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 37]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:2 as TID 49 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:2 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 38 (task 0.0:1)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 38]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:1 as TID 50 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:1 as 2144 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 39 (task 0.0:5)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 39]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:5 as TID 51 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:5 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 40 (task 0.0:8)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 40]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:8 as TID 52 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:8 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 41 (task 0.0:11)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 41]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:11 as TID 53 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:11 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 42 (task 0.0:3)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 42]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:3 as TID 54 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:3 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 43 (task 0.0:6)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 43]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:6 as TID 55 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:6 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 44 (task 0.0:9)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 44]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:9 as TID 56 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:9 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 45 (task 0.0:7)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 45]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:7 as TID 57 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:7 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 46 (task 0.0:10)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 46]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:10 as TID 58 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:10 as 2146 bytes in 1 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 47 (task 0.0:4)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 47]
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Starting task 0.0:4 as TID 59 on executor 0: 10.0.0.100 (PROCESS_LOCAL)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:4 as 2146 bytes in 0 ms
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Lost TID 48 (task 0.0:0)
14/01/03 16:34:16 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.RuntimeException [duplicate 48]
14/01/03 16:34:16 ERROR cluster.ClusterTaskSetManager: Task 0.0:0 failed more than 4 times; aborting job
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Remove TaskSet 0.0 from pool 
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 51 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 49 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 50 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 52 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 53 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 54 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 52 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 51 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 55 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 53 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 56 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 54 because its task set is gone
14/01/03 16:34:16 INFO scheduler.DAGScheduler: Failed to run count at myown.scala:75
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 57 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 55 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 56 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 58 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 59 because its task set is gone
[error] (run-main) org.apache.spark.SparkException: Job failed: Task 0.0:0 failed more than 4 times
org.apache.spark.SparkException: Job failed: Task 0.0:0 failed more than 4 times
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758)
    at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:379)
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441)
    at org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149)
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 59 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 57 because its task set is gone
14/01/03 16:34:16 INFO cluster.ClusterScheduler: Ignoring update from TID 58 because its task set is gone

所以基本上我很困惑并努力解决这个问题,因为我不明白它是我的 scala 代码问题还是 spark 主从通信问题或 spark 环境配置问题。

请求指导我。

最佳答案

此答案可能会帮助您实现:Spark with Cassandra input/output

Datastax 宣布了 Spark 的官方 Cassandra 驱动程序。使用此解决方案,您无需实现 Hadoop 接口(interface),因为这是 Cassandra 和 Spark 之间的直接桥梁。

关于scala - 使用 Spark hadoop API 创建 RDD 以访问 Cassandra DB,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20920776/

有关scala - 使用 Spark hadoop API 创建 RDD 以访问 Cassandra DB的更多相关文章

  1. ruby - 如何使用 Nokogiri 的 xpath 和 at_xpath 方法 - 2

    我正在学习如何使用Nokogiri,根据这段代码我遇到了一些问题:require'rubygems'require'mechanize'post_agent=WWW::Mechanize.newpost_page=post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')puts"\nabsolutepathwithtbodygivesnil"putspost_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div

  2. ruby - 使用 RubyZip 生成 ZIP 文件时设置压缩级别 - 2

    我有一个Ruby程序,它使用rubyzip压缩XML文件的目录树。gem。我的问题是文件开始变得很重,我想提高压缩级别,因为压缩时间不是问题。我在rubyzipdocumentation中找不到一种为创建的ZIP文件指定压缩级别的方法。有人知道如何更改此设置吗?是否有另一个允许指定压缩级别的Ruby库? 最佳答案 这是我通过查看ruby​​zip内部创建的代码。level=Zlib::BEST_COMPRESSIONZip::ZipOutputStream.open(zip_file)do|zip|Dir.glob("**/*")d

  3. ruby - 为什么我可以在 Ruby 中使用 Object#send 访问私有(private)/ protected 方法? - 2

    类classAprivatedeffooputs:fooendpublicdefbarputs:barendprivatedefzimputs:zimendprotecteddefdibputs:dibendendA的实例a=A.new测试a.foorescueputs:faila.barrescueputs:faila.zimrescueputs:faila.dibrescueputs:faila.gazrescueputs:fail测试输出failbarfailfailfail.发送测试[:foo,:bar,:zim,:dib,:gaz].each{|m|a.send(m)resc

  4. ruby-on-rails - 使用 Ruby on Rails 进行自动化测试 - 最佳实践 - 2

    很好奇,就使用ruby​​onrails自动化单元测试而言,你们正在做什么?您是否创建了一个脚本来在cron中运行rake作业并将结果邮寄给您?git中的预提交Hook?只是手动调用?我完全理解测试,但想知道在错误发生之前捕获错误的最佳实践是什么。让我们理所当然地认为测试本身是完美无缺的,并且可以正常工作。下一步是什么以确保他们在正确的时间将可能有害的结果传达给您? 最佳答案 不确定您到底想听什么,但是有几个级别的自动代码库控制:在处理某项功能时,您可以使用类似autotest的内容获得关于哪些有效,哪些无效的即时反馈。要确保您的提

  5. ruby - 在 Ruby 中使用匿名模块 - 2

    假设我做了一个模块如下:m=Module.newdoclassCendend三个问题:除了对m的引用之外,还有什么方法可以访问C和m中的其他内容?我可以在创建匿名模块后为其命名吗(就像我输入“module...”一样)?如何在使用完匿名模块后将其删除,使其定义的常量不再存在? 最佳答案 三个答案:是的,使用ObjectSpace.此代码使c引用你的类(class)C不引用m:c=nilObjectSpace.each_object{|obj|c=objif(Class===objandobj.name=~/::C$/)}当然这取决于

  6. ruby - 如何在 Ruby 中顺序创建 PI - 2

    出于纯粹的兴趣,我很好奇如何按顺序创建PI,而不是在过程结果之后生成数字,而是让数字在过程本身生成时显示。如果是这种情况,那么数字可以自行产生,我可以对以前看到的数字实现垃圾收集,从而创建一个无限系列。结果只是在Pi系列之后每秒生成一个数字。这是我通过互联网筛选的结果:这是流行的计算机友好算法,类机器算法:defarccot(x,unity)xpow=unity/xn=1sign=1sum=0loopdoterm=xpow/nbreakifterm==0sum+=sign*(xpow/n)xpow/=x*xn+=2sign=-signendsumenddefcalc_pi(digits

  7. ruby - 使用 ruby​​ 和 savon 的 SOAP 服务 - 2

    我正在尝试使用ruby​​和Savon来使用网络服务。测试服务为http://www.webservicex.net/WS/WSDetails.aspx?WSID=9&CATID=2require'rubygems'require'savon'client=Savon::Client.new"http://www.webservicex.net/stockquote.asmx?WSDL"client.get_quotedo|soap|soap.body={:symbol=>"AAPL"}end返回SOAP异常。检查soap信封,在我看来soap请求没有正确的命名空间。任何人都可以建议我

  8. python - 如何使用 Ruby 或 Python 创建一系列高音调和低音调的蜂鸣声? - 2

    关闭。这个问题是opinion-based.它目前不接受答案。想要改进这个问题?更新问题,以便editingthispost可以用事实和引用来回答它.关闭4年前。Improvethisquestion我想在固定时间创建一系列低音和高音调的哔哔声。例如:在150毫秒时发出高音调的蜂鸣声在151毫秒时发出低音调的蜂鸣声200毫秒时发出低音调的蜂鸣声250毫秒的高音调蜂鸣声有没有办法在Ruby或Python中做到这一点?我真的不在乎输出编码是什么(.wav、.mp3、.ogg等等),但我确实想创建一个输出文件。

  9. ruby-on-rails - 'compass watch' 是如何工作的/它是如何与 rails 一起使用的 - 2

    我在我的项目目录中完成了compasscreate.和compassinitrails。几个问题:我已将我的.sass文件放在public/stylesheets中。这是放置它们的正确位置吗?当我运行compasswatch时,它不会自动编译这些.sass文件。我必须手动指定文件:compasswatchpublic/stylesheets/myfile.sass等。如何让它自动运行?文件ie.css、print.css和screen.css已放在stylesheets/compiled。如何在编译后不让它们重新出现的情况下删除它们?我自己编译的.sass文件编译成compiled/t

  10. ruby - 使用 ruby​​ 将 HTML 转换为纯文本并维护结构/格式 - 2

    我想将html转换为纯文本。不过,我不想只删除标签,我想智能地保留尽可能多的格式。为插入换行符标签,检测段落并格式化它们等。输入非常简单,通常是格式良好的html(不是整个文档,只是一堆内容,通常没有anchor或图像)。我可以将几个正则表达式放在一起,让我达到80%,但我认为可能有一些现有的解决方案更智能。 最佳答案 首先,不要尝试为此使用正则表达式。很有可能你会想出一个脆弱/脆弱的解决方案,它会随着HTML的变化而崩溃,或者很难管理和维护。您可以使用Nokogiri快速解析HTML并提取文本:require'nokogiri'h

随机推荐