草庐IT

has_ended

全部标签

hadoop - Ended Job = job_local644049657_0014 with errors Error during job, 获取调试信息

如何找到日志文件请指导我已经检查了资源管理器的url。但是我没有找到任何日志文件这是完整的错误QueryID=hadoop_20170325120040_d54d136a-1904-4af9-8f8d-4167343db072Totaljobs=1LaunchingJob1outof1Numberofreducetasksissetto0sincethere'snoreduceoperatorJobrunningin-process(localHadoop)2017-03-2512:00:42,954Stage-0map=0%,reduce=0%EndedJob=job_local64

hadoop - OOZIE : Connection exception has occurred [ java.net.ConnectException 连接被拒绝(连接被拒绝)]

我正在尝试在以下工具的帮助下执行Oozie作业网址:https://www.safaribooksonline.com/library/view/apache-oozie/9781449369910/ch05.html执行时ooziejob-run-configtarget/example/job.properties获取错误为:Connectionexceptionhasoccurred[java.net.ConnectExceptionConnectionrefused(Connectionrefused)].Tryingafter1sec.Retrycount=1Connecti

java - hadoop java : how to know that end of reducer input is reached?

我的reducer是这样的publicstaticclassReduceextendsMapReduceBaseimplementsReducer{ListallRecords=newArrayList();publicvoidreduce(IntWritablekey,Iteratorvalues,OutputCollectoroutput,Reporterreporter)throwsIOException{allRecords.add(values.next());Text[]outputValues=newText[7];for(inti=1;i>=7;i++){outputV

hadoop - 运行 pig 脚本给出错误 : job has failed. Stop running all dependent jobs

我需要帮助来了解为什么在运行pig脚本时出现错误。但是当我在较小的数据中尝试相同的脚本时,它会成功执行。有几个类似问题的问题,但没有一个有解决方案。我的脚本是这样的:A=load‘test.txt’usingTextLoader();B=foreachAgenerateSTRSPLIT($0,’”,”’)ast;C=FILTERBBY(t.$1==2andt.$2matches‘.*xxx.*’);StoreCintotemp;错误是:org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLaunch

xml - 配置单元-site.xml : The element type "configuration" must be terminated by the matching end-tag "</configuration>"

为了练习/学习,我正在尝试在Ubuntu系统上安装Hive。我正在遵循一组预先编写的说明。它说通过转到$HIVE_HOME并运行bin/hive来测试Hive安装。当我这样做时,我得到了相当大的文本转储,但我认为最重要的一点如下:**[FatalError]hive-site.xml:2787:3:Theelementtype"configuration"mustbeterminatedbythematchingend-tag"".17/05/0610:46:12FATALconf.Configuration:errorparsingconffile:/usr/local/hive/c

hadoop - pig 错误 0 : Scalar has more than one row in the output

我有两个文件,我试图在模式匹配的基础上加入这两个文件。File1:weather.bbc.co.uk,112ads.facebook.com,113ads.amazon.co.uk,114www.sky.com,115news.bbc.co.uk,116pics.facebook.com,117File2:facebook.com,facebookbbc.co.uk,bbcnetflix.com,netflixflipkart.com,flipkartoutput:weather.bbc.co.uk,112,bbc.co.uk,bbcads.facebook.com,113,faceb

hadoop - org.apache.hive.jdbc.HiveDriver : HiveBaseResultSet has not implemented absolute()?

我刚开始使用驱动org.apache.hive.jdbc.HiveDriver(版本1.2.1forspark2)与SparkThrift服务器(STS)(引用here)java.sql.ResultSet定义方法absolute()(JavaDochere)但是HiveBaseResultSet似乎选择了不实现该方法(源码here)现在我的应用程序(构建在SmartGWT之上)正在执行一个简单的操作,我收到以下错误消息:===2017-05-1318:06:16,980[3-47]WARNRequestContext-dsRequest.execute()failed:java.sq

python - PySpark 加载 CSV AttributeError : 'RDD' object has no attribute '_get_object_id'

我正在尝试将CSV文件加载到sparkDataFrame中。这是我到目前为止所做的:#scisanSparkContext.appName="testSpark"master="local"conf=SparkConf().setAppName(appName).setMaster(master)sc=SparkContext(conf=conf)sqlContext=sql.SQLContext(sc)#csvpathtext_file=sc.textFile("hdfs:///path/to/sensordata20171008223515.csv")df=sqlContext.l

hadoop - 如何解决 YARN 日志中的 Log aggregation has not completed or is not enabled 错误

我正在使用EMR5.4并将spark作业提交给Yarn当我尝试使用yarnlogs-applicationIdapplication_1528461193301_0001检索日志时,出现以下错误:18/06/0812:38:01INFOclient.RMProxy:ConnectingtoResourceManageratip-10-0-182-144.eu-west-1.compute.internal/10.0.182.144:8032s3://xxx/apps/root/logs/application_1528461193301_0001doesnotexist.Logaggr

scala - Spark : Calculate event end time on 30-minute intervals based on start time and duration values in previous rows

我有一个带有event_time字段的文件,每条记录每30分钟生成一次,并指示事件持续了多少秒。示例:Event_time|event_duration_seconds09:00|80009:30|180010:00|270012:00|100013:00|1000我需要将连续的事件转换为一个具有持续时间的事件。输出文件应如下所示:Event_time_start|event_time_end|event_duration_seconds09:00|11:00|530012:00|12:30|100013:00|13:30|1000ScalaSpark中是否有一种方法可以将数据帧记录与