started

amazon-ec2 - Amazon EC2 上的 Hadoop : Job tracker not starting properly

我们在AmazonEC2集群上运行Hadoop。我们启动主服务器、从服务器并附加ebs卷，最后等待hadoopjobtracker、tasktracker等启动，超时时间为3600秒。我们注意到50%的时间作业跟踪器无法在超时前启动。原因是，hdfs未正确初始化且仍处于安全模式且作业跟踪器无法启动。当我尝试手动ping从站时，我注意到EC2上节点之间的连接问题很少。有没有人遇到过类似的问题并且知道如何解决这个问题？最佳答案我不确定这个问题是否与AmazonEC2有关。我也经常遇到这个问题-虽然我的机器上有一个伪分布式安装。在这些

hadoop - Apache Hadoop 单节点设置中的 start-all.sh 失败

我在Ubuntu12.04上安装了ApacheHadoop1.0.4。我按照http://hadoop.apache.org/docs/stable/single_node_setup.html上的说明进行操作，并到达“执行”部分。我在$bin/start-all.sh上失败了，错误信息如下。我的用户名是anson。$start-all.shmkdir:cannotcreatedirectory`/var/log/hadoop/anson':Permissiondeniedchown:cannotaccess`/var/log/hadoop/anson':Nosuchfileordir

start-all hadoop anson directory

hadoop - HDP : unable to start Phoenix sqlline. py

我正在使用SandboxHDP2.2我做了一个yuminstallphoenix(版本是4.2)但是当我运行这些时:./sqlline.pylocalhost:2181./sqlline.pylocalhost./sqlline.pysandbox.hortonworks.com:2181./sqlline.pysandbox.hortonworks.com我得到了错误:15/07/0308:26:31ERRORclient.ConnectionManager$HConnectionImplementation:Thenode/hbaseisnotinZooKeeper.Itshoul

Phoenix sqlline code hbase hadoop hortonworks-data-platform apache-phoenix

hadoop - HbaseTestingUtility : could not start my mini-cluster

我正在尝试使用HbaseTestingUtility测试我的Hbase代码。每次我使用下面的代码片段启动我的迷你集群时，我都会遇到异常。publicvoidstartCluster(){FileworkingDirectory=newFile("./");Configurationconf=newConfiguration();System.setProperty("test.build.data",workingDirectory.getAbsolutePath());conf.set("test.build.data",newFile(workingDirectory,"zooke

HbaseTestingUtility mini-cluster hbase apache hadoop

Hadoop : start-dfs. sh 连接被拒绝

我在debian/stretch64上有一个vagrantbox我尝试使用文档安装Hadoop3http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.htm当我运行start-dfs.sh时我有这个消息vagrant@stretch:/opt/hadoop$sudosbin/start-dfs.shStartingnamenodeson[localhost]pdsh@stretch:localhost:connect:ConnectionrefusedStartingd

start-dfs Hadoop code pdsh ssh debian hadoop3

sockets - 运行 start-dfs.sh 时权限被拒绝错误

我在执行start-dfs.sh时遇到此错误Startingnamenodeson[localhost]pdsh@Gaurav:localhost:rcmd:socket:PermissiondeniedStartingdatanodespdsh@Gaurav:localhost:rcmd:socket:PermissiondeniedStartingsecondarynamenodes[Gaurav]pdsh@Gaurav:Gaurav:rcmd:socket:Permissiondenied2017-03-1309:39:29,559WARNutil.NativeCodeLoade

start-dfs sockets code section pdsh hadoop hdfs hadoop-yarn hadoop2

Hadoop 2.6.0 : Basic error "starting MRAppMaster" after installing

我刚刚开始使用Hadoop2。使用基本配置安装后，我总是无法运行任何示例。有没有人看到这个问题，请帮助我？错误是这样的ErrorstartingMRAppMasterjava.lang.RuntimeException:java.lang.reflect.InvocationTargetException这是日志20152015-01-0611:56:23,194INFO[main]org.apache.hadoop.mapreduce.v2.app.MRAppMaster:CreatedMRAppMasterforapplicationappattempt_1420510526926

MRAppMaster installing hadoop java apache mapreduce hadoop-yarn

python - Apache Spark : Error while starting PySpark

在Centos机器上，Pythonv2.6.6和ApacheSparkv1.2.1尝试运行./pyspark时出现以下错误似乎是python的一些问题，但无法弄清楚15/06/1808:11:16INFOspark.SparkContext:SuccessfullystoppedSparkContextTraceback(mostrecentcalllast):File"/usr/lib/spark_1.2.1/spark-1.2.1-bin-hadoop2.4/python/pyspark/shell.py",line45,insc=SparkContext(appName="PyS

starting PySpark python section spark hadoop apache-spark

hadoop - master节点的"start-all.sh"和"start-dfs.sh"不启动slave节点服务？

我已经用我的从节点的主机名更新了Hadoop主节点上的/conf/slaves文件，但是我无法从主节点启动从节点。我必须单独启动从站，然后我的5节点集群启动并运行。如何使用主节点的单个命令启动整个集群？此外，SecondaryNameNode正在所有从节点上运行。那是问题吗？如果是这样，我怎样才能将它们从奴隶中移除？我认为一个集群中应该只有一个SecondaryNameNode和一个NameNode，对吗？谢谢! 最佳答案在ApacheHadoop3.0中使用$HADOOP_HOME/etc/hadoop/workers文件每行添

amp start section SecondaryNameNode stackoverflow hadoop hdfs namenode hadoop3

hadoop - "Starting flush of map output"在 hadoop 映射任务中花费很长时间

我在一个小文件(3-4MB)上执行maptask，但map输出相对较大(150MB)。显示Map100%后，需要很长时间才能完成溢出。请建议我如何减少这段时间。以下是一些示例日志...13/07/1017:45:31INFOmapred.MapTask:Startingflushofmapoutput13/07/1017:45:32INFOmapred.JobClient:map98%reduce0%13/07/1017:45:34INFOmapred.LocalJobRunner:13/07/1017:45:35INFOmapred.JobClient:map100%reduce0%

长时 hadoop mapred LocalJobRunner INFO map flush

106 107 108109110 111 112