java - 两次调用 Vertex.get EdgeValue() 后边值不一样

coder 2024-01-08 原文

我正在尝试在 giraph 中实现 Spinner 图分区算法。在第一步中，我的程序向给定的输入图添加边，使其成为无向图，每个顶点选择一个随机分区。 (此分区整数存储在 VertexValue 中)在此初始化步骤结束时，每个顶点向所有输出边发送一条消息，其中包含顶点 ID (a LongWritable) 和顶点选择的分区。

一切正常。现在在我遇到问题的步骤中，每个顶点迭代接收到的消息并将接收到的分区保存在 EdgeValue 中。对应的边。 (VertexValue是V在Vertex<I,V,E>，EdgeValue是E在Edge<I,E>)

以下是我的代码的重要部分:

包装类:

public class EdgeValue implements Writable {
private int weight;
private int partition;
// Getters and setters for weight and partition
    public EdgeValue() {
    this.weight = -2;
    this.partition = -1;
}
// Constructors taking 1 and 2 ints and setting weight/partition to the given value

@Override
public void readFields(DataInput in) throws IOException {
    this.weight = in.readInt();
    this.partition = in.readInt();
}

@Override
public void write(DataOutput out) throws IOException {
    out.writeInt(this.weight);
    out.writeInt(this.partition);
}
}

public class SpinnerMessage implements Writable, Configurable {
private long senderId;
private int updatePartition;
public SpinnerMessage() {
    this.senderId = -1;
    this.updatePartition = -1;
}
// Constructors taking int and/or LongWritable and setting the fields
// Getters and setters for senderId and updatePartition

@Override
public void readFields(DataInput in) throws IOException {
    this.senderId = in.readLong();
    this.updatePartition = in.readInt();
}

@Override
public void write(DataOutput out) throws IOException {
    out.writeLong(this.senderId);
    out.writeInt(this.updatePartition);
}
}

compute之前步骤中的方法(ran 是一个 Random 对象):

public void compute(Vertex<LongWritable, VertexValue, EdgeValue> vertex, Iterable<LongWritable> messages) {
    int initialPartition = this.ran.nextInt(GlobalInformation.numberOfPartitions);
    vertex.getValue().setPartition(initialPartition);
    sendMessageToAllEdges(vertex, new SpinnerMessage(vertex.getId(),initialPartition));
}

compute错误发生步骤中的方法:

public void compute(Vertex<LongWritable, VertexValue, EdgeValue> vertex,Iterable<SpinnerMessage> messages) throws IOException {
for (SpinnerMessage m : messages) {
    vertex.getEdgeValue(new LongWritable(m.getSenderWritable().get())).setPartition(m.getUpdatePartition());
}
// ... some other code, e.g. initializing the amountOfNeighbors array.
// Here I get an ArrayIndexOutOfBoundsException since the partition is -1:
for (Edge<LongWritable, EdgeValue> edge : vertex.getEdges()) {
    EdgeValue curValue = edge.getValue();
    amountOfNeighbors[curValue.getPartition()] += curValue.getWeight();
}

但是，当我用例如遍历边缘时

for(Edge<LongWritable, EdgeValue> e : vertex.getEdges())

或通过

vertex.getEdgeValue(someVertex)

然后返回EdgeValue有重量 -2和分区 -1 (来自标准构造函数的默认值)

我的想法是什么可能导致错误:

getEdgeValue(new LongWritable(someLong))可能不起作用，因为它与另一个对象不同 new LongWritable(someLong)具有相同的值。但是，我已经看到它在 giraph 代码中使用，所以这似乎没有问题，只有 long 存储在 LongWritable 中。似乎很重要。
(最可能的原因)Hadoop 序列化和反序列化以某种方式改变了我的 EdgeValue对象。由于 Hadoop 适用于非常大的图形，因此它们可能不适合 RAM。为此，VertexValue和 EdgeValue必须实现Writable .然而，在网上查看了一些 giraph 代码后，我实现了 read()和 write()以一种对我来说似乎正确的方式(以相同的顺序编写和阅读重要字段)。 (我认为这与问题有某种联系，因为第二次调用返回的 EdgeValue 具有标准构造函数的字段值)

我也看了一点文档:

E getEdgeValue(I targetVertexId) Return the value of the first edge with the given target vertex id, or null if there is no such edge. Note: edge value objects returned by this method may be invalidated by the next call. Thus, keeping a reference to an edge value almost always leads to undesired behavior.

但是，这不适用于我，因为我只有一个 EdgeValue变量，对吧？

提前感谢所有花时间帮助我的人。 (我使用的是 hadoop 1.2.1 和 giraph 1.2.0)

最佳答案

在查看了更多 giraph 代码示例后，我找到了解决方案:Vertex.getEdgeValue() 方法基本上创建了 EdgeValue 的副本的顶点。如果您更改它返回的对象，它不会写入这些更改回磁盘。要在 EdgeValue 或 VertexValue 中保存信息，您必须使用 setVertexValue() 或 setEdgeValue()。

关于java - 两次调用 Vertex.get EdgeValue() 后边值不一样，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30384947/

后边 EdgeValue code LongWritable java algorithm hadoop graph giraph

有关java - 两次调用 Vertex.get EdgeValue() 后边值不一样的更多相关文章

java - 等价于 Java 中的 Ruby Hash - 2
我真的很习惯使用Ruby编写以下代码:my_hash={}my_hash['test']=1Java中对应的数据结构是什么？最佳答案 HashMapmap=newHashMap();map.put("test",1);我假设？关于java-等价于Java中的RubyHash，我们在StackOverflow上找到一个类似的问题： https://stackoverflow.com/questions/22737685/
使用 ACL 调用 upload_file 时出现 Ruby S3 "Access Denied"错误 - 2
我正在尝试编写一个将文件上传到AWS并公开该文件的Ruby脚本。我做了以下事情:s3=Aws::S3::Resource.new(credentials:Aws::Credentials.new(KEY,SECRET),region:'us-west-2')obj=s3.bucket('stg-db').object('key')obj.upload_file(filename)这似乎工作正常，除了该文件不是公开可用的，而且我无法获得它的公共(public)URL。但是当我登录到S3时，我可以正常查看我的文件。为了使其公开可用，我将最后一行更改为obj.upload_file(file
c# - 如何在 ruby 中调用 C# dll？ - 2
如何在ruby中调用C#dll？最佳答案我能想到几种可能性:为您的DLL编写(或找人编写)一个COM包装器，如果它还没有，则使用Ruby的WIN32OLE库来调用它；看看RubyCLR,其中一位作者是JohnLam，他继续在Microsoft从事IronRuby方面的工作。(估计不会再维护了，可能不支持.Net2.0以上的版本)；正如其他地方已经提到的，看看使用IronRuby，如果这是您的技术选择。有一个主题是here.请注意，最后一篇文章实际上来自JohnLam(看起来像是2009年3月)，他似乎很自在地断言RubyCL
java - 从 JRuby 调用 Java 类的问题 - 2
我正在尝试使用boilerpipe来自JRuby。我看过guide从JRuby调用Java，并成功地将它与另一个Java包一起使用，但无法弄清楚为什么同样的东西不能用于boilerpipe。我正在尝试基本上从JRuby中执行与此Java等效的操作:URLurl=newURL("http://www.example.com/some-location/index.html");Stringtext=ArticleExtractor.INSTANCE.getText(url);在JRuby中试过这个:require'java'url=java.net.URL.new("http://www
ruby - 调用其他方法的 TDD 方法的正确方法 - 2
我需要一些关于TDD概念的帮助。假设我有以下代码defexecute(command)casecommandwhen"c"create_new_characterwhen"i"display_inventoryendenddefcreate_new_character#dostufftocreatenewcharacterenddefdisplay_inventory#dostufftodisplayinventoryend现在我不确定要为什么编写单元测试。如果我为execute方法编写单元测试，那不是几乎涵盖了我对create_new_character和display_invent
java - 我的模型类或其他类中应该有逻辑吗 - 2
我只想对我一直在思考的这个问题有其他意见，例如我有classuser_controller和classuserclassUserattr_accessor:name,:usernameendclassUserController//dosomethingaboutanythingaboutusersend问题是我的User类中是否应该有逻辑user=User.newuser.do_something(user1)oritshouldbeuser_controller=UserController.newuser_controller.do_something(user1,user2)我
java - 什么相当于 ruby 的 rack 或 python 的 Java wsgi？ - 2
什么是ruby的rack或python的Java的wsgi？还有一个路由库。最佳答案来自Python标准PEP333:Bycontrast,althoughJavahasjustasmanywebapplicationframeworksavailable,Java's"servlet"APImakesitpossibleforapplicationswrittenwithanyJavawebapplicationframeworktoruninanywebserverthatsupportstheservletAPI.ht
【鸿蒙应用开发系列】- 获取系统设备信息以及版本API兼容调用方式 - 2
在应用开发中，有时候我们需要获取系统的设备信息，用于数据上报和行为分析。那在鸿蒙系统中，我们应该怎么去获取设备的系统信息呢，比如说获取手机的系统版本号、手机的制造商、手机型号等数据。1、获取方式这里分为两种情况，一种是设备信息的获取，一种是系统信息的获取。1.1、获取设备信息获取设备信息，鸿蒙的SDK包为我们提供了DeviceInfo类，通过该类的一些静态方法，可以获取设备信息，DeviceInfo类的包路径为：ohos.system.DeviceInfo.具体的方法如下：ModifierandTypeMethodDescriptionstatic StringgetAbiList()Obt
Observability：从零开始创建 Java 微服务并监控它（二） - 2
这篇文章是继上一篇文章“Observability：从零开始创建Java微服务并监控它（一）”的续篇。在上一篇文章中，我们讲述了如何创建一个Javaweb应用，并使用Filebeat来收集应用所生成的日志。在今天的文章中，我来详述如何收集应用的指标，使用APM来监控应用并监督web服务的在线情况。源码可以在地址 https://github.com/liu-xiao-guo/java_observability 进行下载。摄入指标指标被视为可以随时更改的时间点值。当前请求的数量可以改变任何毫秒。你可能有1000个请求的峰值，然后一切都回到一个请求。这也意味着这些指标可能不准确，你还想提取最小/
【Java 面试合集】HashMap中为什么引入红黑树，而不是AVL树呢 - 2
HashMap中为什么引入红黑树，而不是AVL树呢1.概述开始学习这个知识点之前我们需要知道，在JDK1.8以及之前，针对HashMap有什么不同。JDK1.7的时候，HashMap的底层实现是数组+链表JDK1.8的时候，HashMap的底层实现是数组+链表+红黑树我们要思考一个问题，为什么要从链表转为红黑树呢。首先先让我们了解下链表有什么不好？？？2.链表上述的截图其实就是链表的结构，我们来看下链表的增删改查的时间复杂度增：因为链表不是线性结构，所以每次添加的时候，只需要移动一个节点，所以可以理解为复杂度是N(1)删：算法时间复杂度跟增保持一致查：既然是非线性结构，所以查询某一个节点的时候

java - 两次调用 Vertex.get EdgeValue() 后边值不一样

有关java - 两次调用 Vertex.get EdgeValue() 后边值不一样的更多相关文章

随机推荐