java - 如何在 Mapper 中更新 MapReduce 作业参数

coder 2024-01-10 原文

我想更新我在 Mapper 类中工作时设置的参数(在 Driver 类中)。

我试过，

context.getConfiguration().set("arg", "updatedvalue")

映射器内部。它确实更新了它，但 reducer 的输出全为零。

请帮忙。

映射器:-

public class RecMap extends Mapper<LongWritable, Text, Text, Text> {
    public static TreeMap<String,Integer> co_oc_mat=new TreeMap<String,Integer>();
    public static HashMap<String,Float> user_scoring_mat=new HashMap<String,Float>();
    public static TreeMap<String,Float> sorted_user_scoring_mat=new TreeMap<String,Float>();
    public static ArrayList<String> vals=new ArrayList<String>();
    public static ArrayList<Integer> unique_items=new ArrayList<Integer>();
    public static ArrayList<Integer> unique_users=new ArrayList<Integer>();
    public static int a=0;
    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        ++a;
        String b=value.toString();
        vals.add(b);
        String[] parts=b.split("\\,");
        user_scoring_mat.put(parts[0]+","+parts[1], Float.parseFloat(parts[2]));
    }
    @Override
    public void cleanup(Context context) throws IOException, InterruptedException{
        co_oc_mat.putAll(new get_co_oc_mat().get(vals, a));
        unique_users.addAll(new get_unique_users().get(vals, a));
        unique_items.addAll(new get_unique_items().get(vals, a));
        for(int i=0;i<unique_users.size();i++){
            for(int j=0;j<unique_items.size();j++){
                if(!user_scoring_mat.containsKey(unique_users.get(i)+","+unique_items.get(j))){
                    user_scoring_mat.put(unique_users.get(i)+","+unique_items.get(j), 0.0f);
                }
            }
        }
        sorted_user_scoring_mat.putAll(user_scoring_mat);
        String prev="null";int row_num=-1;String value="A";
        String prev2="null";int col_num=-1;String value2="B";

        //Transmitting co_oc_mat
        for(Entry<String, Integer> entry: co_oc_mat.entrySet()){
            String check_val=entry.getKey().split("\\,")[0];
            if(!prev.contentEquals(check_val)){
                if(row_num==-1){
                    prev=check_val;
                    ++row_num;
                }
                else{
                    for(int i=0;i<unique_users.size();i++){
                        String key=row_num+","+i;
                        context.write(new Text(key), new Text(value));
                    }
                    value="A";
                    prev=check_val;
                    ++row_num;
                }
            }
            value=value+","+entry.getValue();
        }
        for(int i=0;i<unique_users.size();i++){
            String key=row_num+","+i;
            context.write(new Text(key), new Text(value));
        }

        //Transmitting sorted_user_scoring_mat
        for(Entry<String, Float> entry: sorted_user_scoring_mat.entrySet()){
            //context.write(new Text(entry.getKey()), new Text(String.valueOf(entry.getValue())));
            String check_val=entry.getKey().split("\\,")[0];
            if(!prev2.contentEquals(check_val)){
                if(col_num==-1){
                    prev2=check_val;
                    ++col_num;
                }
                else{
                    for(int i=0;i<unique_items.size();i++){
                        String key=i+","+col_num;
                        context.write(new Text(key), new Text(value2));
                    }
                    value2="B";
                    prev2=check_val;
                    ++col_num;
                }
            }
            value2=value2+","+entry.getValue();
        }
        for(int i=0;i<unique_items.size();i++){
            String key=i+","+col_num;
            context.write(new Text(key), new Text(value2));
        }
        context.getConfiguration().setInt("n", unique_items.size());
    }
}

reducer :-

import java.io.IOException;
import java.util.HashMap;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;


public class RecReduce extends
Reducer<Text, Text, Text, Text> {
    public static int n=0;
    @Override
    public void setup(Context context) throws IOException, InterruptedException{
        n=context.getConfiguration().getInt("n", 1);
    }
    public void reduce(Text key, Iterable<Text> values, Context context)
            throws IOException, InterruptedException {
        String[] value;
        HashMap<Integer, Float> hashA = new HashMap<Integer, Float>();
        HashMap<Integer, Float> hashB = new HashMap<Integer, Float>();
        for (Text val : values) {
            value = val.toString().split(",");
            if (value[0].equals("A")) {
                for(int z=1;z<=n;z++){
                    hashA.put(z, Float.parseFloat(value[z]));}
            } else{
                for(int a=1;a<=n;a++){
                    hashB.put(a, Float.parseFloat(value[a]));}
            }
        }
        float result = 0.0f;
        float a_ij;
        float b_jk;
        for (int j=1;j<=n;j++) {
            a_ij = hashA.containsKey(j) ? hashA.get(j) : 0.0f;
            b_jk = hashB.containsKey(j) ? hashB.get(j) : 0.0f;
            result +=a_ij*b_jk;
        }
        context.write(null, new Text(key.toString() + "," + Float.toString(result)));
    }
}

司机:-

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;


public class RecDriver {
    public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
        Configuration conf = new Configuration();
        conf.setInt("n", 0);
        Job job = new Job(conf, "Recommendations_CollaborativeFiltering");
        job.setJarByClass(RecDriver.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);
        job.setMapperClass(RecMap.class);
        //job.setNumReduceTasks(0);
        //Don't use combiner if there is no scope of combining the output. Otherwise the job will get stuck.
        //job.setCombinerClass(RecReduce.class);
        job.setReducerClass(RecReduce.class);

        FileInputFormat.addInputPath(job, new Path("/home/gts1/Desktop/recommendation.txt"));

        FileOutputFormat.setOutputPath(job, new Path("/home/gts1/Desktop/rec1_out"));
        System.exit(job.waitForCompletion(true)?0:1);
    }
}

这是我得到的输出:-

0,0,0.0
0,1,0.0
0,2,0.0
0,3,0.0
0,4,0.0
1,0,0.0
1,1,0.0
1,2,0.0
1,3,0.0
1,4,0.0
2,0,0.0
2,1,0.0
2,2,0.0
2,3,0.0
2,4,0.0
3,0,0.0
3,1,0.0
3,2,0.0
3,3,0.0
3,4,0.0

最佳答案

如 Hadoop API 文档中所述 JobContext提供一个在任务运行时提供给任务的作业的只读 View 。因此，应该可以在 mapper/reducer 方法的上下文中获取参数值，但是不设置它们。

当必须在不同的过程机器之间使用这种协调时，那么 Apache ZooKeeper必须用于在 mapper 中设置值并在 reducer 中获取相同的值。

关于java - 如何在 Mapper 中更新 MapReduce 作业参数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/31021635/

何在 MapReduce String 34 Text java hadoop

有关java - 如何在 Mapper 中更新 MapReduce 作业参数的更多相关文章

ruby - 如何在 Ruby 中顺序创建 PI - 2
出于纯粹的兴趣，我很好奇如何按顺序创建PI，而不是在过程结果之后生成数字，而是让数字在过程本身生成时显示。如果是这种情况，那么数字可以自行产生，我可以对以前看到的数字实现垃圾收集，从而创建一个无限系列。结果只是在Pi系列之后每秒生成一个数字。这是我通过互联网筛选的结果:这是流行的计算机友好算法，类机器算法:defarccot(x,unity)xpow=unity/xn=1sign=1sum=0loopdoterm=xpow/nbreakifterm==0sum+=sign*(xpow/n)xpow/=x*xn+=2sign=-signendsumenddefcalc_pi(digits
ruby-on-rails - 如何验证 update_all 是否实际在 Rails 中更新 - 2
给定这段代码defcreate@upgrades=User.update_all(["role=?","upgraded"],:id=>params[:upgrade])redirect_toadmin_upgrades_path,:notice=>"Successfullyupgradeduser."end我如何在该操作中实际验证它们是否已保存或未重定向到适当的页面和消息？最佳答案在Rails3中，update_all不返回任何有意义的信息，除了已更新的记录数(这可能取决于您的DBMS是否返回该信息)。http://ar.ru
ruby - 如何在 buildr 项目中使用 Ruby 代码？ - 2
如何在buildr项目中使用Ruby？我在很多不同的项目中使用过Ruby、JRuby、Java和Clojure。我目前正在使用我的标准Ruby开发一个模拟应用程序，我想尝试使用Clojure后端(我确实喜欢功能代码)以及JRubygui和测试套件。我还可以看到在未来的不同项目中使用Scala作为后端。我想我要为我的项目尝试一下buildr(http://buildr.apache.org/)，但我注意到buildr似乎没有设置为在项目中使用JRuby代码本身!这看起来有点傻，因为该工具旨在统一通用的JVM语言并且是在ruby中构建的。除了将输出的jar包含在一个独特的、仅限ruby
ruby - 什么是填充的 Base64 编码字符串以及如何在 ruby 中生成它们？ - 2
我正在使用的第三方API的文档状态:"[O]urAPIonlyacceptspaddedBase64encodedstrings."什么是“填充的Base64编码字符串”以及如何在Ruby中生成它们。下面的代码是我第一次尝试创建转换为Base64的JSON格式数据。xa=Base64.encode64(a.to_json) 最佳答案他们说的padding其实就是Base64本身的一部分。它是末尾的“=”和“==”。Base64将3个字节的数据包编码为4个编码字符。所以如果你的输入数据有长度n和n%3=1=>"=="末尾用于填充n%
ruby-on-rails - 如何在 ruby 中使用两个参数异步运行 exe？ - 2
exe应该在我打开页面时运行。异步进程需要运行。有什么方法可以在ruby中使用两个参数异步运行exe吗？我已经尝试过ruby命令-system()、exec()但它正在等待过程完成。我需要用参数启动exe，无需等待进程完成是否有任何rubygems会支持我的问题？最佳答案您可以使用Process.spawn和Process.wait2:pid=Process.spawn'your.exe','--option'#Later...pid,status=Process.wait2pid您的程序将作为解释器的子进程执行。除
ruby - 如何在续集中重新加载表模式？ - 2
鉴于我有以下迁移:Sequel.migrationdoupdoalter_table:usersdoadd_column:is_admin,:default=>falseend#SequelrunsaDESCRIBEtablestatement,whenthemodelisloaded.#Atthispoint,itdoesnotknowthatusershaveais_adminflag.#Soitfails.@user=User.find(:email=>"admin@fancy-startup.example")@user.is_admin=true@user.save!ende
ruby - RSpec - 使用测试替身作为 block 参数 - 2
我有一些Ruby代码，如下所示:Something.createdo|x|x.foo=barend我想编写一个测试，它使用double代替block参数x，这样我就可以调用:x_double.should_receive(:foo).with("whatever").这可能吗？最佳答案 specify'something'dox=doublex.should_receive(:foo=).with("whatever")Something.should_receive(:create).and_yield(x)#callthere
ruby - 如何在 Ruby 中拆分参数字符串 Bash 样式？ - 2
我正在为一个项目制作一个简单的shell，我希望像在Bash中一样解析参数字符串。foobar"helloworld"fooz应该变成:["foo","bar","helloworld","fooz"]等等。到目前为止，我一直在使用CSV::parse_line，将列分隔符设置为""和.compact输出。问题是我现在必须选择是要支持单引号还是双引号。CSV不支持超过一个分隔符。Python有一个名为shlex的模块:>>>shlex.split("Test'helloworld'foo")['Test','helloworld','foo']>>>shlex.split('Test"
ruby - 检查方法参数的类型 - 2
我不确定传递给方法的对象的类型是否正确。我可能会将一个字符串传递给一个只能处理整数的函数。某种运行时保证怎么样？我看不到比以下更好的选择:defsomeFixNumMangler(input)raise"wrongtype:integerrequired"unlessinput.class==FixNumother_stuffend有更好的选择吗？最佳答案使用Kernel#Integer在使用之前转换输入的方法。当无法以任何合理的方式将输入转换为整数时，它将引发ArgumentError。defmy_method(number)
ruby - 如何在 Lion 上安装 Xcode 4.6，需要用 RVM 升级 ruby - 2
我实际上是在尝试使用RVM在我的OSX10.7.5上更新ruby，并在输入以下命令后:rvminstallruby我得到了以下回复:Searchingforbinaryrubies,thismighttakesometime.Checkingrequirementsforosx.Installingrequirementsforosx.Updatingsystem.......Errorrunning'requirements_osx_brew_update_systemruby-2.0.0-p247',pleaseread/Users/username/.rvm/log/138121

java - 如何在 Mapper 中更新 MapReduce 作业参数

有关java - 如何在 Mapper 中更新 MapReduce 作业参数的更多相关文章

随机推荐