下面的代码演示了问题明确 , 即:
The exact same block of code becomes slower after a busy spin pause.
Thread.sleep .另请注意,没有条件导致 HotSpot/JIT 去优化,因为我正在使用数学运算而不是 IF 更改暂停。 .$ java -server -cp . JvmPauseLatency
Sat Apr 29 10:34:28 EDT 2017 => Please wait 75 seconds for the results...
Sat Apr 29 10:35:43 EDT 2017 => Calculation: 4.0042328611017236E11
Results:
215
214
215
214
215
214
217
215
216
214
216
213
215
214
215
2343 <----- FIRST MEASUREMENT AFTER PAUSE CHANGE
795
727
942
778
765
856
762
801
708
692
765
776
780
754
import java.util.Arrays;
import java.util.Date;
import java.util.Random;
public class JvmPauseLatency {
private static final int WARMUP = 20000;
private static final int EXTRA = 15;
private static final long PAUSE = 5 * 1000000000L; // in nanos
private final Random rand = new Random();
private int count;
private double calculation;
private final long[] results = new long[WARMUP + EXTRA];
private long interval = 1; // in nanos
private long busyPause(long pauseInNanos) {
final long start = System.nanoTime();
long until = Long.MAX_VALUE;
while(System.nanoTime() < until) {
until = start + pauseInNanos;
}
return until;
}
public void run() {
long testDuration = ((WARMUP * 1) + (EXTRA * PAUSE)) / 1000000000L;
System.out.println(new Date() +" => Please wait " + testDuration + " seconds for the results...");
while(count < results.length) {
double x = busyPause(interval);
long latency = System.nanoTime();
calculation += x / (rand.nextInt(5) + 1);
calculation -= calculation / (rand.nextInt(5) + 1);
calculation -= x / (rand.nextInt(6) + 1);
calculation += calculation / (rand.nextInt(6) + 1);
latency = System.nanoTime() - latency;
results[count++] = latency;
interval = (count / WARMUP * (PAUSE - 1)) + 1; // it will change to PAUSE when it reaches WARMUP
}
// now print the last (EXTRA * 2) results so you can compare before and after the pause change (from 1 to PAUSE)
System.out.println(new Date() + " => Calculation: " + calculation);
System.out.println("Results:");
long[] array = Arrays.copyOfRange(results, results.length - EXTRA * 2, results.length);
for(long t: array) System.out.println(t);
}
public static void main(String[] args) {
new JvmPauseLatency().run();
}
}
最佳答案
TL; 博士
http://www.brendangregg.com/activebenchmarking.html
casual benchmarking: you benchmark A, but actually measure B, and conclude you've measured C.
System.out.println("WARMUP = " + busyPause(5000000000L));
-XX:+UnlockDiagnosticVMOptions -XX:+PrintCompilation -XX:+TraceNMethodInstalls 运行您的基准测试.我已经修改了您的代码 - 现在它在每次调用之前将时间间隔打印到系统输出中:interval = 1
interval = 1
interval = 5000000000
689 145 4 JvmPauseLatency::busyPause (19 bytes) made not entrant
689 146 3 JvmPauseLatency::busyPause (19 bytes)
Installing method (3) JvmPauseLatency.busyPause(J)J
698 147 % 4 JvmPauseLatency::busyPause @ 6 (19 bytes)
Installing osr method (4) JvmPauseLatency.busyPause(J)J @ 6
702 148 4 JvmPauseLatency::busyPause (19 bytes)
705 146 3 JvmPauseLatency::busyPause (19 bytes) made not entrant
Installing method (4) JvmPauseLatency.busyPause(J)J
interval = 5000000000
interval = 5000000000
interval = 5000000000
interval = 5000000000
-XX:-TieredCompilation禁用分层编译 -XX:-TieredCompilation -XX:TieredStopAtLevel=3禁用分层编译到 4 级 -XX:+TieredCompilation -XX:TieredStopAtLevel=4 -XX:-UseOnStackReplacement禁用 OSR nanoTime() 小调用(见 What is the cost of volatile write?)import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.runner.options.VerboseMode;
import java.util.Random;
import java.util.concurrent.TimeUnit;
@State(Scope.Benchmark)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 2, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 3, timeUnit = TimeUnit.SECONDS)
@Fork(value = 2)
public class LatencyTest {
public static final long LONG_PAUSE = 5000L;
public static final long SHORT_PAUSE = 1L;
public Random rand;
@Setup
public void initI() {
rand = new Random(0xDEAD_BEEF);
}
private long busyPause(long pauseInNanos) {
Blackhole.consumeCPU(pauseInNanos);
return pauseInNanos;
}
@Benchmark
@BenchmarkMode({Mode.AverageTime})
public long latencyBusyPauseShort() {
return busyPause(SHORT_PAUSE);
}
@Benchmark
@BenchmarkMode({Mode.AverageTime})
public long latencyBusyPauseLong() {
return busyPause(LONG_PAUSE);
}
@Benchmark
@BenchmarkMode({Mode.AverageTime})
public long latencyFunc() {
return doCalculation(1);
}
@Benchmark
@BenchmarkMode({Mode.AverageTime})
public long measureShort() {
long x = busyPause(SHORT_PAUSE);
return doCalculation(x);
}
@Benchmark
@BenchmarkMode({Mode.AverageTime})
public long measureLong() {
long x = busyPause(LONG_PAUSE);
return doCalculation(x);
}
private long doCalculation(long x) {
long calculation = 0;
calculation += x / (rand.nextInt(5) + 1);
calculation -= calculation / (rand.nextInt(5) + 1);
calculation -= x / (rand.nextInt(6) + 1);
calculation += calculation / (rand.nextInt(6) + 1);
return calculation;
}
public static void main(String[] args) throws RunnerException {
Options options = new OptionsBuilder()
.include(LatencyTest.class.getName())
.verbosity(VerboseMode.NORMAL)
.build();
new Runner(options).run();
}
}
Benchmark Mode Cnt Score Error Units
LatencyTest.latencyBusyPauseLong avgt 4 15992.216 ± 106.538 ns/op
LatencyTest.latencyBusyPauseShort avgt 4 6.450 ± 0.163 ns/op
LatencyTest.latencyFunc avgt 4 97.321 ± 0.984 ns/op
LatencyTest.measureLong avgt 4 16103.228 ± 102.338 ns/op
LatencyTest.measureShort avgt 4 100.454 ± 0.041 ns/op
-XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation ):<branch prob="0.0408393" not_taken="40960" taken="1744" cnt="42704" target_bci="42"/>
vdso clock_gettime,我们无法分析此代码)import java.util.Arrays;
public class JvmPauseLatency {
private static final int WARMUP = 2000 ;
private static final int EXTRA = 10;
private static final long PAUSE = 70000L; // in nanos
private static volatile long consumedCPU = System.nanoTime();
//org.openjdk.jmh.infra.Blackhole.consumeCPU()
private static void consumeCPU(long tokens) {
long t = consumedCPU;
for (long i = tokens; i > 0; i--) {
t += (t * 0x5DEECE66DL + 0xBL + i) & (0xFFFFFFFFFFFFL);
}
if (t == 42) {
consumedCPU += t;
}
}
public void run(long warmPause) {
long[] results = new long[WARMUP + EXTRA];
int count = 0;
long interval = warmPause;
while(count < results.length) {
consumeCPU(interval);
long latency = System.nanoTime();
latency = System.nanoTime() - latency;
results[count++] = latency;
if (count == WARMUP) {
interval = PAUSE;
}
}
System.out.println("Results:" + Arrays.toString(Arrays.copyOfRange(results, results.length - EXTRA * 2, results.length)));
}
public static void main(String[] args) {
int totalCount = 0;
while (totalCount < 100) {
new JvmPauseLatency().run(0);
totalCount ++;
}
}
}
Results:[62, 66, 63, 64, 62, 62, 60, 58, 65, 61, 127, 245, 140, 85, 88, 114, 76, 199, 310, 196]
Results:[61, 63, 65, 64, 62, 65, 82, 63, 67, 70, 104, 176, 368, 297, 272, 183, 248, 217, 267, 181]
Results:[62, 65, 60, 59, 54, 64, 63, 71, 48, 59, 202, 74, 400, 247, 215, 184, 380, 258, 266, 323]
new JvmPauseLatency().run(0)与 new JvmPauseLatency().run(PAUSE);结果如下:Results:[46, 45, 44, 45, 48, 46, 43, 72, 50, 47, 46, 44, 54, 45, 43, 43, 43, 48, 46, 43]
Results:[44, 44, 45, 45, 43, 46, 46, 44, 44, 44, 43, 49, 45, 44, 43, 49, 45, 46, 45, 44]
while(count < results.length) {
consumeCPU(interval);
long latency = System.nanoTime();
latency = System.nanoTime() - latency;
results[count++] = latency;
if (count >= WARMUP) {
interval = PAUSE;
} else {
interval = rnd.nextBoolean() ? PAUSE : 0;
}
}
sudo perf stat -e cycles,instructions,cache-references,cache-misses,bus-cycles,branch-misses java -Xint JvmPauseLatency结果是: Performance counter stats for 'java -Xint JvmPauseLatency':
272,822,274,275 cycles
723,420,125,590 instructions # 2.65 insn per cycle
26,994,494 cache-references
8,575,746 cache-misses # 31.769 % of all cache refs
2,060,138,555 bus-cycles
2,930,155 branch-misses
86.808481183 seconds time elapsed
Performance counter stats for 'java -Xint JvmPauseLatency':
2,812,949,238 cycles
7,267,497,946 instructions # 2.58 insn per cycle
6,936,666 cache-references
1,107,318 cache-misses # 15.963 % of all cache refs
21,410,797 bus-cycles
791,441 branch-misses
0.907758181 seconds time elapsed
Performance counter stats for 'java -Xint JvmPauseLatency':
126,157,793 cycles
158,845,300 instructions # 1.26 insn per cycle
6,650,471 cache-references
909,593 cache-misses # 13.677 % of all cache refs
1,635,548 bus-cycles
775,564 branch-misses
0.073511817 seconds time elapsed
关于java - 为什么 JVM 在繁忙的自旋暂停后显示相同代码块的更多延迟?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43696948/
类classAprivatedeffooputs:fooendpublicdefbarputs:barendprivatedefzimputs:zimendprotecteddefdibputs:dibendendA的实例a=A.new测试a.foorescueputs:faila.barrescueputs:faila.zimrescueputs:faila.dibrescueputs:faila.gazrescueputs:fail测试输出failbarfailfailfail.发送测试[:foo,:bar,:zim,:dib,:gaz].each{|m|a.send(m)resc
我有一个模型:classItem项目有一个属性“商店”基于存储的值,我希望Item对象对特定方法具有不同的行为。Rails中是否有针对此的通用设计模式?如果方法中没有大的if-else语句,这是如何干净利落地完成的? 最佳答案 通常通过Single-TableInheritance. 关于ruby-on-rails-Rails-子类化模型的设计模式是什么?,我们在StackOverflow上找到一个类似的问题: https://stackoverflow.co
我得到了一个包含嵌套链接的表单。编辑时链接字段为空的问题。这是我的表格:Editingkategori{:action=>'update',:id=>@konkurrancer.id})do|f|%>'Trackingurl',:style=>'width:500;'%>'Editkonkurrence'%>|我的konkurrencer模型:has_one:link我的链接模型:classLink我的konkurrancer编辑操作:defedit@konkurrancer=Konkurrancer.find(params[:id])@konkurrancer.link_attrib
如何在buildr项目中使用Ruby?我在很多不同的项目中使用过Ruby、JRuby、Java和Clojure。我目前正在使用我的标准Ruby开发一个模拟应用程序,我想尝试使用Clojure后端(我确实喜欢功能代码)以及JRubygui和测试套件。我还可以看到在未来的不同项目中使用Scala作为后端。我想我要为我的项目尝试一下buildr(http://buildr.apache.org/),但我注意到buildr似乎没有设置为在项目中使用JRuby代码本身!这看起来有点傻,因为该工具旨在统一通用的JVM语言并且是在ruby中构建的。除了将输出的jar包含在一个独特的、仅限ruby
我正在使用的第三方API的文档状态:"[O]urAPIonlyacceptspaddedBase64encodedstrings."什么是“填充的Base64编码字符串”以及如何在Ruby中生成它们。下面的代码是我第一次尝试创建转换为Base64的JSON格式数据。xa=Base64.encode64(a.to_json) 最佳答案 他们说的padding其实就是Base64本身的一部分。它是末尾的“=”和“==”。Base64将3个字节的数据包编码为4个编码字符。所以如果你的输入数据有长度n和n%3=1=>"=="末尾用于填充n%
我主要使用Ruby来执行此操作,但到目前为止我的攻击计划如下:使用gemsrdf、rdf-rdfa和rdf-microdata或mida来解析给定任何URI的数据。我认为最好映射到像schema.org这样的统一模式,例如使用这个yaml文件,它试图描述数据词汇表和opengraph到schema.org之间的转换:#SchemaXtoschema.orgconversion#data-vocabularyDV:name:namestreet-address:streetAddressregion:addressRegionlocality:addressLocalityphoto:i
在rails源中:https://github.com/rails/rails/blob/master/activesupport/lib/active_support/lazy_load_hooks.rb可以看到以下内容@load_hooks=Hash.new{|h,k|h[k]=[]}在IRB中,它只是初始化一个空哈希。和做有什么区别@load_hooks=Hash.new 最佳答案 查看rubydocumentationforHashnew→new_hashclicktotogglesourcenew(obj)→new_has
为什么4.1%2返回0.0999999999999996?但是4.2%2==0.2。 最佳答案 参见此处:WhatEveryProgrammerShouldKnowAboutFloating-PointArithmetic实数是无限的。计算机使用的位数有限(今天是32位、64位)。因此计算机进行的浮点运算不能代表所有的实数。0.1是这些数字之一。请注意,这不是与Ruby相关的问题,而是与所有编程语言相关的问题,因为它来自计算机表示实数的方式。 关于ruby-为什么4.1%2使用Ruby返
它不等于主线程的binding,这个toplevel作用域是什么?此作用域与主线程中的binding有何不同?>ruby-e'putsTOPLEVEL_BINDING===binding'false 最佳答案 事实是,TOPLEVEL_BINDING始终引用Binding的预定义全局实例,而Kernel#binding创建的新实例>Binding每次封装当前执行上下文。在顶层,它们都包含相同的绑定(bind),但它们不是同一个对象,您无法使用==或===测试它们的绑定(bind)相等性。putsTOPLEVEL_BINDINGput
我可以得到Infinity和NaNn=9.0/0#=>Infinityn.class#=>Floatm=0/0.0#=>NaNm.class#=>Float但是当我想直接访问Infinity或NaN时:Infinity#=>uninitializedconstantInfinity(NameError)NaN#=>uninitializedconstantNaN(NameError)什么是Infinity和NaN?它们是对象、关键字还是其他东西? 最佳答案 您看到打印为Infinity和NaN的只是Float类的两个特殊实例的字符串