作者:
从.NET Core 1.0(或 .NET Framework 4.5、.NET Standard 1.0)开始,.NET中便可以使用具有SIMD硬件加速的向量类型了。
其中大小与硬件相关的向量(Vectors with a hardware dependent size)作用最大。它由 只读结构体(readonly struct) Vector<T>,及辅助的静态类 Vector 所组成。
只读结构体 Vector<T> 主要是通过运算符提供了常规算术运算的能力,功能有限。而静态类 Vector 为向量类型提供了大量的运算函数,能大大拓展了向量类型的使用领域。
但是静态类 Vector 提供了大量的方法,数量达到一百多个,且文档说明很简略,导致学习起来很困难。
于是我编写了一个Demo程序,将静态类 Vector所提供百多个向量方法,每一个均编写了测试代码。利用 测试代码、运行结果 与官方文档进行对照,这样便更容易弄懂了。
目前解决方案里有这3个项目:
为了便于不同目标框架的测试,于是将公用的测试代码放在共享项目里,这样能便于代码复用,使控制台的代码简单。例如 VectorClassDemo50 中 Program.cs 代码为:
using System;
using System.IO;
using VectorClassDemo;
namespace VectorClassDemo50 {
class Program {
static void Main(string[] args) {
string indent = "";
TextWriter tw = Console.Out;
tw.WriteLine("VectorClassDemo50");
tw.WriteLine();
VectorDemo.OutputEnvironment(tw, indent);
tw.WriteLine();
VectorDemo.Run(tw, indent);
}
}
}
因为这次测试了多个平台,不同平台的环境信息信息均不同。于是可以专门用一个函数来输出环境信息,源码如下。
/// <summary>
/// Is release make.
/// </summary>
public static readonly bool IsRelease =
#if DEBUG
false
#else
true
#endif
;
/// <summary>
/// Output Environment.
/// </summary>
/// <param name="tw">Output <see cref="TextWriter"/>.</param>
/// <param name="indent">The indent.</param>
public static void OutputEnvironment(TextWriter tw, string indent) {
if (null == tw) return;
if (null == indent) indent = "";
//string indentNext = indent + "\t";
tw.WriteLine(indent + string.Format("IsRelease:\t{0}", IsRelease));
tw.WriteLine(indent + string.Format("EnvironmentVariable(PROCESSOR_IDENTIFIER):\t{0}", Environment.GetEnvironmentVariable("PROCESSOR_IDENTIFIER")));
tw.WriteLine(indent + string.Format("Environment.ProcessorCount:\t{0}", Environment.ProcessorCount));
tw.WriteLine(indent + string.Format("Environment.Is64BitOperatingSystem:\t{0}", Environment.Is64BitOperatingSystem));
tw.WriteLine(indent + string.Format("Environment.Is64BitProcess:\t{0}", Environment.Is64BitProcess));
tw.WriteLine(indent + string.Format("Environment.OSVersion:\t{0}", Environment.OSVersion));
tw.WriteLine(indent + string.Format("Environment.Version:\t{0}", Environment.Version));
//tw.WriteLine(indent + string.Format("RuntimeEnvironment.GetSystemVersion:\t{0}", System.Runtime.InteropServices.RuntimeEnvironment.GetSystemVersion())); // Same Environment.Version
tw.WriteLine(indent + string.Format("RuntimeEnvironment.GetRuntimeDirectory:\t{0}", System.Runtime.InteropServices.RuntimeEnvironment.GetRuntimeDirectory()));
#if (NET47 || NET462 || NET461 || NET46 || NET452 || NET451 || NET45 || NET40 || NET35 || NET20) || (NETSTANDARD1_0)
#else
tw.WriteLine(indent + string.Format("RuntimeInformation.FrameworkDescription:\t{0}", System.Runtime.InteropServices.RuntimeInformation.FrameworkDescription));
#endif
tw.WriteLine(indent + string.Format("BitConverter.IsLittleEndian:\t{0}", BitConverter.IsLittleEndian));
tw.WriteLine(indent + string.Format("IntPtr.Size:\t{0}", IntPtr.Size));
tw.WriteLine(indent + string.Format("Vector.IsHardwareAccelerated:\t{0}", Vector.IsHardwareAccelerated));
tw.WriteLine(indent + string.Format("Vector<byte>.Count:\t{0}\t# {1}bit", Vector<byte>.Count, Vector<byte>.Count * sizeof(byte) * 8));
//tw.WriteLine(indent + string.Format("Vector<float>.Count:\t{0}\t# {1}bit", Vector<float>.Count, Vector<float>.Count * sizeof(float) * 8));
//tw.WriteLine(indent + string.Format("Vector<double>.Count:\t{0}\t# {1}bit", Vector<double>.Count, Vector<double>.Count * sizeof(double) * 8));
Assembly assembly;
//assembly = typeof(Vector4).GetTypeInfo().Assembly;
//tw.WriteLine(string.Format("Vector4.Assembly:\t{0}", assembly));
//tw.WriteLine(string.Format("Vector4.Assembly.CodeBase:\t{0}", assembly.CodeBase));
assembly = typeof(Vector<float>).GetTypeInfo().Assembly;
tw.WriteLine(string.Format("Vector<T>.Assembly.CodeBase:\t{0}", assembly.CodeBase));
OutputIntrinsics(tw, indent);
}
/// <summary>
/// Output Intrinsics.
/// </summary>
/// <param name="tw">Output <see cref="TextWriter"/>.</param>
/// <param name="indent">The indent.</param>
public static void OutputIntrinsics(TextWriter tw, string indent) {
if (null == tw) return;
if (null == indent) indent = "";
#if NETCOREAPP3_0_OR_GREATER
tw.WriteLine();
tw.WriteLine(indent + "[Intrinsics.X86]");
WriteLineFormat(tw, indent, "Aes.IsSupported:\t{0}", System.Runtime.Intrinsics.X86.Aes.IsSupported);
WriteLineFormat(tw, indent, "Aes.X64.IsSupported:\t{0}", System.Runtime.Intrinsics.X86.Aes.X64.IsSupported);
WriteLineFormat(tw, indent, "Avx.IsSupported:\t{0}", Avx.IsSupported);
WriteLineFormat(tw, indent, "Avx.X64.IsSupported:\t{0}", Avx.X64.IsSupported);
WriteLineFormat(tw, indent, "Avx2.IsSupported:\t{0}", Avx2.IsSupported);
WriteLineFormat(tw, indent, "Avx2.X64.IsSupported:\t{0}", Avx2.X64.IsSupported);
#if NET6_0_OR_GREATER
WriteLineFormat(tw, indent, "AvxVnni.IsSupported:\t{0}", AvxVnni.IsSupported);
WriteLineFormat(tw, indent, "AvxVnni.X64.IsSupported:\t{0}", AvxVnni.X64.IsSupported);
#endif
WriteLineFormat(tw, indent, "Bmi1.IsSupported:\t{0}", Bmi1.IsSupported);
WriteLineFormat(tw, indent, "Bmi1.X64.IsSupported:\t{0}", Bmi1.X64.IsSupported);
WriteLineFormat(tw, indent, "Bmi2.IsSupported:\t{0}", Bmi2.IsSupported);
WriteLineFormat(tw, indent, "Bmi2.X64.IsSupported:\t{0}", Bmi2.X64.IsSupported);
WriteLineFormat(tw, indent, "Fma.IsSupported:\t{0}", Fma.IsSupported);
WriteLineFormat(tw, indent, "Fma.X64.IsSupported:\t{0}", Fma.X64.IsSupported);
WriteLineFormat(tw, indent, "Lzcnt.IsSupported:\t{0}", Lzcnt.IsSupported);
WriteLineFormat(tw, indent, "Lzcnt.X64.IsSupported:\t{0}", Lzcnt.X64.IsSupported);
WriteLineFormat(tw, indent, "Pclmulqdq.IsSupported:\t{0}", Pclmulqdq.IsSupported);
WriteLineFormat(tw, indent, "Pclmulqdq.X64.IsSupported:\t{0}", Pclmulqdq.X64.IsSupported);
WriteLineFormat(tw, indent, "Popcnt.IsSupported:\t{0}", Popcnt.IsSupported);
WriteLineFormat(tw, indent, "Popcnt.X64.IsSupported:\t{0}", Popcnt.X64.IsSupported);
WriteLineFormat(tw, indent, "Sse.IsSupported:\t{0}", Sse.IsSupported);
WriteLineFormat(tw, indent, "Sse.X64.IsSupported:\t{0}", Sse.X64.IsSupported);
WriteLineFormat(tw, indent, "Sse2.IsSupported:\t{0}", Sse2.IsSupported);
WriteLineFormat(tw, indent, "Sse2.X64.IsSupported:\t{0}", Sse2.X64.IsSupported);
WriteLineFormat(tw, indent, "Sse3.IsSupported:\t{0}", Sse3.IsSupported);
WriteLineFormat(tw, indent, "Sse3.X64.IsSupported:\t{0}", Sse3.X64.IsSupported);
WriteLineFormat(tw, indent, "Sse41.IsSupported:\t{0}", Sse41.IsSupported);
WriteLineFormat(tw, indent, "Sse41.X64.IsSupported:\t{0}", Sse41.X64.IsSupported);
WriteLineFormat(tw, indent, "Sse42.IsSupported:\t{0}", Sse42.IsSupported);
WriteLineFormat(tw, indent, "Sse42.X64.IsSupported:\t{0}", Sse42.X64.IsSupported);
WriteLineFormat(tw, indent, "Ssse3.IsSupported:\t{0}", Ssse3.IsSupported);
WriteLineFormat(tw, indent, "Ssse3.X64.IsSupported:\t{0}", Ssse3.X64.IsSupported);
#if NET5_0_OR_GREATER
WriteLineFormat(tw, indent, "X86Base.IsSupported:\t{0}", X86Base.IsSupported);
WriteLineFormat(tw, indent, "X86Base.X64.IsSupported:\t{0}", X86Base.X64.IsSupported);
#endif // NET5_0_OR_GREATER
#if NET7_0_OR_GREATER
WriteLineFormat(tw, indent, "X86Serialize.IsSupported:\t{0}", X86Serialize.IsSupported);
WriteLineFormat(tw, indent, "X86Serialize.X64.IsSupported:\t{0}", X86Serialize.X64.IsSupported);
#endif // NET7_0_OR_GREATER
#endif // NETCOREAPP3_0_OR_GREATER
#if NET5_0_OR_GREATER
tw.WriteLine();
tw.WriteLine(indent + "[Intrinsics.Arm]");
WriteLineFormat(tw, indent, "AdvSimd.IsSupported:\t{0}", AdvSimd.IsSupported);
WriteLineFormat(tw, indent, "AdvSimd.Arm64.IsSupported:\t{0}", AdvSimd.Arm64.IsSupported);
WriteLineFormat(tw, indent, "Aes.IsSupported:\t{0}", System.Runtime.Intrinsics.Arm.Aes.IsSupported);
WriteLineFormat(tw, indent, "Aes.Arm64.IsSupported:\t{0}", System.Runtime.Intrinsics.Arm.Aes.Arm64.IsSupported);
WriteLineFormat(tw, indent, "ArmBase.IsSupported:\t{0}", ArmBase.IsSupported);
WriteLineFormat(tw, indent, "ArmBase.Arm64.IsSupported:\t{0}", ArmBase.Arm64.IsSupported);
WriteLineFormat(tw, indent, "Crc32.IsSupported:\t{0}", Crc32.IsSupported);
WriteLineFormat(tw, indent, "Crc32.Arm64.IsSupported:\t{0}", Crc32.Arm64.IsSupported);
WriteLineFormat(tw, indent, "Dp.IsSupported:\t{0}", Dp.IsSupported);
WriteLineFormat(tw, indent, "Dp.Arm64.IsSupported:\t{0}", Dp.Arm64.IsSupported);
WriteLineFormat(tw, indent, "Rdm.IsSupported:\t{0}", Rdm.IsSupported);
WriteLineFormat(tw, indent, "Rdm.Arm64.IsSupported:\t{0}", Rdm.Arm64.IsSupported);
WriteLineFormat(tw, indent, "Sha1.IsSupported:\t{0}", Sha1.IsSupported);
WriteLineFormat(tw, indent, "Sha1.Arm64.IsSupported:\t{0}", Sha1.Arm64.IsSupported);
WriteLineFormat(tw, indent, "Sha256.IsSupported:\t{0}", Sha256.IsSupported);
WriteLineFormat(tw, indent, "Sha256.Arm64.IsSupported:\t{0}", Sha256.Arm64.IsSupported);
#endif // NET5_0_OR_GREATER
}
因向量类型与内在函数(Intrinsics Functions)紧密相关,于是该函数还输出了各类内在函数的支持信息。
在开发过程中,发现 .NET 版本升级时也在增加更多的 内在函数(Intrinsics Functions)。例如 Net 5.0 时增加了大量 Arm架构的内在函数,且增加了 X86Base。
可以利用条件编译,安全使用当前.NET 版本所允许使用的类。
使用 Vector<T> 的构造函数,只能创建单个数字重复的值,或是通过数据(或Span)逐一指定数字。前者太死板,后者又太繁琐。因为在不同的处理器上,Vector<T>的长度是不同的。
目前在支持 Avx2指令集的机器上,Vector<T>是256位的;而其他情况是 128位的。例如 128位的Vector<T>含有4个Single,而256位的Vector<T>含有8个Single,未来Vector<T>很可能会有512位或更高。
对于测试来说,很多时候我们用一批循环数字就行。例如 128位时用 “a,b,c,d”,而256位时用“a,b,c,d,a,b,c,d”就好。
于是我建立了一个根据有限数据来循环铺满各个向量元素的函数。而且它是用 params 定义的可变参数,极大地方便了使用。代码如下。
/// <summary>
/// Create Vector<T> use rotate.
/// </summary>
/// <typeparam name="T">Vector type.</typeparam>
/// <param name="list">Source value list.</param>
/// <returns>Returns Vector<T>.</returns>
static Vector<T> CreateVectorUseRotate<T>(params T[] list) where T : struct {
if (null == list || list.Length <= 0) return Vector<T>.Zero;
T[] arr = new T[Vector<T>.Count];
int idx = 0;
for(int i=0; i< arr.Length; ++i) {
arr[i] = list[idx];
++idx;
if (idx >= list.Length) idx = 0;
}
Vector <T> rt = new Vector<T>(arr);
return rt;
}
有了CreateVectorUseRotate帮忙构造测试数据后,我们可以很方便的建立测试程序的骨架了。代码如下:
public static void Run(TextWriter tw, string indent) {
RunType(tw, indent, CreateVectorUseRotate(float.MinValue, float.PositiveInfinity, float.NaN, -1.2f, 0f, 1f, 2f, 4f), new Vector<float>(2.0f));
RunType(tw, indent, CreateVectorUseRotate(double.MinValue, double.PositiveInfinity, -1.2, 0), new Vector<double>(2.0));
RunType(tw, indent, CreateVectorUseRotate<sbyte>(sbyte.MinValue, sbyte.MaxValue, -1, 0, 1, 2, 3, 4), new Vector<sbyte>(2));
RunType(tw, indent, CreateVectorUseRotate<short>(short.MinValue, short.MaxValue, -1, 0, 1, 2, 3, 4, 127, 128), new Vector<short>(2));
RunType(tw, indent, CreateVectorUseRotate<int>(int.MinValue, int.MaxValue, -1, 0, 1, 2, 3, 32768), new Vector<int>(2));
RunType(tw, indent, CreateVectorUseRotate<long>(long.MinValue, long.MaxValue, -1, 0, 1, 2, 3), new Vector<long>(2));
RunType(tw, indent, CreateVectorUseRotate<byte>(byte.MinValue, byte.MaxValue, 0, 1, 2, 3, 4), new Vector<byte>(2));
RunType(tw, indent, CreateVectorUseRotate<ushort>(ushort.MinValue, ushort.MaxValue, 0, 1, 2, 3, 4, 255, 256), new Vector<ushort>(2));
RunType(tw, indent, CreateVectorUseRotate<uint>(uint.MinValue, uint.MaxValue, 0, 1, 2, 3, 65536), new Vector<uint>(2));
RunType(tw, indent, CreateVectorUseRotate<ulong>(ulong.MinValue, ulong.MaxValue, 0, 1, 2, 3), new Vector<ulong>(2));
}
RunType 是一个泛型函数,能够分别测试每一种数字类型。主要代码如下。
/// <summary>
/// Run type demo.
/// </summary>
/// <typeparam name="T">Vector type.</typeparam>
/// <param name="tw">Output <see cref="TextWriter"/>.</param>
/// <param name="indent">The indent.</param>
/// <param name="srcT">Source temp value.</param>
/// <param name="src2">Source 2.</param>
static void RunType<T>(TextWriter tw, string indent, Vector<T> srcT, Vector<T> src2) where T : struct {
Vector<T> src0 = Vector<T>.Zero;
Vector<T> src1 = Vector<T>.One;
Vector<T> srcAllOnes = ~Vector<T>.Zero;
int elementBitSize = (Vector<byte>.Count / Vector<T>.Count) * 8;
tw.WriteLine(indent + string.Format("-- {0}, Vector<{0}>.Count={1} --", typeof(T).Name, Vector<T>.Count));
WriteLineFormat(tw, indent, "srcT:\t{0}", srcT);
//WriteLineFormat(tw, indent, "src2:\t{0}", src2);
WriteLineFormat(tw, indent, "srcAllOnes:\t{0}", srcAllOnes);
// -- Methods --
#region Methods
//Abs<T>(Vector<T>) Returns a new vector whose elements are the absolute values of the given vector's elements.
WriteLineFormat(tw, indent, "Abs(srcT):\t{0}", Vector.Abs(srcT));
WriteLineFormat(tw, indent, "Abs(srcAllOnes):\t{0}", Vector.Abs(srcAllOnes));
//Add<T>(Vector<T>, Vector<T>) Returns a new vector whose values are the sum of each pair of elements from two given vectors.
WriteLineFormat(tw, indent, "Add(srcT, src1):\t{0}", Vector.Add(srcT, src1));
WriteLineFormat(tw, indent, "Add(srcT, src2):\t{0}", Vector.Add(srcT, src2));
//AndNot<T>(Vector<T>, Vector<T>) Returns a new vector by performing a bitwise And Not operation on each pair of corresponding elements in two vectors.
WriteLineFormat(tw, indent, "AndNot(srcT, src1):\t{0}", Vector.AndNot(srcT, src1));
WriteLineFormat(tw, indent, "AndNot(srcT, src2):\t{0}", Vector.AndNot(srcT, src2));
参数列表里有2个测试用的向量值,分别是 srcT、src2。
方法的头部定义了一些常用的向量值,如:src0(0的值)、src1(1的值)、srcAllOnes(每个位全为1的值)。随后输出 srcT、srcAllOnes 的值,便于口算数据。
然后便是分别对 静态类Vector 的各个方法进行测试了。
静态类Vector所提供的大部分方法是泛型方法,它们在RunType这样的泛型方法内使用时是很方便的。
但静态类Vector的部分方法不是泛型方法,而是通过重载(overload)的方式提供各个类型的方法的。这时用起来麻烦一些,需要用 typeof 写分支代码。代码如下。
//ConvertToDouble(Vector<Int64>) Converts a Vector<Int64>to aVector<Double>.
//ConvertToDouble(Vector<UInt64>) Converts a Vector<UInt64> to aVector<Double>.
//ConvertToInt32(Vector<Single>) Converts a Vector<Single> to aVector<Int32>.
//ConvertToInt64(Vector<Double>) Converts a Vector<Double> to aVector<Int64>.
//ConvertToSingle(Vector<Int32>) Converts a Vector<Int32> to aVector<Single>.
//ConvertToSingle(Vector<UInt32>) Converts a Vector<UInt32> to aVector<Single>.
//ConvertToUInt32(Vector<Single>) Converts a Vector<Single> to aVector<UInt32>.
//ConvertToUInt64(Vector<Double>) Converts a Vector<Double> to aVector<UInt64>.
if (typeof(T) == typeof(Double)) {
WriteLineFormat(tw, indent, "ConvertToInt64(srcT):\t{0}", Vector.ConvertToInt64(Vector.AsVectorDouble(srcT)));
WriteLineFormat(tw, indent, "ConvertToUInt64(srcT):\t{0}", Vector.ConvertToUInt64(Vector.AsVectorDouble(srcT)));
} else if (typeof(T) == typeof(Single)) {
WriteLineFormat(tw, indent, "ConvertToInt32(srcT):\t{0}", Vector.ConvertToInt32(Vector.AsVectorSingle(srcT)));
WriteLineFormat(tw, indent, "ConvertToUInt32(srcT):\t{0}", Vector.ConvertToUInt32(Vector.AsVectorSingle(srcT)));
} else if (typeof(T) == typeof(Int32)) {
WriteLineFormat(tw, indent, "ConvertToSingle(srcT):\t{0}", Vector.ConvertToSingle(Vector.AsVectorInt32(srcT)));
} else if (typeof(T) == typeof(UInt32)) {
WriteLineFormat(tw, indent, "ConvertToSingle(srcT):\t{0}", Vector.ConvertToSingle(Vector.AsVectorUInt32(srcT)));
} else if (typeof(T) == typeof(Int64)) {
WriteLineFormat(tw, indent, "ConvertToDouble(srcT):\t{0}", Vector.ConvertToDouble(Vector.AsVectorInt64(srcT)));
} else if (typeof(T) == typeof(UInt64)) {
WriteLineFormat(tw, indent, "ConvertToDouble(srcT):\t{0}", Vector.ConvertToDouble(Vector.AsVectorUInt64(srcT)));
}
部分方法具有控制参数,如进行左移位的ShiftLeft。于是最好写一个循环,分别测试不同的控制值。代码如下。
#if NET7_0_OR_GREATER
//ShiftLeft(Vector<Byte>, Int32) Shifts each element of a vector left by the specified amount.
//ShiftLeft(Vector<Int16>, Int32) Shifts each element of a vector left by the specified amount.
//ShiftLeft(Vector<Int32>, Int32) Shifts each element of a vector left by the specified amount.
//ShiftLeft(Vector<Int64>, Int32) Shifts each element of a vector left by the specified amount.
//ShiftLeft(Vector<IntPtr>, Int32) Shifts each element of a vector left by the specified amount.
//ShiftLeft(Vector<SByte>, Int32) Shifts each element of a vector left by the specified amount.
//ShiftLeft(Vector<UInt16>, Int32) Shifts each element of a vector left by the specified amount.
//ShiftLeft(Vector<UInt32>, Int32) Shifts each element of a vector left by the specified amount.
//ShiftLeft(Vector<UInt64>, Int32) Shifts each element of a vector left by the specified amount.
//ShiftLeft(Vector<UIntPtr>, Int32) Shifts each element of a vector left by the specified amount.
int[] shiftCounts = new int[] { 1, elementBitSize - 1, elementBitSize, elementBitSize + 1, -1 };
foreach (int shiftCount in shiftCounts) {
if (typeof(T) == typeof(Byte)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorByte(srcT), shiftCount));
} else if (typeof(T) == typeof(Int16)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorInt16(srcT), shiftCount));
} else if (typeof(T) == typeof(Int32)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorInt32(srcT), shiftCount));
} else if (typeof(T) == typeof(Int64)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorInt64(srcT), shiftCount));
} else if (typeof(T) == typeof(IntPtr)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorNInt(srcT), shiftCount));
} else if (typeof(T) == typeof(SByte)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorSByte(srcT), shiftCount));
} else if (typeof(T) == typeof(UInt16)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorUInt16(srcT), shiftCount));
} else if (typeof(T) == typeof(UInt32)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorUInt32(srcT), shiftCount));
} else if (typeof(T) == typeof(UInt64)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorUInt64(srcT), shiftCount));
} else if (typeof(T) == typeof(UIntPtr)) {
WriteLineFormat(tw, indent, "ShiftLeft(srcT, " + shiftCount + "):\t{0}", Vector.ShiftLeft(Vector.AsVectorNUInt(srcT), shiftCount));
}
}
有一些方法通过out 参数返回了多个值,如能使数据变宽的 Widen。于是可利用“if块”来限制不同类型变量的作用域。代码如下。
//Widen(Vector<Byte>, Vector<UInt16>, Vector<UInt16>) Widens aVector<Byte> into two Vector<UInt16>instances.
//Widen(Vector<Int16>, Vector<Int32>, Vector<Int32>) Widens a Vector<Int16> into twoVector<Int32> instances.
//Widen(Vector<Int32>, Vector<Int64>, Vector<Int64>) Widens a Vector<Int32> into twoVector<Int64> instances.
//Widen(Vector<SByte>, Vector<Int16>, Vector<Int16>) Widens a Vector<SByte> into twoVector<Int16> instances.
//Widen(Vector<Single>, Vector<Double>, Vector<Double>) Widens a Vector<Single> into twoVector<Double> instances.
//Widen(Vector<UInt16>, Vector<UInt32>, Vector<UInt32>) Widens a Vector<UInt16> into twoVector<UInt32> instances.
//Widen(Vector<UInt32>, Vector<UInt64>, Vector<UInt64>) Widens a Vector<UInt32> into twoVector<UInt64> instances.
if (typeof(T) == typeof(Single)) {
Vector<Double> low, high;
Vector.Widen(Vector.AsVectorSingle(srcT), out low, out high);
WriteLineFormat(tw, indent, "Widen(srcT).low:\t{0}", low);
WriteLineFormat(tw, indent, "Widen(srcT).high:\t{0}", high);
} else if (typeof(T) == typeof(SByte)) {
Vector<Int16> low, high;
Vector.Widen(Vector.AsVectorSByte(srcT), out low, out high);
WriteLineFormat(tw, indent, "Widen(srcT).low:\t{0}", low);
WriteLineFormat(tw, indent, "Widen(srcT).high:\t{0}", high);
} else if (typeof(T) == typeof(Int16)) {
Vector<Int32> low, high;
Vector.Widen(Vector.AsVectorInt16(srcT), out low, out high);
WriteLineFormat(tw, indent, "Widen(srcT).low:\t{0}", low);
WriteLineFormat(tw, indent, "Widen(srcT).high:\t{0}", high);
} else if (typeof(T) == typeof(Int32)) {
Vector<Int64> low, high;
Vector.Widen(Vector.AsVectorInt32(srcT), out low, out high);
WriteLineFormat(tw, indent, "Widen(srcT).low:\t{0}", low);
WriteLineFormat(tw, indent, "Widen(srcT).high:\t{0}", high);
} else if (typeof(T) == typeof(Byte)) {
Vector<UInt16> low, high;
Vector.Widen(Vector.AsVectorByte(srcT), out low, out high);
WriteLineFormat(tw, indent, "Widen(srcT).low:\t{0}", low);
WriteLineFormat(tw, indent, "Widen(srcT).high:\t{0}", high);
} else if (typeof(T) == typeof(UInt16)) {
Vector<UInt32> low, high;
Vector.Widen(Vector.AsVectorUInt16(srcT), out low, out high);
WriteLineFormat(tw, indent, "Widen(srcT).low:\t{0}", low);
WriteLineFormat(tw, indent, "Widen(srcT).high:\t{0}", high);
} else if (typeof(T) == typeof(UInt32)) {
Vector<UInt64> low, high;
Vector.Widen(Vector.AsVectorUInt32(srcT), out low, out high);
WriteLineFormat(tw, indent, "Widen(srcT).low:\t{0}", low);
WriteLineFormat(tw, indent, "Widen(srcT).high:\t{0}", high);
}
虽然只读结构体 Vector<T>支持 ToString,能够输出各个元素的数值。但在很多时候(例如使用 AndNot 的函数进行二进制运算时),我们需要观察它的二进制数据,故需要以十六进制的方式来显示其中的数据,但Vector<T>不支持十六进制格式化(X)。
于是专门为 Vector<T> 写了一个重载函数,用于输出它的十六进制值。
/// <summary>
/// Get hex string.
/// </summary>
/// <typeparam name="T">Vector value type.</typeparam>
/// <param name="src">Source value.</param>
/// <param name="separator">The separator.</param>
/// <param name="noFixEndian">No fix endian.</param>
/// <returns>Returns hex string.</returns>
private static string GetHex<T>(Vector<T> src, string separator, bool noFixEndian) where T : struct {
Vector<byte> list = Vector.AsVectorByte(src);
int unitCount = Vector<T>.Count;
int unitSize = Vector<byte>.Count / unitCount;
bool fixEndian = false;
if (!noFixEndian && BitConverter.IsLittleEndian) fixEndian = true;
StringBuilder sb = new StringBuilder();
if (fixEndian) {
// IsLittleEndian.
for (int i=0; i < unitCount; ++i) {
if ((i > 0)) {
if (!string.IsNullOrEmpty(separator)) {
sb.Append(separator);
}
}
int idx = unitSize * (i+1) - 1;
for(int j = 0; j < unitSize; ++j) {
byte by = list[idx];
--idx;
sb.Append(by.ToString("X2"));
}
}
} else {
for (int i = 0; i < Vector<byte>.Count; ++i) {
byte by = list[i];
if ((i > 0) && (0 == i % unitSize)) {
if (!string.IsNullOrEmpty(separator)) {
sb.Append(separator);
}
}
sb.Append(by.ToString("X2"));
}
}
return sb.ToString();
}
/// <summary>
/// WriteLine with format.
/// </summary>
/// <typeparam name="T">Vector value type.</typeparam>
/// <param name="tw">The TextWriter.</param>
/// <param name="indent">The indent.</param>
/// <param name="format">The format.</param>
/// <param name="src">Source value</param>
private static void WriteLineFormat<T>(TextWriter tw, string indent, string format, Vector<T> src) where T : struct {
if (null == tw) return;
string line = indent + string.Format(format, src);
string hex = GetHex(src, " ", false);
line += "\t# (" + hex +")";
tw.WriteLine(line);
}
由于Vector类提供了大量的向量方法,再乘以10种基元类型,导致本程序的输出信息很长,达到了90多KB。
为了避免文章过长,于是这里仅摘录了主要的输出信息。
VectorClassDemo50
IsRelease: False
EnvironmentVariable(PROCESSOR_IDENTIFIER): Intel64 Family 6 Model 142 Stepping 10, GenuineIntel
Environment.ProcessorCount: 8
Environment.Is64BitOperatingSystem: True
Environment.Is64BitProcess: True
Environment.OSVersion: Microsoft Windows NT 10.0.19044.0
Environment.Version: 7.0.0
RuntimeEnvironment.GetRuntimeDirectory: C:\Program Files\dotnet\shared\Microsoft.NETCore.App\7.0.0\
RuntimeInformation.FrameworkDescription: .NET 7.0.0
BitConverter.IsLittleEndian: True
IntPtr.Size: 8
Vector.IsHardwareAccelerated: True
Vector<byte>.Count: 32 # 256bit
Vector<T>.Assembly.CodeBase: file:///C:/Program Files/dotnet/shared/Microsoft.NETCore.App/7.0.0/System.Private.CoreLib.dll
[Intrinsics.X86]
Aes.IsSupported: True
Aes.X64.IsSupported: True
Avx.IsSupported: True
Avx.X64.IsSupported: True
Avx2.IsSupported: True
Avx2.X64.IsSupported: True
AvxVnni.IsSupported: False
AvxVnni.X64.IsSupported: False
Bmi1.IsSupported: True
Bmi1.X64.IsSupported: True
Bmi2.IsSupported: True
Bmi2.X64.IsSupported: True
Fma.IsSupported: True
Fma.X64.IsSupported: True
Lzcnt.IsSupported: True
Lzcnt.X64.IsSupported: True
Pclmulqdq.IsSupported: True
Pclmulqdq.X64.IsSupported: True
Popcnt.IsSupported: True
Popcnt.X64.IsSupported: True
Sse.IsSupported: True
Sse.X64.IsSupported: True
Sse2.IsSupported: True
Sse2.X64.IsSupported: True
Sse3.IsSupported: True
Sse3.X64.IsSupported: True
Sse41.IsSupported: True
Sse41.X64.IsSupported: True
Sse42.IsSupported: True
Sse42.X64.IsSupported: True
Ssse3.IsSupported: True
Ssse3.X64.IsSupported: True
X86Base.IsSupported: True
X86Base.X64.IsSupported: True
X86Serialize.IsSupported: False
X86Serialize.X64.IsSupported: False
[Intrinsics.Arm]
AdvSimd.IsSupported: False
AdvSimd.Arm64.IsSupported: False
Aes.IsSupported: False
Aes.Arm64.IsSupported: False
ArmBase.IsSupported: False
ArmBase.Arm64.IsSupported: False
Crc32.IsSupported: False
Crc32.Arm64.IsSupported: False
Dp.IsSupported: False
Dp.Arm64.IsSupported: False
Rdm.IsSupported: False
Rdm.Arm64.IsSupported: False
Sha1.IsSupported: False
Sha1.Arm64.IsSupported: False
Sha256.IsSupported: False
Sha256.Arm64.IsSupported: False
-- Single, Vector<Single>.Count=8 --
srcT: <-3.4028235E+38, ∞, NaN, -1.2, 0, 1, 2, 4> # (FF7FFFFF 7F800000 FFC00000 BF99999A 00000000 3F800000 40000000 40800000)
srcAllOnes: <NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN> # (FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF)
Abs(srcT): <3.4028235E+38, ∞, NaN, 1.2, 0, 1, 2, 4> # (7F7FFFFF 7F800000 7FC00000 3F99999A 00000000 3F800000 40000000 40800000)
Abs(srcAllOnes): <NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN> # (7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF 7FFFFFFF)
Add(srcT, src1): <-3.4028235E+38, ∞, NaN, -0.20000005, 1, 2, 3, 5> # (FF7FFFFF 7F800000 FFC00000 BE4CCCD0 3F800000 40000000 40400000 40A00000)
Add(srcT, src2): <-3.4028235E+38, ∞, NaN, 0.79999995, 2, 3, 4, 6> # (FF7FFFFF 7F800000 FFC00000 3F4CCCCC 40000000 40400000 40800000 40C00000)
AndNot(srcT, src1): <-3.9999998, 2, -3, -2.350989E-39, 0, 0, 2, 2> # (C07FFFFF 40000000 C0400000 8019999A 00000000 00000000 40000000 40000000)
AndNot(srcT, src2): <-0.99999994, 1, -1.5, -1.2, 0, 1, 0, 1.1754944E-38> # (BF7FFFFF 3F800000 BFC00000 BF99999A 00000000 3F800000 00000000 00800000)
BitwiseAnd(srcT, src1): <0.5, 1, 1, 1, 0, 1, 0, 1.1754944E-38> # (3F000000 3F800000 3F800000 3F800000 00000000 3F800000 00000000 00800000)
BitwiseAnd(srcT, src2): <2, 2, 2, 0, 0, 0, 2, 2> # (40000000 40000000 40000000 00000000 00000000 00000000 40000000 40000000)
BitwiseOr(srcT, src1): <NaN, ∞, NaN, -1.2, 1, 1, ∞, ∞> # (FFFFFFFF 7F800000 FFC00000 BF99999A 3F800000 3F800000 7F800000 7F800000)
BitwiseOr(srcT, src2): <-3.4028235E+38, ∞, NaN, NaN, 2, ∞, 2, 4> # (FF7FFFFF 7F800000 FFC00000 FF99999A 40000000 7F800000 40000000 40800000)
...
Widen(srcT).low: <-3.4028234663852886E+38, ∞, NaN, -1.2000000476837158> # (C7EFFFFFE0000000 7FF0000000000000 FFF8000000000000 BFF3333340000000)
Widen(srcT).high: <0, 1, 2, 4> # (0000000000000000 3FF0000000000000 4000000000000000 4010000000000000)
...
-- Double, Vector<Double>.Count=4 --
srcT: <-1.7976931348623157E+308, ∞, -1.2, 0> # (FFEFFFFFFFFFFFFF 7FF0000000000000 BFF3333333333333 0000000000000000)
srcAllOnes: <NaN, NaN, NaN, NaN> # (FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF)
Abs(srcT): <1.7976931348623157E+308, ∞, 1.2, 0> # (7FEFFFFFFFFFFFFF 7FF0000000000000 3FF3333333333333 0000000000000000)
Abs(srcAllOnes): <NaN, NaN, NaN, NaN> # (7FFFFFFFFFFFFFFF 7FFFFFFFFFFFFFFF 7FFFFFFFFFFFFFFF 7FFFFFFFFFFFFFFF)
Add(srcT, src1): <-1.7976931348623157E+308, ∞, -0.19999999999999996, 1> # (FFEFFFFFFFFFFFFF 7FF0000000000000 BFC9999999999998 3FF0000000000000)
Add(srcT, src2): <-1.7976931348623157E+308, ∞, 0.8, 2> # (FFEFFFFFFFFFFFFF 7FF0000000000000 3FE999999999999A 4000000000000000)
AndNot(srcT, src1): <-3.9999999999999996, 2, -4.4501477170144E-309, 0> # (C00FFFFFFFFFFFFF 4000000000000000 8003333333333333 0000000000000000)
AndNot(srcT, src2): <-0.9999999999999999, 1, -1.2, 0> # (BFEFFFFFFFFFFFFF 3FF0000000000000 BFF3333333333333 0000000000000000)
BitwiseAnd(srcT, src1): <0.5, 1, 1, 0> # (3FE0000000000000 3FF0000000000000 3FF0000000000000 0000000000000000)
BitwiseAnd(srcT, src2): <2, 2, 0, 0> # (4000000000000000 4000000000000000 0000000000000000 0000000000000000)
BitwiseOr(srcT, src1): <NaN, ∞, -1.2, 1> # (FFFFFFFFFFFFFFFF 7FF0000000000000 BFF3333333333333 3FF0000000000000)
BitwiseOr(srcT, src2): <-1.7976931348623157E+308, ∞, NaN, 2> # (FFEFFFFFFFFFFFFF 7FF0000000000000 FFF3333333333333 4000000000000000)
...
-- UInt64, Vector<UInt64>.Count=4 --
srcT: <0, 18446744073709551615, 0, 1> # (0000000000000000 FFFFFFFFFFFFFFFF 0000000000000000 0000000000000001)
srcAllOnes: <18446744073709551615, 18446744073709551615, 18446744073709551615, 18446744073709551615> # (FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF)
Abs(srcT): <0, 18446744073709551615, 0, 1> # (0000000000000000 FFFFFFFFFFFFFFFF 0000000000000000 0000000000000001)
Abs(srcAllOnes): <18446744073709551615, 18446744073709551615, 18446744073709551615, 18446744073709551615> # (FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFF)
Add(srcT, src1): <1, 0, 1, 2> # (0000000000000001 0000000000000000 0000000000000001 0000000000000002)
Add(srcT, src2): <2, 1, 2, 3> # (0000000000000002 0000000000000001 0000000000000002 0000000000000003)
AndNot(srcT, src1): <0, 18446744073709551614, 0, 0> # (0000000000000000 FFFFFFFFFFFFFFFE 0000000000000000 0000000000000000)
AndNot(srcT, src2): <0, 18446744073709551613, 0, 1> # (0000000000000000 FFFFFFFFFFFFFFFD 0000000000000000 0000000000000001)
BitwiseAnd(srcT, src1): <0, 1, 0, 1> # (0000000000000000 0000000000000001 0000000000000000 0000000000000001)
BitwiseAnd(srcT, src2): <0, 2, 0, 0> # (0000000000000000 0000000000000002 0000000000000000 0000000000000000)
BitwiseOr(srcT, src1): <1, 18446744073709551615, 1, 1> # (0000000000000001 FFFFFFFFFFFFFFFF 0000000000000001 0000000000000001)
BitwiseOr(srcT, src2): <2, 18446744073709551615, 2, 3> # (0000000000000002 FFFFFFFFFFFFFFFF 0000000000000002 0000000000000003)
...
ShiftLeft(srcT, 1): <0, 18446744073709551614, 0, 2> # (0000000000000000 FFFFFFFFFFFFFFFE 0000000000000000 0000000000000002)
ShiftLeft(srcT, 63): <0, 9223372036854775808, 0, 9223372036854775808> # (0000000000000000 8000000000000000 0000000000000000 8000000000000000)
ShiftLeft(srcT, 64): <0, 18446744073709551615, 0, 1> # (0000000000000000 FFFFFFFFFFFFFFFF 0000000000000000 0000000000000001)
ShiftLeft(srcT, 65): <0, 18446744073709551614, 0, 2> # (0000000000000000 FFFFFFFFFFFFFFFE 0000000000000000 0000000000000002)
ShiftLeft(srcT, -1): <0, 9223372036854775808, 0, 9223372036854775808> # (0000000000000000 8000000000000000 0000000000000000 8000000000000000)
完整的测试结果,请运行程序进行查看。
源码地址——
https://github.com/zyl910/BenchmarkVector/tree/main/VectorClassDemo
Vector<T> 结构》. https://docs.microsoft.com/zh-cn/dotnet/api/system.numerics.vector-1?view=netcore-1.0C# 使用SIMD向量类型加速浮点数组求和运算(1):使用Vector4、Vector<T>》. https://www.cnblogs.com/zyl910/p/dotnet_simd_BenchmarkVector1.html我正在学习如何使用Nokogiri,根据这段代码我遇到了一些问题:require'rubygems'require'mechanize'post_agent=WWW::Mechanize.newpost_page=post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')puts"\nabsolutepathwithtbodygivesnil"putspost_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div
总的来说,我对ruby还比较陌生,我正在为我正在创建的对象编写一些rspec测试用例。许多测试用例都非常基础,我只是想确保正确填充和返回值。我想知道是否有办法使用循环结构来执行此操作。不必为我要测试的每个方法都设置一个assertEquals。例如:describeitem,"TestingtheItem"doit"willhaveanullvaluetostart"doitem=Item.new#HereIcoulddotheitem.name.shouldbe_nil#thenIcoulddoitem.category.shouldbe_nilendend但我想要一些方法来使用
类classAprivatedeffooputs:fooendpublicdefbarputs:barendprivatedefzimputs:zimendprotecteddefdibputs:dibendendA的实例a=A.new测试a.foorescueputs:faila.barrescueputs:faila.zimrescueputs:faila.dibrescueputs:faila.gazrescueputs:fail测试输出failbarfailfailfail.发送测试[:foo,:bar,:zim,:dib,:gaz].each{|m|a.send(m)resc
作为我的Rails应用程序的一部分,我编写了一个小导入程序,它从我们的LDAP系统中吸取数据并将其塞入一个用户表中。不幸的是,与LDAP相关的代码在遍历我们的32K用户时泄漏了大量内存,我一直无法弄清楚如何解决这个问题。这个问题似乎在某种程度上与LDAP库有关,因为当我删除对LDAP内容的调用时,内存使用情况会很好地稳定下来。此外,不断增加的对象是Net::BER::BerIdentifiedString和Net::BER::BerIdentifiedArray,它们都是LDAP库的一部分。当我运行导入时,内存使用量最终达到超过1GB的峰值。如果问题存在,我需要找到一些方法来更正我的代
我正在尝试设置一个puppet节点,但rubygems似乎不正常。如果我通过它自己的二进制文件(/usr/lib/ruby/gems/1.8/gems/facter-1.5.8/bin/facter)在cli上运行facter,它工作正常,但如果我通过由rubygems(/usr/bin/facter)安装的二进制文件,它抛出:/usr/lib/ruby/1.8/facter/uptime.rb:11:undefinedmethod`get_uptime'forFacter::Util::Uptime:Module(NoMethodError)from/usr/lib/ruby
Rails2.3可以选择随时使用RouteSet#add_configuration_file添加更多路由。是否可以在Rails3项目中做同样的事情? 最佳答案 在config/application.rb中:config.paths.config.routes在Rails3.2(也可能是Rails3.1)中,使用:config.paths["config/routes"] 关于ruby-on-rails-Rails3中的多个路由文件,我们在StackOverflow上找到一个类似的问题
我有多个ActiveRecord子类Item的实例数组,我需要根据最早的事件循环打印。在这种情况下,我需要打印付款和维护日期,如下所示:ItemAmaintenancerequiredin5daysItemBpaymentrequiredin6daysItemApaymentrequiredin7daysItemBmaintenancerequiredin8days我目前有两个查询,用于查找maintenance和payment项目(非排他性查询),并输出如下内容:paymentrequiredin...maintenancerequiredin...有什么方法可以改善上述(丑陋的)代
大约一年前,我决定确保每个包含非唯一文本的Flash通知都将从模块中的方法中获取文本。我这样做的最初原因是为了避免一遍又一遍地输入相同的字符串。如果我想更改措辞,我可以在一个地方轻松完成,而且一遍又一遍地重复同一件事而出现拼写错误的可能性也会降低。我最终得到的是这样的:moduleMessagesdefformat_error_messages(errors)errors.map{|attribute,message|"Error:#{attribute.to_s.titleize}#{message}."}enddeferror_message_could_not_find(obje
我想了解Ruby方法methods()是如何工作的。我尝试使用“ruby方法”在Google上搜索,但这不是我需要的。我也看过ruby-doc.org,但我没有找到这种方法。你能详细解释一下它是如何工作的或者给我一个链接吗?更新我用methods()方法做了实验,得到了这样的结果:'labrat'代码classFirstdeffirst_instance_mymethodenddefself.first_class_mymethodendendclassSecond使用类#returnsavailablemethodslistforclassandancestorsputsSeco
我需要从一个View访问多个模型。以前,我的links_controller仅用于提供以不同方式排序的链接资源。现在我想包括一个部分(我假设)显示按分数排序的顶级用户(@users=User.all.sort_by(&:score))我知道我可以将此代码插入每个链接操作并从View访问它,但这似乎不是“ruby方式”,我将需要在不久的将来访问更多模型。这可能会变得很脏,是否有针对这种情况的任何技术?注意事项:我认为我的应用程序正朝着单一格式和动态页面内容的方向发展,本质上是一个典型的网络应用程序。我知道before_filter但考虑到我希望应用程序进入的方向,这似乎很麻烦。最终从任何