草庐IT

ES聚合统计

hdn_kb 2024-01-06 原文

文章目录

1. 以多个字段唯一并去重后统计总数

注意:ES版本要使用7.xx版本

eg:以类名+方法名唯一并去重后统计接口的总数【每条数据都存在类名、方法名,并且相同的类名和方法名会存在多条数据,数据中存在不同的类名+方法名,需要从所有数据中以类名+方法名唯一并去重统计总数】

{
  "query": {
    "bool": {
      "filter": [
        {
          "wildcard": {
            "systemCode.keyword": {
              "wildcard": "hdn-test",
              "boost": 1.0
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  },
  "aggregations": {
    "interface_count": {
      "cardinality": {
        "field": "className_and_methodName"
      }
    }
  },
  "runtime_mappings": {
    "className_and_methodName": {
      "type": "keyword",
      "script": "emit(doc['className.keyword'] + ' ' + doc['methodName.keyword']);"
    }
  }
}

java实现以上DSL:

    /**
     * 根据systemCode统计接口接口总数
     *
     * @param systemCode
     * @return
     * @throws IOException
     */
    @Override
    public Long queryServerCountByAgentId(String systemCode) throws IOException {

        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

        //构造查询条件
        SysLogRequestParamDTO sysLogRequestParamDTO = new SysLogRequestParamDTO();
        sysLogRequestParamDTO.setSystemCode(systemCode);
        BoolQueryBuilder queryBuilder = getQueryBuilder(sysLogRequestParamDTO);
        searchSourceBuilder.query(queryBuilder);

        //以className和className作为唯一条件查询总数
        Long count = getCountByClassNameAndMethodName(searchSourceBuilder);
        return count;

    }

    private Long getCountByClassNameAndMethodName(SearchSourceBuilder searchSourceBuilder) throws IOException {
        SearchRequest searchRequest = new SearchRequest("索引INDEX");

        //以className和methodName唯一查询总数

        //runtime_mappings 部分
        HashMap<String, Object> runtimeMappings = new HashMap<>();
        HashMap<String, Object> classNameAndMethodName = new HashMap<>();
        String FALSIFIED_SCRIPT = "emit(doc['className.keyword'] + ' ' + doc['methodName.keyword']);";
        classNameAndMethodName.put("script", FALSIFIED_SCRIPT);
        classNameAndMethodName.put("type", "keyword");
        runtimeMappings.put("className_and_methodName", classNameAndMethodName);
        searchSourceBuilder.runtimeMappings(runtimeMappings);

        //aggregations 部分
        CardinalityAggregationBuilder cardinalityAggregationBuilder =
            AggregationBuilders.cardinality("interface_count").field("className_and_methodName");
        searchSourceBuilder.aggregation(cardinalityAggregationBuilder);

        searchRequest.source(searchSourceBuilder);
        log.info("SysLogStatisticsServiceImpl->invokeCountStatistics->DSL:{}", searchRequest);

        //执行查询
        SearchResponse response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

        //获取结果
        Aggregation interface_count = response.getAggregations().get("interface_count");
        long count = ((ParsedCardinality)interface_count).getValue();

        return count;
    }

2. 求近15添内日平均数据

{
  "query": {
    "bool": {
      "filter": [
        {
          "wildcard": {
            "systemCode.keyword": {
              "wildcard": "hdn-test",
              "boost": 1.0
            }
          }
        },
        {
          "range": {
            "createdDate": {
              "from": "2022-11-17",
              "to": "2022-12-02",
              "include_lower": true,
              "include_upper": true,
              "boost": 1.0
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  },
  "aggregations": {
    "createdDateGroup": {
      "date_histogram": {
        "field": "createdDate",
        "format": "yyyy-MM-dd",
        "fixed_interval": "1d",
        "offset": 0,
        "order": {
          "_key": "asc"
        },
        "keyed": false,
        "min_doc_count": 0
      },
      "aggregations": {
        "average_of_executeTime": {
          "avg": {
            "field": "executeTime"
          }
        }
      }
    }
  }
}

java实现以上DSL:


 /**
     * 日平均耗时
     *
     * @param searchSourceBuilder
     * @return
     */
    private List<JSONObject> getDayAvgExecuteTime(SearchSourceBuilder searchSourceBuilder) throws IOException {
        List<JSONObject> list = new ArrayList<>();

        SearchRequest searchRequest = new SearchRequest(comOperateLogIndex);

        //aggregations
        AvgAggregationBuilder executeTimeAvgAggregationBuilder =
            AggregationBuilders.avg("average_of_executeTime").field("executeTime");

        //date_histogram
        DateHistogramAggregationBuilder dateHistogramAggregationBuilder =
            AggregationBuilders.dateHistogram("createdDateGroup").field("createdDate")
                .fixedInterval(DateHistogramInterval.DAY).format("yyyy-MM-dd")
                .subAggregation(executeTimeAvgAggregationBuilder);

        searchSourceBuilder.aggregation(dateHistogramAggregationBuilder);
        searchRequest.source(searchSourceBuilder);

        log.info("SysLogStatisticsServiceImpl->invokeCountStatistics->DSL:{}", searchRequest.source());

        //执行查询
        SearchResponse response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

        //获取结果
        Histogram createdDateGroup = response.getAggregations().get("createdDateGroup");
        for (Histogram.Bucket bucket : createdDateGroup.getBuckets()) {
            JSONObject jsonObject = new JSONObject();
            String createDate = bucket.getKeyAsString();
            jsonObject.put("createDate", createDate);    //日期
            jsonObject.put("doc_count", bucket.getDocCount());   //总条数
            Aggregation average_of_executeTime = bucket.getAggregations().get("average_of_executeTime");
            if (average_of_executeTime instanceof ParsedAvg) {
                ParsedAvg avg = (ParsedAvg)average_of_executeTime;
                jsonObject.put("avgExeTime", avg.getValue());
            }
            list.add(jsonObject);
        }
        return list;
    }
    
    public List<DayAvgExecuteTimeVO> getDayAvgExecuteTime(String systemCode) throws IOException {
        List<DayAvgExecuteTimeVO> list = new ArrayList<>();

        SearchSourceBuilder searchSourceBuilder = buildCondition(systemCode);

        //日平均耗时
        List<JSONObject> dayAvgExecuteTimeList = getDayAvgExecuteTime(searchSourceBuilder);

        DateFormat format = new SimpleDateFormat("yyyy-MM-dd");
        String startTime = format.format(new Date().getTime() - 15 * 24 * 60 * 60 * 1000);
        String endTime = format.format(new Date());
        Date dStart = null;
        Date dEnd = null;

        try {
            dStart = format.parse(startTime);
            dEnd = format.parse(endTime);
        } catch (ParseException e) {
            e.printStackTrace();
        }
        List<Date> dateList = DateUtil.findDates(dStart, dEnd);
        for (Date date : dateList) {
            DayAvgExecuteTimeVO dayAvgExecuteTimeVO = new DayAvgExecuteTimeVO();
            dayAvgExecuteTimeVO.setDate(format.format(date));
            dayAvgExecuteTimeList.forEach(avgTime -> {
                if (StringUtils.equals(format.format(date), avgTime.getString("createDate"))) {
                    dayAvgExecuteTimeVO.setDayAvgExecuteTime(avgTime.getDouble("avgExeTime"));
                } else {
                    dayAvgExecuteTimeVO.setDayAvgExecuteTime(0.0);
                }
            });
            list.add(dayAvgExecuteTimeVO);
        }
        return list;
    }

    /**
     * 构造查询条件
     *
     * @param systemCode
     * @return
     */
    private SearchSourceBuilder buildCondition(String systemCode) {
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

        //构造查询条件
        SysLogRequestParamDTO sysLogRequestParamDTO = new SysLogRequestParamDTO();
        sysLogRequestParamDTO.setSystemCode(systemCode);
        BoolQueryBuilder queryBuilder = getQueryBuilder(sysLogRequestParamDTO);

        //近15天日平均耗时
        DateFormat format = new SimpleDateFormat("yyyy-MM-dd");
        String startTime = format.format(new Date().getTime() - 15 * 24 * 60 * 60 * 1000);
        String endTime = format.format(new Date());
        queryBuilder.filter(QueryBuilders.rangeQuery("createdDate").gte(startTime).lte(endTime));

        searchSourceBuilder.query(queryBuilder);

        return searchSourceBuilder;
    }

    /**
     * JAVA获取某段时间内的所有日期
     * @param dStart 开始时间
     * @param dEnd  结束时间
     * @return
     */
    public static List<Date> findDates(Date dStart, Date dEnd) {
        Calendar cStart = Calendar.getInstance();
        cStart.setTime(dStart);

        List dateList = new ArrayList();
        //别忘了,把起始日期加上
        dateList.add(dStart);
        // 此日期是否在指定日期之后
        while (dEnd.after(cStart.getTime())) {
            // 根据日历的规则,为给定的日历字段添加或减去指定的时间量
            cStart.add(Calendar.DAY_OF_MONTH, 1);
            dateList.add(cStart.getTime());
        }
        return dateList;
    }

3. 求近15天内平均数据

{
  "query": {
    "bool": {
      "filter": [
        {
          "wildcard": {
            "systemCode.keyword": {
              "wildcard": "hdn-test",
              "boost": 1.0
            }
          }
        },
        {
          "range": {
            "createdDate": {
              "from": "2022-11-17",
              "to": "2022-12-02",
              "include_lower": true,
              "include_upper": true,
              "boost": 1.0
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  },
  "aggregations": {
    "average_of_executeTime": {
      "avg": {
        "field": "executeTime"
      }
    }
  }
}

java实现以上DSL:

    public Double getAvgExecuteTime(String systemCode) throws IOException {
        //构造查询条件
        SearchSourceBuilder searchSourceBuilder = buildCondition(systemCode);
        //平均耗时
        JSONObject avgExecuteTime = getAvgExecuteTime(searchSourceBuilder);
        Double avgExeTime = avgExecuteTime.getDouble("avgExeTime");
        return avgExeTime;
    }

    /**
     * 构造查询条件
     *
     * @param systemCode
     * @return
     */
    private SearchSourceBuilder buildCondition(String systemCode) {
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

        //构造查询条件
        SysLogRequestParamDTO sysLogRequestParamDTO = new SysLogRequestParamDTO();
        sysLogRequestParamDTO.setSystemCode(systemCode);
        BoolQueryBuilder queryBuilder = getQueryBuilder(sysLogRequestParamDTO);

        //近15天日平均耗时
        DateFormat format = new SimpleDateFormat("yyyy-MM-dd");
        String startTime = format.format(new Date().getTime() - 15 * 24 * 60 * 60 * 1000);
        String endTime = format.format(new Date());
        queryBuilder.filter(QueryBuilders.rangeQuery("createdDate").gte(startTime).lte(endTime));

        searchSourceBuilder.query(queryBuilder);

        return searchSourceBuilder;
    }

    /**
     * 求平均
     *
     * @param searchSourceBuilder
     * @return
     */
    private JSONObject getAvgExecuteTime(SearchSourceBuilder searchSourceBuilder) throws IOException {
        SearchRequest searchRequest = new SearchRequest(comOperateLogIndex);

        //聚合
        AvgAggregationBuilder executeTimeAvgAggregationBuilder =
            AggregationBuilders.avg("average_of_executeTime").field("executeTime");
        searchSourceBuilder.aggregation(executeTimeAvgAggregationBuilder);
        searchRequest.source(searchSourceBuilder);

        log.info("SysLogStatisticsServiceImpl->invokeCountStatistics->DSL:{}", searchRequest.source());

        //执行查询
        SearchResponse response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

        //获取结果
        JSONObject jsonObject = new JSONObject();
        Aggregation average_of_executeTime = response.getAggregations().get("average_of_executeTime");
        if (average_of_executeTime instanceof ParsedAvg) {
            ParsedAvg avg = (ParsedAvg)average_of_executeTime;
            jsonObject.put("avgExeTime", avg.getValue());
        }
        return jsonObject;
    }

有关ES聚合统计的更多相关文章

  1. 使用canal同步MySQL数据到ES - 2

    文章目录一、概述简介原理模块二、配置Mysql使用版本环境要求1.操作系统2.mysql要求三、配置canal-server离线下载在线下载上传解压修改配置单机配置集群配置分库分表配置1.修改全局配置2.实例配置垂直分库水平分库3.修改group-instance.xml4.启动监听四、配置canal-adapter1修改启动配置2配置映射文件3启动ES数据同步查询所有订阅同步数据同步开关启动4.验证五、配置canal-admin一、概述简介canal是Alibaba旗下的一款开源项目,Java开发。基于数据库增量日志解析,提供增量数据订阅&消费。Git地址:https://github.co

  2. ES基础入门 - 2

    ES一、简介1、ElasticStackES技术栈:ElasticSearch:存数据+搜索;QL;Kibana:Web可视化平台,分析。LogStash:日志收集,Log4j:产生日志;log.info(xxx)。。。。使用场景:metrics:指标监控…2、基本概念Index(索引)动词:保存(插入)名词:类似MySQL数据库,给数据Type(类型)已废弃,以前类似MySQL的表现在用索引对数据分类Document(文档)真正要保存的一个JSON数据{name:"tcx"}二、入门实战{"name":"DESKTOP-1TSVGKG","cluster_name":"elasticsear

  3. ruby - Rails Elasticsearch 聚合 - 2

    不知何故,我似乎无法获得包含我的聚合的响应...使用curl它按预期工作:HBZUMB01$curl-XPOST"http://localhost:9200/contents/_search"-d'{"size":0,"aggs":{"sport_count":{"value_count":{"field":"dwid"}}}}'我收到回复:{"took":4,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":90,"max_score":0.0,"hits":[]},"a

  4. c# - Ruby 等效于 C# Linq 聚合方法 - 2

    什么是Linq聚合方法的ruby​​等价物。它的工作原理是这样的varfactorial=new[]{1,2,3,4,5}.Aggregate((acc,i)=>acc*i);每次将数组序列中的值传递给lambda时,变量acc都会累积。 最佳答案 这在数学以及几乎所有编程语言中通常称为折叠。它是更普遍的变形概念的一个实例。Ruby从Smalltalk中继承了这个特性的名称,它被称为inject:into:(像aCollectioninject:aStartValueinto:aBlock一样使用。)所以,在Ruby中,它称为inj

  5. ruby - 如何获取我的 Sinatra 应用程序的代码覆盖率统计信息? - 2

    我编写了一个Sinatra应用程序(网站),我想收集网站代码的代码覆盖率信息。我是Ruby的新手,但Google告诉我rcov是一个很好的代码覆盖工具。不幸的是,我在网上可以找到的所有信息只显示了如何获取有关测试用例的代码覆盖率信息-我想要有关我的站点本身的代码覆盖率信息。我想要分析的特定站点文件位于“sdk”和“sdk/vendor”目录中,因此我通常使用“rubysite.rb”运行我的站点的地方我改为尝试以下操作:rcov-Isdk-Isdk/vendorsite.rb它显示了Sinatra启动文本,但随后立即退出,而不是像我的Sinatra应用程序通常那样等待网络请求。有人能告

  6. ruby-on-rails - 收集 Rails 应用程序使用统计信息的最佳方式 - 2

    我有一个Rails应用程序,用户可以在其中设置他们的域并在其中发布内容。我需要收集公共(public)流量统计信息,例如网页浏览量等。此功能的一个很好的例子是我作为客户可以看到的flickr使用统计信息。问题是收集使用信息的最佳方式是什么。应该通过解析日志文件来完成还是应该在运行时收集并存储在数据库中?是否有任何工具或Rails插件已经提供了此功能?此解决方案应该可以很好地扩展,即使每月有数千个域和数百万次网页浏览。 最佳答案 GoogleAnalytics可能是您最好的选择... 关于

  7. 关于ES集群信息的一些查看 - 2

    文章目录查看ES信息查看节点信息查看分片信息实际场景下ES分片及副本数量应该怎么分关于ES的灵活使用查看ES信息查看版本kibana:GET/查看节点信息GET/_cat/nodes?v解释:ip:集群中节点的ip地址;heap.percent:堆内存的占用百分比;ram.percent:总内存的占用百分比,其实这个不是很准确,因为buff/cache和available也被当作使用内存;cpu:cpu占用百分比;load_1m:1分钟内cpu负载;load_5m:5分钟内cpu负载;load_15m:15分钟内cpu负载;node.role:上图的dilmrt代表全部权限master:*代表

  8. linux查看es节点使用情况,elasticsearch(es) 如何查看当前集群中哪个节点是主节点(master) - 2

    elasticsearch查看当前集群中的master节点是哪个需要使用_cat监控命令,具体如下。查看方法es主节点确定命令,以kibana上查看示例如下:GET_cat/nodesv返回结果示例如下:ipheap.percentram.percentcpuload_1mload_5mload_15mnode.rolemastername172.16.16.188529952.591.701.45mdi-elastic3172.16.16.187329950.990.991.19mdi-elastic2172.16.16.231699940.871.001.03mdi-elastic4172

  9. sql - Arel 导致聚合无限循环 - 2

    我在使用Arel聚契约(Contract)一查询中的2列时遇到了问题。当我运行它时,在railsdev-server崩溃之前,整个服务器会卡住一分钟。我怀疑是无限循环:)。也许我误解了Arel的整个概念,如果有人能看一下,我将不胜感激。这个查询的预期结果是这样的:[{:user_id=>1,:sum_account_charges=>300,:sum_paid_debts=>1000},...]a_account_charges=Table(:account_charges)a_paid_debts=Table(:paid_debts)a_participants=Table(:exp

  10. ruby - 计算数组的统计信息 - 2

    我正在构建一个需要计算数据集统计信息的网络应用程序。我需要计算数组的百分位数、平均值、众数和其他统计函数。通常在Python中,我只会使用scipy、numpy或nltk,它们有一个巨大的stat数组函数库。我可以利用任何ruby​​gem或库来执行此操作吗?在没有任何现有库的情况下,是否有一种简单的方法可以在Python中进行数据处理,同时将我的应用程序保留在Ruby/Rails中? 最佳答案 如果你真的需要一个完整的统计库,看看statsample.否则你可能会发现descriptive_statistics成为一个不错的、轻量

随机推荐