我知道通过使用 Xeger,我们可以获得指定模式的随机值。
String regex = "[0-9]{2}";
Xeger generator = new Xeger(regex);
String result = generator.generate();
我想知道有没有办法返回指定正则表达式的所有有效字符串。例如,对于模式:[0-9]{2} ,我们可以从 00 中获取所有值至 99 .
谢谢
编辑:
这里我们不考虑像+和*这样的无限输出;我们如何获得有限正则表达式的所有值?
最后编辑:
谢谢大家!最后,我不考虑所有可能的值,因为可能有数千个。我限制了一个特定的数字作为值的数量来减少量。
最佳答案
由于正则表达式是由有限状态机定义的,我想知道是否存在能够在此类机器上自动推理的东西,并且非常适合重新用于这项工作...和 clojure.core.logic delivered
所以,我看了这个definition of the regexp grammar (不幸的是,它缺少 {} 量词,但它们应该很容易添加到我的代码中)使其适应 java 转义,并制定了这个 110 行长的 clojure 程序:
(ns regexp-unfolder.core
(:require [instaparse.core :as insta])
(:require [clojure.core.logic :as l])
(:require [clojure.set :refer [union difference]])
(:gen-class :methods [#^{:static true} [unfold [String] clojure.lang.LazySeq]])
)
(def parse-regexp (insta/parser
"re = union | simple-re?
union = re '|' simple-re
simple-re = concat | base-re
concat = simple-re base-re
base-re = elementary-re | star | plus
star = elementary-re '*'
plus = elementary-re '+'
elementary-re = group | char | '$' | any | set
any = '.'
group = '(' re ')'
set = positive-set | negative-set
positive-set = '[' set-items ']'
negative-set = '[^' set-items ']'
set-items = set-item*
set-item = range | char
range = char '-' char
char = #'[^\\\\\\-\\[\\]]|\\.'" ))
(def printables (set (map char (range 32 127))))
(declare fns handle-first)
(defn handle-tree [q qto [ type & nodes]]
(if (nil? nodes)
[[q [""] qto]]
((fns type handle-first) q qto nodes)))
(defn star [q qto node &]
(cons [q [""] qto]
(handle-tree q q (first node))))
(defn plus [q qto node &]
(concat (handle-tree q qto (first node))
(handle-tree qto qto (first node))))
(defn any-char [q qto & _] [[q (vec printables) qto]] )
(defn char-range [[c1 _ c2]]
(let [extract-char (comp int first seq second)]
(set (map char (range (extract-char c1) (inc (extract-char c2)))))))
(defn items [nodes]
(union (mapcat
(fn [[_ [type & ns]]]
(if (= type :char)
#{(first ns)}
(char-range ns)))
(rest (second nodes)))))
(defn handle-set [q qto node &] [[q (vec (items node)) qto]])
(defn handle-negset [q qto node &] [[q (vec (difference printables (items node))) qto]])
(defn handle-range [q qto & nodes] [[q (vec (char-range nodes)) qto]])
(defn handle-char [q qto node &] [[q (vec node) qto]] )
(defn handle-concat [q qto nodes]
(let [syms (for [x (rest nodes)] (gensym q))]
(mapcat handle-tree (cons q syms) (concat syms [qto] ) nodes)
))
(defn handle-first [q qto [node & _]] (handle-tree q qto node))
(def fns {:concat handle-concat, :star star, :plus plus, :any any-char, :positive-set handle-set, :negative-set handle-negset, :char handle-char})
(l/defne transition-membero
[state trans newstate otransition]
([_ _ _ [state trans-set newstate]]
(l/membero trans trans-set)))
(defn transitiono [state trans newstate transitions]
(l/conde
[(l/fresh [f]
(l/firsto transitions f)
(transition-membero state trans newstate f))]
[(l/fresh [r]
(l/resto transitions r)
(transitiono state trans newstate r))])
)
(declare transitions)
;; Recognize a regexp finite state machine encoded in triplets [state, transition, next-state], adapted from a snippet made by Peteris Erins
(defn recognizeo
([input]
(recognizeo 'q0 input))
([q input]
(l/matche [input] ; start pattern matching on the input
(['("")]
(l/== q 'ok)) ; accept the empty string if we are in an accepting state
([[i . nput]]
(l/fresh [qto]
(transitiono q i qto transitions) ; assert it must be what we transition to qto from q with input symbol i
(recognizeo qto nput)))))) ; recognize the remainder
(defn -unfold [regex]
(def transitions
(handle-tree 'q0 'ok (parse-regexp regex)))
(map (partial apply str) (l/run* [q] (recognizeo q))))
使用 core.logic 编写,应该很容易将其调整为正则表达式匹配器
我将可打印字符从 32 个 ascii 限制到 126 个,否则处理诸如 [^c] 之类的正则表达式会太麻烦,但你可以很容易地扩展它......也,我还没有实现联合、可选模式和字符类的\w、\s 等转义
这是迄今为止我用 clojure 写的最重要的东西,但似乎涵盖了基础知识......一些例子:
regexp-unfolder.core=> (-unfold "ba[rz]")
("bar" "baz")
regexp-unfolder.core=> (-unfold "[a-z3-7]")
("a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" "3" "4" "5" "6" "7")
regexp-unfolder.core=> (-unfold "[a-z3-7][01]")
("a0" "a1" "b0" "b1" "c0" "c1" "d0" "d1" "e0" "e1" "f0" "f1" "g0" "g1" "h0" "h1" "i0" "i1" "j0" "j1" "k0" "k1" "l0" "l1" "m0" "m1" "n0" "n1" "o0" "o1" "p0" "p1" "q0" "q1" "r0" "r1" "s0" "s1" "t0" "t1" "u0" "u1" "v0" "v1" "w0" "w1" "x0" "x1" "y0" "y1" "z0" "z1" "30" "31" "40" "41" "50" "51" "60" "70" "61" "71")
regexp-unfolder.core=> (-unfold "[^A-z]")
(" " "@" "!" "\"" "#" "$" "%" "&" "'" "(" ")" "*" "+" "," "-" "." "/" "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" ":" ";" "{" "<" "|" "=" "}" ">" "~" "?")
regexp-unfolder.core=> (take 20 (-unfold "[abc]*"))
("" "a" "b" "c" "aa" "ab" "ac" "ba" "ca" "aaa" "bb" "cb" "aab" "bc" "cc" "aac" "aba" "aca" "baa" "caa")
regexp-unfolder.core=> (take 20 (-unfold "a+b+"))
("ab" "aab" "abb" "abbb" "aaab" "abbbb" "aabb" "abbbbb" "abbbbbb" "aabbb" "abbbbbbb" "abbbbbbbb" "aaaab" "aabbbb" "aaabb" "abbbbbbbbb" "abbbbbbbbbb" "aabbbbb" "abbbbbbbbbbb" "abbbbbbbbbbbb")
自从我以这种方式开始,我也实现了无限输出:)
如果有人感兴趣,我uploaded it here
显然,这是一个如何从普通的旧 Java 调用 unfold 的示例:
import static regexp_unfolder.core.unfold;
public class UnfolderExample{
public static void main(String[] args){
@SuppressWarnings("unchecked")
Iterable<String> strings = unfold("a+b+");
for (String s : strings){
System.out.println(s);
}
}
}
关于java - 为正则表达式生成所有有效值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15950113/
我试图获取一个长度在1到10之间的字符串,并输出将字符串分解为大小为1、2或3的连续子字符串的所有可能方式。例如:输入:123456将整数分割成单个字符,然后继续查找组合。该代码将返回以下所有数组。[1,2,3,4,5,6][12,3,4,5,6][1,23,4,5,6][1,2,34,5,6][1,2,3,45,6][1,2,3,4,56][12,34,5,6][12,3,45,6][12,3,4,56][1,23,45,6][1,2,34,56][1,23,4,56][12,34,56][123,4,5,6][1,234,5,6][1,2,345,6][1,2,3,456][123
在我的应用程序中,我需要能够找到所有数字子字符串,然后扫描每个子字符串,找到第一个匹配范围(例如5到15之间)的子字符串,并将该实例替换为另一个字符串“X”。我的测试字符串s="1foo100bar10gee1"我的初始模式是1个或多个数字的任何字符串,例如,re=Regexp.new(/\d+/)matches=s.scan(re)给出["1","100","10","1"]如果我想用“X”替换第N个匹配项,并且只替换第N个匹配项,我该怎么做?例如,如果我想替换第三个匹配项“10”(匹配项[2]),我不能只说s[matches[2]]="X"因为它做了两次替换“1fooX0barXg
我真的很习惯使用Ruby编写以下代码:my_hash={}my_hash['test']=1Java中对应的数据结构是什么? 最佳答案 HashMapmap=newHashMap();map.put("test",1);我假设? 关于java-等价于Java中的RubyHash,我们在StackOverflow上找到一个类似的问题: https://stackoverflow.com/questions/22737685/
这是一道面试题,我没有答对,但还是很好奇怎么解。你有N个人的大家庭,分别是1,2,3,...,N岁。你想给你的大家庭拍张照片。所有的家庭成员都排成一排。“我是家里的friend,建议家庭成员安排如下:”1岁的家庭成员坐在这一排的最左边。每两个坐在一起的家庭成员的年龄相差不得超过2岁。输入:整数N,1≤N≤55。输出:摄影师可以拍摄的照片数量。示例->输入:4,输出:4符合条件的数组:[1,2,3,4][1,2,4,3][1,3,2,4][1,3,4,2]另一个例子:输入:5输出:6符合条件的数组:[1,2,3,4,5][1,2,3,5,4][1,2,4,3,5][1,2,4,5,3][
当我的预订模型通过rake任务在状态机上转换时,我试图找出如何跳过对ActiveRecord对象的特定实例的验证。我想在reservation.close时跳过所有验证!叫做。希望调用reservation.close!(:validate=>false)之类的东西。仅供引用,我们正在使用https://github.com/pluginaweek/state_machine用于状态机。这是我的预订模型的示例。classReservation["requested","negotiating","approved"])}state_machine:initial=>'requested
我有这个html标记:我想得到这个:我如何使用Nokogiri做到这一点? 最佳答案 require'nokogiri'doc=Nokogiri::HTML('')您可以通过xpath删除所有属性:doc.xpath('//@*').remove或者,如果您需要做一些更复杂的事情,有时使用以下方法遍历所有元素会更容易:doc.traversedo|node|node.keys.eachdo|attribute|node.deleteattributeendend 关于ruby-Nokog
我想获取模块中定义的所有常量的值:moduleLettersA='apple'.freezeB='boy'.freezeendconstants给了我常量的名字:Letters.constants(false)#=>[:A,:B]如何获取它们的值的数组,即["apple","boy"]? 最佳答案 为了做到这一点,请使用mapLetters.constants(false).map&Letters.method(:const_get)这将返回["a","b"]第二种方式:Letters.constants(false).map{|c
我正在尝试使用boilerpipe来自JRuby。我看过guide从JRuby调用Java,并成功地将它与另一个Java包一起使用,但无法弄清楚为什么同样的东西不能用于boilerpipe。我正在尝试基本上从JRuby中执行与此Java等效的操作:URLurl=newURL("http://www.example.com/some-location/index.html");Stringtext=ArticleExtractor.INSTANCE.getText(url);在JRuby中试过这个:require'java'url=java.net.URL.new("http://www
我只想对我一直在思考的这个问题有其他意见,例如我有classuser_controller和classuserclassUserattr_accessor:name,:usernameendclassUserController//dosomethingaboutanythingaboutusersend问题是我的User类中是否应该有逻辑user=User.newuser.do_something(user1)oritshouldbeuser_controller=UserController.newuser_controller.do_something(user1,user2)我
什么是ruby的rack或python的Java的wsgi?还有一个路由库。 最佳答案 来自Python标准PEP333:Bycontrast,althoughJavahasjustasmanywebapplicationframeworksavailable,Java's"servlet"APImakesitpossibleforapplicationswrittenwithanyJavawebapplicationframeworktoruninanywebserverthatsupportstheservletAPI.ht