Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between revisions of "Details on regular expressions as a data source"

Line 22: Line 22:
  
 
There is a preference page accessed through <code>Window->Preferences->OHF H3Et->Batch Generator->Regex Batch Data Source</code>, that sets some required options.  
 
There is a preference page accessed through <code>Window->Preferences->OHF H3Et->Batch Generator->Regex Batch Data Source</code>, that sets some required options.  
 +
 +
 +
[[Image:RegexPreferences.PNG]]
 +
 +
 +
The 'regex choice strategy' options determines in which order strings are generated from the regular expression; that is, how the generator will behave when it encounters a choice in a regular expression, such as alternation ('|') or a quantifier (such as '*' or {2,3}. If 'Random' is selected, then the choice will be made randomly. If 'Increasing' is selected, then the first time a string is generated from the regular expression, the first available option path will be taken, the second time the second, and so on. When it reaches the last choice it starts over again. 'Decreasing' works like 'Increasing' but starting at the last available option and working backwards. Here is a sample of the different behaviors using the same regular expression to generate 5 strings:
 +
 +
<code><pre>
 +
Random:
 +
>a|b|c|d
 +
b d a c d
 +
 +
Increasing:
 +
>a|b|c|d
 +
a b c d a
 +
 +
Decreasing:
 +
>a|b|c|d
 +
d c b a d
 +
</pre></code>
 +
 +
The second option on the preference page, 'Upper bound for infinite closures' puts an upper limit on the size of the strings created by using quantifiers like '*' or '+', since they could potentially generate arbitrarily large strings. For example, an upper bound of
 +
3, would mean that the longest string which could be generated from a regular expression like <code>ab*</code> would be <code>abbb</code>.
 +
 +
Finally, the message batch generator requires that if you are using regular expressions as a source of data, then the total number of files must be limited (that is, on the final page of the wizard, we have to select a maximum number of files of create, instead of opting to use the entire data source).
 +
 +
 +
 +
== Reference ==
 +
  
 
The generator supports the following special operators for generating sample strings:
 
The generator supports the following special operators for generating sample strings:
Line 88: Line 118:
  
 
|}
 
|}
 
There is a preference page for regular expressions as a message batch generator data source:
 
 
 
[[Image:RegexPreferences.PNG]]
 
 
 
The 'regex choice strategy' options determines in which order strings are generated from the regular expression; that is, how the generator will behave when it encounters a choice in a regular expression, such as alternation ('|') or a quantifier (such as '*' or {2,3}. If 'Random' is selected, then the choice will be made randomly. If 'Increasing' is selected, then the first time a string is generated from the regular expression, the first available option path will be taken, the second time the second, and so on. When it reaches the last choice it starts over again. 'Decreasing' works like 'Increasing' but starting at the last available option and working backwards. Here is a sample of the different behaviors using the same regular expression to generate 5 strings:
 
 
<code><pre>
 
Random:
 
>a|b|c|d
 
b d a c d
 
 
Increasing:
 
>a|b|c|d
 
a b c d a
 
 
Decreasing:
 
>a|b|c|d
 
d c b a d
 
</pre></code>
 
 
The message batch generator requires that if you are using regular expressions as a source of data, then the total number of files must be limited (that is, on the final page of the wizard, we have to select a maximum number of files of create, instead of opting to use the entire data source).
 
 
- must limit the output
 
- must specify infinite closure upper bound
 
- choice strategies
 

Revision as of 04:42, 16 August 2007

The message batch generator can populate fields by creating text strings that match regular expressions. A regular expression is a compact syntax for describing a certain set of strings. For example, the regular expression cat|dog describes the two strings cat and dog, while the regular expression cats? describes the two strings cat and cats. Here are some examples of the kinds of strings you can generate from regular expressions:

>cat
cat cat cat cat cat 
>cat|dog
cat dog cat dog cat 
>cats?
cat cats cat cats cat 
>'grrr*'
'grr' 'grrr' 'grrrr' 'grrrrr' 'grrrrrr' 
>(mewl)+
mewl mewlmewl mewlmewlmewl mewlmewlmewlmewl mewlmewlmewlmewlmewl 
>hot{3,4}
hottt hotttt hottt hotttt hottt 
>[a-z]
a b c d e 
>[0-4]
0 1 2 3 4 
>[ac-e]
a c d e a 

There is a preference page accessed through Window->Preferences->OHF H3Et->Batch Generator->Regex Batch Data Source, that sets some required options.


RegexPreferences.PNG


The 'regex choice strategy' options determines in which order strings are generated from the regular expression; that is, how the generator will behave when it encounters a choice in a regular expression, such as alternation ('|') or a quantifier (such as '*' or {2,3}. If 'Random' is selected, then the choice will be made randomly. If 'Increasing' is selected, then the first time a string is generated from the regular expression, the first available option path will be taken, the second time the second, and so on. When it reaches the last choice it starts over again. 'Decreasing' works like 'Increasing' but starting at the last available option and working backwards. Here is a sample of the different behaviors using the same regular expression to generate 5 strings:

Random:
>a|b|c|d
b d a c d

Increasing:
>a|b|c|d
a b c d a 

Decreasing:
>a|b|c|d
d c b a d

The second option on the preference page, 'Upper bound for infinite closures' puts an upper limit on the size of the strings created by using quantifiers like '*' or '+', since they could potentially generate arbitrarily large strings. For example, an upper bound of 3, would mean that the longest string which could be generated from a regular expression like ab* would be abbb.

Finally, the message batch generator requires that if you are using regular expressions as a source of data, then the total number of files must be limited (that is, on the final page of the wizard, we have to select a maximum number of files of create, instead of opting to use the entire data source).


Reference

The generator supports the following special operators for generating sample strings:

Expression Description
() groups the expressions inside the parentheses
+ 0 or 1 of the preceding expression
* 0 or more of the preceding expression
+ 1 or more of the preceding expression
{n,} n or more of the preceding expression
{n,m} between n and m of the preceding expression (n must not be greater than m)
[xyz] any character inside the brackets
[a-n] any character in the range
\ treat whatever comes next as an ordinary character, and not as a special operator

Copyright © Eclipse Foundation, Inc. All Rights Reserved.