Clicky

Hi

my @groups = split /\n\n/, $body;

runs the risk of "\n\n" may not exist or be "\n" or be "\n\n\n" or ...etc

So instead of the white space separator, I am thinking to get each of the paragraph that will be between them.

I want each item in @groups to contain the following:
First line contains 3 letters, maybe followed by "/", followed by 3 letters, there maybe other string on this line, either before or after the pattern just stated, followed by end of line.
maybe followed by another line just like the first, followed by end of line.

this regex need to run on Active Perl.
thx

asked 12/17/2011 05:19

samj's gravatar image

samj ♦♦


8 Answers:
risk of "\n\n" may not exist or be "\n" or be "\n\n\n" or ...etc

Would /\n+/ satisfy this concern?

link

answered

ozo's gravatar image

ozo

First line contains 3 letters, maybe followed by "/", followed by 3 letters, there maybe other string on this line
either before or after the pattern just stated, followed by end of line.
maybe followed by another line just like the first, followed by end of line.

my @groups =  $body =~ /(?:.*\w{3}\/\w.*\n){1,2}/g
link

answered 2011-12-17 at 13:28:21

ozo's gravatar image

ozo

maybe followed by "/",
link

answered 2011-12-17 at 13:37:17

samj's gravatar image

samj

\/\w

maybe followed by \n
followed by 3 letters, not more than 3.
link

answered 2011-12-17 at 14:10:53

samj's gravatar image

samj

ozo.

I tried to fix your regex for no avail

my @groups =  $body =~ /(?:.*\w{3}\/?\w(3).*\n){1,2}/g;

maybe you can.
link

answered 2011-12-17 at 14:15:25

samj's gravatar image

samj

Please post an example of the data you are working with. It doesn't have to be the exact data, but should mirror the structure. Also post what you expect the output to look like.
link

answered 2011-12-17 at 14:28:08

kaufmed's gravatar image

kaufmed

the :
after the (?

is it literal?
if not, what is it?

thx
link

answered 2011-12-17 at 15:16:25

samj's gravatar image

samj

perldoc perlre
...
       Extended Patterns

       Perl also defines a consistent extension syntax for features not found
       in standard tools like awk and lex.  The syntax is a pair of
       parentheses with a question mark as the first thing within the
       parentheses.  The character after the question mark indicates the
       extension.
...
       "(?:pattern)"
       "(?imsx-imsx:pattern)"
                 This is for clustering, not capturing; it groups
                 subexpressions like "()", but doesn't make backreferences as
                 "()" does.  So
link

answered 2011-12-18 at 12:08:48

ozo's gravatar image

ozo

Your answer
[hide preview]

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Tags:

×21
×11

Asked: 12/17/2011 05:19

Seen: 320 times

Last updated: 12/17/2011 06:09