Clicky

I am working on my computer project, I am not sure of the forum's position on aiding students with work. Its just that I am stuck on this question for a long time now and need to move forward since the deadline is approaching. I will upload the script I have thus far and the question I am trying to solve. If anyone can point me in the right direction I will be grateful.

Thanks

1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
80:
81:
82:
83:
84:
85:
86:
87:
88:
89:
90:
91:
92:
93:
94:
95:
96:
97:
98:
99:
100:
101:
102:
103:
104:
105:
106:
107:
108:
109:
110:
111:
112:
113:
114:
115:
116:
117:
118:
119:
120:
121:
122:
123:
124:
125:
126:
127:
128:
129:
130:
131:
132:
133:
134:
135:
136:
137:
138:
139:
140:
141:
142:
143:
144:
145:
146:
147:
148:
149:
150:
151:
152:
153:
154:
155:
156:
157:
158:
159:
160:
161:
162:
163:
164:
165:
166:
167:
168:
169:
170:
171:
172:
173:
174:
175:
176:
177:
178:
179:
180:
181:
182:
183:
184:
185:
186:
187:
188:
189:
190:
191:
192:
193:
194:
195:
196:
197:
198:
199:
200:
201:
202:
203:
204:
205:
206:
207:
208:
209:
210:
211:
212:
213:
214:
215:
216:
217:
218:
219:
220:
221:
222:
223:
224:
225:
226:
227:
228:
229:
230:
231:
232:
233:
234:
235:
236:
237:
238:
239:
240:
241:
242:
243:
244:
245:
246:
247:
248:
249:
250:
251:
252:
253:
254:
255:
256:
257:
258:
259:
260:
261:
262:
263:
264:
265:
266:
267:
268:
269:
270:
271:
272:
273:
274:
275:
276:
277:
278:
279:
280:
281:
282:
283:
284:
285:
286:
287:
288:
289:
290:
291:
292:
293:
294:
295:
296:
297:
298:
299:
300:
#/usr/bin/perl

use File::Basename;

#------------------------------------------------------------------------------#
#  Global variables that control the program action and output.                #
#------------------------------------------------------------------------------#

$NUM_RECS_TO_PRINT = 10;   # num of output recs to print per section

#---------------------------------------------------------------------#
#  Change this array to include index filenames used on your system.  #
#---------------------------------------------------------------------#

@indexFilenames = ('index.htm', 'index.html', 'index.shtml');


#----------------------------------------------------------------------#
# don't change anything below here unless you're comfortable with Perl #
#----------------------------------------------------------------------#

sub usage {
   print STDERR "
	Usage:  log2.pl access_log > output_file
";
}


#----------------------------------------------------------#
#  These are two helper routines for the 'sort' function.  #
#----------------------------------------------------------#

sub fileNumericAscending {
   $numFileRequests{$a} <=> $numFileRequests{$b};
}

sub fileNumericDescending {
   $numFileRequests{$b} <=> $numFileRequests{$a};
}

sub trim($)
{
   my $string = shift;
   $string =~ s/^s+//;
   $string =~ s/s+$//;
   return $string;
}


#----------------------------<<   main   >>-----------------------------#

   #--------------------------------------------------------------------#
   #  Start by making sure the user is invoking this program properly.  #
   #--------------------------------------------------------------------#

   $numArgs = $#ARGV + 1;

   if ($numArgs != 1) {
      &usage;
      exit 1;
   }

   $logFile = $ARGV[0];

   open (LOGFILE,"access_log") || die "  Error opening log file $logFile.
";

   #------------------------------------------------------------------#
   #  Start reading and processing the access_log file in this loop.  #
   #------------------------------------------------------------------#

   #printf "<pre>
";
   while(<LOGFILE>)
   {


	if (/^(d{1,3}.d{1,3}.d{1,3}.d{1,3})/)
	{
		$REMOTE_IP{$1}++
	}
   



	 #if (/[^(s)]*)$|([^(]+?)s*((.*)/)
#(/([^(]+?)s*((.*)|[^(s)]*)$/)
  #(/([^(s)]*)$|([^(]+?)s*((.*))/)
 	#{

	#$USER_AGENT{$1}++
	#}



      chomp;

      #----------------------------------------------#
        #  condense one or more whitespace character   #
      #  to one single space                         #
      #----------------------------------------------#

      s/s+/ /go;

      #----------------------------------------------------------#
      #  the next line breaks each line of the access_log into   #
      #  nine variables                                          #
      #----------------------------------------------------------#

      ($clientAddress,    $rfc1413,      $username, 
      $localTime,         $httpRequest,  $statusCode, 
      $bytesSentToClient, $referer,      $clientSoftware) =
      /^(S+) (S+) (S+) [(.+)] "(.+)" (S+) (S+) "(.*)" "(.*)"/o;

      #--------------------------------------------------------------------#
      # take care of problem where the $httpRequest may simply be a hyphen #
      #--------------------------------------------------------------------#

      next if ($httpRequest =~ '^-$');

      #-----------------------------------------#
      #  Determine the value of $fileRequested  #
      #-----------------------------------------#

      ($getPost, $fileRequested, $junk) = split(' ', $httpRequest, 6);
	 ($getPost, $clientAddress, $junk) = split(' ', $clientAddress, 1);
     

      #-----------------------------------------------------------------#
      #  if the base filename is something like index.htm, index.html,  #
      #  or index.shtml, interpret this to be the same as the path by   #
      #  itself.  This way, '/java/' is the same as '/java/index.html'. #
      #-----------------------------------------------------------------#

      foreach $indexFile (@indexFilenames) {
        chomp($fileRequested);
        $fileRequested = trim($fileRequested);
        if ($fileRequested =~ /^s+$/) {
           next;
        }
        if ($fileRequested =~ /^$/) {
           next;
        }
        if (basename($fileRequested) =~ /$indexFile/i) {
           $fileRequested = dirname($fileRequested);
           last;
        }
      }

      #----------------------------------------------------------------#
      #  If the last character in $fileRequested is a '/', remove it.  #
      #  This makes /perl/ equal to /perl.                             #
      #----------------------------------------------------------------#

      if (length($fileRequested) > 1) 
      {
        if (substr($fileRequested,length($fileRequested)-1,1) eq '/') 
        {
          chop($fileRequested);
        }
      }

      #-----------------------------------------------------#
      #  here's where we count the number of hits per file  #
      #-----------------------------------------------------#

      $numFileRequests{$fileRequested}++;



   }#end first while loop

   close (LOGFILE);






   #--------------------------------------#
   #  Output the number IPs  #
   #--------------------------------------#

   print "TOP $NUM_RECS_TO_PRINT IP ADDRESSES:
";
   print "-----------------------------

";
   $count=1;
   foreach my $ip (sort {$REMOTE_IP{$b} <=> $REMOTE_IP{$a}} (keys(%REMOTE_IP))) {
      last if ($count > $NUM_RECS_TO_PRINT);
      print "$count	$ip = $REMOTE_IP{$ip}  
";
	
      $count++;
   }
   print "

";

   printf "</pre>
";




   #--------------------------------------#
   #  Output the number IPs  #
   #--------------------------------------#

   print "TOP $NUM_RECS_TO_PRINT USER AGENTS:
";
   print "-----------------------------

";
   $count=1;
   foreach my $agent (sort {$USER_AGENT{$b} <=> $USER_AGENT{$a}} (keys(%USER_AGENT))) {
      last if ($count > $NUM_RECS_TO_PRINT);
      print "$count	$agent= $USER_AGENT{$agent}  
";
	
      $count++;
   }
   print "

";

   printf "</pre>
";




   #--------------------------------------#
   #  Output the number of hits per file  #
   #--------------------------------------#

   print "TOP $NUM_RECS_TO_PRINT CONNECT REQUESTS:
";
   print "-----------------------------

";
   $count=1;
   foreach $key (sort fileNumericDescending (keys(%numFileRequests))) {
      last if ($count > $NUM_RECS_TO_PRINT);
      print "$count	$numFileRequests{$key},$httpRequest{$key} 		 $key
";
	
      $count++;
   }
   print "

";

   printf "</pre>
";





open (LOGFILE,"audit_log") || die "  Error opening log file $logFile.
";
   #printf "<pre>
";
   while (<LOGFILE>) {

if (/mod_security-message:.*./)
{
$MOD_SEC{$1}++
}

}
 close (LOGFILE);



   #--------------------------------------#
   #  Output the number of hits per file  #
   #--------------------------------------#

   print "TOP $NUM_RECS_TO_PRINT PATTERN MATCH:
";
   print "-----------------------------

";
   $count=1;
   foreach my $modsec (sort {$MOD_SEC{$b} <=> $MOD_SEC{$a}} (keys(%MOD_SEC))) {
      last if ($count > $NUM_RECS_TO_PRINT);
      print "$count	$agent= $MOD_SEC{$modsec}  
";
	
      $count++;
   }
   print "

";

   printf "</pre>
";

     



This is the question I am stuck at
4. Search Logs for mod_security-message which is access denied by mod_security

When Mod_Security identifies a problem with a request due to a security violation, it will do two things – 1) Add in some additional client request headers stating why mod_security is taking action, and 2) Log this data to the audit_log and error_log files.  These error messages can be triggered by Mod_Security special checks such as the SecFilterCheckURLEncoding directive, basic filters such as “..” to prevent directory traversals and advanced filters based on converted snort rules.

Search Logic: Search the audit_log entries that have the mod_security-message header, then sort the results, then only show unique entries with a total count of each type in reverse order from highest to lowest, then remove the mod_security-message data at the beginning of each line and list the Top 10 results.

      Your output will be similar to:

   1 51746 Pattern match "Basic" at HEADER.
   2 6138 Pattern match "passwd=" at THE_REQUEST.
   3 5852 Pattern match "/search" at THE_REQUEST.
   4 5368 Pattern match "passwd=" at THE_REQUEST.
   5 4826 Pattern match ".asp" at THE_REQUEST.
   6 3694 Pattern match "login.icq.com" at THE_REQUEST.
   7 1971 mod_security-message: Invalid character detected
   8 1935 Pattern match "/smartsearch.cgi" at THE_REQUEST.
   9 1887 Pattern match "cmd.exe" at THE_REQUEST.
  10 1387 Pattern match "/sh" at THE_REQUEST.

asked 11/03/2011 12:32

SheldonC's gravatar image

SheldonC ♦♦


18 Answers:
Always use

use warnings;
use strict;

This is help you troubleshoot and force good practices.

Line 39 looks wrong
sub trim($)

It should not have ($)

It is fine to use File::Basename but not needed. You can use a simple command like
$filename =~ s/\..*//;
link
farzanj's gravatar image

farzanj

thanks for the instructions but the main part I am stuck on is this. Everything else works except this.
It doesn't output the mod_security-message header

 
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
open (LOGFILE,"audit_log") || die "  Error opening log file $logFile.
";
   #printf "<pre>
";
   while (<LOGFILE>) {

if (/mod_security-message:.*./)
{
$MOD_SEC{$1}++
}

}
 close (LOGFILE);



   #--------------------------------------#
   #  Output the number of hits per file  #
   #--------------------------------------#

   print "TOP $NUM_RECS_TO_PRINT PATTERN MATCH:
";
   print "-----------------------------

";
   $count=1;
   foreach my $modsec (sort {$MOD_SEC{$b} <=> $MOD_SEC{$a}} (keys(%MOD_SEC))) {
      last if ($count > $NUM_RECS_TO_PRINT);
      print "$count	$agent= $MOD_SEC{$modsec}  
";
	
      $count++;
   }
   print "

";

   printf "</pre>
";
link
SheldonC's gravatar image

SheldonC

What kind on line are you trying to match?
Try:
/mod_security-message[:].*\.
link
farzanj's gravatar image

farzanj

The following statement needs changed to
1:
if (/mod_security-message:.*./)

1:
if (/(mod_security-message:.*.)/)

or better yet
1:
if (/mod_security-message:(.*)./)

The parentheses tell perl to put the value between them into $1.
link
schubach's gravatar image

schubach

Correctly stated by schubach
But you still need [:] instead of :
link
farzanj's gravatar image

farzanj

works great only I don't necessarily need the "Access denied with code 200."

also I have this regex (/([^(]+?)\s*(\(.*\)|\b[^(\s)]*)$/) to extract the USER AGENT

eg.  Mozilla/4.0(compatible;MSIE 6.0: Windows NT 5.1)

However when I run my script it takes a very long time to complete when I include this part of this regex in my code

thanks again guys for your help
link
SheldonC's gravatar image

SheldonC

Try:
/([^(]+)\s*([^;]*);([^:]*)\W*([^)]*)/

If it is not what you want, please tell me what you need to extract
link
farzanj's gravatar image

farzanj

This what I am looking for
example:
Mozilla/4.0(compatible;MSIE 6.0: Windows NT 5.1)
link
SheldonC's gravatar image

SheldonC

This is the text to be parsed?
What do you want to extract from it?
link
farzanj's gravatar image

farzanj

I want to extract the following string from apache access_log

What is user’s browser type? Ex: Mozilla/4.0(compatible;MSIE 6.0: Windows NT 5.1)
link
SheldonC's gravatar image

SheldonC

Ok, I was thinking it to be the starting point.  But I need a sample Apache Access_log
link
farzanj's gravatar image

farzanj

If you have access to POSIX::Regex CPAN module (since you're a student I'm not sure if you have permission to install specific CPAN modules on your box), then try the example POSIX regex found here.  http://www.texsoft.it/index.php?m=sw.php.useragent.  You can try from the command line
1:
perl -e 'use POSIX::Regex';
to see if it is installed.  Google is my friend.
link
schubach's gravatar image

schubach

The regex you gave me /([^(]+)\s*([^;]*);([^:]*)\W*([^)]*)/ extracts
221.233.65.147 - - [13/Mar/2004:10:13:44 -0500] "CONNECT register.livesupportonthenet.com:443 HTTP/1.0" 200 - "-" "Mozilla/4.0 = 20  

The original regex that I have is if (/^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/) and extracts
compatible; MSIE 6.0; Windows NT 5.1) Opera 7.21  [

The output is similar to what I am looking for but it takes forever to when I run it
I uploaded a sample access_log.
 
sample access_log
 
link
SheldonC's gravatar image

SheldonC

Please try this:
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
#!/usr/bin/perl
use strict;

open FI, "log.txt";
while (<FI>) {
chomp;
if (/["].*["].*["].*["].*["](.*)["]$/)
  {
    print "$1
";
  }
}

Your example log file gives this as output from my code:
1:
2:
3:
4:
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
link
schubach's gravatar image

schubach

Thanks. That worked great as well. regular expressions can be somewhat challenging.

Thus one seems a bit tricky, I have to extract from the audit log brute foce attacks examplle:
attacker (24.168.72.174) was trying to login using username: exodus, password: HELL
username: exodus9971, password: christ

this is a sample of the audit_log

========================================
Request: 24.168.72.174 - - [Tue Mar  9 22:27:46 2004] "GET http://sbc2.login.dcn.yahoo.com/config/login?.redir_from=PROFILES?&.tries=1&.src=jpg&.last=&promo=&.intl=us&.bypass=&.partner=&.chkP=Y&.done=http://jpager.yahoo.com/jpager/pager2.shtml&login=exodusc&passwd=HELL HTTP/1.0" 200 566
Handler: proxy-server
Error: mod_security: pausing [http://sbc2.login.dcn.yahoo.com/config/login?.redir_from=PROFILES?&amp;.tries=1&amp;.src=jpg&amp;.last=&amp;promo=&amp;.intl=us&amp;.bypass=&amp;.partner=&amp;.chkP=Y&amp;.done=http://jpager.yahoo.com/jpager/pager2.shtml&amp;login=exodusc&amp;passwd=HELL] for 50000 ms
----------------------------------------
GET http://sbc2.login.dcn.yahoo.com/config/login?.redir_from=PROFILES?&.tries=1&.src=jpg&.last=&promo=&.intl=us&.bypass=&.partner=&.chkP=Y&.done=http://jpager.yahoo.com/jpager/pager2.shtml&login=exodusc&passwd=HELL HTTP/1.0
Accept: */*
Accept-Language: en
Connection: Keep-Alive
mod_security-message: Access denied with code 200. Pattern match "passwd=" at THE_REQUEST.
mod_security-action: 200

HTTP/1.0 200 OK
Connection: close

I tried the following regex but it only returned 1      = 3643818
if (/(\|||system\(|eval\(|`|\\)/i)
link
SheldonC's gravatar image

SheldonC

What do you mean by:
I tried the following regex but it only returned 1      = 3643818
if (/(\|||system\(|eval\(|`|\\)/i)


How would that possibly extract a username, password, and IP address from this big string?  Please explain better what you want to do, and post a better example using the code tag.  Also, I think maybe this should be a new question and this question should be closed.  It seems like your original code snippet has now been fixed.
link
schubach's gravatar image

schubach

Ok. I will open a new post with a more detailed explanation.
link
SheldonC's gravatar image

SheldonC

excellent feedback
link
SheldonC's gravatar image

SheldonC

Your answer
[hide preview]

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Tags:

Asked: 11/03/2011 12:32

Seen: 307 times

Last updated: 11/03/2011 10:05