Python file copying using regex -
i have large log file. want extract lines containing java/javax/or/com
followed ./:
. every line this, want extract of corresponding lines stack traces , starts at
. example:
line1: java.line.something.somethingexception line 2: @ something line 3: @ something line 4: @ something line 5-20:junk don't want extract. line 21: javax.line.something.somethingexception line 22: @ something line 23: @ something line 24: @ something
and on...
here want copy line 1-4 , again line 21-24. far code collects line contains keywords i'm unable figure out how write specific no of lines after that, skip few lines , start writing again.these lines starts @ random, i.e can 100 lines or can 250 lines, no pattern.
here's code:
import re import sys itertools import islice file = open(sys.argv[1], "r") file1 = open(sys.argv[2],"w") = 0 line in file: if re.search(r'[java|javax|org|com]+?[\.|:]+?', line, re.i) , not (re.search(r'at\s', line, re.i) or re.search(r'mdcloginid:|webcontainer|c\.h\.i\.h\.p\.u\.e|threadpooltaskexecutor|caused\sby', line, re.i)): file1.write(line)
this code extracts lines containing keywords, i'm stuck @ how next part,i.e copy next lines containing @ , write them new file, stop 'at' ends. search next line containing keywords , same action again.
this can solved flag set in case match specific conditions:
java_regex = re.compile(...) # java at_regex = re.compile(...) # @ copy = false # flag control copy or not copy output line in file_in: if re.search(java_regex, line): # start copying if "java" in input copy = true else: if copy , not re.search(at_regex, line): # stop copying if "at" not in input copy = false if copy: file_out.write(line)
Comments
Post a Comment