Remove consecutive lines containing the same pattern

Multi tool use
Remove consecutive lines containing the same pattern
I'd like to create a sed (or equivalent) expression that would remove consecutive lines containing a specific character. For instance, I have a list of IPs followed by a colon. If they contain a value the following line(s) would not contain a colon. If there are consecutive lines with colons, the first should be removed (since they're empty), as so:
+159.0.0.0:
+159.0.0.1:
+443/tcp open https
+159.0.0.2:
+159.0.0.3:
+159.0.0.4:
+159.0.0.5:
+80/tcp open http
+443/tcp open https
Desired Result:
+159.0.0.1:
+443/tcp open https
+159.0.0.5:
+80/tcp open http
+443/tcp open https
4 Answers
4
This might work for you (GNU sed):
sed 'N;/:.*n.*:/!P;D' file
Keep a moving window of two lines and if both lines contain a :
do not print the first.
:
@Karimi thank you. May be I have not understood the problem. If I create a file with only the first two lines of the given example and run the above solution, the result is only the second line. Is this correct?
– potong
Jul 3 at 6:10
Another awk:
$ awk '/:/ { p = $0 } $0 !~ /:/ {if (p) {print p} print $0; p = ""} ' file
+159.0.0.1:
+443/tcp open https
+159.0.0.5:
+80/tcp open http
+443/tcp open https
This solution also worked for me. I ran this against a large data set and it handled edge cases for colons in the final line. It also removed unexpected blank lines. Thanks for this.
– Karimi
Jul 2 at 21:19
sed is for s/old/new, THAT IS ALL. This will work with any awk in any shell on any UNIX box:
$ awk '/:/{s=$0 ORS;next} {print s $0; s=""}' file
+159.0.0.1:
+443/tcp open https
+159.0.0.5:
+80/tcp open http
+443/tcp open https
and is trivial to enhance for anything else you might want to do, for example to handle the final line ending in a colon just add an END section to print the last saved colon-ending line, if any:
$ cat file
+159.0.0.0:
+159.0.0.1:
+443/tcp open https
+159.0.0.2:
+159.0.0.3:
+159.0.0.4:
+159.0.0.5:
+80/tcp open http
+443/tcp open https
+159.0.0.6:
$ awk '/:/{s=$0 ORS;next} {print s $0; s=""} END{printf "%s", s}' file
+159.0.0.1:
+443/tcp open https
+159.0.0.5:
+80/tcp open http
+443/tcp open https
+159.0.0.6:
EDIT: To check final line is having colon or not made a bit change to code now too as follows.
awk '!/:/ && prev{print prev ORS $0;prev="";next} {prev=$0} END{if(prev && prev !~ /:/){print prev}}' Input_file
Completely tested on your provided sample, could you please try following and let me know if this helps you.
awk '!/:/ && prev{print prev ORS $0;prev="";next} {prev=$0} END{if(prev){print prev}}' Input_file
Adding a non-one liner form of solution too now.
awk '
!/:/ && prev{
print prev ORS $0;
prev="";
next
}
{
prev=$0
}
END{
if(prev){
print prev}
}' Input_file
Explanation: Adding explanation for above code too now.
awk '
!/:/ && prev{ ##Checking condition here if a line is NOT having colon in it and variable prev is NOT NULL then do following.
print prev ORS $0; ##Printing the value of variable named prev ORS(whose default value is new line) and then current line by $0.
prev=""; ##Nullifying prev variable value here.
next ##Using awk out of the box next keyword which will skip all further statements from here.
}
{
prev=$0 ##Setting value of variable prev to current line here.
}
END{ ##Starting END section of current code here, which will be executed after Input_file is being read.
if(prev){ ##Checking if variable prev is NOT NULL, if yes then do following.
print prev} ##Printing the value of variable prev here.
}' Input_file ##Mentioning Input_file name here.
+1 for the Excellent line-by-line explanation. Thanks for sharing this alternative. The only shortcoming is that it doesn't take into consideration the final line having a colon with nothing after.
– Karimi
Jul 2 at 21:19
@Karimi, sure please check my EDIT solution now and let me know on same then it should work?
– RavinderSingh13
Jul 3 at 0:16
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
This was the most elegant solution. The only shortcoming is that it doesn't take into consideration the final line having a colon with nothing after. Since this was the most elegant, I'm awarding this as best answer. Thank you!
– Karimi
Jul 2 at 21:19