Skip to content

Instantly share code, notes, and snippets.

@giuliano108
Created May 3, 2015 14:48
Show Gist options
  • Save giuliano108/c31f825137c4e4790d67 to your computer and use it in GitHub Desktop.
Save giuliano108/c31f825137c4e4790d67 to your computer and use it in GitHub Desktop.
blockgrep.pl
#!/usr/bin/perl
use strict;
=head1 SYNOPSIS
grepblock.pl <start_of_block_regexp> <match_regexp> file1 file2 ... fileN
Input files are split in blocks. Each block is delimited by `start_of_block_regexp`.
Output those blocks that match `match_regexp` (/match_regexp/ms is used).
=cut
use Pod::Usage;
use constant MAX_BLOCK_SIZE => 1048576;
if (@ARGV < 3) {
pod2usage(1);
exit 1;
}
my $start_of_block_regexp = shift;
my $match_regexp = shift;
my $state = 'bob';
my $block = <>;
while (<>) {
my $line = $_;
if ($state ne 'bob' and $line =~ /$start_of_block_regexp/s) {
$state = 'bob';
print $block if $block =~ /$match_regexp/ms;
$block = $line;
} else {
$state = 'scanning';
$block .= $line;
die "Current block is getting too big, giving up..." if length($block) > MAX_BLOCK_SIZE;
}
}
print $block if ($state ne 'bob' and $block =~ /$match_regexp/ms);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment