[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: QOTD (question of the day)



Ken Hagan wrote:
> cat <file> | cut -c 5-80
> or something like that will give you columns 5 through 80.

> Quoting dsavage@peaknet.net:
>>hundreds of files with line number prefixes. I'm looking for a grinder
>>script that will remove the first four (or five) characters of every line
>>in these files:
>>
>>NNN sourcecodetextsourcecodetextsourcecodetext
>>NNN sourcecodetextsourcecodetextsourcecodetext
>>^^^^
>>||||
>>I want to strip away these line numbers. Depending on the number of lines
>>in a file, there are either three digits or four followed by a space.
>>
>>Can anyone suggest a sed/awk/perl/whatever quickie to do the job? Nothing
>>fancy. The only criteria is that it be faster than my fingers typing
>>endless cycles of '4 X downarrow' in vi. :-P

In 'vi', the proper keys would be ':%s/^....//' i.e. % => 1,$ => all 
lines, and s/^....// is to substitute the first four characters for the 
empty string (i.e. delete them).

But that requires you to 'vi' every file. A quicker way would be to use 
'sed' to edit the file, but then you'd have to deal with the temporary 
converted file it creates and then rename it to the original. Perl can 
take care of that easily with the in-place option.

So, a PerlKwiki would be:

perl -p -i -e 's/^\d{3} //' file1 [file2 ...]

This will replace ONLY three digits followed by a space. Just in case 
there's some files that don't need this to happen, or these types of 
lines are mixed with ones that don't need the "adjustment". You can use 
'\d+' if it's some unknown number of digits, or use '\s' if you don't 
know if tabs or spaces were used, and '\s+' if you want to trim off 
whatever whitespace exists after the digits. But that's a regex question 
for another post.

So where do we get the list of files from? 'find' is your friend.

find . -type f -print0 | xargs -0 perl -p -i -e 's/\d{3} //'

Presto! We're done.

Mike/

-
To unsubscribe, send email to majordomo@silug.org with
"unsubscribe silug-discuss" in the body.