Difference between revisions of "Using awk grep sed"

From Free Knowledge Base- The DUCK Project: information for everyone
Jump to: navigation, search
m
(awk)
Line 37: Line 37:
 
  awk '{print}' inventory.txt
 
  awk '{print}' inventory.txt
  
 +
=== picking out specific lines in the data or text file ===
 +
We might have a space separated data file or a text file with lines of composition, either way, we can pick out and print entire lines of text if a single word in the line is matched.
  
 +
awk '/plumbing/ {print}' inventory.txt
 +
 +
In the inventory file there is a second column of text that specifies the department of the listed supply item.  Only lines from the text file will be printed if the item is in the plumbing department.  Alternatively, if it is a composition, only lines where the word 'plumbing' is matched will be printed.
  
  

Revision as of 18:45, 25 February 2020

grep does not alter a file, it only finds matches while awk and sed are text processors.

awk is mostly used for data extraction and reporting. sed is a stream editor. Each one of them has its own functionality and specialties.

sed

Things that you can accomplish using RegEx within the Vi editor on text files can also be accomplished at the command line with sed.

The most basic form is to use sed as a simple search and replace.

sed 's/windows/linux/'

example: process text file by removing blanks, unwanted lines, and duplicates

Get rid of all lines of text containing numerical stats

sed -i '/[0-9]/d' Razor-Fen.txt

Get rid of all empty lines containing no characters

sed -i '/^\s*$/d' Razor-Fen.txt

Get rid of all duplicate lines

sed -i '$!N; /^\(.*\)\n\1$/!P; D' Razor-Fen.txt

grep

example: rgrep

rgrep is grep -r or recursive grep

If you want to search all text files within all subfolders for a particular matching string, the syntax might not be what you would think

For example, rgrep string *.txt will not search though all text files under the current directory, the correct syntax would be:

rgrep -s string --include \*.txt

Here is an example that searches for multiple specific types

rgrep -i --include \*.h --include \*.cpp CP_Image ~/path[12345]

awk

The awk utility is useful for changing data files and generating reports.

By default Awk behaves like 'cat' in that it prints every line of data from the specified file.

awk '{print}' inventory.txt

picking out specific lines in the data or text file

We might have a space separated data file or a text file with lines of composition, either way, we can pick out and print entire lines of text if a single word in the line is matched.

awk '/plumbing/ {print}' inventory.txt 

In the inventory file there is a second column of text that specifies the department of the listed supply item. Only lines from the text file will be printed if the item is in the plumbing department. Alternatively, if it is a composition, only lines where the word 'plumbing' is matched will be printed.