'how to speed up tail and head in bash
I have a giant text file called stock_messages that looks like this:
H: TSLA
A: id1, 100
E: id1, 20
F: id2, 250
...
H: AAPL
A: id1, 100
A: id2, 20
E: id1, 80
A: id2, 10
...
What I want to do is to create a separate text file with messages for each stock (e.g. AAPL.txt, TSLA.txt, etc).
I wrote a bash script so that
start=-1
stock_name=""
grep -n -i '^H' $file | awk -F "[:,]" {'print $1, $NF'} | while read -r line; do
line_number=$(echo $line | awk -F " " {'print $1'})
if [[ "$start" -gt 0 ]]
then
tail -n "+start" $file | head -n "$(($line_number-$start))" > "./data/${stock_name}.txt"
echo "saved $stock_name data!"
fi
start=$line_number
stock_name=$(echo $line | awk -F " " {'print $2'})
done
Basically I'm taking the line numbers where H's are, and using tail and head to take those lines out and save it into separate file.
The script runs pretty fast initially but it gets really slow very quickly, and I'm not sure why.
Any suggestion would be much appreciated!
Solution 1:[1]
If awk is an option
$ awk '/^H:/ {close(stock_message); stock_message=$2".txt"} {print > stock_message}' input_file
$ cat AAPL.txt
H: AAPL
A: id1, 100
A: id2, 20
E: id1, 80
A: id2, 10
...
$ cat TSLA.txt
H: TSLA
A: id1, 100
E: id1, 20
F: id2, 250
...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
