'Sort and format lines [duplicate]
I have the following lines:
a;http://example.com/
b;http://qwerty.com/
a;http://example2.com/
c;http://example2.com/
c;http://example2.com/
a;http://example3.com/
b;http://qwerty.com/
b;http://qwerty3.com/
c;http://qwerty.com/
c;http://example5.com/
and want to get the following format out of this:
a;http://example.com/,http://example2.com/,http://example3.com/
b;http://qwerty.com/,http://qwerty3.com/
c;http://example2.com/,http://example5.com/,http://qwerty.com/
but not quite understand how it can be done, the algorithm. Made the following steps:
# sort the original list by the first main item;
output=$(printf "%s" "${output}" | sort -t\; -k1 -n | sort -u)
# split items into two parts
item1=$(printf "%s" "${output}" | cut -d\; -f 1)
item2=$(printf "%s" "${output}" | cut -d\; -f 2)
Now there are two parts of the sorted list that can be used to work, but how and in which way to build further logic, I still do not quite understand. It seems that next step is to make some kind of loop and start working with item1 and item2 in it.
Can somebody please point to the next steps or give an example?
Solution 1:[1]
Maybe this:
sort -u /tmp/test.list | \
awk -F';' '
{
vect[$1] = vect[$1]","$2
}
END {
OFS=";"
for (idx in vect) {
print idx, substr(vect[idx], 2)
}
}'
You don't care about duplicates, so sort -u will help.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | MaxChinni |
