rigadicomando.org

Whatever you can cat

Random Quote

The only Zen you find on top of mountains
is the Zen you bring there.

• Pirsig

Secondary links

  • About
  • Contacts
  • Disclaimer

Home Blogs admin's blog

How to list duplicate lines in a text file, with counts next to each unique line

Submitted by admin on Mon, 2006-05-08 07:40.
  • bash
  • scripts

from:

How to list duplicate lines in a text file, with counts next to each unique line - [spugbrap's random notes geek blog] [del.icio.us (bash)]

At some point, last year (it's been in my 'toblog' file all this time), I needed to analyze the lines in a text file, removing duplicate lines, while counting how many times each duplicated line occurred within the file, and sorting from most common to least common.

For example, using a text file called 'dupetest.txt', containing:
foo bar baz
foo qux corge
spugbrap likes bacon
foo qux corge
spugbrap likes bacon
foo bar baz
oatmeal cookies are good
oatmeal cookies are good
foo bar baz
foo qux corge
foo bar baz

The output I want is:
4 foo bar baz
3 foo qux corge
2 spugbrap likes bacon
2 oatmeal cookies are good

I knew there had to be a simple way of doing this by just stringing together a few unix commands (in cygwin), but finding the right combination of commands took me some effort. Here's what I came up with:

sort dupetest.txt | uniq -c -d | sort -n -r

  • admin's blog
  • Login to post comments

tags in Arguments

administrivia bash Debian GNU/Linux OS emacs howto perl scripts web
more tags

Navigation

  • Feedback
  • News aggregator

ICT users' rights

  • Support freedom by joining the FSF during our year-end fundraiser
  • Bilski ruling: a victory on the path to ending software patents
  • FSF Releases New Version of GNU Free Documentation License
  • FSF reboots its High Priority list with a grant and call for input
  • "Avoiding Ruinous Compromises" by Richard Stallman
more

High Scalability Architecture

  • Scalability Perspectives #2: Van Jacobson – Content-Centric Networking
  • What CDN would you recommend?
  • Is Eucalyptus ready to be your private cloud?
  • Private/Public Cloud
  • Useful Cloud Computing Blogs
more

Debian Security

  • DSA-1667 python2.4
  • DSA-1666 libxml2
  • DSA-1665 libcdaudio
  • DSA-1664 ekg
  • DSA-1663 net-snmp
more

Drupal Security

  • SA-2008-069 - CCK for 5.x and 6.x - XSS vulnerabilities
  • SA-2008-068 - Localization client and Localization server - Cross site request forgery
  • SA-2008-067 - Drupal core - Multiple vulnerabilities
  • SA-2008-066 - Shindig-Integrator - Multiple vulnerabilities
  • SA-2008-065 - Node Clone - Access bypass
more

EFF

  • FBI Withdraws Unconstitutional National Security Letter After ACLU and EFF Challenge
  • EFF and Sheppard Mullin Defend Wikipedia in Defamation Case
  • Congress Must Investigate Electronic Searches at U.S. Borders
  • Betrayed MSN Music Customers Deserve More from Microsoft
  • EFF Report: FBI Slowed Terror Investigation with Improper NSL Request
more

Invent Geek

  • the ion cooler 2.0
  • the ultimate dance pad v1.0
  • thermaltake sponsors inventgeek
  • The Thermaltake MiniFridge Case Mod
  • Inventgeek gets a facelift and a butt tuck
more

 Privacy | Disclaimer | Drupal | Creative Commons

All content on this site is ditributed under Creative Commons License, each individual author is responsible for its own posts.

RoopleTheme