Monday, March 27, 2017

Counter in Python

Counter is a simple way to find the text occurrences in a given text. You can use it to create tag cloud also. Let's look into an example to see how this works

Let's see the word occurrences from a given url in the web. The code can be used to process any palin text also. Just pass it to the Counter method as a list of words.

Let's look at the code now

>>> import urllib
>>> from collections import Counter

#Point to a website which you want to hit
>>> loc = urllib.urlopen("")

#read the text
>>> text =

# Find the counter
>>> words_counter = Counter(string.split(text))

# Show the most common 10 words. You can pass any number as parameter.
# Not passing any number will result in showing all the counters
>>> words_counter.most_common(10)

It will not show any meaningful result. You can use one of the libraries like BeautifulSoap or some regular expression to strip the html tags. Also you might want to build a dictionary of common words which can be stripped out to make any meaningful inference.

No comments:

Post a Comment