Posts Tagged cal poly
Cal Poly Spring Dance Show 2010
Posted by Prentice Wongvibulsin in Photography on May 30, 2010
Highlights: Cal Poly Spring Dance Show – Images by Prentice Wongvibulsin
This year’s student choreographed dance show was even better than last year’s! You can find all 250 photos in the gallery below (http://www.photoshelter.com/gallery/Spring-Dance-Show-2010/G0000X2lu0PuqW1U). Also, photos from last year’s show.
Spring Dance Show 2010 – Images by Prentice Wongvibulsin
Debut Tournament
Posted by Prentice Wongvibulsin in Photography on October 29, 2009
The Cal Poly Beach Volleyball Club had their first tournament, the Debut Tournament, at Pismo Beach on October 25th.
Mixed Fours Division Champions

The complete set can be found here.
A more select set of photos can be found here.
Spring Dance Show – Unleashed
Posted by Prentice Wongvibulsin in Photography on May 30, 2009
Check out photos of the 2009 Spring Dance Show!
Python Talk
Posted by Prentice Wongvibulsin in Programming on May 20, 2009
I gave a talk on Python for cplug yesterday.
Check out the slides here.
CPLUG: http://www.cplug.org/
SLIDES: http://prenticew.com/talks/pytalk09
During the talk, I wrote a simple Python script to pull content off the web and parse the data… here’s the jist of what I did:
First, lets pull a webpage off the internet:
import httplib
def get_webpage():
conn = httplib.HTTPConnection('en.wikipedia.org')
conn.request("GET","/wiki/Python_(programming_language)")
rd = conn.getresponse()
print rd.status, rd.reason
return rd.read()
This function creates a HTTPConnection object for en.wikipedia.org and the connection object is stored in conn.
We then do a GET request for the Python wiki page. The result of the request is stored in the connection and we can access the status by calling getresponse() which returns a HTTPResponse object.
The status can be accessed with .status and .reason and the data can be accessed with .read().
This yields the plain-text html of the wiki page. This is not very interesting or useful so lets do something else with this data… lets write a frequency counter:
def get_freqct(data):
wordlist = data.split(' ')
freqct = {}
for s in wordlist:
if s not in freqct:
freqct[s]=1
else:
freqct[s]+=1
return freqct
We can pass the data (a string) we got from the first function to our get_freqct function. The function first uses the built-in string function to split the string by a white-space delimiter returning a list of words. We then iterate through the wordlist and generate the frequency count using the dictionary data type. At this point we have something fairly interesting but simply printing out this list is fairly cluttered… lets sort it!
You can quickly sort the contents of this dictionary with the sorted function:
import httplib from operator import itemgetter sol = sorted(d.items(), key=itemgetter(1))
This statement takes the items in d (the dictionary) and returns a list of tuples (key,data) and is sorted by the data field of the tuple using the itemgetter function. So you’ll end up with a sorted list of tuples ordered by the data field.
Then we can print the list with the following for loop:
for word,count in sol:
print word, ":", count
This for loop unpacks the contents of each of the tuples in the sorted list (sol) into the variables word and count. The variables are then printed with the print statement.
If you run this code… you’ll realize that a lot of HTML tags (or parts of HTML tags) get counted. This is not very desirable so lets filter them out using a regular expression!
data = re.sub(r'<[^>]+>','',data)
This regular expression takes the raw data (string) returned by the get_webpage function and replaces each occurrence of an HTML tag with an empty string.
Deconstructing the regular expression:
<- matches the ‘<’ symbol
[^>]+ – matches one or more of anything except the ‘>’ symbol (where + means one or more)
>- matches the ‘>’ symbol
…and put it all together:
#!/usr/bin/python
import httplib
import httplibfrom operator import itemgetter
import re
def get_webpage(site,page):
conn = httplib.HTTPConnection(site)
conn.request("GET", page)
rd = conn.getresponse()
print rd.status, rd.reason
return rd.read()
def get_freqct(list):
freqct = {}
for s in list:
if s not in freqct:
freqct[s]=1
else:
freqct[s]+=1
return freqct
def main():
data = get_webpage('en.wikipedia.org',"/wiki/Python_(programming_language)")
data = re.sub(r'<[^>]+>','',data)
d = get_freqct(data.split(' '))
sol = sorted(d.items(), key=itemgetter(1))
for word,count in sol:
print word, ":", count
if __name__ == "__main__":
main()
The following is a snippet of what the script would yield:
language : 24 code : 24 which : 24 by : 27 Retrieved : 32 with : 32 are : 33 as : 38 on : 50 for : 51 in : 64 is : 80 to : 92 a : 98 Python : 103 and : 122 of : 125 the : 144


















