Posts Tagged code
Erasing Iterators from STL Containers (STL Vector, etc.) in a Loop
Posted by Prentice Wongvibulsin in Programming on April 19, 2010
for( vector<aType>::iterator it = aVec.begin();
it != aVec.end();
it = (*it).shouldDelete()?aVec.erase(it):it+1){
// do stuff
}
Using Variadic Functions in C
Posted by Prentice Wongvibulsin in Programming on June 29, 2009
I don’t really need to write on how to use va_list, va_start, va_end, va_arg because other tutorials or references do a good job of explaining it already… however, here are some notes for wrapping Varadic functions.
Firstly, note the difference between:
void myFc( int arg1, ... )
and the va_list version:
void vmyFc( int arg1, va_list args)
gnu stdc libraries (printf, etc) wrap the va_list versions (vprintf, etc) with Variadic versions (printf, etc) with the following pattern:
int vfunc(int arg1, va_list vargs){
// do real work
}
int func(int arg1, ...){
int retval;
va_list vargs;
va_start(vargs, arg1);
retval = vfunc(arg1, vargs);
va_end(vargs);
return retval;
}
When wrapping va_list functions, it is important to consider that va_list is consumed and so in the case where you will be using your va_list for multiple functions, you’ll need to save the original pointer.
GNU C doc for stdarg.h — Note the __va_copy macro.
For example, wrapping the snprintf function:
//to find the length of the string, you pass null and length 0 to the function: len = vsnprintf(NULL, 0, fmt, vargs); //do the allocation str = (char*) malloc(len+1); //and finally read the string: vsnprintf(str, len+1, fmt, vargs);
note that we use the va_list version of snprintf (vsnprintf).
the 2nd snprintf may (depending on stdarg implementation) cause a segmentation fault. The correct/safe way of doing it would be to copy vargs and use the copy in each snprintf operation:
#ifdef __va_copy __va_copy(save,vargs); #else save = vargs; #endif len = vsnpritnf(NULL, 0, fmt, save); str = (char*) malloc(len+1); #ifdef __va_copy __va_copy(save,vargs); #else save = vargs; #endif vsnpritnf(str, len+1, fmt, save);
But as my awesome co-worker Geoff asserts, its always better to keep it simple and perhaps there’s a way to accomplish what you’re trying to do without Variadic functions. Check out my other post.
PST~ this was just a brain dump of what was on my mind as I was coding today… if you find this useful and/or find some info lacking OR just incorrect, leave a comment so I can fix it.
Python Talk
Posted by Prentice Wongvibulsin in Programming on May 20, 2009
I gave a talk on Python for cplug yesterday.
Check out the slides here.
CPLUG: http://www.cplug.org/
SLIDES: http://prenticew.com/talks/pytalk09
During the talk, I wrote a simple Python script to pull content off the web and parse the data… here’s the jist of what I did:
First, lets pull a webpage off the internet:
import httplib
def get_webpage():
conn = httplib.HTTPConnection('en.wikipedia.org')
conn.request("GET","/wiki/Python_(programming_language)")
rd = conn.getresponse()
print rd.status, rd.reason
return rd.read()
This function creates a HTTPConnection object for en.wikipedia.org and the connection object is stored in conn.
We then do a GET request for the Python wiki page. The result of the request is stored in the connection and we can access the status by calling getresponse() which returns a HTTPResponse object.
The status can be accessed with .status and .reason and the data can be accessed with .read().
This yields the plain-text html of the wiki page. This is not very interesting or useful so lets do something else with this data… lets write a frequency counter:
def get_freqct(data):
wordlist = data.split(' ')
freqct = {}
for s in wordlist:
if s not in freqct:
freqct[s]=1
else:
freqct[s]+=1
return freqct
We can pass the data (a string) we got from the first function to our get_freqct function. The function first uses the built-in string function to split the string by a white-space delimiter returning a list of words. We then iterate through the wordlist and generate the frequency count using the dictionary data type. At this point we have something fairly interesting but simply printing out this list is fairly cluttered… lets sort it!
You can quickly sort the contents of this dictionary with the sorted function:
import httplib from operator import itemgetter sol = sorted(d.items(), key=itemgetter(1))
This statement takes the items in d (the dictionary) and returns a list of tuples (key,data) and is sorted by the data field of the tuple using the itemgetter function. So you’ll end up with a sorted list of tuples ordered by the data field.
Then we can print the list with the following for loop:
for word,count in sol:
print word, ":", count
This for loop unpacks the contents of each of the tuples in the sorted list (sol) into the variables word and count. The variables are then printed with the print statement.
If you run this code… you’ll realize that a lot of HTML tags (or parts of HTML tags) get counted. This is not very desirable so lets filter them out using a regular expression!
data = re.sub(r'<[^>]+>','',data)
This regular expression takes the raw data (string) returned by the get_webpage function and replaces each occurrence of an HTML tag with an empty string.
Deconstructing the regular expression:
<- matches the ‘<’ symbol
[^>]+ – matches one or more of anything except the ‘>’ symbol (where + means one or more)
>- matches the ‘>’ symbol
…and put it all together:
#!/usr/bin/python
import httplib
import httplibfrom operator import itemgetter
import re
def get_webpage(site,page):
conn = httplib.HTTPConnection(site)
conn.request("GET", page)
rd = conn.getresponse()
print rd.status, rd.reason
return rd.read()
def get_freqct(list):
freqct = {}
for s in list:
if s not in freqct:
freqct[s]=1
else:
freqct[s]+=1
return freqct
def main():
data = get_webpage('en.wikipedia.org',"/wiki/Python_(programming_language)")
data = re.sub(r'<[^>]+>','',data)
d = get_freqct(data.split(' '))
sol = sorted(d.items(), key=itemgetter(1))
for word,count in sol:
print word, ":", count
if __name__ == "__main__":
main()
The following is a snippet of what the script would yield:
language : 24 code : 24 which : 24 by : 27 Retrieved : 32 with : 32 are : 33 as : 38 on : 50 for : 51 in : 64 is : 80 to : 92 a : 98 Python : 103 and : 122 of : 125 the : 144

Stripping C/C++ Comments
Posted by Prentice Wongvibulsin in Programming on June 22, 2009
Here’s some code to strip comments from a c/c++ file. Code is adapted from a posting at http://stackoverflow.com/questions/241327/python-snippet-to-remove-c-and-c-comments
import re # adapted from: http://stackoverflow.com/questions/241327/python-snippet-to-remove-c-and-c-comments # strips c/c++ comments def strip_comment(text): rep = r'//.*?$|/\*.*?\*/|\'(?:\\.|[^\\\'])*\'|"(?:\\.|[^\\"])*"' pattern = re.compile(rep, re.DOTALL | re.MULTILINE) return re.sub(pattern, lambda match:(match.group(0),"")[match.group(0).startswith('/')], text)code, comments, parsing, python, re, regular expression
No Comments