Python Programming - Exercises: Finn Arup Nielsen
Python Programming - Exercises: Finn Arup Nielsen
Finn Arup Nielsen DTU Compute Technical University of Denmark October 10, 2013
Installation
Install Python
Install python and some libraries Check that you can write: $ python >>> import simplejson >>> import feedparser >>> import cherrypy >>> import pymongo >>> import nltk >>> nltk.download() >>> from nltk.corpus import brown >>> brown.words() [The, Fulton, County, Grand, Jury, said, ...]
Finn Arup Nielsen 2 October 10, 2013
Install Python
Install ipython (e.g., by pip) Start with: ipython -pylab Once installed make sure you can write: In [1]: plot(sin(linspace(0,8,100)))
Extra task
After installing Cherrypy see that it works. Try to get the bonus-sqlobject.py from the tutorial to work. Note that this requires the installation of a SQL database. One of the line in the bonus-sqlobject.py le states: # configure your database connection here __connection__ = mysql://root:@localhost/test If you dont want to install MySQL try installing the simpler sqlite and its python support and then change the connection line.
General Python
Dictionaries
Count the number of items in a list with the result in a dictionary. List example: l = [a, b, f, f, b, b] Should give something like: c = {a: 1, b: 3, f: 2} What and where is defaultdict?
Recursion
Implement a factorial function, n!, with recursion: >>> factorial(4) 24 (4! = 1 2 3 4 = 24) See what happens with factorial(1000)
10
Classes
Construct a module with a derived dictionary class with sorted keys: >>> s = SortedKeysDict({a: 1, c: 2, b: 3, d: 4}) >>> s.keys() [a, b, c, d] >>> s.items() [(a, 1), (b, 3), (c, 2), (d, 4)] Also implement doctest for the class. Document it and extract the document with, e.g., pydoc
11
12
Project Euler
Project Euler is a website with mathematical problems that should/could be solved by computers. Go to the Web-site http://projecteuler.net/ and solve some of the problems using Python. As an example the problem number 16 can be solved in one line of Python: >>> sum(map(int, list(str(2**1000)))) 1366
13
Encoding
14
UTF-8 encoding/UNICODE
In terms of UTF-8/UNICODE what is wrong with the following code: https://raw.github.com/gist/1035399 Hint look at the word na ve. Make a correction. See also:
http://nnaarupnielsen.wordpress.com/2011/06/20/simplest-sentiment-analysis-in-pythonwith-af/
15
UTF-8 encoding/UNICODE
Translate the AFINN sentiment word list with a language translation web service, or perhaps just a part it to a language you know and see if it works with with a couple of sentences.
16
Numerical python
17
18
Matrix rank
Compute the rank of the array: >>> from numpy import * >>> A = array([[1, 0], [0, 0]]) >>> rank(A) 2 Hmmmm ??? Not this one.
19
Python programming exercises Find the matrix rank by computing the number of numerical non-zero singular values Function header: def matrixrank(A, tol=None): """ Computes the matrix rank >>> matrixrank(array([[1, 0], [0, 0]])) 1 """ Hint: use the svd function in numpy.linalg.
20
Statistical distributions
Generate 10000 sets with 10 Gaussian distributed samples, square each element and sum over the 10 samples. Plot the histogram of the 10000 sums together with the teoretically curve of the probability density function. 2 10 PDF from the pdf() function in the scipy.stats.chi2 class
21
Coauthors
Read coauthors.csv a tab-separated le with co-author matrix. Find the author with most coauthoring. Plot the largest connected component part of the network with NetworkX.
22
Text mining
23
24
Email mining
Change the feature set to less words or other words. Code available here: https://gist.github.com/1226214
25
Web serving
26
27
Pandas
28
29