Lecture 2 - CS50's Computer Science For Lawyers
Lecture 2 - CS50's Computer Science For Lawyers
Lecture 2 - CS50's Computer Science For Lawyers
OpenCourseWare
Donate (https://cs50.harvard.edu/donate)
Lecture 2
From Binary to Programming Languages
Machine Code
Assembly Code
Compilers and Interpreters
Virtual Machines
Python
Input and Printing
Conditionals
Functions
Loops
While Loops
For Loops
Mario
Types
Libraries
Memory
Imprecision
Integer Overflow
Machine Code
Computer manufacturers make CPUs or Central Processing Units which recognize certain patterns of
bits. Thus, these patterns are computer or CPU specific.
CPUs understand machine code. These are the zeroes and ones that tell the machine what to do.
Machine code might look like this: 01111111 01000101 01001100 01000110 00000010 00000001
00000001 00000000 .
Assembly Code
It’s quite difficult for us to code in machine code, so assembly code was created.
Assembly code includes more english-like syntax. Assembly code is an example of source code.
Source code is code with a more english-like syntax that can be translated to machine code.
Some sequences of characters in assembly code include these: movl , addq , popq , and callq ,
which we might be able to assign meaning to. For example, perhaps addq means to add or callq
means to call a function. What values are we doing these operations on? Well, registers!
The smallest unit of useful memory is called a register. These are the smallest units that we can do
some operation on. These registers have names, and we can find them in assembly code as well, such
as %ecx , %eax , %rsp , and %rsb .
Languages with easier to understand syntax than assembly code were created. Below is a program
called hello.c that prints “hello, world” in the programming language C.
#include <stdio.h>
int main(void)
{
printf("hello, world\n");
}
$ cc -o hello hello.c
$ ./hello
hello, world
Some languages skip the step of compilers and instead use interpreters. Interpreters take in source
code and run the source code, line by line, from top to bottom and left to right.
Interpreters are created with the zeroes and ones that the CPU understands. These zeroes and ones
can recognize keywords and functions in the source code.
Python is an interpreted language. To say “hello, world” in Python, we write the following line in
hello.py .
print("hello, world")
To interpret this source code, at the terminal, we simply type python hello.py , where
python is the name of the interpreter.
The program python , in this case, opens up the file hello.py , reads it top to bottom,
recognized the function print and knew what to do, namely print “hello, world” on the screen
and quit.
A sample of the terminal window might look like this:
$ python hello.py
hello, world
Comparing compilers and interpreters, we might note that interpreters skip the step of having a
compiled program before running it. This causes a performance penalty for interpreter languages,
since each time, the interpreter will have to re-interpret the code.
To combat this issue, Python now generates bytecode, where it has already compiled the code and
saved the results in a temporary file. When running the program again, Python will not interpret the
code again but instead look at the pre-compiled version.
Bytecode looks something like this:
0 LOAD_GLOBAL 0 (print)
3 LOAD_CONST 1 ('hello, world')
6 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
Virtual Machines
What if we want to run these programs on different computers, with different CPUs?
Python
$ python hello1.py
What is your name? David
hello, David
x = input("x: ")
y = input("y: ")
print(x + y)
$ python arithmetic.py
x: 1
y: 2
12
We get 1 + 2 = 12. Remember that the input function returns a string and the + operator
concatenates strings, and thus, we get the string “1” concatenated to “2”.
To fix this issue, we can change the input value from a string to an int, or integer. The function to do
that is simply int .
Our code can then be written as…
x = int(input("x: "))
y = int(input("y: "))
print(x + y)
Conditionals
Let us instead write a program that compares two numbers.
In conditions.py , we might write…
x = int(input("x: "))
y = int(input("y: "))
if x < y :
print("x is less than y")
elif x > y:
print("x is greater than y")
elif x == y:
print("x equals y")
x = int(input("x: "))
y = int(input("y: "))
if x < y :
print("x is less than y")
elif x > y:
print("x is greater than y")
else:
print("x equals y")
c = input("Answer: ")
if c == "Y" or c == "y":
print("yes")
elif c == "N" or c == "n":
print ("no")
In this program, if the user inputs “Y”, c == "Y" will evaluate to true, and the program will print
“yes”. If the user inputs “y”, c == "y" will evaluate to true, and the program will also print “yes”.
Functions
We might want to define our own function, such as square, where calling it returns the square of an
input.
In return.py , we might define our own function called square .
def main():
x = int(input("x: "))
print(square(x))
def square(n):
return n * n
if __name__ == "__main__":
main()
Note that we can’t call the function square before defining the function square since the
interpreter reads from top to bottom. To fix this, we can create a main function, and then call the
main function at the end of the file.
When we call the main function, we normally write a strange set of lines to ensure that the main
function is not executed at the wrong time.
With the square function, we’ve abstracted away the multiplication, and now we can simply call
square .
Loops
While Loops
To write a program positive.py that will pester the human until the human inputs a positive
integer, we might write the following:
def main():
i = get_positive_int("i: ")
print(i)
def get_positive_int(prompt):
while True:
n = int(input(prompt))
if n > 0:
break
return n
if __name__ == "__main__":
main()
In the function get_positive_int , while True gives us an infinite loop. Python will then
execute the indented code again and again until it is told to stop.
Note that True and False are Boolean values.
The break keyword tells Python to stop.
Once the loop has been broken, the function returns the value.
For Loops
To write a program score.py , where the user inputs a number and that many hashes are printed, we
might write the following:
n = int(input("n: "))
for i in range(n):
print("#", end="")
print()
range is a function built into Python that returns a range of values from 0 to n - 1 inclusive.
The print function automatically prints a new line. In other words, it moves the cursor to the
next line after printing. To stop Python from printing each hash on a separate line, we specify
end="" as another argument to print , which tells Python to end the lines with nothing.
$ python score.py
n: 10
##########
Mario
In Super Mario Bros., a two dimensional world is created! Here’s one setting:
for i in range(4):
print("?", end="")
print()
To print the block shown, we’ll need to print hashes on both rows and columns. We must first iterate
through the rows, and within each row, we then iterate through each column and print a hash.
Types
In Python, there are many data types.
bool : True/False
int : Numbers
str : Strings of text
float : Real numbers with decimal points and digits after
dict : Hash table
list : Any number of values back to back
range : Range of values
set : A set of values with no duplicates
tuple : x, y or latitude, longitude
Libraries
In addition to the functions built into the core language, there are libraries and frameworks that
provide additional features. These have to be imported manually to be used.
For example, in Python, if we want to generate pseudorandom numbers, we have to import a function
randint from a library called random .
For example, to get a random integer between 1 and 10, we can write this:
print(randint(1, 10))
We can also just write import random without importing the specific function. In this case,
we’ll have to prefix the function with the library name using dot notation as shown below.
To create a game where the user guesses a random integer between 1 and 10, we can write
this:
import random
n = random.randint(1, 10)
if guess == n:
print("Correct")
else:
print("Incorrect")
Note that these numbers are pseudorandom because computers can’t pick a random number like
humans, they have to use algorithms, which are deterministic processes.
Memory
Inside a computer is hardware. These hardware chips are called RAM, or Random Access Memory.
Inside each of these chips is some finite number of bytes used to represent values in our programs.
Python, and most other languages, decide a priori how many bits to use to represent values in our
programs.
Thus, if our value cannot be represented in only that many bits, the language will instead
approximately represent that value.
Imprecision
Let’s take a look at a program called imprecision.py that divides two numbers and returns the
quotient.
x = int(input("x: ))
y = int(input("y: ))
z = x / y
print(f"{z:.30f}")
The syntax :.30f signifies that we’re printing z as a float to 30 decimal places.
We get…
$ python imprecision.py
x: 1
y: 10
x / y = 0.100000000000000005551115123126
This value isn’t what we expect! We don’t have enough bits to store the entire precise value, so the
computer approximates the quotient. This is called floating-point imprecision.
Integer Overflow
A similar problem occurs with integers.
Consider a number that has been allocated three digits.
We start by counting.
Suppose we count until 999. We carry, and we get 1000.
However, the computer has only allocated three digits, so our 1000 gets mistaken for 000.
This is an example of integer overflow, where our large number has wrapped to a small number.
On December 31, 1999, people began to get nervous—programs stored the calendar year with only
two digits. For 1999, the year was stored as 99. When the year 2000 approached, then, the year would
be stored as 00, leading to confusion between the year 1900 and 2000. This became known as the
Y2K problem.
In the past, Boeing 787 planes stored the number of hundredths of seconds in a counter. Once that
counter overflowed (occurring on the 248th day), the plane would go into fail-safe mode and the
power would shut off.