Learn to Program from Scratch

Intro

This is a (ongoing) workshop to learn programming and programming languages, plus skills in designing, testing and optimizing software and is Part IA of the so-called AI Tripos. The way these notes work is I do this material with you, and point out anything that may trip up a beginner. The further we go on, the more terse the notes will get because I can then assume a background. The courses and books we will do promote that you try and figure out things for yourself since it's the only proven way to learn this material but their targeted audience is elite students who qualify to get into the Ivy League so we'll have to backtrack as we go filling in some holes in our education in the early lectures. Really the only difference between those students and us is organization and work ethnic, they can handle a heavy course load of difficult topics because they're smarter with their time, which is how they got into elite schools.

If you know how to use a web browser you can start this curriculum, no expertise in software installation, programming or any other background is assumed. This means if you only own a phone or tablet, you can still complete this material. We'll do mutliple things at the same time dipping into other courses as we complete another, because then you can see patterns that connect them.

--–— Lectures are currently offline, I'm uploading as torrents and to other media currently 18.6G left of 38G of videos to upload ------–—

Solutions

Don't post your solutions to the assignments anywhere, such as keeping a public github with every assignment completed. Schools will hide them behind logins if too many solutions are posted. Some of the AI courses we'll see later the assignments have been restricted to the public. We'll work around this but it's annoying. In the beginning I'll post some partial solutions breaking this rule, but it's better you figure it out yourself anyway since that's how you learn. There is no right and wrong answers to the exercises, this isn't typical school, it's self-directed research where you kind of experiment your way through it at the beginning and then over time get better and better.

Study advice

Do 1% effort for 100 consecutive days as per CMU prof Ryan O'Donnell and see where you end up. It would be better to take your time and redo a chapter numerous times (or go back and redo a few chapters) really fleshing out the exercises than it would to speed ahead and end up confused.

Start here

We begin with Brown University's CS019 Accelerated Introduction to Computer Science which is 2 semesters combined into one and as a result omits some tedious analytical details you would find in regular classes that you can always learn on your own should you later require those things. This is not a usual compsci class, things like log n complexity structures to hold data, infinite streams, graphs and testing are explained completely in the lectures and book.

Look around the various course pages, like the readme, and the notes about the unusual assignments which respects the intelligence of the student. Look around the instructor's page like his manifesto on the current state of compsci education, and scroll through his papers. Skim through the paper on 'What Help Do Students Seek in TA Hours?' because it's from the exact course we are about to take. 'TA' is a teaching assistant, usually a student who did well in the course before. This is a quora post about the book we'll do with the course from the writer/instructor "A person completing the first half is ready for most upper-level computer science". We're doing the entire book.

There is a placement for the course but we're going to skip it, CS019 assumes no background and we have the luxury of time to fill in any blanks as we go.

Materials needed to complete this

As stated in the intro it's possible to complete this course with just a phone or a tablet, the interpreter you use is online and even has a static type checker

  • the book Programming and Programming Languages (use the most latest version, which is 2020)
  • a web browser to view the lectures
    • the lectures cover extra material not in the book
  • a web browser to use the Pyret online interpreter
  • a google account to save your work if you wish
  • the 2018 copy of the assignments since we are watching the 2018 lectures
  • the labs at the bottom of this page (Big-O, Higher-Order Functions, Streams etc).

Archiving the assignments (optional)

If you have wget installed:

wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://cs.brown.edu/courses/cs019/2018/assignments.html

You may also want to archive the pyret documentation or even the entire book using the same method. To archive the lectures try a browser plugin that can download video content and save them as you watch.

Lecture 1 CS019 Pyret Demo

We're starting with the lecture titled Wed 9/5/18. Go to code.pyret.org and click open editor as you follow along the lecture. The book will cover everything he's doing like what a list is, what a function is, this is just a demonstration of the online interpreter. You can run Pyret offline too.

Notice he shuts down a lot of out of scope questions to not derail the lecture.

@34:20 is the private app Exemplar which we don't have access to, and is usually an extra assignment on homework. The idea is that the students are given a task 'write a program that calculates the median of data in a list'. They start by writing the tests first before implementing the program, to get you to think about the problem. The exemplar app already has a working (and purposely buggy) implementation of the assignment that they can't see the source code for which is where the 'chaffs' are found (potential bugs caught). We will have to write a lot of tests ourselves, covering enough cases until we've fleshed out the problem enough that we can go about implementing it. The professor's university page has a paper on Exemplar detailing how many tests the students typically write which is 30+.

So how can we figure out the student given test cases for the median program? By looking up what a median is and the procedure to find it. Reading that article we see that the median is a value that seperates a finite list of numbers, that can be found by arranging them from the smallest to the greatest. If there is an odd number of entries, the middle number that splits the list into two even parts is picked. Otherwise it is defined to be the mean of the two middle values which means add them both and divide by 2, the idea being getting as close as possible to evenly splitting the data. That's why the test for [list: 1, 3, 2] is 2 because your program would reorder the list to be [list: 1, 2, 3] and choose 2 as the median. The Fermi problems he talks about must be released on Piazza which requires campus logins, https://piazza.com/class is a kind of wiki/msg forum for courses that encourages anonymous postings so students don't feel embarassed to ask questions. A lot of universities now solely distribute their course materials through it meaning the public doesn't have access anymore. You can find these yourself online anyway by searching for Fermi problems.

At the end he mentions the design recipe, as a way to save TAs time, explaining if you show up to office hours for help they will go about writing a template with the student, going through all the various stages and not just giving them an answer. He notes it is totally optional to use the design recipe, but it will help you with the hard assignments and if you're going to office hours with a lot of questions clearly your methods aren't working so why not try their method. If you read that paper What Help Do Students Seek in TA Office Hours in the beginning of this workshop you'll see statistical results of the outcomes of students who used the design recipe vs those that didn't and their final grades.

Templates are covered in the third lecture, it's a tool to extract 'code for free' from your data and will be useful for some of the assignments.

Reading PAPL

Let's work through the book starting at the beginning, I'm using whatever the latest version is (as of this writing, 2020). This book is a work in progress so you'll see some minor errors like "REF" which is supposed to later be a link to whatever reference in the text that's not finished yet. The lectures we do will fill in these blanks. You can get a crash course to do the assignments with this single chapter from 2015 though some syntax may have changed, compare it to the Pyret documentation.

Chapter 2 Basic Data and Expressions read while you have the pyret interpreter at code.pyret.org open. The examples with ">" in front of them indicates enter this into the REPL, on the right hand side of the pyret code editor like he did in the lecture pressing enter to eval. It's running a read-eval-print-loop (repl) which means exactly what you think it means. It reads input, evaluates it, prints it to the console and then loops back to waiting to read input again. You can increase the font size if you click the lambda skull logo on the top left side.

Reading documentation

Already we are introduced to built-in functions, such as num-expt. To understand these, let's read the documentation. If you have a small screen, you will need to turn your device to landscape or increase the browser window size in order to see the table of contents in the pyret docs which automatically hide depending on screen size and is annoying UI.

Find the number functions documentation by hitting Ctrl-F in your browser or otherwise searching the page for 'num-expt'. Here is how you read it. The "::" syntax refers to type expected, so num-expt(base :: Number, exponent :: Number) -> Number is annotation telling you this built-in requires two inputs: a base and an exponent of type number, and it outputs a number. This will be explained in chapter 3. Repeat looking in the documentation to understand what string-subtring() does. Try changing the examples yourself in the REPL as you go.

Methods

In the documentation under lists you will see additional built-in functions called methods. These have a dot notation such as .length() or .map(). What this means is the object itself, a list object, has features you can access directly. If your list is named my-list then you can access these by my-list.length() or by chaining them together, such as my-list.sort().length() to both sort and return the length of a list. Try this yourself in the REPL.

Reading chapter 2

The last section on roughnums, where irrational numbers like the square root of two return different values is due to floating point approximation that by default use a 'round to even' rounding strategy which if you wanted you could learn from this lecture but it assumes a technical background we don't have yet, namely that you've completed all the other lectures in that playlist and an introductory programming course, which we are doing right now. The final exercise Do roughnums obey the axioms of arithmetic operators like distributivity, associativity and commutativity? you can tell immediately from the examples in the book that roughnums are always returning different values, so a + b does not equal b + a if the numbers are roughnums. Think to yourself what would happen if you used floating point to encode money in financial software, with unpredictable results if some trade combination resulted in a roughnum.

Reading chapter 3

You spent all of chapter 2 entering commands into the REPL on the right hand side of the pyret online editor, now you'll use the left hand side to write functions and run them by either pressing Ctrl-Enter or the Run button. If you make a mistake like accidentally deleting all your code Ctrl-z will undo.

In 3.5 Defining Functions in Steps notice he changes the tests to not just return the expected value, but to return the expected computation, writing out in full what should be happening. The idea of doing this is to help you write the function. These aren't tests to prove it is correct, they are tests to help understand the full domain of the problem so we can write a model for it with software.

I'm assuming you're doing all the exercises. Let's walk through the 'has-overtime' raises exercise.

fun has-overtime(hours :: Number) -> Boolean:
  doc: "Returns true if hours > 40 or returns false. Hours must be a positive number or error is raised"
  if hours < 0:
    raise("hours must be greater than zero")
  else if hours > 40:
      true
  else:
    false
  end
where:
  has-overtime(40) is false
  has-overtime(40.0001) is true
  has-overtime(-1) raises "hours must be greater than zero"
end

Notice raise() is a built-in function. Where did I find how to use it? In the pyret language documentation.

Let's write some tests for the last exercise hours-to-wages-ot as an example of how writing tests helps to define the problem better, and helps write the program for you. We have to write a function that consumes hours, an hourly rate, a threshold for OT, and returns anything over that threshold at 1.5 times the rate, adding it to the total pay.

First, let's write the parameters:

fun hours-to-wage-ot(hours :: Number, rate :: Number, threshold :: Number) -> Number:
  doc: "Calculates total pay with all hours over threshold at 1.5 the rate"
where:
end

now let's write the tests:

where:
 hours-to-wage-ot(10, 1, 40) is 10 * 1
 hours-to-wage-ot(41, 1, 40) is (40 * 1) + (1 * (1 * 1.5))
 hours-to-wage-ot(50, 20, 48) is (48 * 20) + ((50 - 48) * (20 * 1.5))
end

Keep writing test cases where you don't write the final expected result, but how you would arrive at that result. Notice all I did here was copy the last test case, and the first test case to the function body:

fun hours-to-wages-ot(hours :: Number, rate :: Number, threshold :: Number) -> Number:
  doc: "Calculates pay with all hours over threshold at 1.5 the rate"
  if hours > threshold:
    (threshold * rate) + ((hours - threshold) * (rate * 1.5))
  else:
    hours * rate
  end
where:
  hours-to-wages-ot(10, 1, 40) is 10 * 1
  hours-to-wages-ot(41, 1, 40) is (40 * 1) + (1 * (1 * 1.5))
  hours-to-wages-ot(50, 20, 48) is (48 * 20) + ((50 - 48) * (20 * 1.5))
end

Lecture 2 CS019 Rainfall Problem

Watch the Mon9/10/18 second lecture. Interesting lecture about the rainfall problem and the benefits of writing a lot of examples in order to understand badly defined problems. Data science/machine learning is brought up, as examples of dirty data. Higher-order functions are talked about, like filter and map which are in the pyret documentation. We're going to write our own versions of map, filter and fold after we cover chapter 6 Processing Lists.

Reading chapter 4

This is an interesting and short chapter how you would go about converting data to rows. There's nothing in here you can't already figure out now that you know how to read the documentation for pyret. The text also expects you to start reading documentation, not telling you how to assign a table to a variable so you can run all those sieve expressions on it. Change the examples in the text as you go to see what they do:

my-table = table: name, age
  row: "Alice", 30
  row: "Bob", 40
  row: "Carol", 25
end

sieve my-table using name:
  name == 'Carol'
end

order my-table:
  name ascending
end

This is similar to SQL syntax, where you just declare what you want to do to rows and columns instead of how to do it. Congrats you know some SQL now. In 4.2.7 the principle of orthogonality comment note why they composed these table row operations to be independent of each other so combining them in complex expressions is possible.

4.2.3 Exercise

Do Now from 4.2.3. We are asked to think about how to write two questions, the first one is given to us so I tried the second which asks Of the songs by a particular artist, which have we played the least often?" using only what we've seen so far:

my-playlist = 
  table: artist, song, playcount
    row: "Lustre", "The First Snow", 10
    row: "Lustre", "The First Beauty", 8
    row: "Lustre", "Welcome Winter", 1
  end

order my-playlist:
  playcount ascending
end

Second Exercise Write the second example as a composition of keep and order operations on a playlist table.

my-playlist = 
  table: artist, song, playcount
    row: "Lustre", "The First Snow", 10
    row: "Lustre", "The First Beauty", 8
    row: "Lustre", "Welcome Winter", 1
  end

composition = order my-playlist:
  playcount ascending
end

sieve composition using playcount:
  playcount <= 1
end

Read ahead and play around with this, maybe you just want to return the song name, not the artist name and playcount, experiment with the documentation on your own:

my-playlist = 
  table: artist, song, playcount
    row: "Lustre", "The First Snow", 10
    row: "Lustre", "The First Beauty", 8
    row: "Lustre", "Welcome Winter", 2
  end

least-played = order my-playlist:
  playcount ascending
end

select song, playcount from least-played end

Reading chapter 5

In 5.3.1 the description of the properties of an 'anonymous value' abstraction by using numbers and asking what they mean is similar to Terence Tao's Linear Algebra text where he points out that a property of numbers that makes them useful is their ability to be interpreted differently depending on context.

A Do Now appears. This means we should actually do it. Try to sieve our playlist to show the most played song using built-in language functions from the math or statistics library:

include math

my-playlist = 
  table: artist, song, playcount
    row: "Lustre", "The First Snow", 10
    row: "Lustre", "The First Beauty", 8
    row: "Lustre", "Welcome Winter", 2
  end

play-count = extract playcount from my-playlist end
most-played-count = max(play-count)

sieve my-playlist using playcount:
  playcount == most-played-count
end

Lists are introduced, which is the datastructure we will be using for a lot of the exercises because they frequently present themselves in nature.

Exercise in 5.4.2 Implement all the other statistical questions posed in Basic Statistical Questions. Let's look at them:

'keep-if' likely is supposed to be sieve as per 4.2.1 Keeping or another command you write to filter a table and keep results. Remember the warning from 1.3 The Structure of This Book: "We will include mistakes, not because we don’t know better, but because this is the best way for you to learn. Including mistakes makes it impossible for you to read passively: you must instead engage with the material, because you can never be sure of the veracity of what you’re reading" unless that is just a clever disclaimer to blanket cover all unintentional errors.

Let's go through 5.1 Basic Statistical Questions but using another my-playlist as an example. These will all be functions on lists, because we use the extract built-in function to remove this information from a table and returns a list.

# notice these import statements so we can use these built-in function
include math
include statistics
import lists as L

my-playlist = 
  table: artist, song, playcount
    row: "Lustre", "The First Snow", 10
    row: "Lustre", "Green Worlds", 1
    row: "Lustre", "Welcome Winter", 2
  end

# From 5.1 Basic Stat Questions:
# Question 1, 2, and 3 are the same: the maximum or largest value
# To see what these datastructures look like, type 'play-count' into REPL
play-count = extract playcount from my-playlist end
most-played-count = max(play-count)

sieve my-playlist using playcount:
  playcount == most-played-count
end

# Smallest value in a column 
least-played-count = min(play-count)

# Number of songs in a playlist
# Since we have extracted a list, count the length of the list using a built-in function
# using: play-count = extract playcount from my-playlist end
play-count.length()

# All the distinct entries in playcount column
# Another built-in from lists documentation
# Notice import lists as L, you can call this something else 'import lists as whatever' 
L.distinct(play-count)

# The number of distinct entries in a playcount column
# Asking us: count how many entries are distinct
# Note we are chaining built-ins together here:
L.distinct(play-count).length()

# The average, this is another built-in, from this exact chapter
avg-play-count = mean(play-count)

# Other statistics
# Replace x(play-count) with that built-in from the statistics library 
modes-play-count = modes(play-count)
median-play-count = median(play-count)
stdev-play-count = stdev(play-count)

Reading chapter 6

If during an assignment you get stuck come back to this chapter and you will certainly find what you're looking for. I highly recommend doing all the exercises here and understanding everything in fact I would re-read this chapter until you can effortlessly do all the examples and understand exactly how they work, then you're at a point where the rest of the lectures will make sense without having to stop and figure out all steps. There's a silly story about Gordon Ramsay as a junior chef practicing how to make lobster ravioli with a small frozen potato and pasta over 90x after his shift until the restaurant he was at would let him try it on real lobster. He also had to practice slicing a mackerel a few hundred times until they would let him near the more expensive fish.

6.1 Making Lists and Taking Them Apart

6.1 type all those examples in the REPL to see what they do, like manually linking a list together with the contructor link(link(empty)) which allows you append things on to lists without using a built-in like .append() or the "+" operator. The book walks through each exercise in 6.2 in the rest of the chapter by constructing examples of the behavior of the function.

6.3 Structural Problems w/Scalar Answers

Here we are being taught how to write examples which help us later write the program for my-len() and my-sum(). 6.3.3. there's some errata here, the case expression you are matching the variants of the list datatype with empty/link.

6.4 Structural Problems w/List Answers

There's an exercise to add examples for lists that end with positive numbers, and lists with [link: 0], try it yourself.

fun my-pos-nums(l):
  cases (List) l:
    | empty => empty
    | link(f, r) =>
      if f > 0:
        link(f, my-pos-nums(r))
      else:
        my-pos-nums(r)
      end
  end
where:
  # here we are running the program on the right
  my-pos-nums([list: 1,  0]) is link(1, my-pos-nums([list:  ]))
  my-pos-nums([list: 1,  1]) is link(1, my-pos-nums([list: 1]))

  # direct test of expected output
  my-pos-nums([list: 1, 0]) is [list: 1]
  my-pos-nums([list: 1, 1]) is [list: 1, 1]
end

6.5 Structural Problems and Sub-Domains

Study my-alternating() and my-max(), note how the empty case was handled. This is how it is evaluated:

  • input: my-max([list: 1, 2, 3])
  • num-max(1, num-max(2, num-max(3, 3))) since on the r is empty case, we return f
  • num-max(1, num-max(2, 3))
  • num-max(1, 3)
  • 3

6.7 Accumulators

Read 6.6 how my-avg() was obtained. Here we are shown helper functions inside other functions with the statement: Observe that we do not change my-running-sum itself to take extra arguments. There are multiple reasons for this. One reason is you want simple documentation, user intuitive and bug-free access to my-alternating() just like a typical command shell utility like change directory (cd) you wouldn't have users init your accumulators with cd(empty, 0, dir-name) you would instead want a simple interface where all internal operations are encapsulated that only takes a single input producing a single output.

6.8 Reducing computation

In 6.8.4 to reduce the computation they store the list that uniq-rec(r) outputs and if f is not a member, link it to all of the list ur or else return ur, so you save one iteration of the entire list by having f only compare against a distinct list with all duplicates removed.

6.9 Types

my-max<T>(l :: List<T>) -> List<T> means the list can be any type, but it's first element enforces what type the rest will be (string, number, boolean..) whereas List<Any> means any types can be mixed in the list.

Assignment 1: DocDiff

Let's try the first assignment. One way to slog through a long chapter on list functions is to actually implement something using lists, then we can refer to it as reference. In other words it becomes a research project to read that we are more motivated to do since we're trying to complete a task. You could also try watching Lecture 3 first before doing this assignment.

This assignment looks difficult on purpose. The dot product algebraic definition, look at the example for [1,3,-5] x [4,-2,-1]. Each index in the first list is multiplied by it's corresponding index in the second list, then those results are added together. To read the sigma notation, a*b = the sum of \(a_ib_i\) starting at index 1 (first element of the list) up to the length of the list (n):

If a = [1, 3, -5] and b = [4, -2, -1]:

\(a*b = \sum_{i=1}^3 a_ib_i\) = (\(1_{a1} * 4_{b1})+(3_{a2} * -2_{b2})+(-5_{a3} * -1_{b3})\)

Some more requirements are listed, such as ignoring the case (upper/lower) of a word when counting it's number of occurrences, this translates to converting list string inputs to string-to-lower/string-to-upper. The exact equation for overlap is given: dot-product(doc-vec1, doc-vec2) / num-max(num-sqr(magnitude(doc1)), num-sqr(magnitude(doc2))). The magnitude is defined as: num-sqrt(x * x) so the square root of a vector multiplied (dot product multiplication) by itself.

Example vectors:

  test zest
[list: "test"] 1 0
[list: "zest"] 0 1

Before we begin see the Pyret style guide.

Where to Start

One strategy is to begin with function skeletons of things you know you will need:

import lists as L

fun overlap(doc1 :: List<String>, doc2 :: List<String>) -> Number:
  ```Consumes two lists of strings, ignores case, and returns their overlap: 
  dot-product(doc-vec1, doc-vec2) / num-max(num-sqr(magnitude(doc-vec1)), num-sqr(magnitude(doc-vec2)```

  fun create-vec(l-a :: List<String>, l-b :: List<String>) -> List<Number>:
    doc: "Consumes two lists and returns a vector representation of the first input (l-a)"
  end

  fun ignore-case(l-c :: List<String>) -> List<String>:
    doc: "Converts a list of words to lower case"
  end

  fun dot-product(v-a :: List<Number>, v-b :: List<Number>) -> List<Number>:
    doc: "Consumes two vector lists and returns the dot product "
  end

  fun magnitude(v-c :: List<Number>) -> List<Number>:
    doc: "Consumes a vector and returns it's magnitude: sqrt(dot-product(v-c, v-c)"
  end

end

This function documentation is temporary to help remember what we're writing. We will probably add more helper functions inside overlap() as we go. Let's start with create-vec() since we were given examples in the assignment writeup:

  fun create-vec(l-a :: List<String>, l-b :: List<String>) -> List<Number>:
    doc: "Consumes two lists and returns a vector representation of the first input (l-a)"
  where:
   create-vec([list: "a", "B", "c"], [list: "d", "d", "d", "b"]) is [list: 1, 1, 1, 0] 
   create-vec([list: "d", "d", "D", "b"], [list: "a", "b", "c"]) is [list: 0, 1, 0, 3]
end
  end

We need to combine input l-a and l-b into one sorted distinct list so from our two examples: [list: "a", "b", "c", "d"]. The list library built-ins are permitted:

fun create-vec(l-a :: List<String>, l-b :: List<String>) -> List<String>:
    doc: "Consumes two lists of strings and returns a document vector representation of the first input l-a"
    distinct = L.distinct(l-a + l-b).sort()
  where:
   create-vec([list: "a", "B", "c"], [list: "d", "d", "d", "b"]) is [list: 1, 1, 1, 0] 
   create-vec([list: "d", "d", "D", "b"], [list: "a", "b", "c"]) is [list: 0, 1, 0, 3]
  # vector is ['another', list', 'of', 'words']
   create-vec([list: "list", "of", "words"], [list: "another", "list"]) is [list: 0, 1, 1, 1]
   create-vec([list: "another", "list"], [list: "list", "of", "words"]) is [list: 1, 1, 0, 0]
end
  end

Note in the examples, the vector returned is the same length of the distinct list of each document, so this distinct list we will want to go through each element, and count how many occurrences of that word, linking the count. We can write some more helper functions for this, first function deconstructs the distinct-list into elements, each time asking something like 'if f is a member, then link(count(f))' with count() being our other helper function that takes an element as input, and counts how many times it occurs in a list. This is known as 'wishing for functions' where you pretend you already have functions to do whatever your want, and call to them as you're writing another function. Each time you wish for one write an empty template.

fun create-vec(l-a :: List<String>, l-b :: List<String>) -> List<Number>:
  doc: "Consumes two lists and returns a document vector representation of the first (l-a)"

  fun count(s :: String, l-e :: List<String>) -> Number:
    doc: "Counts the occurrences of a string in a document"
    cases (List) l-e:
      | empty => 0
      | link(f, r) =>
        if s == f:
          1 + count(s, r)
        else:
          count(s, r)
        end
    end
  where:
    count("d", [list: "d", "b","d"]) is 2
    count("e", [list: "f"]) is 0
    count("f", empty) is 0
  end

  fun helper(distinct-list :: List<String>, document :: List<String>) -> List<Number>:
    doc: "Helper to assemble the vector"
    cases (List) distinct-list:
      | empty => empty
      | link(f, r) => if document.member(f):
          link(count(f, document), helper(r, document))
        else:
          link(0, helper(r, document))
        end
    end
  end

  distinct = L.distinct(l-a + l-b).sort()
  helper(ignore-case(distinct), ignore-case(l-a))
where:
  create-vec([list: "a", "B", "c"], [list: "d", "d", "d", "b"]) is [list: 1, 1, 1, 0] 
  create-vec([list: "d", "d", "D", "b"], [list: "a", "b", "c"]) is [list: 0, 1, 0, 3]
end

I wished for another function: ignore-case that I assume goes through a list of strings and converts them all to the same case. Repeat this for the rest of the assignment, see chapter 6 of PAPL to write dot-product(list1, list2) with cases of list1, cases on lists2, and link(f * ff, dot-product(r, rr)), on the empty case return the sum. When you're done click the arrow on the online interpreter and choose to type check and run.

Lecture 3 CS019 Insertion Sort

We're watching Wed9/12/18 lecture on sorting. These videos you can download in 1920x1080 if you want and zoom if you can't see the writing on the whiteboard.

He walks through exactly what a template is and how to extract data to make a template, and spends the rest of the class writing insertion sort using only what we already know so far. You have to hand step through it a little to understand what insertion sort is doing. Function sort() is calling insert() with insert(f, sort(r)) it isn't calling insert(f, r).

Let's step the evaluation by hand with [list: 0, 1, 3, 2] being input to sort(). The following calls to insert happen:

  • insert(0,(insert(1,(insert(3,(insert(2, [list: empty])))))))

It's like our example with my-len() with 1 + 1 + etc. Evaluation starts at the deepest inner bracket:

  • insert(2, [list: empty])
    • is list empty? link(2, link(empty))
    • return to sort(): [list: 2, empty]
  • insert(3, [list: 2, empty])
    • is 3 < 2?
    • link(2, insert(3, [list: empty]))
    • return to sort(): [list: 2, 3, empty]
  • insert(1, [list: 2, 3, empty])
    • is 1 < 2?
    • then it must the smallest in the entire list:
    • link(1, [list: 2, 3, empty])
    • return to sort(): [list: 1, 2, 3, empty]
  • insert(0, [list: 1, 2, 3, empty])
    • is 0 < 1?
    • link(0, [list: 1, 2, 3, empty])
    • return to sort(): [list: 0, 1, 2, 3, empty]
  • no more delayed computations, sort() is finished

The returns to sort():

  • insert(0,(insert(1,(insert(3, [list: 2, empty])))))
  • insert(0,(insert(1, [list: 2, 3, empty])))
  • insert(0, [list: 1, 2, 3, empty])

Everytime a list is returned, the next call to insert can happen because it is complete with insert(n, list).

Lab: Higher-Order Functions

Let's do the lab on functions as data, and write our own map, filter and fold.

fun fun-plus-one(num :: Number, func :: (Number -> Number)) -> Number:
  func(num) + 1
end

>>fun-plus-one(16, num-sqr)
>>5

Try entering various built-ins, or one you wrote yourself, as the second parameter to fun-plus-one.

Map

We're given test cases and asked to implement f-to-c and goldilocks, and given the formula for f to c temperature unit conversion.

fun f-to-c(f-lst :: List<Number>) -> List<Number>:
  cases (List) f-lst:
    | empty => empty
    | link(f, r) => 
      link((f - 32) * (5/9), f-to-c(r))
  end
end

fun goldilocks(f-lst :: List<Number>) -> List<String>:
  cases (List) f-lst:
    | empty => empty
    | link(f, r) =>
      if f > 90:
        link("too hot", goldilocks(r))
      else if f < 70:
        link("too cold", goldilocks(r))
      else:
        link("just right", goldilocks(r))
      end
  end
end
check:
  f-to-c([list: 131, 77, 68]) is [list: 55, 25, 20]
  goldilocks([list: 131, 77, 68]) is [list: "too hot", "just right", "too cold"]
end

We're asked to rewrite goldilocks using map. The data dumped into a map function is whatever the value is at that list index. In this case we know it's a number since f-lst parameter is :: List<Number>.

fun goldilocks2(f-lst :: List< Number>) -> List<String>:
  f-lst.map({(x): if x > 90: "too hot" else if x < 70: "too cold" else: "just right" end})
end
check:
  goldilocks2([list: 131, 77, 68]) is [list: "too hot", "just right", "too cold"]
end

.map() is a built-in method on a list, where f-lst.map() means we are mapping over f-lst. I used the shorthand syntax for lambdas.

We could have also rewritten it like the following example, where a seperate function is used for map instead of writing our own lambda:

fun goldilocks(x :: Number) -> String:
  if x > 90:
    "too hot"
  else if x < 70:
    "too cold"
  else:
    "just right"
  end
end

fun goldilocks2(f-lst :: List< Number>) -> List<String>:
  map(goldilocks, f-lst)
end
check:
  goldilocks2([list: 131, 77, 68]) is [list: "too hot", "just right", "too cold"]
end

Now our task is write our own version of map(). I used the documentation examples for map as check tests. Remember map consumes a function and a list, applies the function to each entry in the list returning a new list, try to do this yourself before looking at a solution (your solution may be better). For annotations, I used the built-in name annotations but if you read through the documentation you can also import annotations, or create your own datatype. The 'T' annotation is described at the end of chapter on processing lists, the notation <T> says that T is a type variable parameter meaning it can be any type, so if you give map a list of numbers it will be that type, or a of strings, it will be List<String> etc.

fun my-map<T>(func :: (T -> T), l :: List<T>) -> List<T>:
  cases (List) l:
    | empty => empty
    | link(f, r) =>
      link(func(f), my-map(func, r))
  end
end
check:
  my-map(num-tostring, [list: 1, 2]) is [list: "1", "2"]
  my-map(lam(x): x + 1 end, [list: 1, 2]) is [list: 2, 3]
end

Filter

First two assignments are straight forward to get you to used to using filter, except you have to use string-to-code-points() instead of any built-in string functions.

fun tl-dr(lol :: List<String>, length-thresh :: Number) -> List<String>:
  filter(lam(element): string-to-code-points(element).length() <= length-thresh end, lol)
end
check:
  tl-dr([list: "dkfjdkj", "hi", "dkfjk"], 2) is [list: "hi"]
  tl-dr([list: "corner", "case", ""], 2) is [list: ""]
  tl-dr([list: "a", "b", "c"], 0) is [list: ]
end

# I got '101' from entering string-code-points("e") in the REPL
# See comments below, you would not want to write code this way
fun eliminate-e(words :: List<String>) -> List<String>:
  filter(lam(element): if string-to-code-points(element).member(101) or
    string-to-code-points(element).member(69): false else: true end end, words)
end
check:
  eliminate-e([list: "e"]) is [list: ]
  eliminate-e([list: "there's", "no", "letter", "e", "here"]) is [list: "no"]
  eliminate-e([list: "hello", "everybody"]) is [list: ]
  eliminate-e([list: "E", "101", "!!!"]) is [list: "101", "!!!"]
end

Lambda parameters can be a single letter: filter(lam(x): x > 1 end) as it's immediately evident what that parameter is, as the 'scope' of that function is only the filter.

Task: implement our own version of filter, which is similar to what we did for map:

fun my-filter<T>(func :: ( T -> Boolean), l :: List<T>)-> List<T>:
  cases (List) l:
    | empty => empty
    | link(f, r) =>
      if func(f):
        link(f, my-filter(func, r))
      else:
        my-filter(func, r)
      end
  end
end
check:
  fun length-is-one(s :: String) -> Boolean:
    string-length(s) == 1
  end
  my-filter(length-is-one, [list: "ab", "a", "", "c"]) is [list: "a", "c"]
  my-filter(is-link, [list: empty, link(1, empty), empty]) is [list: link(1, empty)]
end

Fold

Try the two tasks: list-product and list-max:


fun list-product(lon :: List<Number>) -> Number:
  fold(lam(acc, n): acc * n end, 1, lon)
end
check:
  list-product([list: 2, 2, 2]) is 8
  list-product([list: 0, 1, 2]) is 0
  list-product([list: -1, -1]) is 1
end

fun list-max(lon :: List<Number>) -> Number:
  fold(lam(acc, n): if n > acc: n else: acc end end, 0, lon)
end
check:
  list-max([list: -1, 1, 2]) is 2
  list-max([list: 0, -100, 1]) is 1
  list-max([list: 1/2, 3/4]) is 3/4
end

Try implementing fold yourself before you look at my solution, use the check tests for fold in the pyret documentation to test your solution. I used function/arrow annotation syntax so: (Ann1, Ann2, …, AnnN -> AnnR) where Ann1 through AnnN are annotations representing arguments to a function, and AnnR is an annotation representing the return type of a function. Since fold takes two inputs, I used two annotations:

fun my-fold<T>(func :: (T, T-> T), acc :: T, l :: List<T>) -> T:
  cases (List) l:
    | empty => acc
    | link(f, r) =>
      my-fold(func, func(acc, f), r)
  end
end
check:
  my-fold((lam(acc, elt): acc + elt end), 0, [list: 3, 2, 1]) is 6
  my-fold((lam(acc, elt): acc + elt end), 10, [list: 3, 2, 1]) is 16
  fun combine(acc, elt) -> String:
    tostring(elt) + " - " + acc
  end
  my-fold(combine, "END", [list: 3, 2, 1]) is "1 - 2 - 3 - END"
  my-fold(combine, "END", empty) is "END"
end

Map2

Try the task for implementing who-passed, the second task we're asked to write map2 ourselves.

# link(ff, rr) and link(f, r) can be called anything:
# ie: link(head-first, tail-first) or link(head-second, tail-second)

fun my-map2<T>(func :: (T, T -> T), list1 :: List<T>, list2 :: List<T>) -> List<T>:
  cases (List) list1:
    | empty => empty
    | link(f, r) =>
      cases (List) list2:
        | empty => empty
        | link(ff, rr) =>
          link(func(f, ff), my-map2(func, r, rr))
      end
  end
end
check:
  my-map2(string-append, [list: "mis", "mal"], [list: "fortune", "practice"])
    is [list: "misfortune", "malpractice"]
  my-map2(_ + _, [list: "mis", "mal"], [list: "fortune", "practice"])
    is [list: "misfortune", "malpractice"]
  my-map2(string-append, [list: "mis", "mal"], [list: "fortune"])
    is [list: "misfortune"]
  my-map2(string-append, [list: "mis", "mal"], empty)
    is empty
end

The last task, best-price, look on youtube what a basic demand function is. Here's an example:

fun demand(price-increase :: Number) -> Number:
 doc: "1000 is approx demand of product at a fixed price"
  1000 - (60 * price-increase)
end

This means as the price increases, the quantity demanded will decrease by 60 units times the price increase, so if the price increase is 1, the demand is 1000 - 60. Your best-price function can take a list of price increases, the demand function, then map over the price list w/the demand function, to get a new list of adjusted demand vs price increase. Map over the adjusted demand with the price list to produce the forecast revenue, which is price * qty demanded. Then figure out how to use a combo of filter/fold in order to return the optimal price increase or write a helper function, inside the best-price function for the higher-order functions to use. The reasons for using a helper function is so you get access to the scope of the parent function, so any locally lived variables in that parent function you can pass into the helper function.

Map vs Filter vs Foldl vs Foldr

map and filter only consider one item at a time whereas fold can consider the entire list. You can implement both map and filter with fold (try it). If you've noticed in the Pyret documentation there is two folds: foldl and foldr.

  • foldl(lam(x): 0 + x end, [list: a, b, c])
  • ((0 + a)+ b)+ c)
    • eval nested brackets first, work left to right
  • foldr(lam(x): 0 + x end, [list: a, b, c])
  • a +(b +(c + 0))
    • eval nested brackets first, work right to left

Try subtracting [list: -1, 1, 2] with foldl and foldr to see the difference. With fold you have to go through every element whereas in our cases on list functions, you can break looping through a list if something is false/true and return immediately 'if f == 8: acc else: (loop again)'.

To really understand these functions try scaffolding them together in the REPL for the assignments, map over a list, feed it into a fold, feed that into a filter all chained together.

15-122 - Imperative Programming

In the next lecture of cs019, we will cover how to estimate complexity of a program, and then apply that to all future lectures and assignments. Here we will do the same, learn about how reasoning with contracts can help us write better code and then apply it to all future code. We will also learn exactly how numbers are represented in a computer, which is everyday knowledge we should know when later we have to work in Python for an AI course.

Register here free to Carnegie Mellon University's open learning initiative, and use the course key: imp2019f (this course can also be completed with only a phone).

Click on Enter Course and you should see Syllabus: 15-122 Imperative Programming Aug-Dec 2019. It consists of three units on integers (how two's complement works, bit shifting) and one unit on contracts which is a method for ensuring correctness. One of the professors has a short talk here outlining the benefits of contracts.

Let's start with Unit 2, Module 5: Representing Integers.

The Hindu-Arabic numeral system is introduced which if you've taken the math foundations beside this programming workshop you already know about. The summation notation for converting binary numbers to decimals: below the sigma (the greek E symbol, which means sum) is the starting index 0. Above the sigma is the range, which is n - 1 since we are beginning at zero, if you wish to sum 4 binary numbers then you want 0 through 3 (0, 1, 2, 3 is 4 digits). \(b_i2^i\) is where you plug in the starting index. The b variable (base) is either 0 or 1 since this is binary:

\(b_{0}2^0 + b_{1}2^1 + b_{2}2^2\)… for binary number 0101 is (1 * \(2^0\))+(0 * \(2^1\))+(1 * \(2^2\))+(0 * \(2^3\)) as you start from the right of the binary number and sum left. This will make sense when you do the 'do now' exercise.

Horner's rule you evaluate by starting with the deepest nested brackets and working to the right. The example you eval the multiplication first then addition: (2 + 1) x 2 x 2 x 2 x 2 (+ 1) which is 3 x 2 (6) x 2 (12) x 2 (24) x 2 (48) + 1 = 49.

Try the interactive assignment then move on to Adding Binary Numbers. The bit carry you start at the top and work down, so if you have three 1's (a carried bit), then it's 1 + 1 = 0 (carry over a 1), 0 + 1 = 1. Hexadecimal representation is next, the 'did I get this' questions are great for understanding how the 0x notation works, for example 0x10 is 00010000 or 2\(^4\).

We're now on module 6, fixed-precision integers. C0 is pronounced 'c-naught' and is similar to C. Note for these assignments if you want more practice, click on reset to keep feeding yourself new questions. There's a typo in Division and Modulus, it should read 9/4 = 2 and the remainder 9 % 4 = 1 for (x / y) * y + (x % y) = x.

The rounding unit points out that Python uses the round to negative infinity rounding strategy. Try some of the examples yourself: (-7) / 3 = -2.333… and the strategy for rounding towards O is truncate -2.333 to -2 while rounding to negative infinity you would decrease to -3 instead. Let's explain to ourselves the difference in mod operators too where rounding towards 0 for (-8) % 5 = -3 while rounding towards negative infinity results in -2. The Java/C model (towards 0), you multiply 5 by -1 and the remainder from -5 to -8 is -3. In the Python/negative infinity model, you are finding the lowest multiple of 5, which is less than -8, so 5 x (-2) = -10. The modulus is then the distance between -10 and -8, which is +2. The important takeaway here is to always read the documentation/specs of whatever language you're using, to understand how it's number representation works for integers, floats, etc.

Now we're on module 7 Bit Patterns. The AND operation against a mask such as 0xFF will leave every byte unchanged since if you look at the boolean table if the original bit is 0 or 1 then it will still be 0 or 1. The boolean most significant bit example of 0x80000000 remember 8 in hex is 1000. Let's go through the Bitwise negation example to understand it but with only 8 bytes. Assume x's current state is 0000 0001 everything is off except the first bit, using their home automation model where 1 bit represents a light switch. We want to toggle bit 5, the dining room light:

  • 0x20 = 0010 0000
  • & ~x = 1111 1110
    • dr = 0010 0000
  • 0xDF = 1101 1111
  • &x = 0000 0001
    • others = 0000 0001
  • dr | others = 0010 0001

Module 7 Relating Arithmetic and Bitwise Operations what they are describing is 'complement and increment' as a way to change any binary number to it's negative two's complement representation. The left shift exercise when you look at the answer notice they are shifting around the 1/3 values of blue then using OR to put them all back together into 32bits. The right shifting unit, there's a problem with the answer for 0xC0C0FFEE >> 20 it should be 0xFFFFFC0C but it doesn't recognize it as correct. The last one you have to manually write out the bits, arithmetically shift (padding 1's) to the right 7 bits, then reassemble them from left to right into groups of 4, truncating the furthest right 7 bits to get the answer.

Correct Code

We're now on module 1 Preconditions - When Do Errors Show up? which is the bulk of 15-122, analyzing imperative code and reasoning about it's correctness. You use code commenting to describe the behavior or a function, and these comments are called contracts. The compiler can statically check these contracts meaning the code is not running. If you watched the talk ex CMU prof Simmons gave about contracts I posted earlier in these notes, you'll see that extensive testing covers less than 1% of the field of potential inputs. That same talk he also points out contracts have benefits even for functional languages with strong typing and briefly gives an idea about dependent types as a replacement for contracts, something we'll also briefly cover later.

In When Do Errors Show Up? the example string attach(int n, char c, string str) int, char, and string are similar to Pyret type notations describing what inputs they should accept. Some of the exercises assume syntax knowledge like a char must be in single quotes ie: 'c' (char) vs "c" (string) but otherwise the syntax is easy enough to figure out just by looking at it, and if you make a simple syntax error the system still gives you the correct answer anyway. Module What 'Safety' Means in C0 the first learn by doing you're entering statements like n >=0 && n <=12. The equality boolean is n == 0 (double equal, single is assignment like Pyret).

Moving on to Module 2 Assertions. These are similar to Pyret check tests. If you're following the math workshop with this programming workshop, you will have no problem understanding their examples of 'point-to reasoning'. It's similar to the proofs we've done in Tao's Analysis, you are listing reasons why something is true by pointing out which axiom or lemma gives it to you. The first learn by doing question has a bug, look at the actual assertion statement and not the 'The assertion x!= 8 is valid' title as sometimes the assertion is x!=20 if you reset the question.

Module 2 Conditionals the syntax of [0, 14) which is standard interval notation for inclusive/exclusive. In the example of @assert x==y, note we are using int's and not floats or roughnums so this assertion is true. If you get stuck on the nested conditionals question notice the @requries x < 10, and if 4 < x makes the nested x <= 16 statement a waste of code. Let's continue on to the Assignments module, the example of if (e > 0) { e = e - 1;.. note again these are integers, so the input can't be 0.5 or a 'half number' that makes e negative after 1 is subtracted. This is a little confusing because they are mixing assignment statements '=' with equality statements in working out their point-to reasoning. The very last assignment took some word smithing, x' + y' > 2 * n. The last question to rewrite the assertion without using primed variables substitute exactly what x and y are into x + y > 2 * n, which is (x * 2) + x > 2 * n.

Loops

Module 2: Introduction to Loops when you mutate data many things can go wrong, so learning how to reason about correct loops will be important to us should we go on to use a Python or some other AI library later. The first example the while loop you can hand step this loop, adjust the varibles with each iteration, then compare them to the assert such as @assert i >=n once the loop exits, since if n = 3, then i should equal 3. Invariants are talked about but not introduced yet, these will be critical to know. An invariant is something that remains constant and doesn't change.

Module 3: Postconditions or what can be shown to be true after a function returns. The first learn by doing example note the brackets that C0 uses to declare scope, you want your first assignment within the first bracket:

int endless_subtraction(int x, int y)
//@requires x > 0;
//@ensures \result < y;
{
  int temp = y;
  while (temp > 0) {
    temp = temp - x;
  }
  return temp;
}

For the next assignment, an example of the @ensures syntax:

//@ensures \result > 0 && \result % 2 == 0
//@ensures \result == x || \result == -x

In Contract Exploits, I couldn't get the gcd() function exercise to accept my answer:

//@ensures \result % x == 0 && \result % y == 0;

However it does accept as correct answer:
//@ensures x % \result == 0 && y % \result == 0;

Testing

Module 3 Introduction to Testing I had no problems answering all the assignments correctly the first time, a testament to the greatness of CS019/PAPL. For the edge cases I used gcd(1, 10) and gcd(10, 1), for the factorial function I used fact(0) is 1. There's numerous pow(base, exp) function edge cases you can come up with like pow(0,2) is 0.

Specification Functions module is interesting, writing an inefficient secondary correct function to test a much faster implementation, or a secondary function to quickly test the result. I used 100,10 as inputs to the incorrect gcd. It uses a for loop, something the text didn't discuss at all before using it but you can probably figure it out anyway (for loops are covered in Module 10). First i is declared and assigned to 1, this is the index. A condition: i must be less or equal to x and i must be less or equal to y for the loop to continue. On each iteration increment i with i++ until i's value doesn't pass the loop condition. This is a completely incorrect implementation for example purposes.

Proving Correctness

Module 4 Proving Correctness let's learn about invariants and correctness proofs. The exercise in Reasoning about Loop Invariants for int test(int lo, int hi) my answers if you're stuck were lo <= i', lo <= i + 1, and i + 1 <= hi. I couldn't get the system to accept i' <= hi which it should be, if the condition on line 10 is i + 1 <= hi. It does however accept i <= hi (without the prime). We found another exercise bug.

Aside: It's sometimes beneficial to have a few bugs in these kinds of courses, because you then challenge your own answer and go through the material again to prove your answer is correct. You'll experience this if you ever use an international version of a popular university textbook which typically sell for only 10% the regular cost. Almost always the end of chapter exercises will have been rewritten by some contractor to force students to buy the full priced version, as often the exercises will be assigned homework. Since the publisher doesn't care at all about the international version it will likely be filled with errata/mistakes in these exercises. You will have to dig deep into the material in order to assure yourself that you have the correct answer and the text is wrong. It also keeps you awake while going through the material.

There's another mistake in the very last point-to reasoning question in Proving Correctness, the second loop invariant should be i <= hi instead of lo <= i. In Termination the answers are a little tricky, i + 5 doesn't really make any sense but is still valid since i is decreasing. Proving Functions Correct the last module is very short but enough to put all this material together. Congrats you now know how to reason about a program that mutates it's variables.

Arrays

Since we're on a roll let's finish this entire course, starting with Module 8 C0 Memory Model for Simple Types. In Function Calls in a C program main() is called first, which is usually at the bottom. The first learn by doing exercise you are only determining what fast pow is being called with (parameters), so x and y. The second is hand stepping the function one iteration. You can see how difficult this can become, trying to keep the entire state of a program's variables in your head. In Pyret and other functional languages you don't have to do this, since variables are immutable. The recursion module should be easy to figure out since we've already done a lot of recursion with Pyret.

The first array exercise actually runs your code, this was mine, note we are using booleans, so &&, || or !.

bool[] B = alloc_array(bool, 4);
B[0] = b;
B[1] = !B[0];
B[2] = B[0] && B[1];
B[3] = B[1] || B[0];

The second array exercise also runs your code, I copied the examples for int[]:

char[][] C = alloc_array(char[], 3);
for (int i = 0; i < 3; i++) {
    C[i] = alloc_array(char, 2);
}
C[0][0] = 'a';
C[0][1] = 'b';
C[1][0] = 'c';
C[1][1] = 'd';
C[2][0] = 'e';
C[2][1] = 'f';

In Module 9 The C0 Memory Model the first assignment asks you to open the c0 interpreter and find out what the default initialized value is for an array of type boolean. I guessed it was false instead of using the interpreter since false would be the obvious design choice. You can try the online interpreter for c0 here. At the end of this module they example a contract calling a helper function with \length(A) being used as an input in order to find the length of the array, with a warning that this is a hacky method that won't work if contracts aren't enabled in production since that assert statement won't be run.

Aliasing

Scope and aliasing can be difficult topics to learn, as per this paper on Prof Shriram's Brown Univeristy page where upper-level students even failed to understand aliasing. That paper is worth looking at, to see how confusing this can become in some languages like Java where one change affects two different locations, as they both point to the same thing. The exercises on this page can get tricky, remember that A = B if true means array A and B share the same memory location and are thus the same array. A[1] = B[1] if true means the values stored at those indexes are the same.

Moving on to module 10 Coding with Arrays we run into more aliasing. Here they decide to introduce for loops even though they were already used previously. The main difference between a while loop and a for loop, is you want to use a for loop when the range is known, and a while loop for unknown amounts of loops, such as waiting for a user to enter a key, or summing an arbitrary amount of inputs. Since an array is a fixed range you'd use a for loop for our array copy program. The learn by doing for summing an array, all you need to do here is declare int sum = 0 then in the body of the for loop keep a running sum, returning it outside the loop. Better code you'd have an if test to see if n == 0 then return 0 (sum of an empty array) but I didn't bother and it accepted my answer anyway.

int sum = 0; 
for(int i = 0; i < n; i++) {
   sum += A[i];
}
return sum;
// grader doesn't like sum = sum + A[i], preferring the += syntax instead


//the 'did I get this' exercise
int sum = 0;
for(int i = 0; i < n; i++) {
   sum += i;
}
return sum;
//note the 'between 0 and n' meaning exclusive of n

In Safety of Array Code you can see how difficult this can become to prove safety such as needing to guarantee the loop never reaches INT MAX. The very last did I get this question asks you to prove access to B[i] is safe, which there is only one answer and what we just covered in this module (the loop invariant). The rest of the exercises get harder and harder, but you will eventually figure them out and get better at 'point to reasoning'. For example the reasoning in array copy after the postcondition is added, how can we prove the postcondition is correct: if array B is created with length of n, and returned, then the postcondition must be correct. In Correctness of Array Code I couldn't get the system to accept my answer, which was originally A[i] == B[i] and return true, false otherwise. The accepted answer is to check if they don't equal, return true otherwise:

bool is_same_array(int[] A, int[] B, int n)
 //@requires n == \length(A) && n == \length(B)
{ 
 for (int i = 0; i < n; i++) 
 //@loop_invariant 0 <= i
 {
   if (A[i] != B[i]) 
   return false; 
  } 
  
  return true;
}

For Module 10 Contract Exploits w/Array Code the learn by doing assignment wants you to write assert statements such as assert(A[0] == 0);, checking A's elements are not modified by array copy.

Searching and Sorting

The loop invariant in the first search function is suspect, as that invariant will be broken the first iteration. From the assignment questions this appears to be a typo, that it should read 0 <= i instead. In Module 11 this is a confirmed typo, as the same function is repeated but with the fixed invariant. The first search learn by doing there's a few typos you'll catch, like search(100, x, A, 0, 10) when our search function only takes 4 parameters not 5. The last exercise in Searching in a sorted Array is easy once you've done the other exercises, only the 3rd line was a problem which was assert that A[i] < x, a kind of pointless assertion since if A[i] is not x, and A[i] is not bigger than x (remember it's a sorted array), then our present iteration of A[i] must be less than x. In Reasoning about the search function the first exercise any function calls in the requires/ensures/assert contracts need to be checked for safety violations, and any statement accessing arrays in the @ensures postconditions, assert contracts or body of the code. The feedback for the correct answer on the last exercise is wrong but by now you should be pretty good at point to reasoning. The postcondition in the last exercise is correct because the array has a precondition that it's already sorted, there's a loop invariant that the desired value is not in the array so far making a call to a boolean function with the current value of i, there's 2 assert statements and an if block that if A[current loop index] is bigger than x, then clearly x isn't in the array since it's a sorted array.

At this point, believe it or not, you know all there is to know about basic imperative programming and can go through the docs of any mainstream language like Python3.x for a future AI course. What we didn't cover was manual C pointers which if you wanted could go through the slides of the full 15-122 course or try a systems course that shows how these pointers work at the assembly level. Since we care about AI and will likely be using Python soon this course was a great intro to reasoning about code that mutates memory.

Complexity

We made it to the last unit on complexity.

The first example: 3n + 4 \(\in\) O(2n - 1)

  • try n = 1
  • 7 \(\le\) c(1) so c = 7 and technically that answer is fine, f(3n + 4) is still in O(n)

Their assignment: 3n + 4 \(\in\) O(2\(n^2\) + 3n + 1)

  • Try n = 1:
  • 7 \(\le\) c(2(\(1^2\)) + 3(1) + 1)
  • 7 \(\le\) c(6)
  • You could claim c = 2 or find a better approximation of n:
  • Try n = 2:
  • 3(2) + 4 \(\le\) c(2(\(2^2\))+ 3(2) + 1
  • 10 \(\le\) c(15)
  • True, so \(n_0\) = 2 and c = 1 though c = 2 is still a valid upper bound since f(n) is in O(\(n^2\))
  • c = anything would be fine since n is always going to be bounded by \(n^2\)

There's an optional assignment, to show 2\(n^2\) + 3n + 1 is not in O(3n + 4)

  • Try n = 1
  • 2(1) + 3(1) + 1 \(\le\) c7
  • Try n = 2
  • 2(4) + 3(2) + 1 \(\le\) c10
  • Try n = 3
  • 2(9) + 3(3) + 1 \(\le\) c13 or 28 < c13
  • We can see at this point that no matter what constant we pick for c, the \(n^2\) variable on the left will always be greater eventually than whatever we choose the constant c to be.

The did I get this assignment will be easy, whatever polynomial has the highest degree it will bound the lower degree polynomial since n can't outgrow \(n^k\). In Simplest and Tightest Bound what they are saying here is if your steps are 3n + 1, you don't want to pick any giant bound such as 3n + 1 is \(\in\) \(n^4\). You want the tightest bound, which is n, or whatever the maximum power is of f(n). If there is no variable n then the tightest bound is O(1) or constant time.

The Complexity Classes module assumes you remember logarithms the inverse of exponentiation. Let's try the example using the logarithmic properties from that AoPS page:

  • 3n(2+ log 2\(n^3\))
  • 6n + 3n log 2\(n^3\)
    • From AoPS: log bc = log b + log c
  • 6n + 3n(log 2 + log \(n^3\))
    • From AoPs: log \(b^n\) = n log b
  • 6n + 3n(log2 + 3 log n)
  • 6n + 3n log2 + 9n log n
    • 6n is in O(n) complexity class obviously
    • 3n log 2 is in O(n) because the constant 2
    • 9n log n is in O(n log n) since n is not constant
    • We pick the biggest class: O(n log n)

Try the exercises for determining complexity class:

  • log (3\(n^2\)) - 2
  • log 3 + log \(n^2\) - 2
  • log 3 + 2 log n - 2
    • complexity class: log n

Determining complexity of code

In this module don't worry about their fancy square matrix functions only what the code overall structure is (nested loops where functions are called, constant time returns, etc). In Beyond n I couldn't get the last exercise about n collection of images but correctly guessed 'wh' since we're still making a copy. It doesn't matter if you don't get these immediately you will gain through experience how to estimate complexity.

The last module Experimentally Testing Complexity the cost form an+b is what we've seen before such as 3n + 7, and this chapter is about analyzing a running program to discover what the multiplying constant is (the a in an). To figure out if it's n, \(n^2\), \(n^3\) read the introduction where doubling the size of the input should get double the cost if it's O(n), four times the cost if it's O(\(n^2\)), eight times the cost if it's O(\(n^3\)). The first activity that is f(n) = an2, notice the time goes from 0.02 -> 0.08 -> 0.28 -> 1.05 etc. That's roughly 4x the cost or \(n^2\). The matrix mult activity has roughly 8x the cost after n > 400. There's an error in the assignments, I couldn't get my answers accepted unless I multiplied 'a' by 100,000,000 not 10,000,000. In fact if you check their example chart for vector-above-avg-better, they have clearly multiplied 'a' by 100 million not 10 million.

The last learn by doing assignment 0.38sec multiplied by 4 is 1.52 or close enough to the next doubled input, so this function must be \(n^2\). Multiplying f(n)/n2 you end up with roughly a = 0.6 and there's an interesting follow up on how you can screw up your tests since the initialization of the test is \(n^2\) but the function cost is only n.

Cryptopals challenges

Using only what we know so far (how to read Pyret docs, how to use higher order function) try completing a few of the cryptopals challenges to learn about how real-world crypto is attacked. Model the bytes however you want, I used strings because we haven't covered making our own datatypes yet for type binary, and if you use numbers Pyret will truncate binary 0001 to 1. These challenges were designed so you write most of the modeling yourself.

Before you begin, you probably need to know the basics of bytes, boolean algebra and hex encoding. See 15-122 Imperative Programming.

Challenge 1 - Convert hex to base64: we're given two encoded strings, one encoded in hex and the other in base64 and we are asked to operate only on raw bytes. Since we're using a web editor, I decided to model this with strings, and a string-dict which is in the Pyret documentation. Wikipedia has all the info needed to generate hex encoded or base64 which instead of 4 bits uses 6 bits, and has a padding scheme which is detailed here. In that example, the character 'M' is 0100 1101 (8 bits) but base64 is 6 bits, so it pads four zeros for a total of 12 bits for the encoding which will be 2 base64 characters. You can check for this with built-in num-modulo. If you're wondering how num-modulo(-7, 3) is 2, see this short video on the mod operator. The first largest value of 3 bigger than -7 is -9, and the result is the distance between them which is 2.

include string-dict

hex-to-binary-dict =
  [string-dict:
    "0", "0000",
    "1", "0001",
    "2", "0010",
    "3", "0011",
    "4", "0100",
    "5", "0101",
    "6", "0110",
    "7", "0111",
    "8", "1000",
    "9", "1001",
    "a", "1010",
    "b", "1011",
    "c", "1100",
    "d", "1101",
    "e", "1110",
    "f", "1111"]

binary-to-hex-dict =
  [string-dict:
    "0000", "0",
    "0001", "1",
    "0010", "2",
    "0011", "3",
    "0100", "4",
    "0101", "5",
    "0110", "6",
    "0111", "7",
    "1000", "8",
    "1001", "9",
    "1010", "a",
    "1011", "b",
    "1100", "c",
    "1101", "d",
    "1110", "e",
    "1111", "f"]

binary-to-base64=
  [string-dict:
    "000000", "A",
    "010000", "Q",	
    "100000", "g", 
    "110000", "w",
    "000001", "B",
    "010001", "R",
    "100001", "h",
    "110001", "x",
    "000010", "C",
    "010010", "S",
    "100010", "i",
    "110010", "y",
    "000011", "D",
    "010011", "T",
    "100011", "j",
    "110011", "z",
    "000100", "E",	
    "010100", "U",
    "100100", "k",
    "110100", "0",
    "000101", "F",	
    "010101", "V",	
    "100101", "l",
    "110101", "1",
    "000110", "G",
    "010110", "W",
    "100110", "m",
    "110110", "2",
    "000111", "H",
    "010111", "X",
    "100111", "n",
    "110111", "3",
    "001000", "I",
    "011000", "Y",
    "101000"," o",
    "111000", "4",
    "001001", "J",
    "011001", "Z",
    "101001", "p",
    "111001", "5",
    "001010", "K",
    "011010", "a",
    "101010", "q",
    "111010", "6",
    "001011", "L",
    "011011", "b", 
    "101011", "r",
    "111011", "7",
    "001100", "M",
    "011100", "c",
    "101100", "s",
    "111100", "8",
    "001101", "N",
    "011101", "d",
    "101101", "t",
    "111101", "9",
    "001110", "O",
    "011110", "e",
    "101110", "u",
    "111110", "+",
    "001111", "P",
    "011111", "f",
    "101111", "v",
    "111111", "/"]

Build a function that takes the hex string as input, and outputs the binary representation, such as using string-explode() to make a list out of each character, and then map over the exploded list with hex-to-binary-dict.get-value(x) (see documentation for string-dict). You could also write a function make-bytes() that turns an exploded list of single bits into 4, 6 or 8 bits.

hex-string = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d"

>>hex-to-binary-list(hex-string)
[list:
"0100", "1001", "0010", "0111", "0110", "1101", "0010", "0000", "0110", "1011", "0110", "1001", "0110", "1100", "0110", "1100", "0110", "1001", "0110", "1110", "0110", "0111", "0010", "0000", "0111", "1001", "0110", "1111", "0111", "0101", "0111", "0010", "0010", "0000", "0110", "0010", "0111", "0010", "0110", "0001", "0110", "1001", "0110", "1110", "0010", "0000", "0110", "1100", "0110", "1001", "0110", "1011", "0110", "0101", "0010", "0000", "0110", "0001", "0010", "0000", "0111", "0000", "0110", "1111", "0110", "1001", "0111", "0011", "0110", "1111", "0110", "1110", "0110", "1111", "0111", "0101", "0111", "0011", "0010", "0000", "0110", "1101", "0111", "0101", "0111", "0011", "0110", "1000", "0111", "0010", "0110", "1111", "0110", "1111", "0110", "1101"]

>>hex-to-binary-string(hex-string)

"010010010010011101101101001000000110101101101001011011000110110001101001011011100110011100100000011110010110111101110101011100100010000001100010011100100110000101101001011011100010000001101100011010010110101101100101001000000110000100100000011100000110111101101001011100110110111101101110011011110111010101110011001000000110110101110101011100110110100001110010011011110110111101101101"

In Challenge 3, trying to find out which character has been XOR'd against the given hex encoded string, you can see Norvig's word frequency post. Your program can take the 1 byte ASCII character (ie: 01010000 is "P") and only test XOR'ing the common words, then checking if the result is a member of the much larger original. You avoid having to xor the entire original string every try. Use the repeat() function for strings to multiply a single character's byte representation to the same size of the word's byte representation. Map2 over them with your own xor function.

You don't have to do these challenges but they're good extra practice, plus you get to learn about cryptography at the same time.

Lecture 4 CS019 Big-O

Watching the Fri 9/14/18 lecture on performance. The lambda notation, it's the same as our anonymous map function notation we already did. The setup in the beginning of the lecture is to distinguish the notation in O(n) as being a function that maps from n -> steps to complete computation. Another explanation is here in ambiguities in mathematics.

The constant mentioned at the end of the lecture in the formal notation is further exampled here. It's an estimation constant that bounds f, so no matter what input to f(k), a constant c times g(k) will be bigger or equal to it. What times g(k) is bigger or equal to f(k) -> 5k + 4:

  • f(k) -> 5k + 4:
    • f(1) -> 5(1) + 4 = 9
    • f(2) -> 5(2) + 4 = 14
    • f(3) -> 5(3) + 4 = 19
    • f(10) -> 5(10) + 4 = 54
  • g(k) -> k:
    • g(1) = 1
    • g(2) = 2
    • g(3) = 3

The constant c, try 10 as suggested in the lecture for f([k -> 5k + 4]) is \(\le\) (10 * g([k -> k)):

  • f(1) is 9
  • g(1) -> 10 * g(1) is 10
    • f < g
  • f(2) is 14
  • g(2) -> 10 * g(2) is 20
    • f < g
  • f(10) is 54
  • g(10) -> 10 * g(10) is 100
    • f < g

Reading Chapter 14

We're reading Predicting Growth following the last cs019 lecture on complexity.

14.5 structural recursion can be described as you have built up a structure like say a list, adding n + 1 elements and structural recursion is taking that structure apart in the same way you built it up, recurring on (n - 1) like counting it's length, whereas generative recursion rearranges a problem into a set of new generated problems, basically recursion that ignores structure. Quick sort is an example of generative recursion you break on a pivot and now have 2 lists plus the pivot, that's not how you assembled that list structurally.

The Tabular Method, is not necessary as you will read in 14.8 since we can hand count these functions like we learned in the lecture. The 'question' refers to the conditionals in the cases statement like empty? and the answer is what that datatype variant evaluates to, like empty, or 0 or the recursive call in link(f, r).

"Let's consider the first row. The question costs three units (one each to evaluate the implicit empty-ness predicate, l, and to apply the former to the latter)" which is a very difficult way of saying: calling the function is +1 cost, opening up l for cases is +1, considering empty variant is +1, so 3 in total and once more if the list is actually empty which you should remember from lecture 4 Fri/9/14/18. Second row cost calculation is different from the lecture (none of these tiny costs matter of course), We get to link(f, r) and count: link +1, f + 1, rest + 1, 2 more for the addition operation, and one more to call len() again, so total 6 for the 'answer' though the lecture we saw is +1 for link, +1 for addition, and plus (k - 1), so in the lecture len() is total 5k + 4, here it's 12k + 4. You can do either this tabular method or you can do the adhoc assign + 1 cost to anything that looks like an operations that we did in the lecture and you will end up with O([k -> k]) anyway since we don't care about constants.

14.7 We've seen this notation in lectures, they are lambda functions to remove ambiguities of math notation. 14.8 recall from lectures that in english reads: 'there exists some constant C, for all inputs n in the set of Natural Numbers to the function f(n) such that f(n) is less or equal to this same constant C multiplied by some other function g(n) implying that f() is bounded by g().

14.8 Let's find the smallest constant for 13k + 4 \(\le\) ck2. If k = 1 then c needs to be 17 or larger. If k = 2 then we have 30 \(\le\) 17(22) which is true, and also true for k = 3, k = 4, if k = a trillion then 13(1 trillion) + 4 is still less than 17(1 trilllion * 1 trillion) in fact for inputs bigger than 13 the constant c becomes irrelevant. Big theta is then discussed briefly as the family of functions that grow at the same speed up to a constant. Reading PAPL, if you want to understand the math notation right click it, or if on a phone press the notation, in most browsers you should get a menu popping up from MathJax to show the TeX code to see exactly what it is.

14.9 Combining Big-oh Without Woe as we saw in 15-122 two functions in O(n) run consecutively (one after the other) are O(n + n) or O(2n) which is O(n) since constants don't matter, like f(5k + 4) is really f(k). A function f(n) that invokes z(n) in each of it's steps is O(n x n) or O(n2).

14.10 Solving recurrences

The first recurrence example of T(k) = T(k - 1) think of the lecture when we went through length(), which was a constant amount of 5 steps per input, plus everything that came before it, so 5 + (k-1) + 4 with the + 4 being the empty case, that's the same as T(k - 1) + c. The notation T(k - 2), T(k - 3) is referring to the next steps down the recursion tree towards the base case. Here \(c_0\) or \(c_1\) are used to show the base case is either 0 or 1 (the empty/stop recursion case). If you want go through this set of YouTube examples on recurrence relations to see what a recursion tree looks like.

The next recurrence is T(k-1) plus k, not a constant. To understand the notation: \(T(0) + (k-(k-1)) + (k-(k-2)) + \cdots + (k-2) + (k-1) + k\) plug in a value for k:

  • T(4) = T(0) + T(4-(4-1)) + T(4-(4-2)) + T(4-1) + T(4)
  • 0 + 1 + 2 + 3 + 4
  • 4(4+1)/2 = 10 is the closed form solution and is in O(\(n^2\)) since n(n + 1)/2 = (n2 + n)/2.

2T notation means there are two recursive calls. The last recurrence relation is detailed in this YouTube example.

There's an induction exercise which you can try if you've followed the math workshop with this one:

  • T(n) = n(n + 1)/2
  • Base Case:
    • T(1) = 1(1 + 1)/2 or 2/2
    • T(1) = 1
  • Hypothesis:
    • Suppose that T(n) = n(n + 1)/2 (assumption)
  • Then it is true for n + 1:
    • T(n) + (n + 1) = (n + 1)((n + 1) + 1)/2
    • T(n) + (n + 1) = (n + 1)(n + 2)/2
  • Left side is in the form of hypothesis T(n):
    • n(n + 1)/2 + (n + 1) = (n + 1)(n + 2)/2
  • Let's get rid of the denominator 2 by multiplying both sides by 2/1
    • n(n + 1) + 2(n + 1) = (n + 1)(n + 2)
    • \(n^2\) + 3n + 2 = (n + 1)(n + 2)
  • Factor left side:
    • (n + 1)(n + 2) = (n + 1)(n + 2)

We will come back to this chapter after next lecture to figure out the [k -> log k] and other logarithmic time algorithms.

Lecture 5 CS019 Insertion Sort Reccurrence

We're watching the Mon 9/17/18 lecture. The first 30mins or so of the recording doesn't have video, but you can pause the video around 30:14 in, open the video again to hear the audio, and see everything on the board they are working on (sort and insert). He starts out talking about the invariant of insert, that it's length should always be k+1 because it is inserting a new value. The second topic is about how the size argument of a list doesn't help you with incorrect types, if you have [list: [list: empty]], that's still size = 1 even though it's empty.

The if-statement you take the worst complexity branch, assuming worse-case analysis. The actual cost analysis of insert() starts around 20mins in because you want to resolve the dependencies of sort() by figuring out the complexity class of insert() first. A link operation link(f, r) takes a constant amount of time regardless of the size of the input list, because it doesn't make a copy of the list. The if statement is 1 + (worst case branch).

On the right hand of the board:

  • Tinsert(0) = \(c0\) (constant steps for empty, or base case)
  • Tinsert(k) = c + Tinsert(k - 1)
  • = c + c + c + … \(c0\)
  • = T(k) = kc + \(c0\).

Remember we're calling insert(f, sort(r)) on the rest of the list, so it's k - 1.

  • Tsort(0) = \(c_0\) or a constant amount of steps
  • Tsort(k) = Tsort(k - 1) + Tinsert(k - 1)c + \(c_0\)
  • = Tsort(k - 1) + kc
  • We've seen this already, T(k - 1) + k is the series 1 + 2 + 3 + 4+…+(k - 1) + k or n(n + 1)/2 or \(O(n^2)\)
  • Insertion Sort is \(O(n^2)\) worst case, and O(n) + O(1) in the best case because you still have to call sort(r) and do linear work exploding that n sized list out before you can do the O(1) operation by insert that appends to the beginning of the list and returns.

More Recurrences

Before we look at the big-O lab let's learn some more about recurrences so we 'can feel them in our bones' as suggested by Prof K in the last lecture. I'm going to use Part V Recurrences from MIT's Mathematics for Computer Science(mcs) free book, starting on page 993 and read only up to page 1001 as cs019 will cover the analysis later (Akra-Bazzi formula/asymptotic solution) in future lectures. For now we're going to read the intro of the Tower of Hanoi, and the steps they do to solve the recurrence for merge sort.

There are lectures for the recurrence chapter of mcs, the inductive proof in the recorded lecture for the Tower of Hanoi recurrence is slightly different than the pdf:

  • Our original recurrence: \(T_n\) = 2\(T_{n-1}\) + 1
  • Inductive Hypothesis(IH)/closed form solution: \(T_n\) = \(2^n\) - 1
  • Base Case: \(T_1\) = \(2^1\) - 1
  • Inductive step: \(T_{n+1}\) = 2\(T_{(n-1) + (n+1)}\) + 1 or 2\(T_n\) + 1
  • "Suppose \(T_n\) = \(2^n\) - 1. Then the successor of \(T_n\) is \(T_{n+1}\) = \(2^{n+1}\) - 1"
  • Now via substitution plug in our inductive hypothesis, since we declared \(T_n\) = \(2^n\) - 1 wherever you see \(T_n\) insert that hypothesis:
    • \(T_{n+1}\) = 2\(T_n\) + 1
    • \(T_{n+1}\) = 2(\(2^n\) - 1) + 1 (substitution)
    • \(T_{n+1}\) = \(2^{n+1}\) - 2 + 1 (distributive law)
    • \(T_{n+1}\) = \(2^{n+1}\) - 1

In mcs (the pdf text), for their IH they assumed that \(T_{n-1}\) = \(2^{n-1}\) - 1, so instead of substituting for \(T_n\) they substituted \(T_{n-1}\):

  • Original recurrence: \(T_n\) = 2\(T_{n-1}\) + 1
  • Assume/IH: \(T_{n-1}\) = \(2^{n-1}\) - 1
  • Inductive step (looking at the left side) from \(T_{n-1}\) to \(T_{n++}\) becomes \(T_n\), and we already know the value: \(T_n\) = 2\(T_{n-1}\) + 1
  • By substitution replace all occurrences of \(T_{n-1}\) with IH:
    • \(T_n\) = 2(\(2^{n-1}\) - 1) + 1
    • \(T_n\) = \(2^n\) - 2 + 1
    • \(T_n\) = \(2^n\) - 1

In Merge Sort, \(T_n\) describes two recursive calls, with the total input divided in half between them, plus another recursive call to compare each list and produce the smaller value (merging). The plug and chug method they use is self explanatory until Step 3 where they introduce k = log n in order to get their base case of \(T_1\) = 0. When you see \(2^k\) anywhere in the recurrence replace it with \(2^{\log n}\) which is n because \(a^{\log b}\) = \(b\). What you should take away from step 3 is how they manipulate the recurrence in order to produce a known value of \(T_n\) such as \(T_0\) or \(T_1\) in this case.

Step 3, after k = log n has been substituted for all k:

  • \(2^{\log n}T_{n/2^{\log n}}\) + log n * n - \(2^{\log n}\) + 1
    • Remember \(2^{\log n}\) = n
  • n\(T_{n/n}\) + n * log n - n + 1
    • n\(T_{n/n}\) is n\(T_1\) and in step 3 \(T_1\) was already declared to be 0
  • n(0) + n * log n - n + 1
  • n log n - n + 1 or \(T_n \in O(n \log n)\)

Return to 14.10 Recurrences

Let's go back and read 14.10 Solving Recurrences again now that we know how logarithms are used to produce a known base case value, like \(T_1\). The first is T(k) = T(k/2) + c or [k -> log k] recurrence, we are looking at binary search. For this binary search example, note the trick to make this recurrence \(T_1\) and look up how the logarithms are defined for \(log_2\) k.

The next example T(k) = T(k/2) + k or [k -> k], again they have used the \(2^{\log k}\) trick to produce k/k, and they have also used the distributive property to factor out k producing k(1/k + …+ 1/4 + 1/2 + 1) which ends up simplifying to base case 1 constant + 2k. The [k -> k log k] example they get rid of \(2^k\) by replacing it with \(2^{\log k}\) which is k and this is multiplied by T(1) which is \(c_1\). The rest of the recurrence is k \(\log_2\) k which is exampled and explained in this article on power rule. \(\log_2 k^k\) is k \(\cdot\) \(log_2\) k or k.

Lab: Big-O 1

Let's go through the lab here.

First task, why can we disregard constants, we already know why, because g(x) = 5x2 + x is T(k) = T(k-1) + n which is O[k -> k2]), since we only care (in this class anyway) about finding an upper bound for the worse case, we don't care about the constant 5 or x as they are both bound by x2. We've seen how when inputs grow large, x2 grows so much that x and any constants are insignificant numbers (solely for worse case analysis of course).

Second task in the chapter Notation notice the brackets are needed if using the [k -> k] notation and is why the first example of O(w -> w3) is wrong, it should be f \(\in\) O([w -> w3]). There also shouldn't be any constants in the notation like [m -> 3m]. The example of f \(\in\) O(h2) should likely be f \(\in\) O([h -> h2]) or f(h) \(\in\) O(h2).

In 3 The Tabular Method we finally get an explanation what the terms 'question' and 'answer' mean, this should probably be in the book. The second row of the cases statement we already read in the book for len function, and saw it in lectures. It really doesn't matter how you count the constant units either for worse case analysis, in lectures we chose link(f, r) as +1 cost, the addition as +1 cost, and added it to 3k + 4 so total 5k + 4. If you carefully counted each operation in link and came up with 100k + 4 it'd still be the same worst case analysis of len() is in O([k -> k]). Looking back at chapter 14 T(k) = T(k - 1) + c recurrence is the same as len().

4 Big-O with less math explains how to estimate complexity a little more clearly than the book does. Task:

  • 1: This could be [k -> k * log k] because for each element we make a call to another function so O(F X G). This also fits the explanation of f(x) = x log x where the recursive call is logarithmic but input stays the same.
  • 2: This could be [n -> n + m] which is [n -> n]
  • 3: This could be [k -> k * m2] which is [k -> k3]
  • 4: This is 'a function makes calls to another function at every one of it's steps' so O(F X G) as we are linearly going through to-delete, and at each element f, we call remove(f) on another list (remove(f) then needs to linearly go through all of l to remove f).

Let's figure out 5, a function that takes input 2 lists and counts their length. This translates to two nested cases:

fun dbl-count(list1, list2):
 cases (List) list1:
  | empty => 0
  | link(f, r) =>
 cases (List) list2:
  | empty => 0
  | link(ff, rr) => 1 + dbl-count(r) + 1 + dbl-count(rr)
  end
 end
end   

Looking at chapter 14 in the book, this is T(k) = 2T(k - 1) + c. Two recursive calls each doing a constant amount of work. Let's confirm by watching T(n)=2T(n-1)+1. Indeed this is O([k -> kk]).

  • 6: My guess is that this is O(F(k) X G(k) X H(k2)) or O([k -> k4])
  • 7: Recall from lectures that you figure out complexity cost of dependent functions from the function dependencies up, so we want to find the cost for rem-helper() first. It takes a size k input, and an element from rem-dups. We have an if-branch so we take the worst branch, and both branches involve going linearly through the entire list and looking at every element. We'll just naively calculate some costs: We go into the function, +1, we open the list, +1, we look at empty case, +1, it is empty and we return empty so 3k + 1. If it's not empty +1 for link operation, +1 to compare f to c, and worst case I guess is +1 to link f then + all the previous operations all over again until empty. So this is k + constant or O([k -> k]). Let's look at rem-dups. It's doing linear work up until it calls rem-helper() on every single element. The definition of Quadratic: 'appears when comparing each element of a data structure to all other elements in that structure' seems to fit this situation, so O([k -> k2]). Looking in the book, we see T(k) = T(k - 1) + k describes this program as well.

5.1 Recall \(c_0\) represents the base case T(0) which is some constant amount of work.

Another task, find the closed form of T(k-2) + 1 if k > 1. I will show the improper guess method where you input values for T(n) and hope an obvious pattern reveals itself, then link the correct method:

  • T(0) = constant
  • T(1) = constant
  • T(2) = T(0) + 1 or 1 + constant
  • T(3) = T(1) + 1 or 1 + constant
  • T(4) = T(2) + 1 or 1 + 1 + constant
  • T(5) = T(3) + 1 or 1 + 1 + constant
  • T(6) = T(4) + 1 or 1 + 1 + 1 + constant
  • T(7) = T(5) + 1 or 1 + 1 + 1 + constant
  • T(8) = T(6) + 1 or 1 + 1 + 1 + 1 + constant
  • Pattern: 1,1,2,2,3,3,4,4 etc.
    • Hint: Enter the pattern into OEIS and you find this
    • floor(n/2) (integer division, no remainder)
  • T(n) = floor(n/2) + constant for n > 1
  • This is [n -> log n] since at each level we only do constant work and throw away half the input.
  • For a better explanation, watch this.

Second recurrence relation: 2T(k-1) + 1 for k > 0 we already did, it's O([k -> kk]). The extra challenge relations are all in the book and lectures to solve these exist in this guy's playlist under 2.1.1 - 2.3.3.

Lecture 6 CS019 Quicksort

We're watching the lecture titled Wed9/19/18. Reminder these lectures can be viewed or downloaded in 1920x1080 size so you can zoom with whatever video player if needed but seeing the fine details isn't important he is showcasing the structure of linear [k -> k] vs quadratic [k -> \(k^2\)]. Testing oracles are brought up again which we saw in 15-122 an oracle is a seperate program that is a much slower and obviously correct version of your more highly optimized program for the purposes of testing it's correctness. Remember the post-conditions check where the result was equal to the result of another program which ran far more slowly.

He remarks that this is really a filter, and we should try quicksort using a built-in. Let's try it:

fun combine(lt, p, gt):
  lt.append([list: p]).append(gt)
end

fun qsort(l):
  cases (List) l:
    | empty => empty
    | link(pivot, r) =>
      combine(
        qsort(r.filter(lam(n): n < pivot end)),
        pivot,
        qsort(r.filter(lam(n): n >= pivot end)))
  end
end
check:
  qsort([list: 2, 1, 3, 4, 5]) is [list: 1, 2, 3, 4, 5]
  qsort([list: 100, 0, 1, 343]) is [list: 0, 1, 100, 343]
  qsort([list: 0, -1, 100]) is [list: -1, 0, 100]
  qsort([list: ]) is ([list: ])
  qsort([list: 8, 1, 1, 9, -1]) is [list: -1, 1, 1, 8, 9]
end

He ends the lecture trying to convey a bigger idea: use the structure of the data to come up with a simple solution, confirm the inefficient solution is correct, optimize it according to the bound you want.

Lab: Big-O 2

Now we can do the second lab. The first task is a mental exercise, which you can figure out after reading the Wikipedia article on Quicksort implementation issues. Several optimizations are listed in the article, one is switching to insertion sort when the number of elements remaining to be sorted are under some threshold. Since insertion sort efficiency for small lists in the best/average case scenario was repeatedly talked about in lectures I'm going to guess this is the solution they expect you to come up with since as we will see soon every single comparison sort style algorithm has the same worst case complexity, unless this whole task is just a trick to get you to see how hopeless it is to try and improve the worst case complexity of quicksort.

The next task is to read some slides up to and including the lower bound slides, coincidentally from the same school I once went to though I wish our classes were as good as CS019. The decision tree is somewhat confusing at first, a<b means abc, acb, cab, as in 'a' appears before 'b' does in that permutation. Lower bound slide, n! leaves, there are 6 leaves (meaning branches with no children) and since this tree is n = 3, n! is 1x2x3 or 6. The lower bound proof look at the first example of logarithms here if \(2^{h+1}\) \(\ge\) \(n!\) then \(h + 1\) \(\ge\) \(\log_2 n!\) because \(b^y = x\) is \(log_b x = y\). The task is to explain how this relates to our previous problem of attempting to optimize quicksort, and these slides prove that every comparison sort algorithm has the same worst case \(\Omega\)([n -> n log n]) comparisons. However I recall him telling us in lecture that insertion sort for a small list can help with the average/best case (note worse case obviously stays the same).

1.1 Tasks, sum all integers in a list requires linear work to go through the list and get the elements so this must be minimum O([k -> k]) unless they were in some kind of sequential order where you could use a closed form solution, like n(n -1)/2. Looking at the recurrence relations chapter in PAPL, this is doing constant work at every step, so T(k) = T(k - 1) + c where the c is addition of elements. The rest of the tasks look at the chapter on Predicting Growth in PAPL and see if there is a corresponding recurrence. They did however, make sure to point out we're using integers, so this opens the door to tricks we can do on only even or odd integers such as fast exponentiation but even with those tricks the minimum runtime for going through an n sized list, and at each element doing constant work on an entire other list of size n won't change it's recurrence of T(k) = (k - 1) + k.

Formal analysis writeup, use it as a template to look at the other function.

The last optional task asks how you would implement silly flipper in linear time. The original function calls reverse at every single element. Really all it's doing is linking first, last-rest, first-rest, last-rest - 1 .. so two lists input, the orig and reversed version, link(f, ff) and then on empty case use .take() to truncate. Linear reverse is described in the above pdf on fast exponentiation.

Assignment 2: Nile

This is a great intro course as the assignments are similar to what you'd be doing outside of school, such as using a Python library to pull down data from a website API and it arrives as lists nested within some other data structure and needs to be unpacked and cleaned such as dealing with the newline character "\n" in this assignment. Here we are restricted from using list built-ins, so things like member(), append(), or distinct() you'll have to write these yourself, the book considers rewriting library functions as drill exercises since programming is learn by doing.

You'll probably want to use a google account to save your work. We need to fill in recommend() and popular-pairs() according to the spec we're given. Don't use their 2018 starter template, there's some kind of bug where accessing the content of a recommendation doesn't work. Write your own or use this:

provide *
provide-types *

data Recommendation<A>:
  | recommendation(count :: Number, content :: List<A>)
end

data File:
  | file(name :: String, content :: String)
end

fun recommend(title :: String, book-records :: List<File>) -> Recommendation:
  doc: ```Takes in the title of a book and a list of files,
       and returns a recommendation of book(s) to be paired with title
       based on the files in book-records.```
  recommendation(0, empty)
end

fun popular-pairs(records :: List<File>) -> Recommendation:
  doc: ```Takes in a list of files and returns a recommendation of
       the most popular pair(s) of books in records.```
  recommendation(0, empty)
end

Examples to help understand what the assignment is asking for, you can run this to see if the tests are at least well formed:

fun recommend(title :: String, book-records :: List<File>) -> Recommendation:
  doc: ```Takes in the title of a book and a list of files,
       and returns a recommendation of book(s) to be paired with title
      based on the files in book-records.```
  recommendation(0, empty)
where:
  r1 = file("a", "1984\nCrime and Punishment\nHeaps are Lame\nLord of the Flies")
  r2 = file("b", "1984\nHeaps are Lame\nLord of the Flies")
  r3 = file("c", "1984\nCrime and Punishment\nHeaps are Lame\nCalculus")
  r4 = file("d", "Crime and Punishment\n1984\nLord of the Flies")
  input = [list: r1, r2, r3, r4]

  recommend("1984", input) is recommendation(3, [list: "Heaps are Lame", "Crime and Punishment", "Lord of the Flies"])
  recommend("Heaps are Lame", input) is recommendation(3, [list: "Crime and Punishment", "1984", "Lord of the Flies"])
  recommend("War", [list: file("q", "War\nPeace")]) is recommendation(1, [list: "Peace"])
  recommend("PAPL", input) is  recommendation(0, empty)
 

  # These are their tests from 2016 Nile writeup:
  f1=file("alist.txt","1984\nAnimal Farm\nHigurashi\nLife of Pi")
  f2=file("blist.txt","Animal Farm\nHigurashi\nLife of Pi")
  f3=file("clist.txt","1984\nHeart of Darkness")
  input2 = [list: f1, f2, f3]

  recommend("1925", input2) is recommendation(0,[list: ])
  recommend("1984", input2) is recommendation(1,[list: "Animal Farm","Higurashi","Life of Pi","Heart of Darkness"])
end

Start again by writing as if functions already exist for what you want to do, creating templates if you wish for a function. My strategy is map through the input, extract the file.content string of "Title\nTitle", and return a single list of all titles that have been recommended.

fun recommend(title :: String, book-records :: List<File>) -> Recommendation:

 list-of-titles = fold(lam(acc, elt): acc.append(elt) end, empty, map(lam(x): make-list(title, x) end, book-records))
 produce-rec(list-of-titles)
where
.. tests
end 

I wished for a function called make-list(), fed into a fold to append the return into a single [list: "title", "title"]:

# note recommend(title, book-records) are in scope here, if make-list is inside recommend
# you can change parameters to make-list(f :: File) and freely use title inside body of make-list 

fun make-list(search-title :: String, f :: File) -> List<String>:
  doc: "Tests if title is recommended, returns filtered list of f.content minus searched for title"

  new-list = f-to-list(f.content)
  if new-list.member(search-title):
    filter(lam(x): if x == search-title: false else: true end end, new-list)
  else:
    empty      
  end
where:
  make-list("A", file("", "A\nB\nC")) is [list: "B", "C"]
end

Another wished for function, f-to-list() which I expect extracts the f.content string, converts to a list and returns so it can be tested for membership. Notice I left the input to f-to-list() a single input, and the helper inside deals with accumulators/appending the newline character.

fun f-to-list(s :: String) -> List<String>:
  doc:" Consumes string Title\nTitle and returns [list: Title, Title]"

  fun helper(acc, l-a):
    doc: "Breaks on newline character and links accumulator"
    cases (List) l-a:
      | empty => empty 
      | link(f, r) => 
        if f == 10:
          link(string-from-code-points(acc), helper(empty, r))
        else:
          helper(acc + [list: f], r)
        end
    end
  end
  helper(empty, string-to-code-points(s + "\n"))
where:
  f-to-list("1984\nHeart of Darkness") is [list: "1984", "Heart of Darkness"]
end

Now complete wished for function produce-rec() which receives a list of all recommended titles, duplicates meaning they were recommended more than once. This is the first draft, the second draft is seeing where you can redesign to minimize the complexity or completely scrap the entire program and rewrite it from scratch, which is normal as you now know a little more about the requirements needed and can plan a better program. We're just working with templates, trashing them all and rewriting is no problem. As you gain experience you will get better at this, even being able to design programs on paper to work out the logic before writing any code.

popular-pairs()

Using the 2019 Nile assignment spec, notice the new datatype for popular-pairs(), this will be inside a recommendation instead of a list of titles: recommendation(2, [list: pair(book1, book2), pair(book2, book3)] and the return annotation is popular-pairs() -> Recommendation<BookPair>:

data BookPair:
    | pair(book1 :: String, book2 :: String)
end

From the examples you write, notice the pattern is f paired w/r, fr paired w/rr, etc.

input = [list: file("alist.txt", "1984\nAnimal Farm\nHigurashi\nLife of Pi"),
               file("blist.txt", "Animal Farm\nHigurashi\nLife of Pi"),
               file("clist.txt", "1984\nHeart of Darkness")]

recommendation(2, [list: pair("Animal Farm","Higurashi"), pair("Animal Farm", "Life of Pi"), pair("Higurashi", "Life of Pi")])

Assignment 3: Sortacle

Each assignment we'll try to build better software, with more tests. This assignment we're building a testing oracle. Let's try this using some of the design recipe shilled in the lectures:

  • Identify what must be represented and how it is represented
  • State what kind of data the function consumes and produces
  • Work through examples that illustrate the function's purpose
  • Write an outline of the function
  • Fill in the gaps
  • Test it

First function is generate-input(n :: Number) -> List<Person>, it consumes a number, and produces a list of length n of type Person. Since this function is going to generate random person.name strings we can't directly match the output in a test case, but we can still confirm it has the correct properties of [list: person(name, number)]:

data Person:
  | person(name :: String, age :: Number)
end

fun generate-input(n :: Number) -> List<Person>:
  doc:"Generate random n size list of person(name, age), name is length 4, age is 1 - 100"
  ...
where:
  # confirm size
  L.length(generate-input(5)) is L.length([list: 1, 2, 3, 4, 5])
  map(lam(x): (string-length(x.name) > 0) is true end, generate-input(5))  

  # make sure the age is within 1 - 100
  map(lam(x): ((x.age > 0) and (x.age <= 100)) is true end, generate-input(5))
 
 # confirm name is ASCII code points from 97 - 122 
 # string-to-code-points() returns a list, so nested map
  map(lam(x): map(lam(y): ((y >= 97) and (y <= 122)) is true end, string-to-code-points(x.name)) end, generate-input(3)) 
end

Pyret has a built-in function for generating a list repeat() which we will need to write ourselves since this assignment doesn't allow using built-ins outside of the higher order functions and sort() or sort-by() though I used some list built-ins here and in the test cases since we already rewrote them in the Nile assignment. One way we could architect this is recursing on n, linking a list together. So num-random() gives us something resembling a random name, see this article, code points I guess could be between 97 and 122 or 'a' and 'z' though you could just leave this random and fuzzy test:

fun generate-input(n :: Number) -> List<Person>:
  doc:"Generate random n size list of person(name, age)"

  fun random-name(s :: Number, name-string :: String) -> String:
    doc: "Generate a random string between ASCII 97 - 122 of length s"
    if s < 1:
      name-string
    else:
      single = num-random(33) + 90
      if (single >= 97) and (single <= 122):
        random-name(s - 1, string-from-code-point(single) + name-string)
      else:
        random-name(s, name-string)
      end
    end
  where:
    string-length(random-name(5, "")) is 5
    string-length(random-name(1, "")) is 1
  end

  fun helper(num :: Number, acc :: List<String>) -> List<Person>:
    doc: "Consumes n, init acc and returns [list: person(name, age)] n times"    
    if num < 1:
      acc
    else: 
      name-length = num-random(25) + 1
      helper(num - 1, link(person(random-name(name-length, ""), num-random(100) + 1), acc))
     # acc on first call is a placeholder here for empty
     # this will return ...link(item, link(item, orig-acc)) 
    end
  end

  # preconditions
  if n < 1:
    raise("Input must be greater than 0")
  else:
    helper(n, empty)
  end
where:
  # confirm size
  L.length(generate-input(5)) is L.length([list: 1, 2, 3, 4, 5])
  map(lam(x): (string-length(x.name) > 0) is true end, generate-input(2))  

  # make sure the age is within 1 - 100
  map(lam(x): ((x.age > 0) and (x.age <= 100)) is true end, generate-input(2))
  # confirm name is ASCII code points from 97 - 122 
  map(lam(x): map(lam(y): ((y >= 97) and (y <= 122)) is true end, string-to-code-points(x.name)) end, generate-input(2)) 

  # confirm input is > 0
  generate-input(0) raises "Input must be greater than 0"
end

Now we have what the assignment is looking for:

>>> generate-input(2)
[list: person("vwqukq", 3), person("rljhrsijqcwyuglwjtnhc", 48)]

Next is a sort validation function that consumes two sorted lists of type Person, and compares if they are both the same returning a boolean true or false. This is similar to 15-122 when we would call an inefficient and obviously correct version of the same program in order to satisfy the postcondition the output is the same. Start with basic tests as usual:

fun is-valid(test-input :: List<Person>, correct-input :: List<Person>) -> Boolean:
  ...
where:
  is-valid([list: person("a", 31), person("b", 30)],[list: person("b", 30), person("a", 31)]) is true
  is-valid([list: person("c", 10), person("d", 10)]) is true 
  is-valid(empty, empty) is true
  is-valid([list: person("a", 31)], empty) is false
end

To compare if the first sorted input list is equal to the correct sorted list, we could map2 over both lists to see if .age is equal, then fold over the list map2 returns comparing each entry with 'true'. This is inefficient as a case statement can break on the first instance of false instead of continuing through the entire linear list:

fun is-valid(test-input :: List<Person>, correct-input :: List<Person>) -> Boolean:
  doc: "Consumes l1 a sorted list and compares to l2 a correct-sorter() list"

  fun length-test(test-input-a :: List<Person>, correct-input-a :: List<Person>) -> Boolean:
    doc: "Test if both lists are the same length"
    (L.length(test-input-a)) == (L.length(correct-input-a))
  end

  fun name-test(test-input-b :: List<Person>, correct-input-b :: List<Person>) -> Boolean:
    doc: "Test if any names have been altered"
    correct-sorted-names = correct-sorter(test-input-b)
    names = map2(lam(n1 :: Person, n2 :: Person): (n1.name == n2.name) end, correct-sorted-names, correct-input-b)
    L.foldl(lam(a :: Boolean, b :: Boolean): a and b end, true, names)
  end

  fun age-sort-test(test-input-c :: List<Person>, correct-input-c :: List<Person>) -> Boolean:
    doc: "Test if sorted by age correctly"
    age-test = map2(lam(n3 :: Person, n4 :: Person): (n3.age == n4.age) end, test-input-c, correct-input-c)
    L.foldl(lam(c :: Boolean, d :: Boolean): c and d end, true, age-test)
  end

  fun age-and-name-sort-test(test-input-d :: List<Person>, correct-input-d :: List<Person>) -> Boolean:
    doc: "See if the names and ages match, this will fail occasionally for qsort on duplicate ages"
    both-test = map2(lam(n5 :: Person, n6 :: Person): (n5.name == n6.name) and (n5.age == n6.age) end, test-input-d, correct-input-d)
    L.foldl(lam(e :: Boolean, f :: Boolean): e and f end, true, both-test)
  end

  # return the results
  if length-test(test-input, correct-input)
    and
    name-test(test-input, correct-input)
    and
    age-sort-test(test-input, correct-input)
    and
    age-and-name-sort-test(test-input, correct-input):
    true
  else:
    false  
  end
where:
  is-valid(empty, empty) is true
  is-valid([list: person("a", 31)], empty) is false
  is-valid([list: person("", 31)], [list: person("a", 31)]) is false 
end

Now we go about writing multiple buggy implementations of sorting algorithms and test these functions with oracle :: (List<Person> -> List<Person>) -> Boolean ie: oracle(insertion-sort)) or oracle(quick-sort):

fun combine(lt, p, gt):
  doc: "Used by quicksort"
  lt.append([list: p]).append(gt)
end

fun qsort(l :: List<Person>) -> List<Person>:
  doc: "Quicksort from lectures"
  cases (List) l:
    | empty => empty
    | link(pivot, r) =>
      combine(
        qsort(r.filter(lam(n): n.age < pivot.age end)),
        pivot,
        qsort(r.filter(lam(n): n.age >= pivot.age end)))
  end
end

fun incorrect-qsort(l :: List<Person>) -> List<Person>:
  doc: "Sabotaged quicksort"
  cases (List) l:
    | empty => empty
    | link(pivot, r) =>
      combine(
        incorrect-qsort(r.filter(lam(n): n.age < pivot.age end)),
        pivot,
        incorrect-qsort(r.filter(lam(n): n.age > pivot.age end)))
  end
end

# empty case test
fun empty-sort(p :: List<Person>) -> List<Person>: 
  empty 
end

# deletes the names
fun name-screwed(w :: List<Person>) -> List<Person>:
  correct = L.sort-by(w,
    lam(p1, p2): p1.age < p2.age end,
    lam(p1, p2): p1.age == p2.age end)
  (map(lam(x): person("", x.age) end, correct))
end

fun oracle(f :: (List<Person> -> List<Person>))-> Boolean:
  doc: "Testing oracle for sorting lists of type Person"

  random-input = generate-input(num-random(30) + 5)
  is-valid(f(random-input), correct-sorter(random-input))
where:
  oracle(qsort) is true
  oracle(incorrect-qsort) is false
  oracle(name-screwed) is false
  oracle(empty-sort) is false
end

These tests won't always pass, because sometimes incorrect-qsort() will produce the correct output, it's only on edge cases with duplicate ages that it won't pass our oracle test case. Quicksort will also sometimes produce a different sort sequence for multiple same ages where the names will differ than the sequence correct-sorter() is using. You now know how to build your own automated testing program though I only covered a tiny amount of test cases, I didn't really give anything away here and showed most of my (poor) code because this assignment is about full testing coverage. You should completely redo my code and add your own testing to cover all the things I missed.

If you were a student at Brown actually taking this course, you would have to write a much larger set of tests since the grading TAs would be running your oracle against a suite of incorrect sorting algorithms. Some things you'd need to check are: missing ages, all random inputs not just ASCII characters (fuzzy testing), empty names with the same or different ages. I sterilized the input to the oracle sorters but they probably don't when grading.

Assignment 4: Data Scripting

If you ever accidentally delete something in the pyret online code editor, you can still recover with Ctrl-Z even if the undo options are greyed out in whatever browser.

This assignment is yet more practice that mirrors what a data scientist would have to do unravelling data out of nested structures and writing quick basic tasks.

is-palindrome() and sum-largest()

Straight forward to figure out using map/map2/fold. When you map over nested lists the x in lam(x) is the entire nested list.

adding-machine()

Notice two consecutive 0's in the test case which will screw up your list/empty cases if you're linking on occurrences of 0 unless you filter the returned acculumator.

bmi-report()

Template is:

data PHR:
  | phr(name :: String,
      height :: Number,
      weight :: Number,
      heart-rate :: Number)
end

data Report:
  | bmi-summary(under :: List<String>,
      healthy :: List<String>,
      over :: List<String>,
      obese :: List<String>)
end

fun bmi-report(phrs :: List<PHR>) -> Report:
  ...
where:
  bmi-report([list: phr("eugene", 2, 60, 77),
      phr("matty", 1.55, 58.17, 56 ),
      phr("ray", 1.8, 55, 84),
      phr("mike", 1.5, 100, 64)]) is 
  bmi-summary([list: "eugene", "ray"], # under
    [list: "matty"],         # healthy
    [list: ],                # over
    [list: "mike"]           # obese
    )
end

My solution was to write a helper function bmi(p :: PHR) -> Number and then filter each list for under, healthy, over, obese. Return bmi-summary by mapping over those filtered lists for lam(x): x.name, try adding more test cases like the empty case, or multiple same names. Assignment claims we can assume no entry will be zero so we don't need a test case for division by zero.

data-smooth()

You could want tests for empty, one element, two elements, and the given assignment example:

fun data-smooth(phrs :: List<PHR>) -> List<Number>:
  doc: "Smooth heart rate data"
...
where:
  # given example from assignment writeup
  data-smooth([list: phr("eugene", 2, 60, 95),
      phr("matty", 1.55, 58.17, 102),
      phr("ray", 1.8, 55, 98),
      phr("mike", 1.5, 100, 88), phr("a", 2, 2, 105)]) is [list: 95, 295/3, 96, 97, 105]

  # empty test
  data-smooth(empty) is empty

  # one element test
  data-smooth([list: phr("spock", 3, 2, 1)]) is [list: 1]

  # two element test
  data-smooth([list: phr("spock2", 1, 2, 90), phr("spock3", 3, 4, 91)]) is [list: 90, 91]
end

One way to try this assignment is recurse on length with (prior + l.first + l.rest.first) / 3 then before your helper function returns, append the last element which we want unchanged. Before your data-smooth function returns append the first element to .rest of what helper returned, since we want the first element unchanged too.

frequent-words()

I wrote this template:

# optional
data Freq:
  | frequency(word :: String, count :: Number)
end

fun frequent-words(words :: List<String>) -> List<String>:
 ...
where:
  frequent-words([list: "silver", "james", "james", "silver",
      "howlett", "silver", "loganne", "james", "loganne"])
    is  [list: "james", "silver", "loganne"]
  frequent-words([list: "a", "a", "a", "a", "b", "b", "c"]) is [list: "a", "b"]
  # corner case, all words are only 1 frequency so return sorted
  frequent-words([list: "james", "so"]) is [list: "so", "james"  ]
  frequent-words(empty) is empty
end

The assignment wants a list returned of words, in descending order of freq count, sorted by smallest string length if there's a tie. If the highest frequency count is greater than 1, then you return those words otherwise return the list sorted if there are no freq greater than 1 is how I interpret the assignment. I made a distinct list, mapped over it and at each distinct list entry filtered the total input list 'words' to get a list of same words, storing x.word and the length of whatever list that filter returned to get frequency("word", 3). I then obtained the highest frequency count, and used it to recurse on, creating a new list if x.count == count breaking on if count < 2. Before appending to a helper function accumulator I used sort-by() from the documentation. If the highest freq count however was only 1 then I sorted and returned the list. Maybe you can engineer a simpler way.

daily-max-for-month()

We're asked to take [list: 20151004, 150, 200, 175, 20151005, 0.002, 0.03, 20151007, 130, 0.54, 20151101, 78] and turn it into a daily maximum [list: max-hz(20151004, 200), max-hz(20151005, 0.03), max-hz(20151007, 130)] if the searched for month is October. Our template:

data Report:
  | max-hz(date :: Number, max-reading :: Number)
end

fun daily-max-for-month(sensor-data :: List<Number>, month :: Number) -> List<Report>:
  ...
where:
  input = [list: 20151004, 150, 200, 175, 20151005, 0.002, 0.03, 20151007, 130, 0.54, 20151101, 78]
  daily-max-for-month(input, 10) is [list: max-hz(20151004, 200), max-hz(20151005, 0.03), max-hz(20151007, 130)]
end

I completed this with 2 functions maxx(), and a helper(). I filtered sensor-data for the month desired to get a distinctlist of all dates of that month [list: 20151004, 20151005, 20151007] since if you multiply month by 100, and add 20150000 you get the desired month. Then I called helper(distinct-list, sensor-data) which tested if f was a member of my distinct month list. If so I called maxx to find the highest count that day: link(max-hz(f, maxx(r), helper(distinct-list, r))).

Lecture 7 CS019 Trees

We caught up to the lectures, we're watching Fri/9/21/18 lecture on 2D tree shaped data w/guest lecturer Tim Nelson who is an expert in formal methods such as proving correct a network security policy as an example or formally verifying software models something we will be doing in the courses after this one. An interesting paper from his page authored with Prof K you should read is The Human in Formal Methods. In it they write about porting the method of example writing we've been doing to this domain. A common learning mistake that students make is talked about, in that they write programs before they have understood the problem, as a result they 'solve' the wrong problem, and were misdirected from the learning goals. We so far have been writing examples and tests, so are a bit better at understanding the problem before writing the software but this paper makes it clear we should probably be writing much more examples and tests, and even try finding invariants.

The mutual recursive datatypes it's confusing just writing examples, to understand them write out your own tree:

       a 
      /  \   
    b    none     
   /  \
  c    none
   \ 
    none

a has child b and none
b has child c and none
c has none
  • start with a:
  • person("a", No-children)
  • add in b:
  • person("a", child(person("b", No-children), No-children))
  • add in c:
  • e2 = person("a", child(person("b", child(person("c", No-children), No-children)), No-children))

Reading Chapter 7

This chapter talks about representations, which will come up in lecture 8 again. Cases are introduced here after we have used them multiple times, but now you know the meaning of the term data variant.

Reading Chapter 8

Notice the typo in oldest-song() examples, using lvar instead of lver (always run the examples so they pass). In fact this chapter has a few errors, but of course they could be intentional due to that awesome disclaimer in the beginning of the book. You can rewrite oldest-song() using just a single fold statement: if (current-year - x.year) > acc.year: x else: acc. Fold does all the magic for you taking anything you return such as x (the song) and puts it in the accumulator as it iterates over the list.

Reading Chapter 9

The recursive data definitions that we saw in Lecture 7, and they hand step through a recursive function. Seems like this should be in chapter 1. The design recipe template from HtDP book for recursive data definitions is also shown. For the 3 exercises, cut and paste the recursive template and try filling it in to complete the exercises, notice how easy it is, that's why they shill the design recipe.

data NumList:
  | nl-empty
  | nl-link(first :: Number, rest :: NumList)
end

#|
fun num-list-fun(nl :: NumList) -> ???:
  cases (NumList) nl:
    | nl-empty => ...
    | nl-link(first, rest) =>
      ... first ...
      ... num-list-fun(rest) ...
  end
end
|#

The HtDP method is first uncomment the template, change the function name and the recursive call name to w/e name your function is. Change the annotation for the return such as -> Boolean. If you add a parameter, add one to the recursive call too. Now the function pretty much fills out itself. Try the exercises, the very last one I interpreted as:

data NumListList:
  | nll-empty
  | nll-link(first :: NumList, rest :: NumListList)
end

fun sum-of-lists(nl :: NumListList) -> NumList:
  doc: "We already wrote num-list-sum() as an exercise"
  cases (NumListList) nl:
    | nll-empty => nl-empty
    | nll-link(first, rest) =>
      nl-link(num-list-sum(first), sum-of-lists(rest))
  end
end
test = nll-link(nl-link(1, nl-link(3, nl-empty)), nll-link(nl-link(3, nl-link(4, nl-empty)), nll-empty))
sum-of-lists(test)

>>nl-link(4, nl-link(7, nl-empty))

Lecture 8 CS019 Sets

We're watching Mon 9/24/18 lecture on sets. When he shortens insert() to equal link, try it in the pyret repl: insert = link, insert(2, empty) produces [list: 2]. The end of this lecture discussing the big-O complexity of both representations he points out the inefficient blue representation is better depending on what you care about, like appending a log you often don't look at. Note some of the students had problems figuring out the big-O, it's expected at this stage you only have a familiarity with costs and that true understanding will come later with more experience and during an algorithms book/class which has various analysis techniques we haven't been taught yet that make it much easier.

Reading Chapter 15

We haven't covered 15.2 yet so just reading 15.1, covers what we just saw in the lecture representing sets using lists. Again this is a good course because you have to manually implement everything yourself including built-ins such as all the set built-ins.

There's a link to the glossary, interesting definitions in there such as links to a paper on coalgebras/coinduction. In the size() recurrence, remember the chapter Predicting Growth where common recurrences are listed such as T(k) = T(k - 1) + k is \([k \rightarrow k^2]\) because it's closed form solution is: \(\frac{k^2 + k}{2}\) hence T(d) = d + (d - 1) is also \([d \rightarrow d^2]\).

End of 15.1.3 notes your program could keep running stats about which representation of insert() to use and make a decision. The exercises here you can do in your head, thinking of types to give these operations like subset() or how remove is going to work, obviously it must remove all the duplicates if the representation allows for duplicates which is easy to implement if f == removed-thing, skip over it and link the rest. I'm not exactly sure about the last exercise, I'm assuming they want us to consider that type LSet = List is not good enough, and to create an entire new datatype of nl-set since Set datatype is already built-in, like we did for nl-link.

Reading Chapter 12

This is short unfinished chapter on testing. In testing blocks we have some new binary and unary test operators we can use like expr1 is%(function) expr2 or expr1 satisfies (function). See examples in the documentation.

Assignment 5: Oracle

Let's build a testing oracle. There's a starter template we need to access the function they give us matchmaker() and is-hire()

# CSCI0190 (Fall 2018)
provide *
provide-types *

import shared-gdrive("oracle-support.arr",
  "11JjbUnU58ZJCXSphUEyIODD1YaDLAPLm") as O

##### PUT IMPORTS BELOW HERE ############


##### PUT IMPORTS ABOVE HERE ############

type Hire = O.Hire
hire = O.hire
is-hire = O.is-hire
matchmaker = O.matchmaker

# DO NOT CHANGE ANYTHING ABOVE THIS LINE

fun generate-input(num :: Number) -> List<List<Number>>:
  doc: 
  ```
  generates a list of candidates or companies for a group of size num
  ```
  empty
where:
  generate-input(0) is empty
end

fun is-valid(
    companies :: List<List<Number>>,
    candidates :: List<List<Number>>,
    hires :: Set<Hire>)
  -> Boolean:
  doc: 
  ```
  Tells if the set of hires is a good match for the given
  company and candidate preferences.
  ```
  true
where:
  is-valid(empty, empty, empty-set) is true
end

fun oracle(a-matchmaker :: (List<List<Number>>, List<List<Number>> 
      -> Set<Hire>))
  -> Boolean:
  doc: 
  ```
  Takes a purported matchmaking algorithm as input and outputs whether or
  not it always returns the correct response
  ```
  true
where:
  oracle(matchmaker) is true
end

Run the template then try to give matchmaker some inputs to see what it does:

companies =  [list: [list: 1, 0, 2], [list: 0, 2, 1], [list: 0, 2, 1]]
candidates = [list: [list: 1, 0, 2], [list: 1, 2, 0], [list: 0, 1, 2]]
# Note matchmaker writeup: it is 0 to n-1 indexed, so 3 lists means 0 - 2 

matchmaker(companies, candidates)
>>[list-set: hire(2, 1), hire(1, 0), hire(0, 2)]

First Let's understand the matching algorithm, we don't have to write one ourselves but we have to know how it works in order to test all the false matchmaker algorithms in our oracle. Wikipedia has a section on the Gale-Shapely algorithm. As usual Wikipedia is incomplete or has imprecise language, so let's look at the actual paper. Scroll down to Theorem 1: There always exists a stable set of marriages and look at the proof for the algorithm:

  • First round:
  • Companies all select their first preferred candidate
  • Candidates accept provisionally, if a more preferred offer is presented in next rounds they will reject this one
    • Rejected companies are put into a pool
  • Second round:
  • Companies in the pool with no matching then all select their second preferred candidate
  • Candidates accept or reject upon preferences
  • X rounds:
  • Repeat, noting that no company can offer more than once to a candidate

The 2019 assignment writeup contains a few hints, such as 'be cautious about using the provided matchmaker function in your oracle as this assumes that there is only one right answer to a given instance of the stable hiring problem'. I translate this to mean there could be many permutations of the stable-hiring problem which are still correct, however the Gale-Shipley algorithm produces the exact same matching everytime if the company is making offers, and candidates rejecting. What if the roles are reversed, if it is candidate making offers and company rejecting, is the matching then different? We have 2 inputs into matchmaker but it doesn't say which one is actually making the offers and which input is rejecting. We should try altering the inputs and see what happens:

companies =  [list: [list: 1, 0, 2], [list: 0, 2, 1], [list: 0, 2, 1]]
candidates = [list: [list: 1, 0, 2], [list: 1, 2, 0], [list: 0, 1, 2]]

>>matchmaker(candidates, companies)
[list-set: hire(2,2), hire(1,0), hire(0,1)]

>>matchmaker(companies, candidates)
[list-set: hire(2,1), hire(1,0), hire(0,2)]

They're different, so there is indeed more than one solution to the problem depending on if the bias is towards candidates or companies. So how are we going to test. Googling around I find an old book, entirely on the Stable Marriage problem here. On page 8 Stability Checking we are given a stability checking algorithm and on page 7 this is more clearly defined. For each company C, look at their preferences and check that each candidate they prefer is not matched to somebody else if candidate also prefers company C. The assignment also wants us to repeatedly test matchmaker algorithms to ensure they still produce stable matchings over many random inputs.

  • Some tests to implement:
    • check the hire pairs correspond with the input
      • no hire(7,1) if input is 3 companies
    • check n companies and n candidates are in n pairs
    • check for duplicate matchings
    • check they are actually type Hire
    • check they are stable as per the algorithm in that book
  • False matchmakers to try and fool your tests with:
    • shuffle the pairs randomly
    • return a duplicate matching
    • return empty/n-1 or n + 1 pairs
    • swap companies w/candiates
    • subtract one randomly from any pair or all pairs

This was my good enough generate-input function for test inputs < num = 25, more than that and needs optimization, like using a dict or keep a sorted accumulator to check for dupes.

fun generate-input(num :: Number) -> List<List<Number>>:
  doc: 'generates a list of candidates or companies for a group of size num'
  fun test-for-dupes(input :: List<Number>) -> List<Number>:
    doc: 'Checks for duplicate prefs in the list'
    distinct-input = L.distinct(input)
    if (L.length(distinct-input) < num):
      test-for-dupes(distinct-input
          + map(lam(x): num-random(num) end, repeat(num - distinct-input.length(), 0)))
    else:
      input
    end
  end

  fun create-list(n :: Number) -> List<Number>:
    doc: 'Assembles list of distinct prefs'
    randomized = map(lam(x): num-random(n) end, repeat(n, 1))
    test-for-dupes(randomized)
  end

  if (num <= 0):
    # If empty, return empty
    empty
  else if num == 1: 
    # If n, return n - 1 
    [list: [list: 0]]
  else:
    # repeat gives a skeleton list to map over of size num
    map(lam(x): create-list(num) end, repeat(num, 1)) 
  end
where:
  generate-input(0) is empty
  generate-input(-1) is empty
  generate-input(1) is [list: [list: 0]]
  L.length(generate-input(3)) is 3
  # omitted a bunch of tests here
end

Lecture 9 CS019 Logarithmic Growth

We're watching lecture Wed/9/26/18 here. This is a light lecture just explaining the intuition behind logarithmic complexity to set up balanced binary search trees. At 7:10 he explains logarithms: "A (base-10) logarithm of a number is essentially how many digits a number has. That's it". He then explains in terms of virtual machine costs, if you have a logarithmic algorithm and there is a million new runs of that algorithm you are only paying 6x the cost.

Lecture 10 CS019 Balanced Binary Search Trees

This is lecture Fri/9/28/18. We discover the entire last lecture was just trolling and would not give us logarithmic time in terms of successfully being able to throw away half the data because the tree isn't balanced. This whole lecture is understanding how you can balance a tree. There's more lectures on trees in future lectures.

Reading Chapter 17.2

All of this was covered in the lecture, including most of the exercises. There's an interesting writeup about how hashing functions work.

Lab: Tree traversal

Let's try the CS019 lab for trees. The first part is just thought exercises then the lab offers us stencil code for the BT tree lab sections:

#  _____                _                                      _ 
# |_   _| __ ___  ___  | |_ _ __ __ ___   _____ _ __ ___  __ _| |
#   | || '__/ _ \/ _ \ | __| '__/ _` \ \ / / _ \ '__/ __|/ _` | |
#   | || | |  __/  __/ | |_| | | (_| |\ V /  __/ |  \__ \ (_| | |
#   |_||_|  \___|\___|  \__|_|  \__,_| \_/ \___|_|  |___/\__,_|_|

data BTree<T>:
  | mt
  | node(value :: T, left :: BTree<T>, right :: BTree<T>)
end

fun btree-in-order<A>(tree :: BTree<A>) -> List<A>:
  doc: "Returns the elements of tree in a list via an in-order traversal"
  ...
where:
  nothing
end

fun btree-pre-order<A>(tree :: BTree<A>) -> List<A>:
  doc: "Returns the elements of tree in a list via a pre-order traversal"
  ...
where:
  nothing
end

fun btree-post-order<A>(tree :: BTree<A>) -> List<A>:
  doc: "Returns the elements of tree in a list via a post-order traversal"
  ...
where:
  nothing
end

First make a tree, run it, and type 'a-tree' into pyret interpreter and click the output to see the right and left branches, or a-tree.left or a-tree.right

a-tree =
  node(5,
    node(4, 
      node(3, mt, mt), mt),
    node(6, node(7, mt, mt), node(8, mt, mt)))

>> a-tree.left
node(4, node(3, mt, mt), mt)

>> a-tree.right
node(6, node(7, mt, mt), node(8, mt, mt))

>> a-tree.value
5

The lab wants us to write various traversals, here's what I wrote for btree-in-order which is basically the other two orders as well, just rearrange helper(l) + root + helper(r):

data BTree<T>:
  | mt
  | node(value :: T, left :: BTree<T>, right :: BTree<T>)
end

a-tree =
  node(5,
    node(4, 
      node(3, mt, mt), mt),
    node(6, node(7, mt, mt), node(8, mt, mt)))

# degenerate tree from PAPL chapter 17.2.2
b-tree = node(1, mt,
  node(2, mt,
    node(3, mt,
      node(4, mt, mt))))

fun btree-in-order<A>(tree :: BTree<A>) -> List<A>:
  doc: "Returns the elements of tree in a list via an in-order traversal"

  fun helper(n :: BTree<A>) -> List<A>:
    cases (BTree) n:
      | mt => empty
      | node(v, l, r) =>
        link(v, helper(l) + helper(r)) 
    end
  end

  cases (BTree) tree:
    | mt => empty
    | node(v, l, r) =>
      helper(l) + link(v, empty) + helper(r)
  end
where:
  btree-in-order(a-tree) is [list: 4, 3, 5, 6, 7, 8]
  btree-in-order(b-tree) is [list: 1, 2, 3, 4]
end

Another lab where we rewrite map/filter/fold. Below when you see Prof K's lecture on knowledge transfer you'll understand why he makes us do this over and over. This is my first attempt at map, I wrote it like he writes software on the board in class thinking through the data definition:

data BTree<T>:
  | mt
  | node(value :: T, left :: BTree<T>, right :: BTree<T>)
end

a-tree =
  node(5,
    node(4, 
      node(3, mt, mt), mt),
    node(6, node(7, mt, mt), node(8, mt, mt)))

# degenerate tree from PAPL chapter 17.2.2
b-tree = node(1, mt,
  node(2, mt,
    node(3, mt,
      node(4, mt, mt))))

fun btree-map<A, B>(f :: (A -> B), tree :: BTree<A>) -> BTree<B>:
  doc: "Recursively applies f to the value of every node contained in tree"

  cases (BTree) tree:
    | mt => mt
    | node(v, l, r) =>
      node(f(v), btree-map(f, l), btree-map(f, r))
  end
where:
  btree-map(lam(x): x + 1 end, a-tree) is node(6, node(5, node(4, mt, mt), mt), 
    node(7, node(8, mt, mt), node(9, mt, mt)))
  btree-map(lam(x): x + 1 end, b-tree) is node(2, mt, node(3, mt, node(4, mt, node(5, mt, mt))))
end

>>btree-map(lam(x): x * x end, a-tree)
node(25, node(16, node(9, mt, mt), mt), node(36, node(49, mt, mt), node(64, mt, mt)))

I was actually surprised when it worked the first time, the magic of recursion. You can figure out filter and fold, note that btree-fold the lab wants you to convert the tree to a list, apply f to every element and produce a value just like regular fold does with a warning about complete testing, meaning every function of regular fold should be replicated here.

Assignment 6: Filesystem

I interpret the assignment to mean this directory tree/model:

# this is missing from my template:
type Path = List<String>

# you can omit size field, so c = file("name", "text") and c.size will return size 4 of the content field 'text'.  
filesystem = (dir("TS", [list:
        dir("Text", empty, [list: file("part1", 99, "content"), file("part2", 52, "content"), file("part3", 17, "content")]),   
        dir("Libs", [list: 
            dir("Code", empty, [list: file("hang", 8, "content"), file("draw", 2, "content")]), 
            dir("Docs", empty, [list: file("read!", 19, "content")])], empty)],
      [list: file("read!", 10, "content")]))

Recall that Pyret allows you to write methods for datatype, here's an example for du-dir() where I wrote a .size() method you can call on any directory:

provide *
provide-types *

data File:
  | file(name :: String, size :: Number, content :: String)
end

data Dir:
  | dir(name :: String, ds :: List<Dir>, fs :: List<File>) with:
    method size(self :: Dir) -> Number:
      doc: "Returns combined file size, file count, and sub directory count of a Dir"
      size(self)
    end
end

fun size(d :: Dir) -> Number:
  doc: "Method .size()"

  fun fsize(fd :: Dir) -> Number:
    doc: "Returns file size, and number of files in a Dir"
    fd.fs.foldl(lam(f :: File, acc :: Number): 1 + f.size + acc end, 0)
  end

  fsize(d)
    + map(lam(x :: Dir): 1 + size(x) end, d.ds).foldl(lam(elt, acc): elt + acc end, 0)
end
check:
  a = dir("TS", [list:
      dir("Text", empty, [list: file("part1", 1, "content")])], empty)

  b = dir("Libs", [list: 
      dir("Code", empty, [list: file("hang", 1, "content"), file("draw", 1, "content")]), 
      dir("Docs", empty, [list: file("read!", 1, "content")])], empty)

  c = dir("TS", [list:
      dir("Text", empty, [list: file("part1", 1, "content"), file("part2", 1, "content"), file("part3", 1, "content")]),   
      dir("Libs", [list: 
          dir("Code", empty, [list: file("hang", 1, "content"), file("draw", 2, "content")]), 
          dir("Docs", empty, [list: file("read!", 1, "content")])], empty)],
    [list: file("read!", 1, "content")])

  d = dir("TS", [list:
      dir("Text", empty, [list: file("part1", 99, "content"), file("part2", 52, "content"), file("part3", 17, "content")]),   
      dir("Libs", [list: 
          dir("Code", empty, [list: file("hang", 8, "content"), file("draw", 2, "content")]), 
          dir("Docs", empty, [list: file("read!", 19, "content")])], empty)],
    [list: file("read!", 10, "content")])

  e = dir("TS", empty, [list: file("", 1, "")])

  a.size() is 3
  b.size() is 8
  c.size() is 19
  d.size() is 218
  e.size() is 2
end

This is all in the Pyret documentation, and I looked at the Pyret source on github to see how to chain multiple methods:

data Dir:
  | dir(name :: String, ds :: List<Dir>, fs :: List<File>) with:
    method size(self :: Dir) -> Number:
      doc: "Returns the size of all sub directories"
      size(self)
    end,
    method path(self :: Dir, s :: String) -> List<Path>:
      doc: "Takes a Dir, and a search string, returns the path"
      path(self, s)
    end
end

The path method there's numerous ways you could architect this like passing around a list between functions and if a file is found link it's Dir.name to the path, or link every path, and if the file isn't found, and that directory has no subdirs, remove it's Dir.name from the path list and return the path. Another way is write .findf() which if true replies with the directory name, then write another function that crawls the subdir lists to assemble the path to that directory.

Common mistakes/bugs

I do these exercises on a phone whenever I have spare time (commuting, lunch break, randomly get an idea somewhere) and often repeated the same annoying mistakes you may also run into:

  • Using [list: ] instead of 'empty' which breaks everything in Pyret
  • Not reading requirements carefully enough, we should probably do some materials on requirements later
  • Not writing type annotations for parameters. Try the code.pyret.org type check + run feature.

Lecture 11 CS019 Streams

Watching lecture titled Mon/10/01/18. In the examples of ones, everytime you invoke 'ones' you get lzlink(1, <thunk>) which is one, plus a way to get more ones if called again. We're asked to write our own map, you could also try it with cases:

fun maps<A, B>(f :: (A -> B), s :: Lz<A>) -> Lz<B>:
  cases (Lz<A>) s:
    | lzlink(e, r) => lzlink(f(e), {(): maps(f, rst(s))}) 
  end
end

fun maps2<A, B, C>(f :: (A, B -> C), s1 :: Lz<A>, s2 :: Lz<B>) -> Lz<C>:
  cases (Lz<A>) s1:
    | lzlink(e, r) => 
      cases (Lz<B>) s2:
        | lzlink(ee, rr) => lzlink(f(e, ee), {(): maps2(f, rst(s1), rst(s2))})
      end
  end
end

>>s2 = maps(lam(x): x - 1 end, nats)
>>take(s2, 5)
[list: -1, 0, 1, 2, 3]

>>s3 = maps2(lam(x, y): x + y end, nats, nats)
>>take(s3, 5)
[list: 0, 2, 4, 6, 8]

The type variables annotation he used maps<A,B,C>(f :: (A -> B),..)-> Stream<C>: is in the streams chapter of the text and in Chapter 27.

Reading Chapter 13

The chapter on streams will explain some calculus notation converting it to a pyret function. The symbol \(\varepsilon\) is epsilon which is a small positive qty, and "functions of arity one" means one parameter or input. There's a for loop used in the check block which is an iterator. We learn more about the ( ->) notation, meaning no arguments.

The fibonacci example, change the starting values around so you can see what's going on. You first lz-link 0 and 1 then those values are mapped over, producing a new value, which gets mapped over again etc.

Lab: Streams

The streams lab is here, mostly adapted from the chapter we just read in PAPL. Everything here we've already seen. We already wrote map and map2, try to write your own filter/fold for streams:

fun lz-fold<A, B>(f :: (A, B -> A), base :: A, s :: Stream<B>) -> Stream<A>:
  lz-link(f(base, lz-first(s)), lam(): lz-fold(f, f(base, lz-first(s)), lz-rest(s)) end) 
end

fun lz-filter<A, B>(f :: (A -> Boolean), s :: Stream<B>) -> Stream<A>:
  if f(lz-first(s)) == true:
    lz-link(lz-first(s), lam(): lz-filter(f, lz-rest(s)) end)
  else:
    lz-filter(f, lz-rest(s))
  end
end

>> take(10, lz-filter(lam(x): if x > 10: true else: false end end, nats))
[list: 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

>> take(10, lz-fold(lam(x, y): x + y end, 0, nats))
[list: 0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

Lecture 12 CS019 Model View Controllers

This is lecture Wed 10/3/18. You can't see the entire whiteboard but it doesn't matter, this is a lecture about seperation of concerns when building large systems, how to embody the essence of a system into a model that is common to all parts of the system. He's using a demo of animation, 'ask' is cond, and in chapter 11 which is optional reading.

Lecture 13 CS019 Differential Systems

Lecture Fri 10/05/18. Using same reactor model to make a point about differential systems. Interesting examples about PID controllers/reactive systems, event loops, checkpoints of the state of the system.

Assignment 7: Updater

If you did the tree lab earlier this assignment will be easy to prototype then gets harder as you try and optimize, like the writeup suggests make a working prototype first then optimize. Write your own datatype for Cursor<A>, for variants write a node index, the children/subtrees of that node, and another variant for everything else in the tree or 'above' the cursor. The Lists library has helpful built-ins such as get() and set() and filter to change the list of subtrees in mytree.children. The analysis they want was done in big-O lab 2, writing up a formal analysis describing the run-time complexity of each operation. Can you describe how your own code works and it's runtime complexity? That's all this is asking.

Option datatype is explained in the Pyret documentation, used if a function may not return a value. Notice the cases on Option, some(a) returns value 'a' and none returns whatever value you want instead of nothing, like 0 or a string 'empty'.

Lecture 14 CS019 Aliasing

Lecture Wed 10/10/18. A great lecture about memory aliasing, introduces directed acyclic graphs, which is what git is.

Reading Chapter 17

Chapter on sharing and equality, covers what we just did in lecture.

Lecture 15 CS019 Monads, Sequential Computation

He's going over some BTree code handed out in Piazza (we don't have access), but we've seen all this code before it's just a Btree. Note the check blocks can have strings, so when you run a test you know which one failed such as check "if leaves are zero": …test.. end. We discover this is a design pattern that allows abstracting out this pattern of carrying information and it's called monadic programming.

The end of the lecture he walks through the code of a pyret library he wrote, explaining all the complicated types and notation, you can use this to look up the source yourself when doing assignments and read the code.

Assignment 8: Continued Fractions

A stream assignment. We've already done most of this assignment in stream labs and the book/lectures. The big-O complexity analysis request all this is asking you to do is understand what you just built seeing if you can explain it's runtime complexity and whatever performance tradeoffs you made, just like we did in updated assignment. This is a skill you build over time through practice learning how to write better software.

This is a highly useful exercise, you will probably come across streams like this for approximations in a lot of software.

Streams in Python

Try Numerical methods w/Functional programming in Python, it shows how to create the same kind of streams to refine a result to better and better accuracy. To understand the weird init syntax see this, note the video button for a short YouTube tutorial.

Assignment 9: JoinLists

We find out what reduce is in this assignment (we will write map/reduce in future assignments), and an introduction to writing code for parallel architecture. If you read the lectures notes on trees from COMP212 it has some insight into split/merge. Whenever you see 'elt' in assignments or the pyret docs it means element.

We are almost done all the assignments, we only have MapReduce, Tour Guide, MST and Fluid Images left as we don't have access to '24'. These assignments are ridiculously hard for an introductory course but the point was you spent a lot of time writing code and figuring it out yourself. This is how you get better at programming through practice, you build something, then after you build it have a much better idea of what you truly need to do.

Try going back to some of the earlier assignments you wrote and redo them, for example in the upcoming assignment MapReduce you get to revisit Nile again and my first implementation was so convoluted and unreadable I rewrote Nile from scratch to work with the MapReduce assignment and it was very easy the second time around, because building it the first time was just fleshing out the problem really, the second time was the real software I wanted.

The Art of Computer Programming

This method of writing a draft and throwing it out as you learn more about a program you want to write is explained in The Art of Computer Programming vol 1. Go on libgen and get any draft of the MMIX fascicle or Vol 1 copy, look at 1.4.1 (page 63 in the MMIX fascicle). Knuth talks about how you would go about writing 'a complex and lengthy program'. This is the Dijkstra method from Structured Programming in 1972. Nobody writes programs this way anymore, but it's still an interesting read. Note he sketches everything on paper, figuring out all the program logic and splitting it up into smaller and smaller pieces or helper functions. After the strategic part is done he starts writing the easiest subroutine first, up the chain, until he reaches a point where enough is known about the problem that everything can be scrapped and redesigned.

This is how I do these assignments, pick the easiest functionality of updater or nile or joinlists, write out examples (tests) and from the examples sketch the logic, and work up from there. For me anyway it's easier to use paper first planning out the program as it makes these assignments much faster to write. If you find yourself sitting in the pyret interpreter for hours confused debugging a huge amount of code then consider writing a draft first and spending all your time there and the least amount of time entering and running the code.

Lecture 16 CS019 Graphs

This is lecture Mon 10/15/18. We're going to do a month of graphs, which is great since graphical models are heavily used in AI and software, even a tree really is just a (degerate) graph. We see that all the assignments we've done so far we could have encoded using a graph. The last half of the lecture he hands out an anonymous quiz to test their abilities, then talks about how education research works like how to revise courses to make assignments more effective. The lecture resumes at 49:05

Our assignment is to read the chapter on graphs up to 20.2.4 but all this will be covered in the next few lectures.

Lecture 17 CS019 Graphs II

Lecture Wed 10/17/18 the graph node code is slightly cut off but this is all in the book, plus it's only for demonstrating the point that you need a 'level of indirection' key as the content can be a duplicate. He goes on a tangent about impersonation attacks on social media.

For some reason many students confused about found() in the lecture probably because it looks like it's a parameter to reach() on the board. They are two different functions: reach(g :: Graph, nl :: Node) -> Boolean and found(nl :: Node) -> Boolean is a seperate/helper function that is called as reach traverses the graph and feeds a node to found, you'd write found() within the scope of reach so you could access reach's nl parameter to compare every node given to nl.

Complexity for graphs, the viewer asked to think about what the complexity of a graph should be.

Lecture 18 CS019 Graphs III

Lecture Fri 10/19/18 video starts postage stamp size, becomes regular size at 04:02. The first 20 minutes were students figuring out how to architect a reach function. The function reach-3 he's walking through is in PAPL in 20.4.4 and he answers a lot of questions about it. This whole lecture is trying to figure out if reach-3 is correct and will actually terminate.

How to teach yourself to transfer skills

Prof Shriram Krishnamurthi from CS019 gives a talk here about us, the students of these courses and how all of us are really bad at transfering the skills we've been taught. He uses an example from his CS1730 class (which uses PAPL) on giving them an unsolvable problem they can't recognize even through they've all had prior classes on decidability. He gives advice how to achieve transfer you can try yourself, such as spiral curricula. You learn the same things in different levels of complexity, such as completing CS019 then after taking Cornell's CS3110 or the lambda calculus or seeing programming done entirely in math notation to transfer to higher abstractions, and something like the Art of Computer Programming by Knuth or CMU's 15-213 course to transfer your CS019 knowledge to a more concrete assembly language level. In other words you iterate over your education numerous times to pick up things you may not really get the first time or to see concepts being transfered from abstract to concrete like balancing a tree in assembly. Maybe you even want to take an entirely different subject, transfering these skills to computational biology.

Another method to achieve transfer is what we'll try below, competitive programming.

Competitive programming

While we finish CS019, we need to learn some C++ libraries to distribute a tensorflow algorithm over a huge amount of servers, we should probably learn some java libraries, and we need to fill in gaps in our algorithms and complexity analysis understanding, and we need to concretely apply this over and over so read theory, apply and repeat.

I'm going to work through the first volume of Competitive Programming 4 that is used in this class, because it addresses all the concerns by Prof K in the above lecture he gave on transfering skills. In the book a concept is taught to you then explicitly this is transfered by working through a lot of curated problems they recommend on popular competitive programming sites. You get to see the concept in many different representations, which makes it different from every other competitive programming resource. I had to buy it as there isn't a digital copy available yet but version 1 is free on their site, and version 3 can be found on libgen. The kattis and other online judges are free which the book uses. We will use other resources not just this book so doesn't really matter what version you have.

GeoHot who says you shouldn't waste your time grinding leetcode admits they still use HackerRank style questions to filter candidates, and recommends competition programming as the best way to learn, until you get so good at competitive programming you start to waste your time.

Let's look at the homework assignments for CMU's 15-451 class