Accelerated Intro to CS

Table of Contents

Intro

This is a no-credit self-study of:

These 3 courses were originally designed by Shriram Krishnamurthi who has a unique lecture style and builds curriculums based on research in education. See his talk here about curriculum design as an engineering problem and the failure of knowledge transfer defined to mean a skill learned in one context is used successfully in another. "Even the students who have taken theory of computation (decidability) can't recognize the unsolvable problem". I mix in some USACO training with CS19 because the course is advanced enough to do many USACO Gold problems so why not.

This is just for fun, it's a state of the art algorithms design class to see how it's done and what the current fastest runtimes are. If you're interested in AI or algorithm design you can do this after CS19.

Accelerated Intro to CS

This is 2 semesters of introductory computer science compressed into a single course.

Materials

  • The book A Data-Centric Introduction to Computing
  • Lectures & recitations on YouTube, torrent file if needed
  • Pyret online interpreter (CPO) to run code in your browser
  • Assignments
  • Labs at the bottom of this page (Big-O, Streams etc).
    • A similar intro course here w/open lectures uses the same book (doesn't cover the advanced chapter)

Lecture 1 Intro

Watching the cs19 lecture titled Wed 9-5-18 or here.

Go to code.pyret.org (CPO) and click open editor as you follow along the lecture, he teaches you the basics of the online IDE. If you make a mistake like accidentally deleting all your code Ctrl-Z will undo on most browsers or ⌘Z if you're using Safari. Use a google drive account to save work. Anything you don't understand in this lecture is in the first few chapters of the book.

Try running this:

# sample function with tests
fun f(a, b):
  a + b
where:
  f(1, 2) is 3
  f("a", "b") is "ab"
  f([list: 1], [list: 2]) is [list: 1, 2]
end

# same function but with type annotations
fun g(a :: Number, b :: Number)-> Number:
  a + b
end

check "testing g w/types":
  g(1,2) is 3
  # this will fail the type check
  g("a", "b") is "ab"
end

We don't have access to Examplar but it gives you an idea how many quality examples he expects. The Fermi problems he talks about you can search for if you want practice approximating. An order of magnitude can be powers of 10 and a good approximation is within 1 or 2 divisions or multiplications by 10 when considering seemingly impossible Fermi approximations.

DCIC Book

Let's read Basic Data (Chapter 3). Have the Pyret editor open as you read and play around with all the code.

If you see an exercise box definitely try it such 3.1.6.1 Combining Images. When you are first starting out sometimes it's easier to write the entire expression as a single line:

# overlay(top-shape, floor)
overlay(circle(25, "solid", "yellow"), rectangle(50, 50, "solid", "blue"))
# above(top, bottom)
above(circle(25, "solid", "red"), rectangle(30, 50, "solid", "blue"))

The Target logo exercise

(circle(15, "solid", "red"))
(circle(25, "solid", "white"))
(circle(35, "solid", "red"))

overlay(circle(15, "solid", "red"),
  (overlay((circle(25, "solid", "white")), (circle(35, "solid", "red")))))

# using named values (chapter 3.2)
center = (circle(15, "solid", "red"))
middle = (circle(25, "solid", "white"))
edge = (circle(35, "solid", "red"))

# overlay(top-shape, floor)
overlay(center, (overlay(middle, edge)))

You want to read chapter 5 next where lists are introduced and lambdas/higher-order functions.

Pyret Documentation

Let's learn how to read documentation by going through all of Language Concepts. Try everything in the editor.

Primitives/literals

Starting with Names. If you've never seen a regular language before look up Finite Automata on YouTube.

^[_a-zA-Z][_a-zA-Z0-9]*(?:-+[_a-zA-Z0-9]+)*
  • ^ the beginning of the string must match whatever is in [ ]
    • underscore or alpha char
  • [ ]* match whatever is in the bracket 0 or more times
  • + match the preceeding character 1 or more times
  • [ ]+ match the characters in the bracket 1 or more times
  • (?: stuff)* is a way to group optional strings and apply * to the whole group meaning there can be 0 or more of whatever is in the (?: )* grouping.

Reading the entire regex this says names must begin with a single underscore or aA-zZ then everything else is optional except names can't end with a hyphen. The Pyret convention is that names are kebab-case-names because on a skewer and type annotation names are CamelCase because Camels have 2 humps ehhh

Type the string literals into the Pyret REPL to see how the escape char '\' works. It means you are telling the program not to interpret the character after \ as anything other than a string. Try escaping the backslash char itself, you'd think 'escape \\ backslash' would eval to 'escape \ backslash'. Try print('escape \\ backslash').

Another regex in number literals:

^[-+]?[0-9]+(?:\\.[0-9]+)?(?:[eE][-+]?[0-9]+)?
  • [ ]? match 0 or 1 time

num-expr (number expression) if it has sign - and + must start at the beginning but there can only be 1 sign or none. The [ ]+ we know means match 1 or more and then there is another optional grouping of 0 or 1 matches.

(sign)(digit).(digit)e(- or +)(digit)

#move decimal right 2
1.01E2
1.01E+2
# move decimal left 2
1.01E-2
# Exactnum type
-1/33
-0/10
# Roughnum type
~-1
~+1

Comments

Reading comments. The Pyret Style Guide has suggestions what to put in doc strings/comments.

Programs

The following means:

‹program›: ‹prelude› ‹block›
‹prelude›: [‹provide-stmt›] [‹provide-types-stmt›] (‹import-stmt›)*

A program consists of a prelude (provide statements and optional import statements (notice the *)) and a block which we'll soon learn is just a space of sequential operations.

Reading import statements they exist so we can bring in one of these libraries or our own and a library is a bunch of similar programs grouped into a module. In the Pyret editor type 'sum' in the REPL and run it. The name 'sum' is unbound. Let's import the math library:

include math
sum([list: 0, 1, 1])

# or import and bind to a name
import math as M
M.sum([list: 0, 1, 1])

Reading provide statements these are for writing your own library and then choosing which programs/types you wish to allow to be exported. Try it yourself, first use any throwaway google drive account to log in so you can save files.

# provide{external name : local name}
provide {add2 : add-two} end
fun add-two(n):
  n + 2 
end

Save this as something.arr then click publish, open a new file and import your program by cut+pasting the import statement given to you after publishing:

import shared-gdrive("add2.arr", "an ascii string") as A

Run what you pasted and type A.add2 in the REPL and it works, A.add-two gives an error 'The module A has no provided member add-two' because we renamed it add2.

In the Pyret editor there's a shadowed text at the top that says 'use context starter2024' or similar depending on when you're reading this. This is explained here it's a way to automatically import certain libraries or restrict others from default namespace by using provide statements. It seems to me starter2024 context allows all list functions, strings, numbers and the images library.

Bindings

Reading bindings. Check blocks are used here, run them to see how they work. A tuple is a data structure try running this and play around with it in the REPL:

{{w; x} as wx; {y; z} as yz} as wxyz = {{~5; true}; {"hello"; 4}}
# {w;x} = wx
# {y;z} = yz
# bind both to wxyz
w
x
y
"wx"
wx
"yz"
yz
"wxyz"
wxyz

Blocks

Reading blocks there is no return statement in Pyret as the last statement in a block that evals to a value is the returned value.

# this top level is fine
a = "test"

# this throws warning
# 'blocks should end with an expression'
block:
  b = "test"
end

Declarations

Reading declarations. We will see the recursive let-binding in one of the lectures on streams. It means you can use the name of the let-binding inside the definition itself otherwise you have to bind something first then after can use it's name. Turns out every function is a rec let-binding.

Read scope. A where block is much like an 'expect test' in other languages it's so testing doesn't fall out of sync with the code as where blocks are always run after the function declaration is executed by Pyret.

Data Declarations

Try running this in the editor:

data MyList<a>:
  | myempty
  | mylink(first :: a, rest :: MyList<a>)
end

t= mylink(1, myempty)
t
is-mylink(t)
t.first
t.rest

There isn't a program we wrote named 'mylink', it is the constructor along with myempty of that data type. It has an abstract type 'a' which is binded to whatever type it's given so it can hold strings, lists, anything. Methods are introduced here they are .function() style for data types that consumes the data type as input. We can write a method as per this chapter on declarations to push an element to a MyList and even a constructor for [mylist: 1, 2] syntax:

data MyList<a>:
  | myempty 
  | mylink(first :: a, rest :: MyList<a>)
sharing:
  method my-push<b>(self, elt :: b)-> MyList<b>:
    doc: "Pushes an element to the front of a MyList"
    mylink(elt, self) end end

# taken from expressions chapter
# creates [mylist: elt, elt..]
mylist = {
  make0: lam(): myempty end,
  make1: lam(a): mylink(a, myempty) end,
  make2: lam(a, b): mylink(a, mylink(b, myempty)) end,
  make3: lam(a, b, c): mylink(a, mylink(b, mylink(c, myempty))) end,
  make4: lam(a, b, c, d): mylink(a, mylink(b, mylink(c, mylink(d, myempty)))) end,
  make5: lam(a, b, c, d, e): mylink(a, mylink(b, mylink(c, mylink(d, mylink(e, myempty))))) end,
}
[mylist: "a", "b"].my-push("c")

Var assignables will come up much later in lecture the same with type declarations, you can skip these.

Expressions

Reading expressions lightly skim this to know where to find things later. If you read 'chaining application' then you can now read the Pyret source code here as they make frequent use of it. The ^ operation is another way to do function composition or f(g()) is f ^ g.

The cases expression splits a data type up into it's variants such as mylist was myempty and mylink(first, rest) we will use it extensively in the course.

Testing

Read testing as the course goes, most of this will be explained in lecture.

Equality

The equality documentation is fully explained in lecture later.

Combining Ops

Make sure you know Pyret disallows mixing operators without using parens to define precedence.

Spies

Read spies it's a way to debug by printing if you want to see the current value of something annoying you with errors.

Rewrite the lists library

Students during the summer break before the semester starts would do a 6 week placement using the prof's other book HtDP and we can mimic this by rewriting some of the List library functions here in the Pyret Documentation 3.7.4 List Functions. Many of the course assignments have us rewriting them anyway.

Read the book chapter on lists and how to process them. There's a recorded recitation on how recursion works, how the cases syntax in Pyret works, and how lambdas work but all of this is in the book too.

get()

Try writing get. Cut and paste the function signature and examples from the Pyret documentation. The three dots indicate to Pyret this is a template and you can write tests and run them to see if at least the tests are well formed:

fun my-get<A>(lst :: List<A>, n :: Number) -> A:
  doc: "Returns the nth element of the given list, or raises an error if n is out of range"
  cases (List) lst:
    | empty => ...
    | link(f, r) => ...
  end
where:
  my-get([list: 1, 2, 3], 0) is 1
  my-get([list: 1, 2, 3], 2) is 3
  my-get([list: 1, 2], 2) raises "n too large"
  my-get(empty, 0) raises "n too large"
  my-get([list: 1, 2, 3], -1) raises "invalid argument: -1"

  # test against Pyret .get()
  my-get([list: 1, 2, 3], 2) is [list:1, 2, 3].get(2)
end

Always start with examples and here they tell us what to do when the list is empty. In both examples empty should return "n too large":

fun my-get<A>(lst :: List<A>, n :: Number) -> A:
  doc: "Returns the nth element of the given list, or raises an error if n is out of range"
  cases (List) lst:
    | empty => raise("n too large")
    | link(f, r) => ...
  end
where:
  my-get([list: 1, 2, 3], 0) is 1
  my-get([list: 1, 2, 3], 2) is 3
  my-get([list: 1, 2], 2) raises "n too large"
  my-get(empty, 0) raises "n too large"
  my-get([list: 1, 2, 3], -1) raises "invalid argument: -1"

  # test against Pyret .get()
  my-get([list: 1, 2, 3], 2) is [list:1, 2, 3].get(2)
end

We need a check for n < 0 to raise "invalid argument: -1".

fun my-get<A>(lst :: List<A>, n :: Number) -> A:
  doc: "Returns the nth element of the given list, or raises an error if n is out of range"

  fun help(l, count):
    cases (List) l:
      | empty => raise("n too large")
      | link(f, r) => ...
    end
  end

  if n < 0:
    raise("invalid argument: " + to-string(n))
  else:
    help(lst, n)
  end
where:
  my-get([list: 1, 2, 3], 0) is 1
  my-get([list: 1, 2, 3], 2) is 3
  my-get([list: 1, 2], 2) raises "n too large"
  my-get(empty, 0) raises "n too large"
  my-get([list: 1, 2, 3], -1) raises "invalid argument: -1"

  # test against Pyret .get()
  my-get([list: 1, 2, 3], 2) is [list:1, 2, 3].get(2)
end

A different but inefficient way is to place a guard at the beginning where it will be checked needlessly everytime there is a loop though compilers we will learn later will probably optimize away that check:

fun my-get<A>(lst :: List<A>, n :: Number) -> A:
  doc: "Returns the nth element of the given list, or raises an error if n is out of range"

  # inefficient redundant checking
  if n < 0:
    raise("invalid argument: " + to-string(n))
  else:
    cases (List) lst:
      | empty => raise("n too large")
      | link(f, r) => ...
    end
  end
...

Look at the examples for regular operation:

# n is zero, return element
my-get([list: 1, 2, 3], 0) is 1
# 2-2 is zero, return element
my-get([list: 1, 3, 9], 2) is 9

Each loop we are asking does n == 0 if so return the current element if not decrease n and try again with the next element in the list:

fun my-get<A>(lst :: List<A>, n :: Number) -> A:
  doc: "Returns the nth element of the given list, or raises an error if n is out of range"

  fun help(l, counter):
    cases (List) l:
      | empty => raise("n too large")
      | link(f, r) => 
        if counter == 0:
          f
        else:
          help(r, counter - 1)
        end
    end
  end

  if (n < 0):
    raise("invalid argument: " + to-string(n))
  else:
    help(lst, n)
  end
where:
  my-get([list: 1, 2, 3], 0) is 1
  my-get([list: 1, 2, 3], 2) is 3
  my-get([list: 1, 2], 2) raises "n too large"
  my-get(empty, 0) raises "n too large"
  my-get([list: 1, 2, 3], -1) raises "invalid argument: -1"

  # test against Pyret .get()
  my-get([list: 1, 2, 3], 2) is [list:1, 2, 3].get(2)
end

Another way to write this:

fun my-get<A>(lst :: List<A>, n :: Number) -> A:
  doc: "Returns the nth element of the given list, or raises an error if n is out of range"
  if (n < 0):
    raise("invalid argument: " + to-string(n))
  else if is-empty(lst):
    raise("n too large")
  else if n == 0:
    lst.first
  else:
    my-get(lst.rest, n - 1)
  end
end

set()

Next rewrite set. We already know what to do in the empty case from writing get:

fun my-set<A>(lst :: List<A>, n :: Number, v :: A) -> List<A>:
  doc: "Returns a new list with the same values as the given list but with the nth element set to the given value, or raises an error if n is out of range"

  fun help(l, count):
    cases (List) l:
      | empty => raise("n too large")
      | link(f, r) => ...
    end
  end

  if n < 0:
    raise("invalid argument: " + to-string(n))
  else: 
    help(lst, n)
  end
where:
  my-set([list: 1, 2, 3], 0, 5) is [list: 5, 2, 3]
  my-set([list: "a", "b"], 1, "c") is [list: "a", "c"] 
  # test against Pyret .set()
  my-set([list: 1, 2], 1, 4) is [list: 1, 2].set(1, 4)

  # errors
  my-set([list: "a"], 1, "b") raises "n too large"
  my-set(empty, -1, 5) raises "invalid argument: -1"  
end

If the counter is zero we want to link v element on to the remainder of the list, if it's not zero then link the current element f and call my-set again.

cases (List) l:
     | empty => raise("n too large")
     | link(f, r) => 
       if count == 0:
         link(v, r)
       else:
         link(f, help(r, count - 1))
       end
   end

Recall from the book the way link(elt, list) works for example [list: 1, 2] is link(1, link(2, empty)). Hand step my-set to see how it builds up a chain of delayed computation until recursion halts:

  • my-set([list: 1, 2, 3, 4, 5], 4, 99)
  • link(1, help([list: 2, 3, 4, 5], 3, 99))
  • link(1, link(2, help([list: 3, 4, 5], 2, 99)))
  • link(1, link(2, link(3, help([list: 4, 5] 1, 99))))
  • link(1, link(2, link(3, link(4, help([list: 5], 0, 99)))))
  • link(1, link(2, link(3, link(4, link(99, empty)))))
  • link(1, link(2, link(3, link(4, [list: 99]))))
  • link(1, link(2, link(3, [list: 4, 99])))
  • … notice it starts eval at the most nested bracket

We can also write set this way:

fun my-set<A>(lst :: List<A>, n :: Number, v :: A) -> List<A>:
  doc: "Returns a new list with the same values as the given list but with the nth element set to the given value, or raises an error if n is out of range"

  if n < 0:
    raise("invalid argument: " + to-string(n))
  else if is-empty(lst):
    raise("n too large")
  else if n == 0:
    [list: v] + lst.rest
  else: 
    [list: lst.first] + my-set(lst.rest, n - 1, v)
  end
where:
  my-set([list: 1, 2, 3], 0, 5) is [list: 5, 2, 3]
  my-set([list: "a", "b"], 1, "c") is [list: "a", "c"] 
  # test against Pyret .set()
  my-set([list: 1, 2], 1, 4) is [list: 1, 2].set(1, 4)

  # errors
  my-set([list: "a"], 1, "b") raises "n too large"
  my-set(empty, -1, 5) raises "invalid argument: -1"  
end

join-str()

Sorting lists we are skipping because it's a future lecture. Try join-str next, look in the string library for helpful builtins:

fun my-join-str<A>(lst :: List<A>, sep :: String) -> String:
  doc: "Converts list elements to a single string deliminated by sep"
  cases (List) lst:
    | empty => ""
    | link(f, r) => ...
  end
where:
  my-join-str([list: 1, 2, 3], "; ") is "1; 2; 3"
  my-join-str([list: "a", true, ~5.3], " : ") is "a : true : ~5.3"
  my-join-str(empty, "nothing at all") is ""
end

From the examples we want to convert each element to a string and append the deliminator to every element except the last. This means we have to check if the current element is the last if so do not add the deliminator and that can be accomplished by doing cases on the rest of the list (all of this in the book):

fun my-join-str<A>(lst :: List<A>, sep :: String) -> String:
  doc: "Converts list elements to a single string deliminated by sep"
  cases (List) lst:
    | empty => ""
    | link(f, r) => 
      cases (List) r:
        | empty => 
          # if the next element in r is empty do not append sep
          to-string(f) + my-join-str(r, sep)
        | link(_, _) =>
          to-string(f) + sep + my-join-str(r, sep)
      end
  end
where:
  my-join-str([list: 1, 2, 3], "; ") is "1; 2; 3"
  my-join-str([list: "a", true, ~5.3], " : ") is "a : true : ~5.3"
  my-join-str(empty, "nothing at all") is ""
  # test against Pyret builtin
  my-join-str([list: 1, 2, 3], "*") is join-str([list: 1, 2, 3], "*")
end

The underscore in link(fr, rr) (first of the rest, rest of the rest) means I don't plan on using those variables so tells the reader of the code to ignore them. The variables inside link(,) => can be anything like link(current, remainder).

Hand step this:

  • my-join-str([list: 1, 2, 3], "; ")
  • "1 ;" + my-join-str([list: 2, 3], "; ")
  • "1 ;" + "2 ;" + my-join-str([list: 3]), "; ")
  • "1 ;" + "2 ;" + "3" + my-join-str(empty, "; ")
  • "1 ;" + "2 ;" + "3" + ""

My-join-str can be a simple if-statement too:

fun my-join-str<A>(lst :: List<A>, sep :: String) -> String:
  doc: "Converts list elements to a single string deliminated by sep"
  if is-empty(lst):
    ""
  else if is-empty(lst.rest):
    to-string(lst.first) + my-join-str(lst.rest, sep)
  else:
    to-string(lst.first) + sep + my-join-str(lst.rest, sep)
  end
where:
  my-join-str([list: 1, 2, 3], "; ") is "1; 2; 3"
  my-join-str([list: "a", true, ~5.3], " : ") is "a : true : ~5.3"
  my-join-str(empty, "nothing at all") is ""
  # test against Pyret builtin
  my-join-str([list: 1, 2, 3], "*") is join-str([list: 1, 2, 3], "*")
end

range()

The more builtins you rewrite the easier this becomes, look at the docs and rewrite range:

fun my-range(start :: Number, stop :: Number) -> List<Number>:
  doc: "Creates a list of numbers, starting with start, ending with stop"
  if start == stop:
    empty
  else:
    link(start, my-range(start + 1, stop))
  end
where:
  my-range(0, 0) is [list: ]
  my-range(0, 1) is [list: 0]
  my-range(-5, 5) is [list: -5, -4, -3, -2, -1, 0, 1, 2, 3, 4]
  # test against Pyret builtin
  my-range(-5, 5) is range(-5, 5)
end

Next is range-by, I wrote this by making it work for one test case, running to see the failing tests, repeat:

fun my-range-by(start :: Number, stop :: Number, delta :: Number) -> List<Number>:
  doc: "Creates a list of numbers, starting with start, in intervals of delta, until reaching (but not including) stop"
  if delta == 0:
    raise("interval of 0")
  else if (start >= stop) and (delta > 0):
    empty
  else if (start <= stop) and (delta < 0):
    empty
  else:
    link(start, my-range-by(start + delta, stop, delta))
  end
where:
  my-range-by(1, 10, 4) is [list: 1, 5, 9]
  my-range-by(10, 1, -4) is [list: 10, 6, 2]
  my-range-by(3, 20, 9) is [list: 3, 12]
  my-range-by(20, 3, 9) is empty
  my-range-by(20, 3, -9) is [list: 20, 11]
  my-range-by(2, 3, 0) raises "interval of 0"
  # test against Pyret builtin
  my-range-by(1, 10, 4) is range-by(1, 10, 4)
end

repeat()

We don't have to do all error handling yet where we test for every possible bad input just copy the examples in the documentation and make them work for now:

fun my-repeat<A>(n :: Number, e :: A) -> List<A>:
  doc: "Creates a list with n copies of e"
  if n <= 0:
    empty
  else:
    link(e, my-repeat(n - 1, e))
  end
where:
  my-repeat(0, 10) is empty
  my-repeat(3, -1) is [list: -1, -1, -1]
  my-repeat(1, "foo") is link("foo", empty)
  my-repeat(3, empty) is [list: [list: ], [list: ], [list: ]]
  # test against Pyret builtin
  my-repeat(3, -1) is repeat(3, -1)
end

split-at()

Skip filter, partition, find as these are in a lab we do later. Split-at builtin is using a tuple in the return signature see the documentation.

fun my-split-at<A>(n :: Number, lst :: List<A>) -> {prefix :: List<A>, suffix :: List<A>}:
  doc:"Splits the list into two lists, one containing the first n elements, and the other containing the rest"

  fun helper<T>(test :: String, count :: Number, input :: List<T>)-> List<T>:
    cases (List) input:
      | empty => 
        if (n >= count) and (test == "prefix"):
          raise("Index too large")
        else:
          empty
        end
      | link(f, r) =>
        if (test == "prefix") and (count <= n):
          link(f, helper(test, count + 1, r))
        else if (test == "suffix") and (count > n):
          link(f, helper(test, count + 1 , r))
        else:
          helper(test, count + 1, r)
        end
    end
  end

  if n == 0:
    {prefix: empty, suffix: lst}
  else if n < 0:
    raise("Invalid index")
  else:
    {prefix: helper("prefix", 1, lst), suffix: helper("suffix", 1, lst)} 
  end

where:
  my-split-at(2, [list: 'a', 'b', 'c', 'd']) is {prefix: [list: "a", "b"], suffix: [list: "c", "d"]}
  my-split-at(0, [list: 1, 2, 3, 4]) is {prefix: empty, suffix: [list: 1, 2, 3, 4]}
  my-split-at(4, [list: 1, 2, 3, 4]) is {prefix: [list: 1, 2, 3, 4], suffix: empty}
  my-split-at(2, [list: 1, 2, 3, 4]) is {prefix: [list: 1, 2], suffix: [list: 3, 4]}
  my-split-at(-1, [list: 1, 2, 3, 4]) raises "Invalid index"
  my-split-at(5, [list: 1, 2, 3, 4]) raises "Index too large"
  # test against Pyret builtin
  my-split-at(2, [list: 1, 2, 3, 4]) is split-at(2, [list: 1, 2, 3, 4])
end

last(), push(), append()

These should be simple for you to write by now.

fun my-last<A>(lst :: List<A>) -> A:
  doc:"Returns the last element in lst. Raises an error if the List is empty."
  cases (List) lst:
    | empty => raise("last of empty list")
    | link(f, r) =>
      cases (List) r:
        | empty => f
        | link(_, _) => my-last(r)
      end
  end
where:
  my-last([list: 1, 3, 5]) is 5
  my-last([list: 1]) is 1
  my-last([list: ]) raises "last of empty list"
  # test against Pyret builtin
  my-last([list: 1, 3]) is last([list: 1, 3])
end

fun my-push<A>(l :: List<A>, elt :: A) -> List<A>:
  doc:"Constructs a list with the given element prepended to the front of the given list."
  cases (List) l:
    | empty => link(elt, empty)
    | link(f, r) => link(elt, l)
  end
where:
  my-push(empty, "a") is link("a", empty)
  my-push(link("a", empty), "b") is link("b", link("a", empty))
  my-push([list: 1, 2], 3) is [list: 3, 1, 2]
  # test against Pyret builtin
  my-push([list: 1, 2], -1) is push([list: 1, 2], -1) 
end

fun my-append<A>(front :: List<A>, back :: List<A>) -> List<A>:
  doc: "Produce a new List with the elements of front followed by the elements of back."
  cases (List) front:
    | empty => back
    | link(f, r) => link(f, my-append(r, back))
  end
where:
  my-append([list: 1, 2, 3], [list: 4, 5, 6]) is [list: 1, 2, 3, 4, 5, 6]
  my-append([list:], [list:]) is [list:]
  my-append([list: 1], [list: 2]) is [list: 1, 2]
  # test against Pyret builtin
  my-append([list: 1, 2], [list: 3, 4]) is append([list: 1, 2], [list: 3, 4])
end

remove(), reverse(), member()

Reminder the more you do this the better you become.

fun my-remove<a>(lst :: List<a>, elt :: a) -> List<a>:
  doc: ```Returns the list without the element if found, or the whole list if it is not```
  if is-empty(lst):
    empty
  else:
    if elt == lst.first:
      remove(lst.rest, elt)
    else:
      link(lst.first, remove(lst.rest, elt))
    end
  end
where:
  my-remove(empty, 1) is empty 
  my-remove([list: 1, 1, 2], 1) is [list: 2]
  my-remove([list: 1, 2], 3) is [list: 1, 2]
  # test using Pyret builtin
  my-remove([list: 1, 1, 2], 2) is remove([list: 1, 1, 2], 2)
end

fun my-reverse<a>(lst :: List<a>) -> List<a>:
  doc: "Returns a new list containing the same elements as this list, in reverse order"

  fun help(input, accumulator):
    cases (List) input:
      | empty => accumulator
      | link(f, r) => help(r, link(f, accumulator))
    end
  end

  help(lst, empty)
where:
  my-reverse([list: ]) is [list: ]
  my-reverse([list: 1, 3]) is [list: 3, 1]
  my-reverse([list: "a", "b", "c"]) is [list: "c", "b", "a"]
  # test using Pyret builtin
  my-reverse([list: true, false]) is reverse([list: true, false])
end

fun my-member<a>(lst :: List<a>, elt :: a) -> Boolean:
  doc: ```Returns the list without the element if found, or the whole list if it is not```
  if is-empty(lst):
    false
  else:
    if lst.first == elt:
      true
    else:
      my-member(lst.rest, elt)
    end
  end
where:
  my-member(empty, 1) is false 
  my-member([list: 1, 1, 2], 2) is true
  # test using Pyret builtin
  my-member([list: 1, 1, 2], 2) is member([list: 1, 1, 2], 2)
end

shuffle()

Rewriting this was equivalent to doing one of the course assignments Sortacle. I used num-random from the number library.

fun my-shuffle<A>(lst :: List<A>) -> List<A>:
  doc:"Returns a new List with all the elements of the original List in random order."

  fun help(len :: Number, acc :: List<A>, prev :: List<Number>)-> List<A>:
    rand = num-random(len)
    if length(acc) == len:
      acc  
    else if prev.member(rand): # test if get index already taken
      help(len, acc, prev)
    else:
      permute = get(lst, rand) 
      help(len, push(acc, permute), push(prev, rand))
    end
  end

  help(length(lst), empty, empty)
end

check "Shuffle Properties Test":

  fun remove-1(lst, acc, v):
    doc: "Special remove function that only removes 1 match"
    if lst.first == v:
      acc + lst.rest
    else: remove-1(lst.rest, acc + [list: lst.first], v)
    end
  end

  fun elt-test(shuffled, original, result): 
    doc: "Test if elements the same in permuted and original lists"
    cases(List) shuffled:
      | empty => 
        # fold over the result list and'ing to find a false
        (result.foldl(lam(base, x): (base and (x == true)) end, true))
        and
        # make sure there's no extra elements leftover 
        (shuffled.length() == original.length())
      | link(f, r) => 
        if original.member(f):
          # delete found elt from original
          new-orig = remove-1(original, empty, f) 
          elt-test(r, new-orig, push(result, true))
        else:
          # for debugging keep everything in a list instead of halting w/false
          elt-test(r, original, push(result, false))
        end
    end
  end

  # test random lists
  fun quick-check(count):
    # QuickCheck is the Haskell library that generates random inputs and tests properties hold 
    if count == 0:
      "Finished"
    else:    
      test = range(1, (1 + num-random(30))).map(lam(x): num-random(x) end)
      block:
        # we can do this in a check block
        elt-test(my-shuffle(test), test, empty) is true
        # test against Pyret shuffle builtin
        elt-test(my-shuffle(test), shuffle(test), empty) is true
        quick-check(count - 1)
      end
    end
  end
  quick-check(30)
end

Lab: HOFs

Get the higher-order functions lab from here (cs19 2016 version). Reading the lab most of this is in the book now.

Map

We're asked to write our own map. These type signature annotations are in the book (A -> B) means a function taken as a parameter that consumes type A and returns type B. These are similar to math variables where x + y = 2 can be the same (1 + 1 = 2) or different (3 + (-1) = 2) just they have to be consistent:

  • (Number -> Number)-> Number
  • (Number -> String)-> String
  • (String -> String)-> String
    • these can all be (a -> b) -> b

Write your own map:

fun my-map<A,B>(func :: (A -> B), l :: List<A>) -> List<B>:
  cases (List) l:
    | empty => empty
    | link(f, r) => 
      link(func(f), my-map(func, r))
  end
where:
  my-map(num-to-string, [list: 1, 2]) is [list: "1", "2"]
  my-map(lam(x): x + 1 end, [list: 1, 2]) is [list: 2, 3]
end

Filter

Write your own filter:

fun my-filter<T>(func :: ( T -> Boolean), l :: List<T>)-> List<T>:
  cases (List) l:
    | empty => empty
    | link(f, r) =>
      if func(f):
        link(f, my-filter(func, r))
      else:
        my-filter(func, r)
      end
  end
end
check:
  # these are all tests from the Pyret docs for .filter
  fun length-is-one(s :: String) -> Boolean:
    string-length(s) == 1
  end
  my-filter(length-is-one, [list: "ab", "a", "", "c"]) is [list: "a", "c"]
  my-filter(is-link, [list: empty, link(1, empty), empty]) is [list: link(1, empty)]
  my-filter(lam(x): x > 0 end, [list: 0, 1, -1]) is [list: 1]
end

Fold

Write your own fold, func now takes two parameters func(acc, f) where acc is an accumulator like storing a subtotal:

fun my-fold<A,B>(func :: (B, A -> B), acc :: B, l :: List<A>) -> B:
  cases (List) l:
    | empty => acc
    | link(f, r) =>
      my-fold(func, func(acc, f), r)
  end
end
check:
  my-fold((lam(acc, elt): acc + elt end), 0, [list: 3, 2, 1]) is 6

  fun combine(acc, elt) -> String:
    tostring(elt) + " - " + acc
  end

  my-fold(combine, "END", [list: 3, 2, 1]) is "1 - 2 - 3 - END"
  my-fold(combine, "END", empty) is "END"
end

Map2

Write map2:

fun my-map2<A,B>(func :: (A, A -> B), list1 :: List<A>, list2 :: List<A>) -> List<B>:
  cases (List) list1:
    | empty => empty 
    | link(f, r) =>
      cases (List) list2:
        | empty => empty
        | link(ff, rr) =>
          link(func(f, ff), my-map2(func, r, rr))
      end
  end
where:
  # documentation says it iterates over the shortest list and stops
  a = [list: 1, 2, 3]
  b = [list: 1, 2, 3, 4]
  my-map2(lam(x, y): x + y end, a, b) is [list: 2, 4, 6]
  my-map2(lam(x, y): x + y end, b, a) is [list: 2, 4, 6]

  # example tests from documentation
  my-map2(string-append, [list: "mis", "mal"], [list: "fortune", "practice"])
    is [list: "misfortune", "malpractice"]
  my-map2(_ + _, [list: "mis", "mal"], [list: "fortune", "practice"])
    is [list: "misfortune", "malpractice"]
  my-map2(string-append, [list: "mis", "mal"], [list: "fortune"])
    is [list: "misfortune"]
  my-map2(string-append, [list: "mis", "mal"], empty)
    is empty
end

The last task, best-price, look on youtube what a basic demand function is:

fun demand(price-increase :: Number) -> Number:
 doc: "1000 is approx demand of product at a fixed price"
  1000 - (60 * price-increase)
end

This means as the price increases, the quantity demanded will decrease by 60 units times the price increase.

Assignment 1

At this point you have caught up to all the students in the course who did the placement exercises over the summer, now the course assignments come fast and furious if you read the FAQ it's so students can decide to drop the course or not if they fall too behind in the work but we don't have a deadline to meet.

Assignment DocDiff was released the first lecture. Watch this starting @33:11 if you are curious to see what these document vectors mean in 3D space. The assignments in this course are purposely designed so you have to write examples in order to solve them they aren't the drill we just completed rewriting most of the list library.

The dot product definition, look at the example for [1,3,-5] x [4,-2,-1]. Each index in the first list is multiplied by it's corresponding index in the second list, then those results are added together. To read the sigma notation, a*b = the sum of \(a_ib_i\) starting at index 1 (first element of the list) up to the length of the list (n). If a = \([1_{1} ,3_{2}, -5_{3}]\) and b = \([4_{1}, -2_{2}, -1_{3}]\):

\(a*b = \sum_{i=1}^3 a_ib_i\) = (\(1_{a1} * 4_{b1})+(3_{a2} * -2_{b2})+(-5_{a3} * -1_{b3})\)

We are given the signature the input is two lists of strings and the output will be a value with square roots involved (irrational numbers) so we are dealing with Roughnums.

fun overlap(doc1 :: List<String>, doc2 :: List<String>) -> Number:

To vectorize both documents we need a distinct combined list of doc1 and doc2, that is converted to lower case, that is sorted, to represent the word frequency count of each document.

Example input to overap:
doc1 = [list: "a", "B", "c"]
doc2 = [list: "d", "d", "D", "b"]

Distinct/lowercase and sorted combination of doc1/doc2
doc1 + doc2 [list: "a", "b", "c", "d"]

Total words:[list: "a", "b", "c", "d"]
doc1-vec =  [list:  1,   1,   1,   0]
doc2-vec =  [list:  0,   1,   0,   3]

It is returning an overlap between 0 and 1, where two documents of length 3 that have 2 same words is 2/3, and if all 3 are are the same then 3/3 or overlap of 1. No similar words is 0/3 or 0.

fun overlap(doc1 :: List<String>, doc2 :: List<String>) -> Number:
  doc: ""
  lower-doc1 = map(string-to-lower, doc1)
  
where:
 # these examples taken from the Examplar paper
  overlap([list: "welcome", "to", "Walmart"], 
    [list: "WELCOME", "To", "walmart"]) is-roughly 3/3
  overlap([list: "1", "!", "A", "?", "b"], 
    [list: "1", "A", "b"]) is-roughly 3/5
  overlap([list: "alakazam", "abra"],
    [list: "abra", "kadabra", "alakazam", "abra"]) is-roughly 2/4
  overlap([list: "a", "b"], [list: "c"]) is 0/3

  # epsilon test for roughnums
  epsilon = 0.001
  a = [list: "alakazam", "abra"]
  b = [list: "abra", "kadabra", "alakazam", "abra"]

  num-abs(overlap(a, b) - 2/4) <= epsilon is true  
end

Look in the documentation for binary operators you can use, and built-in libraries. The math equation we just have to translate it to a program using built-ins. This is how I interpreted it:

(dot-product(vec1, vec2)) / num-max(
    num-sqr(num-sqrt(dot-product(vec1, vec1))), 
    num-sqr(num-sqrt(dot-product(vec2, vec2))))

Sample solution

This is a sample solution I cut and pasted from a 2019 Brown student so you can see how many tests they are expected to write:

fun combine(a :: List<String>, b :: List<String>) -> List<String>:
  doc: ```Consumes two lists a,b to combine; 
       returns a single list with all elements from both```
  cases (List) a:
    | empty => b
    | link(f,r) => combine(r, link(f, b))
  end
where:
  combine([list: "a", "b"], [list: "c", "d"]) is [list: "b", "a", "c", "d"]
  combine(empty, [list: "a", "b"]) is [list: "a", "b"]
  combine([list: "a", "b"], empty) is [list: "b", "a"]
  combine(empty, empty) is empty
end

fun weed(junk :: List<String>, builder :: List<String>) -> List<String>:
  doc: ```Consumes a list to remove duplicates one 
       and a list of all unique values; returns the list weeded of
       all duplicates```
  cases (List) junk:
    | empty => builder.sort()
    | link(f,r) => 
      if builder.member(f):
        weed(r, builder)
      else:
        weed(r, link(f, builder))
      end
  end
where:
  weed([list: "b", "a", "aa", "a"], empty) is [list: "a", "aa", "b"]
  weed(empty, empty) is empty
end

fun occurances(basis :: List<String>, key :: String) -> Number:
  doc: ```Consumes a list of strings and a key to search for;
       returns the number of occurances of the key```
  cases (List) basis:
    | empty => 0
    | link(f,r) => 
      if f == key:
        1 + occurances(r, key)
      else:
        occurances(r, key)
      end
  end
where:
  occurances([list: "a", "b", "c", "a", "a", "b"], "a") is 3
  occurances([list: "a", "boot", "a"], "c") is 0
  occurances(empty, 'A') is 0
end

fun val-vector(basis :: List<String>, masterlist :: List<String>,
    value :: List<Number>) -> List<Number>:
  doc: ```Consumes a list of the vector for which to determine the value,
       the master list to check against, and the current value vector;
       returns the final value vector counting 
       each item from the master list```
  cases (List) masterlist:
    | empty => value.reverse()
    | link(f,r) => val-vector(basis, r, (link(occurances(basis, f), value)))
  end
where:
  val-vector([list: "a", "b", "a"], [list: "a", "b", "c"], empty) 
    is [list: 2, 1, 0]
  val-vector(empty, empty, empty) is empty
  val-vector(empty, [list: "a", "b"], empty) is [list: 0, 0]
end

fun dot(v1 :: List<Number>, v2 :: List<Number>) -> Number:
  doc: ```Consumes two vectors, v1 and v2, of equal length;
       returns the standard dot product of the vectors```
  cases (List) v1:
    | empty => 0
    | link(f,r) => (f * v2.first) + dot(r, v2.rest)
  end
where:
  dot([list: 1, 2, 3], [list: 1, 2, 3]) is 14
  dot(empty, empty) is 0
end

fun overlap(doc1 :: List<String>, doc2 :: List<String>) -> Number:
  doc: ```Consumes two lists of strings;
       returns the overlap score of the two```
  a = map(string-tolower, doc1)
  b = map(string-tolower, doc2)
  masterlist = weed(combine(a,b), empty)
  v1 = val-vector(a, masterlist, empty)
  v2 = val-vector(b, masterlist, empty)
  (dot(v1, v2) / (num-max(dot(v1, v1), dot(v2, v2))))
where:
  overlap([list: "a", "b", "c"], [list: "d", "d", "d", "b"]) is (1 / 10)
  overlap([list: "a", "b"], [list: "a", "B"]) is 1
  overlap([list: "Cat", "dOg", "horse"], [list: "cat", "dog", "pig"]) is (2 / 3)
  overlap([list: "", "a", "b"], [list: "a", "b"]) is (2 / 3)
  overlap([list: "Hello World!", "boop", "heh"], [list: "HelloWorld!", "AA", "boop"]) is (1 / 3)
  overlap([list: "''", "'"], [list: "'"]) is (1 / 2)
  overlap([list: "CAT", "DOG", "HORSE"], [list: "cat", "dog", "horse", "pig"]) is (3 / 4)
  overlap([list: "3", "4", "3"], [list: "3", "5"]) is (4 / 10)
  overlap([list: "aaa"], [list: "aaa", "4"]) is (1 / 2)
  overlap([list: "*", "jk", "'"], [list: "*", "-"]) is (1 / 3)
  overlap([list: "boop", "bOop", "BooP"], [list: "booP", "boooop"]) is (1 / 3)
   overlap([list: "Last", "Friday", "in", "three", "week's", "time", "I", "saw", "a", "spotted", 
      "striped", "blue", "worm", "shake", "hands", "with", "a", "legless", "lizard"], [list: "I", 
      "am", "counting", "my", "calories,", "yet", "I", "really", "want", "dessert"]) is (2 / 21)
end

Here is my solution using map/filter/fold. I didn't test all the little functions inside overlap, the prof would prob fail me for this:

fun overlap(doc1 :: List<String>, doc2 :: List<String>)-> Number:
  doc: "Consumes two lists of strings returns the overlap score"

  fun vectorize(l1 :: List<String>, l2 :: List<String>)-> List<Number>:
    doc: "Returns document l1 as a vector of size l2"
    map(lam(x): filter(lam(y): if x == y: true else: false end end, l1).length() end, l2)
  end

  fun dot-product(v1 :: List<Number>, v2 :: List<Number>)-> Number:
    doc: "Dot product of two vectors"
    map2(lam(x, y): x * y end, v1, v2).foldl(lam(acc, z): acc + z end, 0)
  end

  doc1-lower = map(string-tolower, doc1)
  doc2-lower = map(string-tolower, doc2)
  all-words = sort(distinct(doc1-lower + doc2-lower))
  vec1 = vectorize(doc1-lower, all-words)
  vec2 = vectorize(doc2-lower, all-words)
  (dot-product(vec1, vec2)
      / (num-max(dot-product(vec1, vec1), dot-product(vec2, vec2))))
where:
  # these examples taken from the Examplar paper
  overlap([list: "welcome", "to", "Walmart"], 
    [list: "WELCOME", "To", "walmart"]) is-roughly 3/3
  overlap([list: "1", "!", "A", "?", "b"], 
    [list: "1", "A", "b"]) is-roughly 3/5
  overlap([list: "alakazam", "abra"],
    [list: "abra", "kadabra", "alakazam", "abra"]) is-roughly 2/4
  overlap([list: "a", "b"], [list: "c"]) is 0/3
end

Lecture 2 Rainfall Problem

Watch the Mon9/10/18 second lecture. The first task try it in Pyret yourself and see if you made any mistakes as later in the lecture he'll show all the errors students typically make doing this. The end introduces table shaped data ie: databases if you didn't read the table section from the book.

Assignment 2

If you watched lecture 2 the students talked about starting Nile already so let's do it too. Here is the writeup. Try some examples:

data Recommendation:
  | recommendation(count :: Number, names :: List<String>)
end

data File:
  | file(name :: String, content :: String)
end

fun recommend(title :: String, book-records :: List<File>) -> Recommendation:
  ...
end

check:
  file-1 = file("", "Crime and Punishment\nHeaps are Lame\nLord of the Flies")
  file-2 = file("", "Crime and Punishment\nRandom Book\nLord of the Flies")
  file-3 = file("", "Test Book\nHeaps are Lame\nRandom Book")

  input = [list: file-1, file-2, file-3]

  recommend("Crime and Punishment", input) is recommendation(2, [list: "Lord of the Flies"])
  recommend("Test Book", input) is recommendation(1, [list: "Heaps are Lame", "Random Book"])
  recommend("No Recommendation", input) is recommendation(0, empty)
end

In later versions of the course, they removed the file strings and they changed popular-pairs to a BookPair datatype. The assignment wants us to rewrite any builtins ourselves which we already did beginning of this course except for list concat + operator so I rewrote it:

fun cat(l1, l2):
  doc: "Rewrite of the list + concat operator"

  if l1 == empty:
    l2
  else if l2 == empty:
    l1
  else:
    link(l1.first, cat(l1.rest, l2))
  end
where:
  cat([list: 1, 2], [list: 3, 4]) is [list: 1, 2] + [list: 3, 4]
  cat([list: 1], [list: ]) is [list: 1] + empty
  cat([list: ], [list: 1]) is empty + [list: 1]
end

Try to plan out everything that needs to be done then the assignment writes itself:

Recommend:

  • loop input of List<File>
  • convert file string to List<String>
    • is title a member? then append that list to all-recs
    • filter/remove searched for title from all-recs
  • filter or count the titles that appear the most
  • prepare recommendations as explained in assignment

Popular pairs:

  • loop input of List<File>
  • convert file string to List<String>
    • append list to all-recs
    • filter/remove duplicate titles
  • call recommend on every title in the distinct all-recs list
  • filter, sort or count the recommendations that appear the most
  • prepare popular pairs as explained in assignment

In the strings library you'll see a builtin function for exploding out a string into a list of code points, write a function that extracts the string, cleans the newline '\n', and returns a list of titles:

data File:
  | file(name :: String, content :: String)
end

fun file-to-list(fl :: File)-> List<String>:
  doc: "List<File>.content to List<String> in sorted order"

  fun newline-clean(l :: List<Number>, acc :: List<String>)-> List<String>:
    doc: "Converts code-points to string, removes the newline char"
    cases (List) l:
      | empty => empty
      | link(f, r) => if f == 10:
          link(string-from-code-points(acc), newline-clean(r, empty))
        else:
          newline-clean(r, acc + [list: f])
        end
    end
  end

  # add a newline char to the end of the file string as a sentinel
  code-points = string-to-code-points(fl.content) + [list: 10]
  newline-clean(code-points, empty).sort()
where:
  test1 = file("", "A\nH\nL")
  test2 = file("", "T\nA\nR")

  file-to-list(test1) is [list: "A", "H", "L"]
  file-to-list(test2) is [list: "A", "R", "T"] 
end

Continue through our list of TODO items:

fun all-recs(lf :: List<File>, title :: String)-> List<String>:
  doc: "All other titles in files that contain title"
  cases (List) lf:
    | empty => empty
    | link(f, r) => 
      titles = file-to-list(f)
      if titles.member(title):
        titles.remove(title) + all-recs(r, title)
      else:
      all-recs(r, title) end
  end
end

fun all-titles(lf :: List<File>)-> List<String>:
  doc: "Distinct list of all titles for popular-pairs"
  cases (List) lf:
    | empty => empty
    | link(f, r) =>
      distinct(file-to-list(f) + all-titles(r))
  end
end

check "all-recs and all-titles":
  test1 = file("", "Crime and Punishment\nHeaps are Lame\nLord of the Flies")
  test2 = file("", "Test Book\nHeaps are Lame\nRandom Book\nLord of the Flies")
  input = [list: test1, test2]

  all-recs(input, "Nothing") is empty
  all-recs(input, "Heaps are Lame") is [list: "Crime and Punishment", 
    "Lord of the Flies", "Lord of the Flies", "Random Book", "Test Book"]

  all-titles(input) is [list: "Crime and Punishment", "Heaps are Lame",
    "Lord of the Flies", "Random Book", "Test Book"]
end

If all-recs sends back an empty list then we know to return recommendation(0, empty). If all-recs sends back a list that when we apply distinct to it is the same length then we know to return recommendation(1, all-recs). Otherwise we have dupes or more than 1 recommendation.

fun recs(i :: List<String>, acc :: List<Recommendation>) -> List<Recommendation>:
  doc: "If duplicates exist count them and return as a list of recommendations" 
  cases (List) i:
    | empty => acc
    | link(f, r) =>
      if r.member(f):
        # frequency count
        new-count = r.foldl(lam(x, base): if x == f: base + 1 else: base end end, 1)
        new-list = r.remove(f)
        # add to list of recommendations
        new-acc = [list: recommendation(new-count, [list: f])] + acc
        recs(new-list, new-acc)
      else:
        recs(r, acc)
      end
  end
end

fun rec-comp(r1 :: Recommendation, r2 :: Recommendation)-> Boolean:
  doc: "Comparison sort-by for Recommendation"
  r1.count > r2.count
end

fun rec-eq(r1 :: Recommendation, r2 :: Recommendation)-> Boolean:
  doc: "Equality sort-by for Recommendation"
  r1.count == r2.count
end

fun recommend(title :: String, book-records :: List<File>) -> Recommendation:
  doc: "Search title for recommendations given by users"

  titles = all-recs(book-records, title)

  if is-empty(titles):
    recommendation(0, empty)
  else if titles.length() == distinct(titles).length():
    recommendation(1, titles)
  else:
    # sort all the recommendations by highest count
    all-dupes = sort-by(recs(titles, empty), rec-comp, rec-eq)
    # get highest count
    max-count = all-dupes.first.count
    # filter by highest-count
    max-rec-list = filter(lam(x): x.count >= max-count end, all-dupes)
    # combine all the title lists of max-rec-list
    max-title-list = distinct(fold(lam(acc, x): x.names + acc end, empty, max-rec-list).sort())
    recommendation(max-count, max-title-list) 
  end
where:
  file-1 = file("", "A\nB\nZ\nD\nE")
  file-2 = file("", "A\nB\nZ\nF\nE")
  file-3 = file("", "B\nK\nE")

  input = [list: file-1, file-2, file-3]

  recommend("N", input) is recommendation(0, empty)
  recommend("D", input) is recommendation(1, [list: "A", "B", "E", "Z"])
  recommend("A", input) is recommendation(2, [list: "B", "E", "Z"])
  recommend("B", input) is recommendation(3, [list: "E"])
end

Popular pairs will be similar to this, try the data type:

data BookPair:
    | pair(book1 :: String, book2 :: string)
end

Lecture 3 Sorting

We're watching Wed9/12/18 lecture on sorting.

This is the code on the board which you should have typed into CPO yourself:

fun insert(n :: Number, l :: List<Number>) -> List<Number>:
  cases (List) l:
    | empty => [list: n]
    | link(f, r) => if n <= f:
        link(n, l)
      else:
        link(f, insert(n, r))
      end
  end
where:
  # student given examples:
  insert(2,  [list: 3,4,5])   is [list: 2,3,4,5]
  insert(1,  [list: 0,1,1,2]) is [list: 0,1,1,1,2]
  insert(-1, [list: -2,0,3])  is [list: -2,-1,0,3]
  insert(3,  [list: ])        is [list: 3]
  insert(4,  [list: 1,2,3])   is [list: 1,2,3,4]
end

fun sort(l :: List<Number>) -> List<Number>:
  cases (List) l:
    | empty => empty
    | link(f, r) => insert(f, sort(r))
  end
where:
  sort([list: 3,2,1])        is [list: 1,2,3]
  sort([list: 1,2,3])        is [list: 1,2,3] 
  sort([list: 4,1,-1])       is [list: -1, 1, 4]
  sort([list: 0.1, 0, 0.11]) is [list: 0, 0.1, 0.11]
end

Let's step the evaluation by hand with [list: 0, 1, 3, 2] being input to sort(). It explodes it with the following insert delayed computations:

  • insert(0, (insert(1, (insert(3, (insert(2, empty)))))))

The returns to sort():

  • insert(0,(insert(1,(insert(3, [list: 2])))))
    • f > n or 3 > 2
  • insert(0,(insert(1,(insert(2, link(3, empty))))))
  • insert(0,(insert(1, [list: 2, 3])))
  • insert(0, [list: 1, 2, 3])

Everytime a list is returned, the next call to insert can happen because it is complete with insert(n, list).

Assignment 3

This assignment we're building a testing oracle. It's a program that tests another program and very common in the industry.

First function is generate-input(). Looking at the table here of printable ASCII characters A thru Z is 65-90 and a thru z is 97-122 you could limit to those chars but it doesn't really matter we could test every Unicode code-point up to 65535 like we are fuzzing inputs with everything as per the strings library documentation. Reminder that num-random(n) sometimes returns zero so pad it with (1 + (num-random(n)) or whatever lower range you want.

This is a correct-sorter we are given as an example but it's not the only one, we could also correctly sort by names instead of ages. Make examples of all different names with the same age and see how they are sorted by correct-sorter. Write another sort-by that uses p1.name < p2.name, create all same names with different ages and see how it's sorted.

fun correct-sorter(people :: List<Person>) -> List<Person>:
  sort-by(people,
    lam(p1, p2): p1.age < p2.age end,
    lam(p1, p2): p1.age == p2.age end)
end

Oracle only consumes a function and replies if it's correct or not so we have to check the dupes are properly sorted and do a name and age test since we don't know which sorting function will be tested. It should pass at least one of them.

Write down some examples for is-verify:

  • both lengths are the same
  • every element is present in the original input and sorted output
  • there are no missing or new elements in the output
  • the sorted output is verified to be correctly sorted in the case of dupe ages/names

This is what I came up with:

data Person:
  | person(name :: String, age :: Number)
end

fun gen-name(len :: Number)-> String:
  doc: "Generate a string of length len from code-points 65 - 255"
  rand = 1 + num-random(121)
  if len == 0:
    ""
  else if ((rand >= 65) and (rand <= 90)) or ((rand >= 97) and (rand <= 122)):
    string-from-code-point(rand) + gen-name(len - 1)
  else:
    gen-name(len)
  end
end

fun generate-input(n :: Number) -> List<Person>:
  doc:"Generate random n size list of person(name, age)"
  if n > 0:
    # add 1 to num-random() as it can return 0
    link(person(gen-name(1 + num-random(10)), 1 + num-random(100)),
      generate-input(n - 1))
  else:
    empty
  end
where:
  generate-input(0) is empty
  # test size is correct
  length(generate-input(5)) is length([list: 1, 2, 3, 4, 5])
  # test the name isn't blank
  map(lam(x :: Person): (string-length(x.name) > 0) is true end, generate-input(5))  
  # make sure the age > 0
  map(lam(x :: Person ): x.age > 0 is true end, generate-input(5))
end 

fun correct-sorter(people :: List<Person>) -> List<Person>:
  sort-by(people,
    lam(p1, p2): p1.age < p2.age end,
    lam(p1, p2): p1.age == p2.age end)
end

fun correct-sorter2(p):
  # sort by names instead of age
  sort-by(p,
    lam(p1, p2): p1.name < p2.name end,
    lam(p1, p2): p1.name == p2.name end)
end

fun incorrect-sort1(p):
  sort-by(p,
    # reverse sort
    lam(p1, p2): p1.age > p2.age end,
    lam(p1, p2): p1.age == p2.age end)
end

fun incorrect-sort2(p):
  sort-by(p, 
    # equality is borked
    lam(p1, p2): p1.age < p2.age end,
    lam(p1, p2): p1.age >= p2.age end)
end

fun incorrect-sort3(p):
  # null out the names
  nulled-names = map(lam(x): person("", x.age) end, p)
  sort-by(nulled-names,
    lam(p1, p2): p1.age < p2.age end,
    lam(p1, p2): p1.age == p2.age end)
end

fun incorrect-sort4(p):
  # returns list unchanged
  p
end

fun incorrect-sort5(p):
  # null out the ages
  nulled-ages = map(lam(x): person(x.name, 0) end, p)
  sort-by(nulled-ages,
    lam(p1, p2): p1.age < p2.age end,
    lam(p1, p2): p1.age == p2.age end)
end

fun incorrect-sort6(p):
  # change the ages
  changed = map(lam(x): person(x.name, 1 + x.age) end, p)
  sort-by(changed,
    lam(p1, p2): p1.age < p2.age end,
    lam(p1, p2): p1.age == p2.age end)
end

fun remove-1<T>(lst :: List<T>, acc :: List<T>, elt :: T)-> List<T>:
  doc: "Removes 1 dupe of elt"
  cases (List) lst:
    | empty => acc
    | link(f, r) =>
      if f == elt:
        acc + r
      else:
        remove-1(r, link(f, acc), elt)
      end
  end
where:
  remove-1([list: "b", "a", "a"], empty, "a") is [list: "b", "a"]
end

fun same-elts(lst1 :: List<Person>, lst2 :: List<Person>)-> List<Boolean>:
  doc: "Tests if lst1 has the same elements of lst2"
  cases(List) lst1:
    | empty => 
      # test if there's extra in lst2 
      if lst2.length() > 0:
        link(false, empty)
      else:
        empty
      end
    | link(f, r) =>
      if lst2.member(f): 
        # remove 1 dupe of f from lst2
        new-lst2 = remove-1(lst2, empty, f)
        link(true, same-elts(r, new-lst2))
      else:
        link(false, same-elts(r, lst2))
      end
  end
where:
  input = generate-input(2)
  more = input + generate-input(1)
  less = input.rest

  same-elts(empty, empty) is empty
  same-elts(input, input) is [list: true, true]
  same-elts(more, input) is [list: true, true, false]
  same-elts(input, more) is [list: true, true, false]
  same-elts(less, input) is [list: true, false]
end

fun age-sort-verify(lst :: List<Number>):
  doc: "Check if lst is correctly sorted by age, return a boolean or empty"
  cases (List) lst:
    | empty => empty
    | link(f, r) =>
      cases (List) r:
          # if no more elements after f it's sorted
        | empty => true
        | link(ff, rr) =>
          if f.age < ff.age:
            age-sort-verify(r)
          else if (f.age == ff.age) and (f.name >= f.name):
            age-sort-verify(r)
          else:
            false
          end
      end
  end
where:
  age-sort-verify([list: person("c", 1), person("b", 1), person("a", 1)]) is true
  age-sort-verify(empty) is empty
end

fun name-sort-verify(lst :: List<String>):
  doc: "Check if lst is correctly sorted by name, return a boolean or empty"
  cases (List) lst:
    | empty => empty
    | link(f, r) =>
      cases (List) r:
        | empty => true
        | link(ff, rr) =>
          if f.name < ff.name:
            name-sort-verify(r)
          else if (f.name == ff.name) and (f.age > ff.age):
            name-sort-verify(r)
          else:
            false
          end
      end
  end
where:
  name-sort-verify([list: person("ab", 31), person("ab", 30), person("ab", 14)]) is true
  name-sort-verify(empty) is empty
end

fun is-valid(original :: List<Person>, sorted :: List<Person>) -> Boolean:
  doc:"Consumes original input and sorted input to test if correctly sorted"

  # all the elements and length are the same
  elements-same = same-elts(original, sorted)
  elt-test = fold(lam(base, x): x and base end, true, elements-same)
  # correctly sorted 
  age-test = age-sort-verify(sorted)
  name-test = name-sort-verify(sorted)

  if is-empty(original) or is-empty(sorted):
    raise("No input")
  else:
    elt-test and (age-test or name-test)
  end
where:
  a = generate-input(10)
  b = correct-sorter(a)
  c = correct-sorter2(a)
  d = correct-sorter(b + generate-input(1)) # new element
  e = correct-sorter(a.rest) # missing element

  is-valid(a, b) is true
  is-valid(a, c) is true
  is-valid(a, d) is false
  is-valid(a, e) is false
  is-valid(empty, empty) raises "No input"
end

fun oracle(f :: (List<Person> -> List<Person>))-> Boolean:
  doc: "Test if f is a correct sorter with random inputs"
  test-input = generate-input(30)
  is-valid(test-input, f(test-input))
where:
  oracle(correct-sorter) is true
  oracle(correct-sorter2) is true
  oracle(incorrect-sort1) is false
  oracle(incorrect-sort2) is false
  oracle(incorrect-sort3) is false
  oracle(incorrect-sort4) is false
  oracle(incorrect-sort5) is false
  oracle(incorrect-sort6) is false
end

My two function age/name-sort-verify are hacks since they are not type safe. The proper way to do this would be to only return one type (a list) either empty or [list: bool] then use fold to 'and' over the elements and return true/false.

Staff solution

There is a paper on this assignment writing how this course encourages property-based testing meaning test for properties that are always true under random inputs instead of testing specific values. They only extracted the is-valid function and ran them against their buggy sorters and they didn't assess student's random generators claiming it was good drill for them to write.

New elements being introduced is talked about in 5.1 What have we learned:

"Students suffered from a “directionality” problem: students’ property-based testing failed to notice new elements introduced in the output that were not in the input. This is reasonable from an operational perspective: while a buggy implementation could easily duplicate an input element in the output, it would be a rare implementation whose output contained entirely new elements that were not in the input. In a specification setting, however, this is a dangerous detail to overlook, because it would (for instance) give a synthesis algorithm freedom to generate an implementation that the specifier almost certainly did not want. Therefore, while this phenomenon is not widespread in this student population on these problems, its occurrence is still noteworthy, and the transition from property-based testing to full-on specification would need to address this.

Synthesis I think here means a code generating tool for example coq theorem prover you can use it to generate code. The rest of this paper gives you hints for the future assignments like Oracle.

Lecture 4 Big-O

Watching the Fri 9/14/18 lecture on performance here.

@23:33 he is arbitrarily counting any operation as a single unit of time except the recusive loop, we'll see in a few minutes that none of these small constants really matter in worst-case analysis. Using length as an example: for the empty case (when the algorithm terminates) there's 4 steps, for every other case there is 5 steps. Length loops k times or the size of the input. So we have 5k + 4 as our running time function or [k -> 5k + 4].

@39:43 [k -> 5k + 4] is [k -> 5k] or [k -> k] to suppress unimportant details because as he elaborates once k (the input) gets huge it's growth is really just k or linear.

@46:00 there exists some constant c that for all values of k: f(k) \(\le\) c * g(k). The f(k) function on the left is a model of your running time t(k) = 5k + 4, the cg(k) function on the right is an arbitrary function you choose from a list of existing math functions that closely bounds f(k).

Example:

  • f(n) = 5n + 4 and cg(n) = n
  • What if c = 10 and we have 100 inputs
    • 5(100) + 4 \(\le\) 10(100)
    • Right side cg(n) already has bounded f(n)
    • Our running time function is in O(g(n)) or O(n)

DCIC Predicting Growth

This chapter is one of the best in the book because it gives you everything for basic big-O analysis, but the course/book assumes you will go on to an algorithm analysis book later so can't cover everything.

Reading DCIC. The arbitrary counting he did in the lecture giving a unit of time to operations is a little different in the book where he adds some more constants to certain operations but of course none of these small constants matter in big-O analysis. The tabular method only works for programs that are purely conditionals like our cases on list programs. The 'question' or predicate is the left hand part of the => in cases on List, 'the answer' is the returned result on the right hand side of =>.

The exercise in the tabular method asks why is it justified to ignore the recursive call because really the entire program is the body of a loop, so whatever is in the body you multiply it by how many times you loop/recurse.

The exercise in 13.8 for big-theta function equality, big-theta is defined as: cg(x) \(\le\) f(x) \(\le\) dg(x) where both c and d are arbitrary constants. This means if there exists a constant factor where g(x) is less or equal to f(x), and a constant factor where g(x) is bigger or equal to f(x), then they grow at the same speed or are equal. Reflexivity means something equals itself and using our above inequality where f(x) can be less than, greater or equal to g(x) then f(x) = g(x) so we can substitute g(x) for f(x) and f(x) = f(x).

In 13.9 O(max(F,G)) is also how you do analysis on sequential programs in imperative programming you analyze blocks of code and take whatever the max block cost is. If there's loops then as we learned in the lecture you simply multiply the amount of times you loop by whatever operations are in the body and if it's recursive code the entire function is the loop body. If the loop calls another loop, then you multiply both their big-O together.

13.10 Solving Recurrences c0 or c1 is the notation for the constants in the base case or empty case. Try to make your own geometric interpretations for all the recurrences like what's in the book for k2, for example T(n/2) + c if you input 8 has height of 3 or log(8) height (all the logs are base 2).

All the recurrences in the book have an accompanying video somewhere on YouTube, such as this guy's playlist. I'm going to do it completely different exploiting asymptotics by filling in some values for k to find the big-O. Here I am doing what you should do which is just entering in values and playing around seeing what happens. The value of T(k) is everything that came before it T(k - 1) plus whatever costs at k. When you do this formally you can just assign T(0) or whatever base case cost to be 1 or 0.

  • T(k) = T(k - 1) + c
    • T(1) = T(1 - 1) or T(0) c0 + c
    • T(2) = T(2 - 1) or T(1) + c
    • T(3) = T(3 - 1) or T(2) + c

T(3) is T2 + c which is T1 + c which is c0 + c or total c + c + c + c0 or kc or O(k). Note that T(k=3) is 3c or kc because k is whatever number is inside T(k). T(5) would be 5c, T(100) is 100c, T(n) is nc.

  • T(k) = T(k - 1) + k
    • T(1) = T(0) c0 + k
    • T(2) = T(1) + k
    • T(3) = T(2) + k

T(3) total is k + k + k + c0 or 3k which is 3*3 since T(k=3) so O(k2). If k=4 then T(4) = 4k or 4*4.

  • T(k) = T(k/2) + c
    • T(2) = c1 + c
    • T(4) = T(2) + c or 2c
    • T(8) = T(4) + c or 3c

T(k=4) is c(log 4) and T(k=8) is c(log 8) or O(log k).

  • T(k) = T(k/2) + k

Same as above except it's T(4) = k(log 4) or 2k which is of course just a constant factor so O(k)

  • T(k) = 2T(k/2) + k
    • T(2) = 2T(1) + k or k
    • T(4) = 2T(2) + k or 3k
    • T(8) = 2T(4) + k or 7k
      • k(2log 8 - 1)
      • or since k=8 k(2log k - 1)
      • O(k * log k)

2T(k) is multiplication of whatever T(k) produces by 2 or doubling of work. If you didn't get the book version it's the substitution method explained here.

  • T(k) = 2T(k - 1) + c
    • T(1) = c0 + c
    • T(2) = 2T(1) + c or 3c
    • T(3) = 2T(2) + c or 7c
    • T(4) = 2T(3) + c or 15c

T(4) is c(2k - 1) or O(2k)

Induction (Optional)

The size of algorithm inputs are natural numbers. An algorithm has either 1 input or n inputs there is no half inputs or negative sized inputs. This means we can use induction to write proofs about algorithms.

Induction was explained by Aristotle in Posterior Analytics and Prior Analytics. In one of the books he says induction (epagoge) is a progression from particulars to a universal. This can mean either forming a concept by abstracting a specific case to a general case or a process of propositional inference where each step infers the next. Aristotle then writes induction can claim to have gone through all particular instances, not because it has actually enumerated every single particular instance, but through the realization that all instances have the same observed property as an essential property so clearly it defacto enumerates all instances. So to go back to his original definition of induction being a progression from particulars to a universal, he means the observations of essential properties that do not change in any instance leading to a general concept (hypothesis) provides the foundation for a statement's universality and not the piles of inferences. Aristotle then in the Topics explains how you would identify such properties. Today the interpretation of induction is propositional inference that leads to a 'creative moment of illumination' where you perform an inductive hypothesis but if you want to see it done without any kind of magic then read Aristotle.

Math induction

Mathematical induction is recursion. Prove some property holds for a base case of 1 input or empty, assume it holds for any arbitrary n input, prove it holds for n + 1 inputs.

Imagine you have tried using append on any 2 lists of arbitrary n length and everytime you have correctly deduced the length of the appended list. This is why in a proof by induction you can assume n it's because you've likely performed countless experiments/examples/tests where it worked for any n sized input. To prove it works for n + 1 input use list.first as length 1 and list.rest as length n then length(list.first + list.rest) is length(n + 1 )

Theorem: length(append(x, y)) = length(x) + length(y)

For the empty or base case we fix y and 'induct' on x:

  • Left hand side of theorem:
    • length(append(empty, y)
    • = length(y)
    • = 0 + length(y)
  • Right hand side of theorem:
    • length(empty) + length(y)
    • = 0 + length(y)
    • = length(y)

Assume: length(append(x.rest, y)) = length(x.rest) + length(y)

Anytime in our proof when we substitute/move things around, if it is in the form of the above we can substitute one for the other since we assumed they are both equal to each other. Now we prove the original theorem for the n + 1 case which is the entire list x:

  • length(append(x, y)) = length(x) + length(y)
  • length(append(append(x.first, x.rest), y)) [remove the first]
  • length(append(x.first, (append(x.rest, y)))) [the same expression as previous]
  • = 1 + length(append(x.rest, y)) [factor out x.first and add it to length]

The last rewrite/move we used the base case as we already established earlier length(empty) + length(y) was 0 + length(y). Now we have the left hand side in the form of the our assumption so we can substitute:

  • 1 + (length(x.rest) + length(y)) [by assumption]
  • length(append(x.first, x.rest)) + length(y) [Add back in x.first]
    • this is the same as length(x) + length(y) and matches right-hand side of theorem

Proof of bounds

In the lecture he said the class didn't need to know this yet as it would be taught in an algorithms analysis course but we can try some examples.

Prove T(k) = T(k - 1) + c is O(n)

  • Claim: For all k \(\ge\) 1, T(k - 1) + c \(\le\) ck
    • this is big-O definition f(n) \(\le\) cg(n)
  • Base case: k = 1 and T(1 - 1) = c or 1 which is \(\le\) ck
  • Assume: for arbitrary k, T(k) \(\le\) ck
  • Prove T(k + 1) \(\le\) c(k + 1)
    • T(k - 1 + 1) + c \(\le\) ck + c
    • T(k) + c \(\le\) ck + c
    • ck + c \(\le\) ck + c [rewrite by assumption]
    • O(k)

Prove T(k) = T(k - 1) + k is O(n2)

  • Claim: For all k \(\ge\) 1, T(k - 1) + k is \(\le\) kk
  • Base case: k = 1 and T(1 - 1) = k \(\le\) kk
    • If k = 1 then 1 \(\le\) 12
  • Assume: for arbitrary k, T(k) \(\le\) kk
  • Prove T(k + 1) \(\le\) (k + 1)(k + 1)
    • T(k - 1 + 1) + k \(\le\) (k + 1)2
    • T(k) + k \(\le\) (k + 1)2
    • kk + k \(\le\) (k + 1)2 [rewrite by assumption]
    • kk + 2k \(\le\) kk + 2k + 1 [rewrite with another k, inequality still true]
    • kk + 2k + 1 \(\le\) kk + 2k + 1 [add 1 and inequality still true]
    • (k + 1)2 \(\le\) (k + 1)2
    • O(k2)

'Stronger' induction

There's a variant called complete or strong induction where you assume all previous inputs up to n hold the property, then this implies n + 1 also holds this property.

Complete induction algorithm:

  • base case not usually needed since we are assuming everything holds up to n but can help with establishing bounds.
  • for all n \(\ge\) base case, if x holds this property: for all (base case) \(\le\) x \(\lt\) n then n also holds this property.

Let's try proving T(k) = T(k/2) + c is O(log k).

To make this proof easier assign the constant in T(k/2) + c a number and since it's a constant and doesn't matter (we can pick any number) we pick c = 4. Since T(1) is another constant we can pick T(1) = 1. We will derive the real constant later.

  • Claim: For all n \(\ge\) 2, T(n) \(\le\) clog(n)
  • Base case: (n = 2): T(2/2) = T(1) + 4 = 5
    • Temporarily pick 5 for c so T(2) \(\le\) 5log(2) it doesn't matter
    • 2 was chosen since log(0) is undefined and log(1) is 0 in log2
  • Assume: Let n \(\ge\) 3. For all 2 \(\le\) k \(\lt\) n assume T(k) \(\le\) clog(k)

Our strategy at this point is to somehow rewrite to get the form xn/y where x/y is a rational number less than 3. We want to arrive at the form clog(xn/y) so by log laws we have clog(n) + clog(x/y) + 4. Since this is an algorithm you also assume the inputs are increasing.

  • Recall let n \(\ge\) 3
  • T(n) = T(n/2) + 4 since n \(\ge\) 2 (base case)
  • \(\le\) clog(n/2) + 4
    • 3/2 is less than 2, from our assumption: 3/2 \(\le\) n/2 \(\lt\) n
  • \(\le\) clog(\(\frac{n + 1}{2}\)) + 4 since log is increasing and n/2 \(\le\) (n + 1)/2
  • \(\le\) clog(\(\frac{2n}{3}\)) + 4 since log is increasing and \(\frac{n + 1}{2}\) \(\le\) \(\frac{2n}{3}\) for n \(\ge\) 3
    • Works because (n + 1) \(\le\) 2n implies log(n + 1) \(\le\) log(2n)
    • We get the desired form and can use log laws
  • = clog(n) + clog(2/3) + 4
    • The constant is c \(\ge\) \(\frac{c}{-log_{2}(2/3)}\) + 4

Then 4/(-log(2/3)) is ~7. We can go back and change 5log(n) in our base case to 7log(n) and the proof is done:

  • 7log(n) + 7log(2/3) + 4
  • = 7log(n) + -4.09 + 4
  • = 7log(n) + O(1) (or really 0.09, or 0)
    • T(k/2) + c is O(log n)

Although we changed the c in T(k/2) + c to be 4 (and at the base case temporarily 5) you can see now it doesn't really matter what it is, you can always multiply clog2(2/3) by some constant to erase the other constant.

Lecture 5 iSort analysis

We're watching lecture 5 the Mon9/17/18 lecture. The first 30mins or so of the recording doesn't have video, but you can pause the video around 30:14 in, open the video again to hear the audio, and see everything on the board they are working on (sort and insert).

The if-statement you take the worst complexity branch and any dependent programs like sort calling insert, you do analysis on insert first. A link operation link(f, r) takes a constant amount of time regardless of the size of the input list, because it doesn't make a copy of the list.

On the right hand of the board:

  • Tinsert(0) = \(c_0\) (constant steps for empty, or base case)
  • Tinsert(k) = c + Tinsert(k - 1)
  • = c + c + c + … \(c_0\)
  • = kc + \(c_0\)

Sort analysis:

  • Tsort(0) = \(c_0\) or a constant amount of steps
  • Tsort(k) = Tsort(k - 1) + Tinsert(k - 1)c + \(c_0\)
  • = Tsort(k - 1) + kc
  • We've seen this already in the book T(k - 1) + k is the series 1 + 2 + 3 + …+(k - 1) + k or n(n + 1)/2 or \(O(n^2)\)
  • Insertion Sort is \(O(n^2)\) worst case, and O(n) + O(1) in the best case because you still have to call sort(r) and do linear work exploding that n sized list out before you can do the O(1) insert operations.

Lab: Big-O 1

As mentioned in lecture there's 2 Big-O labs. Let's try the first see bottom of page here.

The first lab task 1: We can disregard 5 and x in g(x) = 5x2 + x because the definition includes any constant multiplied by g(x), and we are bounding growth, so x2 bounds both x and x2

Task 2: Fill in the table for link(,r) => 1 + len(r) the underscore binding means you aren't using that selector as first of the list is never used in length, we only use r. It doesn't matter how you score the running time of constant operations, the answer is 12k + 4 total for the whole function I get 9k + 4 total and don't know where the other 3 constant operations came from. Either way it's still O(k).

Task 3: Look at the table directly above these exercises and make your best deduction.

  • 1: Every loop we are calling O(log j) so k * log j or O(nlog n)
  • 2: Sequentially running linear functions is O(n + n +…) or just O(n)
  • 3: k * mm or O(n3)
  • 4: Every loop we are calling remove which is linear, this is n*n or k*m
  • 5: Linear + Linear or O(n)
  • 6: Multiply the loop body by k so k(k + k*k) or O(n3)? Exponential time is a constant doubling of the amount of work wehreas this is like calling member and insertion sort on every element so unlikely to be exponential but I can't really tell with the wording.
  • 7: Every loop it calls a linear function so k*k

Task 4: Find the closed form.

  • T(k) = T(k - 2) + 1
    • Anything less or equal to 1 is c0 or assume 0
  • T(2) = T(0) + 1 or 0 + 1
  • T(3) = T(1) + 1 or 0 + 1
  • T(4) = T(2) + 1 or 2
  • T(5) = T(3) + 1 or 2
  • T(6) = T(4) + 1 or 3
  • T(8) = T(6) + 1 or 4
  • 1, 1, 2, 2, 3, 3, 4, 4…
    • Use oeis.org looks like it's floor(n/2) or integer division n // 2 in programming languages.
  • T(k) = 2T(k - 1) + 1
    • Video solution here
    • If you expand this it's \(2^{k-1}-1\)

Task 5: Find recurrence/closed form for delete. First you figure out the complexity of remove, which linearly goes through an entire list and removes something if it exists so worst case it's O(n) as it does n comparisons. Then you go through delete and it seems to me that it's T(k - 1) + k where the + k is that remove function being called every single loop that doesn't shrink with the input size. This is of course k -> k2 from book and 5.1 Solving Recurrences example in this lab.

The extra recurrences there are videos for them here but you aren't expected to know this yet just enough to recognize certain patterns of program behavior and their common recurrences as described in the book.

Assignment 4

This assignment data scripting is more practice that mirrors what a typical developer would have to do writing quick basic tasks. If for any reason the assignments go down use the way back machine/archive.org here. This is due before the next lecture in the actual course.

There's a paper for this assignment too, how palindrome needs a data cleaning solution (remember the second lecture/Rainfall problem) and how many other problems could benefit from reshaping data. Here's the paper from sci-hub to see how they would implement most of these problems and what students had difficulties with.

Lecture 6 Quicksort

We're watching lecture 6 titled Wed9/19/18. The fine details of the program trace in the beginning isn't important he is explaining the structure of linear k -> k vs quadratic k -> k2

He remarks that their quicksort implementation is really a filter, and we should try quicksort using a filter:

He ends the lecture trying to convey a bigger idea: use the structure of the data to come up with a simple solution, confirm the inefficient solution is correct, optimize it according to the bound you want.

fun combine(lt, p, gt):
  lt.append([list: p]).append(gt)
end

fun qsort(l):
  cases (List) l:
    | empty => empty
    | link(pivot, r) =>
      combine(
        qsort(r.filter(lam(n): n < pivot end)),
        pivot,
        qsort(r.filter(lam(n): n >= pivot end)))
  end
end
check:
  qsort([list: 2, 1, 3, 4, 5]) is [list: 1, 2, 3, 4, 5]
  qsort([list: 100, 0, 1, 343]) is [list: 0, 1, 100, 343]
  qsort([list: 0, -1, 100]) is [list: -1, 0, 100]
  qsort([list: ]) is ([list: ])
  qsort([list: 8, 1, 1, 9, -1]) is [list: -1, 1, 1, 8, 9]
end

Lab: Big-O 2

Second lab here. In the lab is another link to slides here about the lower bound on all comparison sort algorithms, how you can never do better than O(n log n). There are non-comparative sorting algorithms like radix sort where you can beat that time. It's surprising this is given in an intro course you usually learn it in an algorithms course much later.

Task: Minimum possible runtimes, you're only supposed to think about why your answer is true. For example to sum all elements in a list you have to look at every element no matter what so it's going to be linear at minimum. The second task sum the first list then sum the second and add both their totals: O(n) + O(n) which is still O(n). The third task you linearly go through one list, and at each loop do constant work on the entire second list so n * n. It would look like:

fun product(l1, l2):
  cases (List) l1:
    | empty => 0
    | link(f, r) => 
  l2.foldl(lam(x, acc): acc + (f * x) end, 0) + product(r, l2) end
where:
  product([list: 1, 2], [list: 1]) is 3
end

The task for ab or [list: 1, 2] and [list: 1] to 11 + 21 = 3 try with [list:2, 3] and [list: 3] because 23 + 33 doesn't equal (2*3) + (3*3). Let's say there's a typo or I misinterpreted the problem you'd still have to go through one entire list, and at each loop go through another entire linear list so n*n. Again, if you stick around we will find out the fastest possible runtimes of every graph/matrix multiplication etc. algorithm using all kinds of reduction techniques.

Task: undo-sol k*(complexity of undo-helper) which appears to be linear, each loop you are doing constant work then when finished reverse the list so linear + linear or O(n + n) which is O(n). So undo-sol is k*k.

Can we make sols-silly-flipper linear? Try some examples:

With reversing the list:
1, 2, 3, 4, 5, 6
1, 6, 5, 4, 3, 2
1, 6, 2, 3, 4, 5
1, 6, 2, 5, 4, 3
1, 6, 2, 5, 3, 4

Without reversing the list:
1, 2, 3, 4, 5, 6
1] 6, 2, 3, 4, 5 
1, 6] 2, 3, 4, 5 linked the elt that was moved
1, 6, 2] 3, 4, 5 linked next elt
1, 6, 2, 5] 3, 4 linked the elt that was moved
1, 6, 2, 5, 3, 4] linked next elt

You can use a combo of last, push and take to do this with another function passing the length to the linear flipper which decrements it everytime you recurse or do a take operation dropping elements. Then it's linear + linear which is O(n).

The formal writeup section try it yourself, my-reverse is linear, unique there's a fold that tests the accumulator if the current element is there (linear), if not return the accumulator and move to the next element. The call to my-reverse does linear work before giving my-reverse a duplicate free input then my-reverse performs linear work on the input so this is O(n + n) or O(n).

Assignment 5

Let's build a testing oracle. There's a starter template we need to access the function they give us matchmaker() and is-hire():

provide *
provide-types *

import shared-gdrive("oracle-support.arr",
  "11JjbUnU58ZJCXSphUEyIODD1YaDLAPLm") as O

##### PUT IMPORTS BELOW HERE ############


##### PUT IMPORTS ABOVE HERE ############

type Hire = O.Hire
hire = O.hire
is-hire = O.is-hire
matchmaker = O.matchmaker

# DO NOT CHANGE ANYTHING ABOVE THIS LINE

fun generate-input(num :: Number) -> List<List<Number>>:
  doc: 
  ```
  generates a list of candidates or companies for a group of size num
  ```
  empty
where:
  generate-input(0) is empty
end

fun is-valid(
    companies :: List<List<Number>>,
    candidates :: List<List<Number>>,
    hires :: Set<Hire>)
  -> Boolean:
  doc: 
  ```
  Tells if the set of hires is a good match for the given
  company and candidate preferences.
  ```
  true
where:
  is-valid(empty, empty, empty-set) is true
end

fun oracle(a-matchmaker :: (List<List<Number>>, List<List<Number>> 
      -> Set<Hire>))
  -> Boolean:
  doc: 
  ```
  Takes a purported matchmaking algorithm as input and outputs whether or
  not it always returns the correct response
  ```
  true
where:
  oracle(matchmaker) is true
end

Run the template then try to give matchmaker some inputs to see what it does:

companies =  [list: [list: 1, 0, 2], [list: 0, 2, 1], [list: 0, 2, 1]]
candidates = [list: [list: 1, 0, 2], [list: 1, 2, 0], [list: 0, 1, 2]]
# Note matchmaker writeup: it is 0 to n-1 indexed

matchmaker(companies, candidates)
>>[list-set: hire(2, 1), hire(1, 0), hire(0, 2)]

The stable matching algorithm is described here from the Algorithm Design book lecture page. We are given this algorithm in the template (note the import shared drive) we don't have to write it, only test it.

  • First round:
  • Companies all select their first preferred candidate
  • Candidates accept provisionally, if a more preferred offer is presented in next rounds they will reject this one
    • Rejected companies are put into a pool
  • Second round:
  • Companies in the pool with no matching then all select their second preferred candidate
  • Candidates accept or reject upon preferences
  • X rounds:
  • Repeat, noting that no company can offer more than once to a candidate

The 2019 assignment writeup contains a few hints, such as 'be cautious about using the provided matchmaker function in your oracle as this assumes that there is only one right answer to a given instance of the stable hiring problem'. This refers to the book using satisfies in tests where you compare the output of one program against another.

I translate this to mean there could be many permutations of the stable-hiring problem which are still correct, however the Gale-Shipley algorithm produces the exact same matching everytime if the company is making offers, and candidates rejecting. What if the roles are reversed, if it is candidate making offers and company rejecting? We have 2 inputs into matchmaker but it doesn't say which one is actually making the offers and which input is rejecting. We should try altering the inputs and see what happens:

companies =  [list: [list: 1, 0, 2], [list: 0, 2, 1], [list: 0, 2, 1]]
candidates = [list: [list: 1, 0, 2], [list: 1, 2, 0], [list: 0, 1, 2]]

>>matchmaker(candidates, companies)
[list-set: hire(2,2), hire(1,0), hire(0,1)]

>>matchmaker(companies, candidates)
[list-set: hire(2,1), hire(1,0), hire(0,2)]

They're different, so there is multiple possible matchings we have to test for. I found an old book entirely on the Stable Marriage problem here and on page 8 (of the book, not the pdf page#): For each company C, look at their preferences and check that each candidate they prefer is not matched to somebody else if candidate also prefers company C. Think about the properties of a stable matching: there should be n companies matched with n candidates, there should not be any double matching or missing matchings. The goal here is write a program that can do enough property based testing to not be fooled by any subtle introductions of bugs like the Sortacale assignment.

Solutions

The solutions or what kind of tests they expected to see are all in the earlier paper we read (see 4.2 Matcher). They point out the problem is a relation meaning many correct outputs whereas a function only has a single output, so the oracle (much like sortacle) tests properties not specific values.

Lecture 7 Trees

We're watching lecture 7 Fri9/21/18 lecture on 2D tree shaped data w/guest lecturer who teaches the Logic for Systems course at Brown we will do after this. An interesting paper from his page authored with Prof K you should read is The Human in Formal Methods. In it they write about porting the method of example writing we've been doing to this domain. A common learning mistake that students make is talked about, in that they write programs before they have understood the problem, as a result they 'solve' the wrong problem, which misdirected from the learning goals.

The mutual recursive datatypes, to understand them write out your own tree:

       a 
      /  \   
    b    none     
   /  \
  c    none
   \ 
    none

a has child b and none
b has child c and none
c has none
  • start with a:
  • person("a", No-children)
  • add in b:
  • person("a", child(person("b", No-children), No-children))
  • add in c:
  • person("a", child(person("b", child(person("c", No-children), No-children)), No-children))

Lecture 8 Sets

We're watching lecture 8 Mon9/24/18 lecture on sets. This is a design lecture, given something that has no notion of order, how do you design it's data structure representation. The end of this lecture discussing the big-O complexity of both representations he points out the inefficient representation is better depending on what you care about, like appending a log you often don't look at.

Lecture 9 BSTs

We're watching lecture 9 Wed/9/26/18. This is a light lecture just explaining the intuition behind logarithmic complexity to set up balanced binary search trees. At 7:10 he explains logarithms: "A (base-10) logarithm of a number is essentially how many digits a number has". He then explains in terms of virtual machine costs, if you have a logarithmic algorithm and there is a million new runs of that algorithm you are only paying 6x the cost.

Assignment 6

This is a standard assignment creating a filesystem directory tree/model.

I was curious and wanted to write a method meaning a .function() type built-in like l.length(), Pyret allows you to write methods for datatypes here's an example for du-dir() where I wrote a .size() method:

Click to see my example of size

provide *
provide-types *

data File:
  | file(name :: String, size :: Number, content :: String)
end

data Dir:
  | dir(name :: String, ds :: List<Dir>, fs :: List<File>) with:
    method size(self :: Dir) -> Number:
      doc: "Returns combined file size, file count, and sub directory count of a Dir"
      size(self)
    end
end

fun size(d :: Dir) -> Number:
  doc: "Method .size()"

  fun fsize(fd :: Dir) -> Number:
    doc: "Returns file size, and number of files in a Dir"
    fd.fs.foldl(lam(f :: File, acc :: Number): 1 + f.size + acc end, 0)
  end

  fsize(d)
    + map(lam(x :: Dir): 1 + size(x) end, d.ds).foldl(lam(elt, acc): elt + acc end, 0)
end
check:
  a = dir("TS", [list:
      dir("Text", empty, [list: file("part1", 1, "content")])], empty)

  b = dir("Libs", [list: 
      dir("Code", empty, [list: file("hang", 1, "content"), file("draw", 1, "content")]), 
      dir("Docs", empty, [list: file("read!", 1, "content")])], empty)

  c = dir("TS", [list:
      dir("Text", empty, [list: file("part1", 1, "content"), file("part2", 1, "content"), file("part3", 1, "content")]),   
      dir("Libs", [list: 
          dir("Code", empty, [list: file("hang", 1, "content"), file("draw", 2, "content")]), 
          dir("Docs", empty, [list: file("read!", 1, "content")])], empty)],
    [list: file("read!", 1, "content")])

  d = dir("TS", [list:
      dir("Text", empty, [list: file("part1", 99, "content"), file("part2", 52, "content"), file("part3", 17, "content")]),   
      dir("Libs", [list: 
          dir("Code", empty, [list: file("hang", 8, "content"), file("draw", 2, "content")]), 
          dir("Docs", empty, [list: file("read!", 19, "content")])], empty)],
    [list: file("read!", 10, "content")])

  e = dir("TS", empty, [list: file("", 1, "")])

  a.size() is 3
  b.size() is 8
  c.size() is 19
  d.size() is 218
  e.size() is 2
end

I looked at the Pyret source on github to see how to chain multiple methods:

data Dir:
  | dir(name :: String, ds :: List<Dir>, fs :: List<File>) with:
    method size(self :: Dir) -> Number:
      doc: "Returns the size of all sub directories"
      size(self)
    end,
    method path(self :: Dir, s :: String) -> List<Path>:
      doc: "Takes a Dir, and a search string, returns the path"
      path(self, s)
    end
end

Lecture 10 Balanced Binary Search Trees

Watching lecture 10 Fri/9/28/18. We discover the entire last lecture was just trolling and would not give us logarithmic time in terms of successfully being able to throw away half the data because the tree isn't balanced. This whole lecture is understanding how you can balance a tree.

Lab: Tree Traversal

Let's try the CS19 lab for trees. The first part is just thought exercises then the lab offers us stencil code for the BT tree lab sections:

#  _____                _                                      _ 
# |_   _| __ ___  ___  | |_ _ __ __ ___   _____ _ __ ___  __ _| |
#   | || '__/ _ \/ _ \ | __| '__/ _` \ \ / / _ \ '__/ __|/ _` | |
#   | || | |  __/  __/ | |_| | | (_| |\ V /  __/ |  \__ \ (_| | |
#   |_||_|  \___|\___|  \__|_|  \__,_| \_/ \___|_|  |___/\__,_|_|

data BTree<T>:
  | mt
  | node(value :: T, left :: BTree<T>, right :: BTree<T>)
end

fun btree-in-order<A>(tree :: BTree<A>) -> List<A>:
  doc: "Returns the elements of tree in a list via an in-order traversal"
  ...
where:
  nothing
end

fun btree-pre-order<A>(tree :: BTree<A>) -> List<A>:
  doc: "Returns the elements of tree in a list via a pre-order traversal"
  ...
where:
  nothing
end

fun btree-post-order<A>(tree :: BTree<A>) -> List<A>:
  doc: "Returns the elements of tree in a list via a post-order traversal"
  ...
where:
  nothing
end

First make a tree, run it, and type 'a-tree' into pyret interpreter and click the output to see the right and left branches, or a-tree.left or a-tree.right

a-tree =
  node(5,
    node(4, 
      node(3, mt, mt), mt),
    node(6, node(7, mt, mt), node(8, mt, mt)))

>> a-tree.left
node(4, node(3, mt, mt), mt)

>> a-tree.right
node(6, node(7, mt, mt), node(8, mt, mt))

>> a-tree.value
5

The lab wants us to write various traversals, here's what I wrote for btree-in-order which is basically the other two orders as well, just rearrange helper(l) + root + helper(r):

data BTree<T>:
  | mt
  | node(value :: T, left :: BTree<T>, right :: BTree<T>)
end

a-tree =
  node(5,
    node(4, 
      node(3, mt, mt), mt),
    node(6, node(7, mt, mt), node(8, mt, mt)))

# degenerate tree from DCIC book
b-tree = node(1, mt,
  node(2, mt,
    node(3, mt,
      node(4, mt, mt))))

fun btree-in-order<A>(tree :: BTree<A>) -> List<A>:
  doc: "Returns the elements of tree in a list via an in-order traversal"

  fun helper(n :: BTree<A>) -> List<A>:
    cases (BTree) n:
      | mt => empty
      | node(v, l, r) =>
        link(v, helper(l) + helper(r)) 
    end
  end

  cases (BTree) tree:
    | mt => empty
    | node(v, l, r) =>
      helper(l) + link(v, empty) + helper(r)
  end
where:
  btree-in-order(a-tree) is [list: 4, 3, 5, 6, 7, 8]
  btree-in-order(b-tree) is [list: 1, 2, 3, 4]
end

Another lab where we rewrite map/filter/fold. When you see Prof K's lecture on knowledge transfer you'll understand why he makes us do this over and over. This is my first attempt at map, I wrote it like he writes software on the board in class thinking through the data definition:

data BTree<T>:
  | mt
  | node(value :: T, left :: BTree<T>, right :: BTree<T>)
end

a-tree =
  node(5,
    node(4, 
      node(3, mt, mt), mt),
    node(6, node(7, mt, mt), node(8, mt, mt)))

# degenerate tree from DCIC book
b-tree = node(1, mt,
  node(2, mt,
    node(3, mt,
      node(4, mt, mt))))

fun btree-map<A, B>(f :: (A -> B), tree :: BTree<A>) -> BTree<B>:
  doc: "Recursively applies f to the value of every node contained in tree"

  cases (BTree) tree:
    | mt => mt
    | node(v, l, r) =>
      node(f(v), btree-map(f, l), btree-map(f, r))
  end
where:
  btree-map(lam(x): x + 1 end, a-tree) is node(6, node(5, node(4, mt, mt), mt), 
    node(7, node(8, mt, mt), node(9, mt, mt)))
  btree-map(lam(x): x + 1 end, b-tree) is node(2, mt, node(3, mt, node(4, mt, node(5, mt, mt))))
end

>>btree-map(lam(x): x * x end, a-tree)
node(25, node(16, node(9, mt, mt), mt), node(36, node(49, mt, mt), node(64, mt, mt)))

Note that btree-fold the lab wants you to convert the tree to a list, apply f to every element and produce a value just like regular fold does with a warning about complete testing, meaning every function of regular fold should be replicated here.

Assignment 7

Reading the assignment for updater, the Cursor datatype can also contain another cursor which points to the node above:

data Cursor<A>:
  | mt-cursor
  | cursor(above :: Cursor<A>, index :: List<Number>, point :: Tree<A>)
end

The index represents the pointed to cursor's position in the child list, ie: [list: 0]. This means you can fold through all the cursor.above indexes to make a string of .get() or .set()::

import lists as L

data Tree<A>:
  | mt
  | node(value :: A, children :: List<Tree>)
end

data Cursor<A>:
  | mt-cursor
  | cursor(above :: Cursor<A>, index :: List<Number>, point :: Tree<A>)
end

# test tree:
#        1
#       / \
#      2   5
#     / \   \
#    4   3   6 
#         \   \
#         3.5  7

test-tree = 
  node(1, [list:
      node(2, 
        [list: node(4, empty), node(3, [list: node(3.5, empty)])]),
      node(5, 
        [list: node(6, [list: node(7, empty)])])])

# left side
test-tree.children.get(0)

# right side
test-tree.children.get(1)

# get node 3.5
test-tree.children.get(0).children.get(1).children.get(0)

# produces a list of indices from root to cur.point ex: [list: 0,1,0] 
fun get-index(cur :: Cursor):
  L.foldl(lam(_, x): get-index(cur.above) + [list: x] end, empty, cur.index)
end

# produces tree.children.get(n).tree.children.get(n)..
fun get-nodes(tree, l):
  L.fold(lam(t, x): t.children.get(x) end, tree, l)
end

check "map n produces [list: 0, 1]":
  get-nodes(test-tree, map_n(lam(n, _): n end, 0, test-tree.children)) 
    is node(3, [list: node(3.5, empty)])
end

Check the top node, if a match then cursor(mt, empty, entire-tree) otherwise send the children list of the root node into a helper function that keeps track of the parent node. When pred(f.value) is true, find the position in the parent list of your cursor point, extract this number as an index, build the cursor with:

cursor(find-cursor(entire-tree, lam(x): x == parent-node.value end)),
  index-of-point-node, point-node)

The index can also just be a number you test to see if it's 0, meaning trying to move left should raise an error as per assignment. The depth-first search requirements you'll have to figure out a scheme to return the leftmost or topmost node for multiple matches you can get-index all the matching cursors if you want or design find-cursor to only search as per requirements. This program has multiple different ways you can implement I came up with a few before trying to figure out the most efficient way.

get-node-val returns type option (in the DCIC book chapter on sets) an option will wrap the response in some(v) or read the documentation for cases on (Option) with some(a) => syntax. Including an underscore in cases means you don't plan on using that field

# this returns some(v) which you extract with some.value
fun get-node-val<A>(cur :: Cursor<A>) -> Option<A>: 
  cases(Cursor) cur:
    | cursor(_,_, point) => some(point.value) 
    | mt-cursor => none
  end
end

This is one of the best assignments when you get it working and can zip along the tree in constant time using option types as your protection against get'ing a non-existent value. Many languages like OCaml have option types.

Week 6 Accelerated Intro to CS

Watching lecture 11 Mon/10/01/18 about streams. In the examples of ones, everytime you invoke 'ones' you get lzlink(1, <thunk>) which is one, plus a way to get more ones if called again. We're asked to write our own map, you could also try it with cases:

fun maps<A, B>(f :: (A -> B), s :: Lz<A>) -> Lz<B>:
  cases (Lz<A>) s:
    | lzlink(e, r) => lzlink(f(e), {(): maps(f, rst(s))}) 
  end
end

fun maps2<A, B, C>(f :: (A, B -> C), s1 :: Lz<A>, s2 :: Lz<B>) -> Lz<C>:
  cases (Lz<A>) s1:
    | lzlink(e, r) => 
      cases (Lz<B>) s2:
        | lzlink(ee, rr) => lzlink(f(e, ee), {(): maps2(f, rst(s1), rst(s2))})
      end
  end
end

>>s2 = maps(lam(x): x - 1 end, nats)
>>take(s2, 5)
[list: -1, 0, 1, 2, 3]

>>s3 = maps2(lam(x, y): x + y end, nats, nats)
>>take(s3, 5)
[list: 0, 2, 4, 6, 8]

Lab: Streams

The streams lab is here, mostly adapted from the DCIC book. We already wrote map and map2 in the lecture/book, try to write your own filter/fold for streams:

# This is all in the book/lecture on streams
data Stream<T>:
  | lz-link(h :: T, t :: ( -> Stream<T>))
end

rec ones = lz-link(1, lam(): ones end)

fun nats-from(n :: Number):
  lz-link(n, {(): nats-from(n + 1)})
end

nats = nats-from(0)

fun lz-first<T>(s :: Stream<T>) -> T: s.h end
fun lz-rest<T>(s :: Stream<T>) -> Stream<T>: s.t() end

fun take<T>(n :: Number, s :: Stream<T>) -> List<T>:
  if n == 0:
    empty
  else:
    link(lz-first(s), take(n - 1, lz-rest(s)))
  end
end

# Lab assignments
fun lz-fold<A, B>(f :: (A, B -> A), base :: A, s :: Stream<B>) -> Stream<A>:
  lz-link(f(base, lz-first(s)), lam(): lz-fold(f, f(base, lz-first(s)), lz-rest(s)) end) 
end

fun lz-filter<A>(f :: (A -> Boolean), s :: Stream<A>) -> Stream<A>:
  if f(lz-first(s)) == true:
    lz-link(lz-first(s), lam(): lz-filter(f, lz-rest(s)) end)
  else:
    lz-filter(f, lz-rest(s))
  end
end

>> take(10, lz-filter(lam(x): if x > 10: true else: false end end, nats))
[list: 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

>> take(10, lz-fold(lam(x, acc): acc + x end, 0, nats))
[list: 0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

Assignment 8

The continued fractions assignment.

You only have to use repeating-stream for phi and e, though I wrote a decimal to continued fraction algorithm I got from from here on page 13 and my own inefficient e generator anyway. When you write your repeating-stream function for e it has to be property-based tested since it's a stream you can only take x amount of values at a time.


fun c-frac(n :: Number) -> Stream<Option<Number>>:
  doc: "Consumes decimal, returns some(fraction integral) stream"
  integral = num-truncate(n)
  decimal = n - integral
  ask:
    | decimal > 0 then:
      new-decimal = 1 / decimal
      lz-link(some(integral), {(): c-frac(new-decimal)})
    | otherwise:
      lz-link(some(integral), {(): mt-stream})
  end
where:
  cf-e :: Stream<Number> = c-frac(generate-e(30))
  cf-phi :: Stream<Number> = c-frac((1 + num-sqrt(5)) / 2)

  take(cf-e, 6) is [list: some(2), some(1), some(2), some(1), some(1), some(4)]
  take(cf-phi, 3) is take(repeating-stream([list: some(1)]), 3)
  take(c-frac(2.875), 4) is [list: some(2), some(1), some(7), none]
  take(c-frac(PI), 5) is [list: some(3), some(7), some(15), some(1), some(292)]
end

# see wikipedia it is the sum of 1/k! where k starts at 0 and keeps going
# 1/0! + 1/1 + 1/(1*2) + 1/(1*2*3) ... where 1/0! is 1.
fun generate-e(n :: Number) -> Number:
  doc:"Generates e (inefficiently) to n decimal places"

  fun e-helper(current-count):
    factorial = take(nats-from(1), current-count)
      .foldr(lam(acc, x): x * acc end, 1)
    lz-link(1 / factorial, {(): e-helper(current-count + 1)})
  end

  take(e-helper(1), n).foldl(lam(acc,x): acc + x end, 1)
where:
  epsilon = 0.0000001
  num-abs((generate-e(10) - 2.7182818)) < epsilon is true

  wikipedia-value = 2.71828182845904523536028747135266249775724709369995
  silly-epsilon = 0.00000000000000000000000000000000000000001
  num-abs(generate-e(60) - wikipedia-value) < silly-epsilon is true
end

check "property-based-test-e":  
  # Property-based testing
  # Any frac stream of e, if integral > 1 it should be even:
  # ie [2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1...]

  fun e-test(e-value, size-of-approx):
    take(c-frac(e-value), size-of-approx)
      .map(lam(x): if x.value > 1: num-modulo(x.value, 2) == 0 else: 0 end end)
      .filter(lam(y): not(y == 0) end)
      .foldl(lam(z, acc): z and acc end, true)
  end

  e-test(generate-e(10), 10) is true
  e-test(generate-e(20), 20) is true
  e-test(generate-e(30), 30) is true
  # This is false because we need a bigger e generated
  e-test(generate-e(20), 30) is false
end

The ask check decimal > 0 was to avoid division by zero, and to terminate finite rational streams with none. PI is a Pyret built-in constant.

The goal of this assignment is to produce a program that can make refined approximations:

Click to see my version of c-approx

# First attempt without Options

fun c-approx(c-list :: List<Number>) -> Number:
  doc: "Consumes list of continued fractions, returns rational approximation"

  cases (List) c-list.rest
      | empty => c-list.first
      | link(f, r) => (1 / c-approx(c-list.rest)) + c-list.first
    end
where: 
  # phi test of approximations
  cf-phi :: Stream<Number> = c-frac((1 + num-sqrt(5)) / 2)

  c-approx(take(cf-phi, 3)) is 3/2
  c-approx(take(cf-phi, 4)) is 5/3
  c-approx(take(cf-phi, 5)) is 8/5
  c-approx(take(cf-phi, 6)) is 13/8

  # PI test of approximations
  c-approx(take(c-frac(PI), 1)) is 3
  c-approx(take(c-frac(PI), 2)) is 22/7
  c-approx(take(c-frac(PI), 3)) is 333/106
  c-approx(take(c-frac(PI), 4)) is 355/113 
end

The algorithm for this is just hand stepping the examples in the assignment writeup:

  • c-approx([list: 3, 7, 15])
    • (1 / c-approx([list: 7, 15]) + 3
    • (1 / (1 / c-approx([list: 15]) + 7) + 3
    • (1 / ((1 / 15) + 7)) + 3
    • (1 / (106/15)) + 3
    • (15/106) + 3
    • 333/106

There is many other stream functions left for you to write like fraction-stream which takes a stream as a parameter, so you can use take() inside the body of fraction-stream to keep grabbing more input:

Click to see my implementation of fraction-stream

# you can use a map over a stream here too

fun fraction-stream(n :: Stream<Number>) -> Stream<Number>:
  doc: "Consumes stream of continued fraction integrals, returns stream of rationals"

  fun helper(integrals, take-ctr):
    next-approx = take(integrals, take-ctr)
    lz-link(c-approx(next-approx), {(): helper(n, take-ctr + 1)})
  end
  helper(n, 1)
where:
  take(fraction-stream(repeating-stream([list: 1])), 3) is [list: 1, 2, 3/2]
end

The assignment has you redoing everything for option types, meaning doing cases on Options to safeguard against division by zero and many other problems the assignment writeup notes. This is another very good assignment showing how you can use a stream anywhere in a program passing it around as a function.

Lecture 12 Model View Controllers

This is lecture 12 Wed10/3/18. You can't see the entire whiteboard in the beginning but it doesn't matter, this is a lecture about seperation of concerns when building large systems, how to embody the essence of a system into a model that is common to all parts of the system.

If you combine this lecture with the streams we just learned, you now know web programming.

Lecture 13 Differential Systems

Watching lecture 13 Fri10/05/18. Using same reactor model to make a point about differential systems. Interesting examples about PID controllers/reactive systems, event loops, checkpoints of the state of the system.

Week 7 Accelerated Intro to CS

Our next assignment is JoinLists but first there's a talk by Guy L. Steele here from 2009 under Supplementary Material you can watch or type that DOI into sci-hub to read the paper, the JoinLists assignment was derived from this. You can skip through the talk and watch whatever is interesting.

  • Big message
    • Effective parallelism uses trees
    • User-defined associative combining operators are good
    • MapReduce is good (we build MapReduce ourselves after this)
    • 'Accumulator' paradigm is obsolete, use divide-and-conquer
    • The last 50 years of programming tricks should be thrown out

Locality is mentioned @11:45, it's explained here at 44:07, a program is likely to use data or instructions near memory locations it has recently used, like referencing array memory locations in succession (stride-1 pattern).

In this Guy L Steele lecture cons is list constructor (we used it week 1 during placement) similar to link(first, rest) and car means first, cdr means rest. His split function takes a lamdba/functional as a parameter like map or filter does. Whenever you hear the term 'catamorphism' (category theory) in this talk, think a fold function on lists or reducing some structure to a single value.

There's some slides using Fortress lang, the function signature syntax is similar to Pyret. Example reductionFilter():

Fortress:
reductionFilter[E](p: E -> Boolean, xs:List[E]):List[E] = 
 ... body
end

Pyret:
fun reductionFilter<A>(f :: (A -> Boolean), xs :: List<A>)-> List<A>:
 ... body
end

These are only used for examples for the greater talk subjects, such as @34:29 notice the unshuffle algorithm is similar to our joinLists assignment datatype containing a number. He chopped up the data and mapped it into a richer dataspace, did all the computations there then projected it back.

Associativity of these operators means if you run (r, 2) concat (5) then it's the same as (r) concat (2,5). A 'monoid' is a term for associative and identity algebras that form the monoid laws, you have some type<A>, an associative operator like concat or + that combines two of them into a single type<A>, and an identity value. In math an identity for natural numbers is n times 1, since multiplying anything by one gives you back that same number. In Pyret every list is linked to an empty list, and if you take [list: 1,2] and link it to empty, you get back [list: 1, 2] another identity.

He quickly goes through a huge amount of slides about string splitting algorithms, but the summary we are interested in is at 51:00 and at 53:49.

  • sums, loops and list constructors are abstractly the same thing:
    • generate a bunch of singletons
    • map over them applying some function
    • combine the results ie: generate and reduce

If you combine the results w/associative operators then you can take advantage of parallelism. The rest of the talk brings up some interesting ideas since function composition is associative: f(g(x)) = g(f(x)) you can create thousands of associative functions that transform your data in steps and with parallelism the order in how this is done doesn't matter thanks to associativity. He talks about how arrays are unsuitable for most of this since they rely on addition hardware, whereas trees are much more efficient. Any kind of accumulators are to be avoided if you are doing multithread/parallel algorithms.

Assignment 9

From the assignment writeup, use the template where join-list(), JoinList.join() and split() are already libraries there for us to use, and some other testing functions. The writeup wants us to use split() for everything if possible. From Guy L Steele's talk we want to generate a bunch of singletons and throw them at functions, and avoid any kind of accumulators if we can.

Make some test JoinLists to play around with:

size-1 = one(1)
size-2 = join-list(one(1), one(2), 2)
size-3 = join-list(join-list(one(3), one(2), 2), one(1), 2)
size-4 = join-list(one("a"), join-list(join-list(one("b"), one("c"), 2), one("d"), 2), 2)

# x gives the first element
# y gives the rest of the JoinList
split(size-3, lam(x, y): y end)
>>> one(1)

size-1.join(one("another"))
>>> join-list(one(1), one("another"), 2)

# identity operator
empty-join-list.join(one("test"))
>>> one("test")

Here's some of what I tried as an example of how to write parallel style:

fun j-first<A>(jl :: JoinList<A>%(is-non-empty-jl)) -> A:
  doc:"Returns the first element of a non-empty list"
  cases(JoinList) jl:
    | one(elt) => elt
    | join-list(_,_,_) => split(jl, lam(x, _): j-first(x) end)
  end
where:
  split(size-2, lam(x, _): j-first(x) end) is 1
  split(size-3, lam(x, _): j-first(x) end) is 3
end

# Uses the identity operator of empty-join-list
fun j-rest<A>(jl :: JoinList<A>%(is-non-empty-jl)) -> JoinList<A>:
  doc:"Returns a join list containing all elements but the first of a non-empty list"
  cases(JoinList) jl:
    | one(elt) => empty-join-list 
    | join-list(_,_,_) =>
      split(jl, lam(x, y): j-rest(x).join(y) end)
  end
where:
  j-rest(size-2) is one(2)
  j-rest(size-3) is join-list(one(2), one(1), 2)
  j-rest(size-4) is join-list(join-list(one("b"), one("c"), 2), one("d"), 2)
end

For j-nth, the logic was return the first element automatically if it's 0, then fall into the join-list case. If the first element's length is smaller or equal than the 0-based indexed n that's given as an argument to j-nth, what we're looking for isn't there so skip the first element (which could be an entire embedded JoinList) and call j-nth again with n adjusted to subtract x length. This way there is no centralized accumulator/counter and you could make 20 of these programs run concurrently to split up the work as Guy L Steele recommended. Try hand stepping this:

fun j-nth<A>(jl :: JoinList<A>%(is-non-empty-jl), n :: Number) -> A:
  doc:"Returns the nth element (0-based index) of a n+1 JoinList"
  cases(JoinList) jl:
    | one(elt) => elt
    | join-list(_,_,_) => split(jl, lam(x, y): x-length = j-length(x)
          if (x-length <= n):
            j-nth(y, n - x-length)
          else: 
            j-nth(x, n)
          end
        end)
  end
end
check:
 size-4 = join-list(one("a"), join-list(join-list(one("b"), one("c"), 2), one("d"), 2), 2)
  j-nth(size-4, 0) is "a"
  j-nth(size-4, 2) is "c"
end

The rest of the assignment is typical exercises we've already done before like implementing map/filter but reduce instead of fold this time. The writeup explains what reduce is, if you have a list [list: 1, 2, 3] then reduce x - y would be 1 - 2 - 3.

The analysis writeup asks 'what property should the first argument to j-reduce demonstrate' and Guy L. Steele already gave us the answer: be associative if you're dealing with manipulating a bunch of singletons and the split function is non-deterministic it can return a different order of join-lists which you will run into if you test j-reduce with '-' or subtraction operator.

Assignment 10

This came out the same day as JoinLists. Before you read the assignment writeup see an example what MapReduce does. We are writing a model of a parallel framework. MapReduce takes two custom functions a mapper and a reducer, and the input. The framework splits the input to feed in parallel to many mapper functions to distribute the work (we don't have to write this). The mapper function returns (key, value) pairs, see this example from the 2016 assignment writeup for a word-count mapper:

check "simple map test":
  wc-map(tv("book1", "the words the words")) is [list: tv("the", 1), tv("words", 1), tv("the", 1), tv("words", 1)]
end

Your collector/filter would then sort these by keys and produce a list of their values: [list: tv("the", [list: 1, 1]), tv("words", [list: 1, 1])] and feed that to reduce which would return tv("the", 2) or tv("words", 2).

Here is my poorly done early draft just to mimic the MapReduce framework and understand it better, we have no idea the result of user supplied reduce so will later have to create property-based tests:

Click to see my implementation

# CSCI0190 (Fall 2018)
provide *

import lists as L
import shared-gdrive("map-reduce-support.arr",
  "1F87DfS_i8fp3XeAyclYgW3sJPIf-WRuH") as support

type Tv-pair = support.Tv-pair
tv = support.tv
wc-map = support.wc-map
wc-reduce = support.wc-reduce

# DO NOT CHANGE ANYTHING ABOVE THIS LINE

## A. Your map-reduce definition

fun map-reduce<A,B,M,N,O>(input :: List<Tv-pair<A,B>>,
    mapper :: (Tv-pair<A,B> -> List<Tv-pair<M,N>>),
    reducer :: (Tv-pair<M,List<N>> -> Tv-pair<M,O>))-> List<Tv-pair<M,O>>:
  doc:""

  fun reduce-prepare<S,V>(distinct :: List<S>, pairs :: List<Tv-pair<S,V>>)-> List<Tv-pair<S,List<V>>>:
    doc: "Map over distinct, filter pairs for matching key, collect all their values"
    distinct.map(lam(key): tv(key, pairs.filter(lam(x): x.tag == key end).map(lam(y): y.value end)) end)
  where:
    reduce-prepare([list: "test", "this"], [list: tv("test", 1), tv("this", 1), tv("test", 1)]) 
      is [list: tv("test", [list: 1, 1]), tv("this", [list: 1])]
  end

  # fold the input feeding it to mapper and appending output into a new list
  key-pairs = input.foldl(lam(elt, acc): acc + mapper(elt) end, empty)
  # get a distinct list of keys
  distinct-keys = L.distinct(key-pairs.map(lam(x): x.tag end))
  # create list of tv(key, List<Value>) to pass to reduce
  reduce-ready = reduce-prepare(distinct-keys, key-pairs)
  # map over tv-pairs and apply reduce returning List<Tv-pair<M,O>>
  reduce-ready.map(lam(x): reducer(x) end)
end

check "map-reduce function test":
  input = [list: tv("filename", "the cars fox"), tv("filename2", "fox cars cars cars")]
  map-reduce(input, wc-map, wc-reduce) is [list: tv("the", 1), tv("fox", 2), tv("cars", 4)] 
end

I was surprised my poor implementation of MapReduce worked fine for my implementation of anagram-map/reduce:

Click to see my version of anagram-map

### B. Your anagram implementation  
  
fun anagram-map(input :: Tv-pair<String, String>) 
  -> List<Tv-pair<String, String>>:
  doc:"Returns Tv-pair list of token sorted key and that token"
  # split on spaces of tv-pair.value 
  tokens = string-split-all(input.value, " ")
  # map over the tokens, create key by sorting token
  map(lam(x): tv(string-from-code-points(string-to-code-points(x).sort()), x) end, tokens)
where:
  # tests taken from a student solution
  anagram-map(tv("words.txt", "star rats tarts")) is 
  [list: tv("arst", "star"), tv("arst", "rats"), tv("arstt", "tarts")]
end

fun anagram-reduce(input :: Tv-pair<String, List<String>>) 
  -> Tv-pair<String, List<String>>:
  doc:"Return key and filter for dupes in key value"
  tv(input.tag, L.distinct(input.value))
end

check "anagram test":
  map-reduce([list: tv("words.txt", "star rats tarts")], anagram-map, anagram-reduce) is 
  [list: tv("arst", [list: "star", "rats"]), tv("arstt", [list: "tarts"])]
  
  # will fail because order isn't exact, need to use lst-same-els built-in 
  map-reduce([list: tv("words.txt", "star rats tarts star")], anagram-map, anagram-reduce) is 
  [list: tv("arst", [list: "star", "rats"]), tv("arstt", [list: "tarts"])]
end

For anagram mapper since the assignment says the same characters is the determination of an anagram we can make it is a key by turning into code points and sorting, then convert back to a string to create a key. The nile mapper/reducer is the same just remember all mapper does is fire out tv(key, value) for every input and reduce either filters for dupes, subtracts or sums values in the form tv(key, [list: values]).

Why would JoinLists be helpful? To split up the input to mapper like in the Wikipedia article where the idea is to split it evenly across hundreds of servers running mapper and feed them all to multiple reducers.

Lecture 14 Identical

Watching lecture Wed10/10/18. A great lecture about memory aliasing, introduces directed acyclic graphs, acyclic meaning it doesn't cycle and has one direction. The reading uses memory location examples to make this concept of more clear

Lecture 15 DAG size, Monads

Watching lecture Fri10/12/18. It starts by going through some task they were given to write examples for determining the size of a DAG that we don't have access to, however this is all in the DCIC book. If a branch is the same construction (recall identical from previous lectures/the book) then you don't count it's size twice.

check "size of A node":
 A = node(1, leaf, leaf)
 size(node(2, A, A)) is 2
 size(node(1, A, A)) is 2

There's only 2 distinct nodes in each, node A and another wrapped around it. Check blocks can have strings, so when you run a test you know which one failed. The rest of this lecture is writing the DAG size function with the class because you have to carry information on which nodes you've already seen. We discover this design pattern of carrying information is called monadic programming.

The end of the lecture he walks through the code of a pyret library he wrote explaining the types and syntax, you can use this to look up the source yourself and read the code.

Week 8 Accelerated Intro to CS

Watching lecture 16 Mon10/15/18 a casual lecture introducing graphs and describing their properties. In fact this whole week is pretty casual just watching lectures he mentions graphs lab is later, and queues lab is after lecture 20.

Lecture 17 Graphs II

Watching lecture 17 Wed10/17/18

His notation on the board for the student samples of implementing data Node is cut off from the recording but you can see yourself why this won't work:

# Node without key
data Node:
  | node(content :: String, lon :: List<Node>)
end

airport = node("EWR", [list: node("PBD", [list: airport])])
# doesn't work, undefined variable   

This is covered in the chapter for DAGs in the DCIC book. We want a level of indirection and that's why graph representations have node keys. This is the function signature for reach:

fun reach(g :: List<Node>, n1 :: Node, found :: (Node-> Boolean))-> Boolean:
    ...
end

'found' parameter is a function you supply, the same as any higher-order function like filter which consumes a function, for some reason this causes confusion in the class and he explains found around the 30m mark. This is all in the DCIC book too.

Lecture 18 Graphs III

Watching lecture Fri10/19/18. The video grows to full size @4:02 you don't miss anything, the first 17:14 mins of the lecture is mainly the class working on the neighbors function contract. @28:53 the reach-3 function is in the book and he explains ormap function, which is also defined in the book. The rest of this lecture he completely walks through reach-3 answering all class questions.

Lecture 19 Queues, Weighted Graphs

Watching lecture Mon10/22/18 we're still on reach-3.

If you can't see the node numbers for the graph on the board:

   1
  /|\
 / | \
2  3  4
 \ | /
   5
 / | \
6  7  8

       (9) <-- disconnected node

He tells us every node is in 3 states: not yet visited (white), completely visited/cycled (black) and visited once but needs to be processed again (grey). He links map/fold/filter to traversal strategies, another reason why we did so many of those in labs. A recursive process he explains automatically gives you depth-first instead of breadth-first, and breadth-first gives you the shortest path to a node for free.

Week 9 Accelerated Intro to CS

Watching lecture 20 Wed10/24/18. Another casual lecture setting up the idea of a spanning tree, an auxillary datastructure you would run to traverse a graph avoiding anything with a cycle. He's giving us many homework hints, reminding us of Oracle assignment when there was multiple correct solutions. 'Test characteristics of the solution not the exact solution' in other words property-based testing.

Prof K gives a good crash course here on peering/internet backbones how it's similar to the idea of lightest next available choice. Now you also know what a greedy algorithm strategy is.

Lab: Queues/Heaps

We don't have access to any of the templates in the queues lab anymore so we will make our own lab.

First read the book chapter Halloween Analysis where a queue implementation is shown.

Amortized analysis is a cost analysis scheme in which the cost of operations over a data structure is specified under the assumption of continued sequential use. A queue has 3 abstract stages: inserted, in transit, and removed for 3n total abstract operations over an n sized sequential run of queuing and dequeuing. The total amortized charge is 3n/3 or 3 on average. Suppose we assign charge 1 to enqueue and charge 2 to dequeue. Most dequeues will have credit or overcharge because we only need to reverse the queues list when we run out of dequeues, and it's all these overcharges that add up to pay for the cost of reversing the list when this data structure is used over and over not a single time.

This is explained formally with induction in an easy to understand manner in Robert Harpers book Programming in SML in chapter 28.2 (and also how the scheme falls apart in the multi-threaded/parallel case, unless you use a cache/memoization which we will soon learn). The general idea is to show the charge function is equal to the real cost of T(n) so establishing an upper bound on the charge function means you can establish an upper bound on T(n).

There is another amortized scheme called the potential method. Potential is a function of the entire data structure, and potential would increase everytime we dequeued and didn't require a list reverse, representing the potential energy stored to do a future heavy cost operation. Once the dequeue list was empty and needed to be replaced by the reversed queue list, the potential would reset to empty 'paying' out for the expensive operation. All of this is explained in this lecture from MIT's 6.046j using a binary counter and 2-3 trees as an example of thinking about the state of the data structure and deciding the worst state when the most likely costliest operation will happen like 2-3 trees where you have all 2 nodes and an insert causes a chain of reaction to split the tree.

To see another potential function/amortized analysis in action let's watch some of this lecture from Erik Demaine's Advanced Data Structures something you can easily complete if you finish CS19 from Brown.

Watching time travel. Up to 11:00 he is describing Pyret, you always have the old data structure (persistence) and any functions on it return a new version that don't alter the original data. @21:20 a back pointer is similar to what we did in the updater assignment, we always had a 'pointer' to the parent. The 'mods' can just be a list where we push previous nodes and keep a time counter somewhere of doing this so we can get() them in constant time with Option types, assuming Pyret returns .get() nodes in O(1). If it doesn't we'll just keep his partial persistence description of only storing up to 2p back pointers in a single node, then creating a new node if there's another change and recursively updating all previous backpointers.

@32:08 this is what we came here for, to see how amortized/potential method works. He explains this quite well, the potential function is defined as the amount of modifications in latest nodes, multiplied by some constant work. The potential charge maxes out to 2cp once you have a full node and decreases by -2p * (constant work) or -2cp work when you generate a new node and recursively change all previous pointers. Thus the cost of those changes is 'paid for' or zero and you are only left with c + c or constant work… over a series of operations run sequentially.

Priority Queues

We have access thanks to Github to previous student solutions, let's look at a Priority Queue they wrote this is from the upcoming MST assignment, cut+paste it into Pyret:

### Heaps Implementation Based on Heap Lab :)
data Edge:
  | edge(a :: String, b :: String, weight :: Number)
end

type Graph = List<Edge>

data Heap:
  | mt
  | node(value :: Edge, left :: Heap, right :: Heap)
end

test-heap = node(edge("a", "b", 1), node(edge("c", "d", 3), 
    node(edge("e", "f", 4), mt, mt), node(edge("g", "h", 5), mt, mt)),
  node(edge("c", "d", 6), node(edge("e", "f", 7), mt, mt), 
    node(edge("g", "h", 8), mt, mt)))

fun insert(el :: Edge, h :: Heap) -> Heap:
  doc: ```Takes in a PossiblePath elt and a proper Heap h produces
       a proper Heap that has all of the elements of h and elt.```
  cases (Heap) h:
    | mt => node(el, mt, mt)
    | node(v,l,r) =>
      if el.weight > v.weight:
        node(v, insert(el, r), l)
      else:
        node(el, insert(v, r), l)
      end
  end
where:
  insert(edge("a", "b", 1), mt) is node(edge("a", "b", 1), mt, mt)
  insert(edge("z", "x", 20), test-heap) is node(edge("a", "b", 1), 
    node(edge("c", "d", 6), node(edge("g", "h", 8), node(edge("z", "x", 20), 
          mt, mt), mt), node(edge("e", "f", 7), mt, mt)), 
    node(edge("c", "d", 3), node(edge("e", "f", 4), mt, mt), node(edge("g", "h",
          5), mt, mt)))
  insert(edge("A", "B", 0), node(edge("a", "b", 1), mt, mt)) is node(edge("A", 
      "B", 0), node(edge("a", "b", 1), mt, mt), mt)
end

fun remove-min(h :: Heap%(is-node)) -> Heap:
  doc: ```Given a proper, non-empty Heap h, removes its minimum element.```
  amputated = amputate-bottom-left(h)
  cases (Heap) amputated.heap:
    | mt => mt
    | node(v,l,r) =>
      reorder(rebalance(node(amputated.elt, l, r)))
  end
where:
  remove-min(test-heap) is node(edge("c", "d", 3), node(edge("c", "d", 6), 
      node(edge("e", "f", 7), mt, mt), node(edge("g", "h", 8), mt, mt)), 
    node(edge("e", "f", 4), node(edge("g", "h", 5), mt, mt), mt))
end

fun rebalance(h :: Heap) -> Heap:
  doc: ```Given a Heap h, switches all children along the leftmost path```
  cases (Heap) h:
    | mt => mt
    | node(v,l,r) => node(v,r,rebalance(l))
  end
where:
  rebalance(node(edge("c", "d", 3), node(edge("c", "d", 6), 
        node(edge("e", "f", 7), mt, mt), mt), mt))
    is node(edge("c", "d", 3), mt, node(edge("c", "d", 6), mt, 
      node(edge("e", "f", 7), mt, mt)))
  rebalance(node(edge("c", "d", 3), mt, node(edge("c", "d", 6), mt, 
        node(edge("e", "f", 7), mt, mt))))
    is node(edge("c", "d", 3), node(edge("c", "d", 6), mt, 
      node(edge("e", "f", 7), mt, mt)), mt)
end

fun get-min(h :: Heap%(is-node)) -> Edge:
  doc: ```Takes in a proper, non-empty Heap h and produces the
       minimum weight in h.```
  cases (Heap) h:
    | mt => raise("Invalid input: empty heap")
    | node(val,_,_) => val
  end
where:
  get-min(test-heap) is edge("a", "b", 1)
end

data Amputated:
  | elt-and-heap(elt :: Edge, heap :: Heap)
end

fun amputate-bottom-left(h :: Heap%(is-node)) -> Amputated:
  doc: ```Given a Heap h, produes an Amputated that contains the
       bottom-left element of h, and h with the bottom-left element removed.```
  cases (Heap) h:
    | mt => raise("Invalid input: empty heap")
    | node(value, left, right) =>
      cases (Heap) left:
        | mt => elt-and-heap(value, mt)
        | node(_, _, _) =>
          rec-amputated = amputate-bottom-left(left)
          elt-and-heap(rec-amputated.elt,
            node(value, rec-amputated.heap, right))
      end
  end
where:
  amputate-bottom-left(test-heap) is elt-and-heap(edge("e", "f", 4), 
    node(edge("a", "b", 1), node(edge("c", "d", 3), mt, node(edge("g", "h", 5), 
          mt, mt)), node(edge("c", "d", 6), node(edge("e", "f", 7), mt, mt), 
        node(edge("g", "h", 8), mt, mt))))
end

fun reorder(h :: Heap) -> Heap:
  doc: ```Given a Heap h, where only the top node is misplaced,
       produces a Heap with the same elements but in proper order.```
  cases(Heap) h:
    | mt => mt
    | node(val, lh, rh) =>
      cases(Heap) lh:
        | mt => node(val, mt, mt)
        | node(lval, llh, lrh) =>
          cases(Heap) rh:
            | mt =>
              if val.weight < lval.weight:
                node(val, node(lval, mt, mt), mt)
              else:
                node(lval, node(val, mt, mt), mt)
              end
            | node(rval, rlh, rrh) =>
              if lval.weight <= rval.weight:
                if val.weight <= lval.weight:
                  h
                else:
                  node(lval, reorder(node(val, llh, lrh)), rh)
                end
              else:
                if val.weight <= rval.weight:
                  h
                else:
                  node(rval, lh, reorder(node(val, rlh, rrh)))
                end
              end
          end
      end
  end
where:
  reorder(mt) is mt
  reorder(test-heap) is test-heap
  reorder(node(edge("a", "b", 20), node(edge("c", "d", 3), 
        node(edge("e", "f", 4), mt, mt), node(edge("g", "h", 5), mt, mt)),
      node(edge("c", "d", 6), node(edge("e", "f", 7), mt, mt), 
        node(edge("g", "h", 8), mt, mt)))) is node(edge("c", "d", 3), 
    node(edge("e", "f", 4), node(edge("a", "b", 20), mt, mt), 
      node(edge("g", "h", 5), mt, mt)), node(edge("c", "d", 6), 
      node(edge("e", "f", 7), mt, mt), node(edge("g", "h", 8), mt, mt)))
end

The term 'heap' here has nothing to do with the same word for a pool of memory you can allocate on a system to implement dynamically sized arrays/objects it just means a tree with an element at each node. A 'min-heap' as described here means the lowest node value it at the root.

Reading 4.2.2 insert, the task asks what's the single thing we can do to reestablish balance if after inserting on the right we end up with n + 1 on the right. My intuition is to just swap the sides of the tree, the right hand side is now the left hand side and balance condition restored. The next task asks why did we insert in the right subheap instead of left, which I assume is because if the condition is left = n and right = n-1 if we insert in the left we get n+1 left and n-1 right an imbalance we can't fix without linearly going through everything and reordering. Note the lab tells us we don't know which case we are in and are inserting blind everytime but we do know the balance condition so can guarantee every right insert and branch swap will keep the condition.

Let's look at the student code:

fun insert(el :: Edge, h :: Heap) -> Heap:
  cases (Heap) h:
    | mt => node(el, mt, mt)
    | node(v,l,r) =>
      if el.weight > v.weight:
        node(v, insert(el, r), l)
      else:
        node(el, insert(v, r), l)
      end
  end

Indeed they have swapped branches node(v, l, r) inserting on the right, then changing with the left to node(v, r, l). Our next task is understand remove-min. Remove that leftmost bottom node to replace the root, swap all left nodes if they exist, then swap left and right branches. Reorder the top node recursively.

Let's look at amputate heap:

data Amputated:
  | elt-and-heap(elt :: Edge, heap :: Heap)
end

fun amputate-bottom-left(h :: Heap%(is-node)) -> Amputated:
  cases (Heap) h:
    | mt => raise("Invalid input: empty heap")
    | node(value, left, right) =>
      cases (Heap) left:
        | mt => elt-and-heap(value, mt)
        | node(_, _, _) =>
          rec-amputated = amputate-bottom-left(left)
          elt-and-heap(rec-amputated.elt,
            node(value, rec-amputated.heap, right))
      end
  end

Do cases on the heap, do cases on the left heap, and keep inserting the left branch into the function until the left is empty and if so use custom datatype 'elt-and-heap' to return that node's value as you must be in the last left node. This student also returned the node beside the left node on the right, in the lab writeup 14 and 21 are beside each other.

Let's look at remove-min, reorder and rebalance operations:

fun remove-min(h :: Heap%(is-node)) -> Heap:
  amputated = amputate-bottom-left(h)
  cases (Heap) amputated.heap:
    | mt => mt
    | node(v,l,r) =>
      reorder(rebalance(node(amputated.elt, l, r)))
  end

Call rebalance with the amputated node then reorder whatever is returned from rebalance.

fun rebalance(h :: Heap) -> Heap:
  cases (Heap) h:
    | mt => mt
    | node(v,l,r) => node(v,r,rebalance(l))
  end

Rebalance moves everything on the right to the left, then recursively swaps the left. Now at this point we are left (as per lab writeup) with 14 on top. Reorder is then called:

fun reorder(h :: Heap) -> Heap:
  cases(Heap) h:
    | mt => mt
    | node(val, lh, rh) =>
      cases(Heap) lh:
        | mt => node(val, mt, mt)
        | node(lval, llh, lrh) =>
          cases(Heap) rh:
            | mt =>
              if val.weight < lval.weight:
                node(val, node(lval, mt, mt), mt)
              else:
                node(lval, node(val, mt, mt), mt)
              end
            | node(rval, rlh, rrh) =>
              if lval.weight <= rval.weight:
                if val.weight <= lval.weight:
                  h
                else:
                  node(lval, reorder(node(val, llh, lrh)), rh)
                end
              else:
                if val.weight <= rval.weight:
                  h
                else:
                  node(rval, lh, reorder(node(val, rlh, rrh)))
                end
              ...

Do cases on the left and right branch from the root. If the right hand branch is empty, look at the left side and return whatever is bigger, root or left branch value. If there is a right hand branch, and if the left value is <= to the right, and if the root <= to the left branch, return the entire heap it's already in order. Otherwise if the root is not smaller than the left hand branch, return the left node as root and rebalance the whole left side. However if the right hand side is bigger then reorder the right side.

Lecture 21 MSTs II

Watching lecture Fri10/26/18 this class is entirely about MST/Dijkstra algorithm examples and property-based test cases you would want to use, and test cases that would be a waste of time. If you forgot any of these algorithms they're in the book in the Graphs chapter. The lesson here is Dijkstra's shortest path algorithm only goes off local information so it's greedy search will never work on a graph with negative weights (which violates the spec of the algorithm anyway) as it doesn't know about them (however we will see in the advanced algorithm course they figured this out in 2022).

Assignment 12

This assignment comes out after lecture 19, and isn't due until lecture 25. He explains this algorithm in lecture 19 around 24mins using a queue with breadth-first traversal and talks about manhattan distance in lecture 20 how it's movement on paths you can travel not a direct line like euclidean distance. The implementation template still works and we have access here. Run that template and try the following commands in the repl:

  • brown-university-landmarks
  • brown-university-landmarks.get("Blueno")
  • brown-university-landmarks.get("Blueno").distance(brown-university-landmarks.get("The-Rock"))
  • brown-university-tours

First task is Dijkstra's algorithm from the book and lectures. Given a name in the graph, construct a set of paths to every place in the graph that is the shortest total distance fom the starting place. You could try a standard linear crawl of the entire graph and get the distances of every place from the starting place and store them in a string-dict as per the assignment writup. Now these are the weights we can use to do a normal greedy strategy shortest-path search where we map/iterate over the string-dict keys and construct shortest path set to every location. So far we have linear crawl + linear map * usual Dijkstra auxillary min-heap and this is O(n + nlog n) or just O(nlog n).

Second task we feed the tours into the same Dijkstra's algorithm but this time we start at a point so must use the distance points function. Feed all the tour places into the same algorithm then begin at the place closest to the starting point. Then we repeat what we did in the first algorithm getting a list of shortest distances and storing them in a StringDict or calculate as you go visiting one point then get'ing the points of every closest node on the tour(s) and continually taking the shortest distance. If you remember the paper we read earlier about this course, in 5.6 Other Examples of PBT 'students implement and test Dijkstra's algorithm; a graph may admit multiple equally-short paths between two vertices' and in the assignment writeup they noted there can be 2 or more places at the same point.

Student solution I found online:


import string-dict as SD
import lists as L
type StringDict = SD.StringDict
string-dict = SD.string-dict


### Campus Tour Helper
fun campus-tour-helper(current :: Name, to-visit :: Set<Name>,
    dijk-res :: Traverse, graph :: Graph) -> Path:
  doc:```consumes the current node, the nodes to visit, the dijkstra results,
      and the graph; returns the path as defined in the problem statement```
  destinations = to-visit.to-list()
  cases (List) destinations:
    | empty => empty
    | link(_, _) =>
      nearest-dest = L.foldl(lam(acc, x): 
          if dijk-res.dist.get-value(x) < dijk-res.dist.get-value(acc):
            x
          else:
            acc
          end
        end, destinations.first, destinations)
      dist-to-nodes = dijkstra-helper(nearest-dest, graph, 
        [string-dict: nearest-dest, 0], 
        [string-dict: nearest-dest, [list: nearest-dest]], 
        insert(pp(nearest-dest, 0, [list: nearest-dest]), mt))

      extracted-path = dijk-res.paths.get-value(nearest-dest).reverse().rest
      extracted-path.append(campus-tour-helper(nearest-dest, 
          to-visit.remove(nearest-dest), dist-to-nodes, graph))
  end
  #|
where:
  campus-tour-helper("3", [set: "1", "4"], dijkstra-helper("3", g, 
      [string-dict: "3", 0], [string-dict: "3", [list: "3"]], 
      insert(pp("3", 0, [list: "3"]), mt)), g) is [list: "1", "2", "4"]

  campus-tour-helper("3", empty-set, dijkstra-helper("3", g, 
      [string-dict: "3", 0], [string-dict: "3", [list: "3"]], 
      insert(pp("3", 0, [list: "3"]), mt)), g) is empty |#
end

### Main functions

fun dijkstra(start :: Name, graph :: Graph) -> Set<Path>:
  doc:```consmues a start note and a graph; returns the shortest
      path to each point in the graph from the start```
  nodes = graph.names()
  if nodes.member(start):
    output = dijkstra-helper(start, graph, 
      [string-dict: start, 0], 
      [string-dict: start, [list: start]], 
      insert(pp(start, 0, [list: start]), mt))

    keys = output.paths.keys().to-list()
    list-to-set(map(output.paths.get-value(_), keys))
  else:
    empty-set
  end
end

fun campus-tour(tours :: Set<Tour>, start-position :: Point, 
    campus-data :: Graph) -> Path:
  doc:```consumes a set of tours, a start position, and campus data; returns
      the calculated campus tour as a path```
  all-stops = L.foldl(lam(acc, x): acc.union(x) end, empty-set, 
    map(_.stops, tours.to-list()))

  if all-stops == empty-set:
    empty
  else:
    graph-nodes = map(campus-data.get(_), campus-data.names().to-list())
    cases (List) graph-nodes:
      | empty => empty
      | link(_, _) =>
        start-node = L.foldl(lam(acc, x): if acc.{1} < x.{1}: acc else: 
          x end end, {graph-nodes.first.name; 
            start-position.distance(graph-nodes.first.position)}, 
          map({(x): {x.name; start-position.distance(x.position)}}, 
            graph-nodes)).{0}

        dist-to-nodes = dijkstra-helper(start-node, campus-data, 
          [string-dict: start-node, 0], 
          [string-dict: start-node, [list: start-node]], 
          insert(pp(start-node, 0, [list: start-node]), mt))
        
        if all-stops.member(start-node):
          updated-set = all-stops.remove(start-node)
          link(start-node, campus-tour-helper(start-node, updated-set, 
            dist-to-nodes, campus-data)).reverse()
        else:
          updated-set = all-stops
          link(start-node, campus-tour-helper(start-node, updated-set, 
            dist-to-nodes, campus-data)).reverse()
        end
    end
  end
end

And the helper/testing functions:


# testing graphs

q1 = place("1", point(0,0), [set: "2", "3"])
q2 = place("2", point(1,0), [set: "1", "3", "4"])
q3 = place("3", point(0,1), [set: "1", "2", "4"])
q4 = place("4", point(2,0), [set: "2", "3"])
g = to-graph([set: q1, q2, q3, q4])

r1 = place("1", point(0,0), [set: "2", "3", "5"])
r2 = place("2", point(1,1), [set: "1", "3", "4"])
r3 = place("3", point(1, -1), [set: "1", "2", "4"])
r4 = place("4", point(2, 0), [set: "2", "3", "5"])
r5 = place("5", point(3, 0), [set: "4", "1"])
g1 = to-graph([set: r1, r2, r3, r4, r5])

### dijkstra helper functions and testing functions

# return type for dijkstra-helper so that it may return both distances and paths
data Traverse:
  | trav(dist :: StringDict<Number>, paths :: StringDict<List<Name>>)
end

fun dijkstra-helper(current :: Name, graph :: Graph, 
    distances :: StringDict<Number>,
    paths :: StringDict<List<Name>>,
    queue :: Heap) -> Traverse:
  doc:```takes in the current node, graph, the distances of nodes added to
      solution, paths, and the queue of paths to check; returns the result of
      dijkstra with both distances and paths```
  cases (Heap) queue:
    | mt => trav(distances, paths)
    | node(_, _, _) =>
      curr-place = graph.get(current)
      neighbors = curr-place.neighbors.to-list()

      # add all neighbors not already in the solution to the queue
      updated-queue = L.foldl(lam(acc :: Heap, x :: Name):
          cases (Option) distances.get(x):
            | none => insert(pp(x, distances.get-value(current) + 
                  curr-place.distance(graph.get(x)), 
                  link(x, paths.get-value(current))), acc)
            | some(_) => acc
          end
        end, queue, neighbors)

      minimum = get-min(updated-queue)
      # if not in the solution add it
      cases (Option) distances.get(minimum.loc):
        | none => 
          dijkstra-helper(minimum.loc, graph, distances.set(minimum.loc, 
              minimum.distance), paths.set(minimum.loc, minimum.path), 
            remove-min(updated-queue))
        | some(_) =>
          dijkstra-helper(minimum.loc, graph, distances, paths, 
            remove-min(updated-queue))
      end
  end
where:
  dijkstra-helper("1", g, [string-dict: "1", 0], [string-dict: "1", 
      [list: "1"]], insert(pp("1", 0, [list: "1"]), mt)) is
  trav([string-dict: "1", 0, "3", 1, "2", 1, "4", 2], [string-dict: "1", 
      [list: "1"], "3", [list: "3", "1"], "2", [list: "2", "1"], "4", 
      [list: "4", "2", "1"]])

  dijkstra-helper("5", g, [string-dict: ], [string-dict: ], mt) 
    is trav([string-dict: ], [string-dict: ])
end

fun is-valid-dijk(proposed :: Set<Path>, start :: Name, graph :: Graph) 
  -> Boolean:
  doc:```consumes a proposed dijkstra solution, a start node, and a graph;
      returns whether each path is a valid solution```
  proposed-list = proposed.to-list()
  if (proposed-list.length() == 0) and 
    not(graph.names().to-list().member(start)):
    true
  else:
    if proposed-list.length() == graph.names().to-list().length():
      if not(proposed-list.length() == 0):
        #checks if all are valid paths
        all-valid = L.all(lam(x): valid-path(x, start, graph) end, 
          proposed-list)

        if all-valid:
          our-lengths = L.foldl(lam(acc, x):
              paths = all-paths(start, x, empty, graph)
              lengths = map(length-path(_, graph), paths)
              min = L.foldl(num-min, lengths.first, lengths)
              acc.set(x, min)
            end, [string-dict: ], graph.names().to-list())

          #checks all lengths
          L.all(lam(x): our-lengths.get-value(x.first) == length-path(x, 
              graph) end, proposed-list)
        else:
          false
        end
      else:
        true
      end
    else:
      false
    end
  end
where:
  is-valid-dijk([set: [list: "4", "2", "1"], [list: "3", "1"], [list: "2", "1"],
      [list: "1"]], "1", g) is true
  is-valid-dijk([set: [list: "4", "2", "1"], [list: "3", "1"], 
      [list: "2", "1"]], "1", g) is false
  is-valid-dijk([set: [list: "4", "2"], [list: "3", "1"], [list: "2", "1"],
      [list: "1"]], "1", g) is false
  is-valid-dijk([set: [list: "4", "2", "3", "1"], [list: "3", "1"], 
      [list: "2", "1"], [list: "1"]], "1", g) is false
end

fun valid-path(path :: Path, start :: Name, graph :: Graph) -> Boolean:
  doc:```given a path, a start, and a graph determines if it is a valid path```
  if path.length() == 0:
    false
  else:
    if path.length() == 1:
      path.first == start
    else:
      is-next-adj = graph.get(path.first).neighbors.member(path.get(1))
      is-next-adj and valid-path(path.rest, start, graph)
    end
  end
where:
  valid-path(empty, "1", g) is false
  valid-path([list: "1"], "1", g) is true
  valid-path([list: "2"], "1", g) is false
  valid-path([list: "4", "2", "3", "1"], "1", g) is true
  valid-path([list: "4", "1"], "1", g) is false
end

fun length-path(path :: Path, graph :: Graph) -> Number:
  doc:```consumes a path and a graph; determines the length of the path```
  cases (List) path.rest:
    | empty => 0
    | link(f, r) =>
      graph.get(path.first).distance(graph.get(f)) 
        + length-path(path.rest, graph)
  end
where:
  length-path([list: "1"], g) is 0
  length-path([list: "4", "2", "1"], g) is 2
end

fun all-paths(start :: Name, target :: Name, seen :: Path, graph :: Graph) 
  -> List<Path>:
  doc:```consumes a start node, target, node, the seen, and the graph;
      returns a list of all Paths to the target from the current```
  if not(start == target):
    current-place = graph.get(start)
    neighbor-paths = map(lam(x): if seen.member(x): empty else: 
          map(link(start, _), all-paths(x, target, link(start, seen), graph)) 
      end end, current-place.neighbors.to-list())
    L.foldl(lam(acc, x): acc.append(x) end, empty, neighbor-paths)        
  else:
    [list: [list: target]]
  end
where:
  all-paths("1", "1", empty, g) is [list: [list: "1"]]
  all-paths("1", "4", empty, g) is [list: [list: "1", "2", "3", "4"], 
    [list: "1", "2", "4"], [list: "1", "3", "2", "4"], [list: "1", "3", "4"]]
end

### campus-tour helper and testing functions
fun is-valid-tour(proposed :: Path, tours :: Set<Tour>, 
    start-position :: Point, graph :: Graph) -> Boolean:
  doc:```consumes a proposed tour, and the input for campus-tour; returns 
      whether the result is a valid tour```
  all-stops = L.foldl(lam(acc, x): acc.union(x) end, empty-set, 
    map(_.stops, tours.to-list()))
  stops-left = L.distinct(L.filter({(x): all-stops.to-list().member(x)}, 
      proposed).reverse())

  if not(all-stops == empty-set) and not(proposed == empty):
    if valid-path(proposed, proposed.last(), graph):
      if all-stops.to-list().length() == stops-left.length():
        graph-nodes = map(graph.get(_), graph.names().to-list())
        distances = map({(x): {x.name; start-position.distance(x.position)}}, 
          graph-nodes)
        minimum-dist = L.foldl(lam(acc, x): if acc.{1} < x.{1}: acc else: 
          x end end, {graph-nodes.first.name; 
            start-position.distance(graph-nodes.first.position)}, distances).{1}

        all-minimal = map({(x): x.{0}}, L.filter({(y): y.{1} == minimum-dist}, 
            distances))
        start-valid = all-minimal.member(proposed.last())
        if start-valid:
          is-valid-tour-on-path(proposed.last(), L.filter(_ == proposed.last(), 
              stops-left), proposed.reverse(), graph)
        else:
          false
        end
      else:
        false
      end
    else:
      false
    end
  else:
    false
  end
where:
  tours = [set: tour("one", [set: "1"]), tour("two", [set: "4"])]
  is-valid-tour([list: "1", "2", "4", "2"], tours, point(0.6, 0.6), g)
    is true
  is-valid-tour([list: "4", "2", "1", "3"], tours, point(0.6, 0.6), g)
    is true
  is-valid-tour([list: "1", "2", "1", "3"], tours, point(0.6, 0.6), g)
    is false
end

fun is-valid-tour-on-path(current :: Name, stops-left :: Path,
    proposed :: Path, graph :: Graph) -> Boolean:
  doc:```consumes the current node, the stops left, the proposed path left
      and the graph; returns if this in line with the greedy algo```
  if stops-left.length() == 0:
    true
  else:
    dijk-res = dijkstra-helper(current, graph, [string-dict: current, 0], 
      [string-dict: current, [list: current]], insert(pp(current, 0, 
          [list: current]), mt))
    minimum-dist = L.foldl(lam(acc, x): 
        if dijk-res.dist.get-value(x) < acc:
          dijk-res.dist.get-value(x)
        else:
          acc
        end
      end, dijk-res.dist.get-value(stops-left.first), stops-left)
    all-minimal = L.filter({(x): dijk-res.dist.get-value(x) == minimum-dist},
      stops-left)

    if all-minimal.member(stops-left.first):
      proposed-to-next = list-up-to-elt(proposed, stops-left.first)
      if length-path(proposed-to-next, graph) == minimum-dist:
        is-valid-tour-on-path(stops-left.first, stops-left.rest, 
          proposed.drop(proposed-to-next.length() - 1), graph)
      else:
        false
      end
    else:
      false
    end
  end
where:
  is-valid-tour-on-path("3", [list: "1", "4"], [list: "3", "1", "2", "4"], g) 
    is true
  is-valid-tour-on-path("1", [list: "4"], [list: "1", "2", "4"], g) 
    is true
  is-valid-tour-on-path("1", [list: "4"], [list: "2", "4"], g) 
    is false
end


fun list-up-to-elt<A>(list-ext :: List<A>, elt :: A) -> List<A>:
  doc:```given a list and an elt; returns the list up to and including elt```
  cases (List) list-ext:
    | empty => empty
    | link(f, r) =>
      if f == elt:
        [list: elt]
      else:
        link(f, list-up-to-elt(r, elt))
      end
  end
where:
  list-up-to-elt(empty, 1) is empty
  list-up-to-elt([list: 1, 2, 3, 4, 5, 6], 4) is [list: 1, 2, 3, 4]
end

Week 10 Accelerated Intro to CS

Lecture 23 Detecting Cycles w/union-find

Watching lecture Wed 10/31/18 we skipped lecture 22 which is an internal Brown specific lecture about which classes you could take after this one.

If you can't see the function contracts what he's written on the board @03:54

Sets
unions :: S x S ->
are-in-same-set :: elt x elt -> Boolean

Where x is defined as the cross product, the set of all possible ordered combinations consisting of members in both sets like (A, B) of type A x B . Prof K informs us this is how type inference works in many well known compilers. @33:00 he's trying to point out that we have a graph data structure, and this data structure of sets is totally seperate, an auxilliary one to help us find the minimum spanning tree using sets. @40:10 we are now learning mutation as our programs need to have a notion of time. "I just made any parallel programs you write for the rest of your life hell".

Lecture 24 union-find II

Watching lecture Fri 11/2/18. Try the code yourself:

data Box: | box(ref v) end
b = box(0)
the-val = b!v 
the-ref = b.v
b!v{v: 1}
# the-ref is now 1 but the-val is still 0
the-new-val = b!v
# the-new-val is now 1

The difference is the-val and the-new-val went to a memory location and took the current value (a number) while the-ref took a kind of pointer to b.v and everytime you update b.v it now updates the-ref as well since they point to the same thing. He explains a bit of JQuery in this lecture. The last 20 mins or so of the lecture we can't see the board but he's talking about this chapter in Algorithms That Exploit State.

Ackermann's function is on wikipedia and massively grows \(2^{2^{2^{2^{...}}}}\) and the inverse Ackermann's function and the paper Prof K suggested we read is covered in the first lecture of Advanced Algorithms we do after this.

Assignment 13

MSTs assignment and template. Reminder from lecture 20/21 Kruskal's algorithm is take a sorted list of edges ascending in weight, choose the lightest edge, test it doesn't create a cycle if so add it to our set and repeat until there's no more edges left to test. Prim's solution is similar to Dijkstra's algorithm traversing the entire graph using greedy strategy always taking the lightest path and keeping a priority queue/checking for cycles before adding the lightest edge. Detecting cycles was then covered in lecture 23/24 and in the book as union find.

In the problem statement these are undirected edges so [list: edge("A", "B", 3), edge("B", "A", 3)] are both the same edge. Our task summarized is to implement another testing oracle that consumes 2 MST algorithms, generates random graphs for inputs and repeatedly tests if the MST algorithms are correct against each other's outputs as per the writeup. We can use satisfies and violates test operators.

Student solution:

provide *
provide-types *

include shared-gdrive("mst-support.arr", "14fGt9flbRDF3GKwiYWbIlet63fbwxczF")
include my-gdrive("mst-common.arr")

# DO NOT CHANGE ANYTHING ABOVE THIS LINE
#
# You may write implementation-specific tests (e.g., of helper functions)
# in this file.

import lists as L
import string-dict as SD

### Test Union Find
check "fynd":
  fynd(elt(0, some(elt(3, none)))) is=~ elt(3, none)
  fynd(elt(0, none)) is=~ elt(0, none)
end

check "union":
  elts = map(elt(_, none), range(0,4))
  union(elts.get(0), elts.get(3))
  elts is=~ [list: elt(0, some(elt(3, none))), elt(1, none), 
    elt(2, none), elt(3, none)]
  union(elts.get(0), elts.get(0)) 
  elts is=~ [list: elt(0, some(elt(3, none))), 
    elt(1, none), elt(2, none), elt(3, none)]
end

### Heaps Implementation Based on Heap Lab :)
data Heap:
  | mt
  | node(value :: Edge, left :: Heap, right :: Heap)
end

test-heap = node(edge("a", "b", 1), node(edge("c", "d", 3), 
    node(edge("e", "f", 4), mt, mt), node(edge("g", "h", 5), mt, mt)),
  node(edge("c", "d", 6), node(edge("e", "f", 7), mt, mt), 
    node(edge("g", "h", 8), mt, mt)))

fun insert(el :: Edge, h :: Heap) -> Heap:
  doc: ```Takes in a PossiblePath elt and a proper Heap h produces
       a proper Heap that has all of the elements of h and elt.```
  cases (Heap) h:
    | mt => node(el, mt, mt)
    | node(v,l,r) =>
      if el.weight > v.weight:
        node(v, insert(el, r), l)
      else:
        node(el, insert(v, r), l)
      end
  end
where:
  insert(edge("a", "b", 1), mt) is node(edge("a", "b", 1), mt, mt)
  insert(edge("z", "x", 20), test-heap) is node(edge("a", "b", 1), 
    node(edge("c", "d", 6), node(edge("g", "h", 8), node(edge("z", "x", 20), 
          mt, mt), mt), node(edge("e", "f", 7), mt, mt)), 
    node(edge("c", "d", 3), node(edge("e", "f", 4), mt, mt), node(edge("g", "h",
          5), mt, mt)))
  insert(edge("A", "B", 0), node(edge("a", "b", 1), mt, mt)) is node(edge("A", 
      "B", 0), node(edge("a", "b", 1), mt, mt), mt)
end

fun remove-min(h :: Heap%(is-node)) -> Heap:
  doc: ```Given a proper, non-empty Heap h, removes its minimum element.```
  amputated = amputate-bottom-left(h)
  cases (Heap) amputated.heap:
    | mt => mt
    | node(v,l,r) =>
      reorder(rebalance(node(amputated.elt, l, r)))
  end
where:
  remove-min(test-heap) is node(edge("c", "d", 3), node(edge("c", "d", 6), 
      node(edge("e", "f", 7), mt, mt), node(edge("g", "h", 8), mt, mt)), 
    node(edge("e", "f", 4), node(edge("g", "h", 5), mt, mt), mt))
end

fun rebalance(h :: Heap) -> Heap:
  doc: ```Given a Heap h, switches all children along the leftmost path```
  cases (Heap) h:
    | mt => mt
    | node(v,l,r) => node(v,r,rebalance(l))
  end
where:
  rebalance(node(edge("c", "d", 3), node(edge("c", "d", 6), 
        node(edge("e", "f", 7), mt, mt), mt), mt))
    is node(edge("c", "d", 3), mt, node(edge("c", "d", 6), mt, 
      node(edge("e", "f", 7), mt, mt)))
  rebalance(node(edge("c", "d", 3), mt, node(edge("c", "d", 6), mt, 
        node(edge("e", "f", 7), mt, mt))))
    is node(edge("c", "d", 3), node(edge("c", "d", 6), mt, 
      node(edge("e", "f", 7), mt, mt)), mt)
end

fun get-min(h :: Heap%(is-node)) -> Edge:
  doc: ```Takes in a proper, non-empty Heap h and produces the
       minimum weight in h.```
  cases (Heap) h:
    | mt => raise("Invalid input: empty heap")
    | node(val,_,_) => val
  end
where:
  get-min(test-heap) is edge("a", "b", 1)
end

data Amputated:
  | elt-and-heap(elt :: Edge, heap :: Heap)
end

fun amputate-bottom-left(h :: Heap%(is-node)) -> Amputated:
  doc: ```Given a Heap h, produes an Amputated that contains the
       bottom-left element of h, and h with the bottom-left element removed.```
  cases (Heap) h:
    | mt => raise("Invalid input: empty heap")
    | node(value, left, right) =>
      cases (Heap) left:
        | mt => elt-and-heap(value, mt)
        | node(_, _, _) =>
          rec-amputated = amputate-bottom-left(left)
          elt-and-heap(rec-amputated.elt,
            node(value, rec-amputated.heap, right))
      end
  end
where:
  amputate-bottom-left(test-heap) is elt-and-heap(edge("e", "f", 4), 
    node(edge("a", "b", 1), node(edge("c", "d", 3), mt, node(edge("g", "h", 5), 
          mt, mt)), node(edge("c", "d", 6), node(edge("e", "f", 7), mt, mt), 
        node(edge("g", "h", 8), mt, mt))))
end

fun reorder(h :: Heap) -> Heap:
  doc: ```Given a Heap h, where only the top node is misplaced,
       produces a Heap with the same elements but in proper order.```
  cases(Heap) h:
    | mt => mt
    | node(val, lh, rh) =>
      cases(Heap) lh:
        | mt => node(val, mt, mt)
        | node(lval, llh, lrh) =>
          cases(Heap) rh:
            | mt =>
              if val.weight < lval.weight:
                node(val, node(lval, mt, mt), mt)
              else:
                node(lval, node(val, mt, mt), mt)
              end
            | node(rval, rlh, rrh) =>
              if lval.weight <= rval.weight:
                if val.weight <= lval.weight:
                  h
                else:
                  node(lval, reorder(node(val, llh, lrh)), rh)
                end
              else:
                if val.weight <= rval.weight:
                  h
                else:
                  node(rval, lh, reorder(node(val, rlh, rrh)))
                end
              end
          end
      end
  end
where:
  reorder(mt) is mt
  reorder(test-heap) is test-heap
  reorder(node(edge("a", "b", 20), node(edge("c", "d", 3), 
        node(edge("e", "f", 4), mt, mt), node(edge("g", "h", 5), mt, mt)),
      node(edge("c", "d", 6), node(edge("e", "f", 7), mt, mt), 
        node(edge("g", "h", 8), mt, mt)))) is node(edge("c", "d", 3), 
    node(edge("e", "f", 4), node(edge("a", "b", 20), mt, mt), 
      node(edge("g", "h", 5), mt, mt)), node(edge("c", "d", 6), 
      node(edge("e", "f", 7), mt, mt), node(edge("g", "h", 8), mt, mt)))
end

### Helper Functions ###
fun adj-list-build(graph :: Graph) -> StringDict<List<Edge>>:
  doc:```consumes a graph; returns a string dictionary with nodes as keys
      and an adjacency list as values```
  for fold(adj-list from mt-dict, curr-edge from graph):
    so-far-a = adj-list.get(curr-edge.a)
    so-far-b = adj-list.get(curr-edge.b)
    ask:
      | (so-far-a == none) and (so-far-b == none) then:
        adj-list.set(curr-edge.a, [list: curr-edge]).set(curr-edge.b, 
          [list: curr-edge])
      | so-far-a == none then:
        adj-list.set(curr-edge.a, [list: curr-edge]).set(curr-edge.b, 
          link(curr-edge, so-far-b.value))
      | so-far-b == none then:
        adj-list.set(curr-edge.b, [list: curr-edge]).set(curr-edge.a, 
          link(curr-edge, so-far-a.value))
      | otherwise:
        adj-list.set(curr-edge.a, link(curr-edge, so-far-a.value)).set(
          curr-edge.b, link(curr-edge, so-far-b.value))
    end
  end
where:
  adj-list-build(empty) is mt-dict
  adj-list-build([list: edge("a", "b", 10), edge("a", "b", 3), 
      edge("c", "a", 3)])
    is [string-dict: "a", [list: edge("c", "a", 3), edge("a", "b", 3), 
      edge("a", "b", 10)], "b", [list: edge("a", "b", 3), edge("a", "b", 10)], 
    "c", [list: edge("c", "a", 3)]]
  adj-list-build([list: edge("a", "b", 10), edge("a", "b", -2)]) 
    is [string-dict: "a", [list: edge("a", "b", -2), edge("a", "b", 10)], "b", 
    [list: edge("a", "b", -2), edge("a", "b", 10)]]
end

fun prim-helper(building :: List<Edge>, queue :: Heap, solution :: Set<String>,
    adj-list :: StringDict<Edge>) -> List<Edge>:
  doc:```consumes a building solution, solution so far, adjacency list, and a 
      queue; returns the edges in the MST```
  cases (Heap) queue:
    | mt => building
    | node(_, _, _) =>
      next-edge = get-min(queue)
      if solution.member(next-edge.a) and solution.member(next-edge.b):
        prim-helper(building, remove-min(queue), solution, adj-list)
      else if solution.member(next-edge.a):
        new-queue = fold({(acc, x): insert(x, acc)}, remove-min(queue), 
          adj-list.get-value(next-edge.b))
        prim-helper(link(next-edge, building), new-queue, 
          solution.add(next-edge.b), adj-list)
      else:
        new-queue = fold({(acc, x): insert(x, acc)}, remove-min(queue), 
          adj-list.get-value(next-edge.a))
        prim-helper(link(next-edge, building), new-queue, 
          solution.add(next-edge.a), adj-list)
      end
  end
where:
  prim-helper(empty, node(edge("a", "b", 1), mt, mt), [set: "a"], 
    adj-list-build([list: edge("a", "b", 1)])) is [list: edge("a", "b", 1)]
  prim-helper(empty, node(edge("a", "b", -2), node(edge("a", "b", 10), mt, mt), 
      mt), [set: "a"], adj-list-build([list: edge("a", "b", 10), edge("a", 
          "b", -2)])) is [list: edge("a", "b", -2)]
end

fun graph-builder(to-add :: List<String>, solution :: List<String>) -> Graph:
  doc:```consumes nodes to add, and nodes already added; returns a random graph 
      with all nodes```
  cases (List) to-add:
    | empty => empty
    | link(f, r) =>
      sol-size = solution.length()
      edges = for fold(b from empty, idx from L.shuffle(range(0, num-random(
                sol-size) + 1))):
        for fold(build from b, idx2 from range(0, num-random(4) + 1)):
          link(edge(f, solution.get(idx), num-random(11) - 5), build)
        end
      end
      edges.append(graph-builder(r, link(f, solution)))
  end
where:
  graph-builder([list: "b", "c", "d"], [list: "a"]) 
    satisfies {(x): x.length() <= 18}
  graph-builder([list: "b", "c", "d"], [list: "a"]) 
    satisfies {(x): x.length() >= 3}
end

check "find-nodes":
  find-nodes(empty) is empty
  find-nodes([list: edge("a", "b", 1), edge("a", "b", 3), edge("b", "c", 4)])
    is [list: "c", "a", "b"]
end

### Main Functions ###

fun mst-kruskal(graph :: Graph) -> List<Edge>:
  doc:```consumes a graph; returns a minimal spanning tree using 
      kruskals algorithm```
  sorted-edges = graph.sort-by({(x, y): x.weight < y.weight}, 
    {(x, y): x.weight == y.weight})
  unique-nodes = find-nodes(sorted-edges)
  groups = fold({(acc, x): acc.set(x, elt(x, none))}, mt-dict, unique-nodes)
  for fold(edges from empty, curr-edge from sorted-edges):
    e1 = fynd(groups.get-value(curr-edge.a))
    e2 = fynd(groups.get-value(curr-edge.b))
    if not(e1 == e2):
      block:
        union(e1, e2)
        link(curr-edge, edges)
      end
    else:
      edges
    end
  end
end

fun mst-prim(graph :: Graph) -> List<Edge>:
  doc:```consumes a graph; returns a minimal spanning tree using
      prims algorithm```
  unique-nodes = L.distinct(fold({(acc, x): link(x.a, link(x.b, acc))}, empty,
      graph))
  adj-list = adj-list-build(graph)
  if graph == empty:
    empty
  else:
    queue = fold({(acc, x): insert(x, acc)}, mt, 
      adj-list.get-value(unique-nodes.first))
    prim-helper(empty, queue, [tree-set: unique-nodes.first], adj-list)
  end
end

fun generate-input(num-vertices :: Number) -> Graph:
  doc:```consumes a number; returns a random graph with that many vertices```
  if [set: 0, 1].member(num-vertices):
    empty
  else:
    nodes = map(string-from-code-point, range(97, 97 + num-vertices))
    graph-builder(nodes.rest, [list: nodes.first])
  end
end

fun mst-cmp(
    graph :: Graph,
    mst-alg-a :: (Graph -> List<Edge>),
    mst-alg-b :: (Graph -> List<Edge>))
  -> Boolean:
  doc:```consumes two mst algorithms and determines if they both produce a valid
      mst result```
  tree1 = mst-alg-a(graph)
  tree2 = mst-alg-b(graph)
  if ((tree1 == empty) and (tree2 == empty)) and (graph == empty):
    true
  else:
    graph-nodes = find-nodes(graph).sort()
    # check spanning as having all nodes as graph in solution
    spanning = (find-nodes(tree1).sort() == graph-nodes) and (find-nodes(
        tree2).sort() == graph-nodes)

    # is-tree by measuring number of edges is one less total nodes 
    tree = (tree1.length() == (graph-nodes.length() - 1)) and (tree2.
      length() == (graph-nodes.length() - 1))

    # calculate weight by summing weights over all edges and outputing result
    weight-calc = lam(x :: Graph): fold({(acc, y): acc + y.weight}, 0, x) end
    weight = (weight-calc(tree1) == weight-calc(tree2))

    # check all tree edges are members of the graph
    edges-cont = lam(x :: Graph): L.all(graph.member(_), x) end
    edges = (edges-cont(tree1) and edges-cont(tree2))

    edges and (weight and (tree and spanning))
  end
end

fun sort-o-cle(
    mst-alg-a :: (Graph -> List<Edge>),
    mst-alg-b :: (Graph -> List<Edge>))
  -> Boolean:
  doc:```consumes two mst algorithms and determines if they are valid
      by checking if their results ober a large sample space are equal```
  random-suite = map({(_): generate-input(num-random(15))}, range(0, 30))
  rand-res = L.all(mst-cmp(_, mst-alg-a, mst-alg-b), random-suite)
  
  edge-suite = [list: two-e-two-n, tree-graph, pos-sym-cycle, neg-sym-cycle, 
    pos-cycle, neg-cycle, disj-two-mult, disj-two-back, two-cycle-inter, 
    two-cycle-inter-amb, single-edge, two-e-two-n-same, empty]
  edge-res = L.all(mst-cmp(_, mst-alg-a, mst-alg-b), edge-suite)
  
  rand-res and edge-res
end

Their 'common code'

provide *
provide-types *

include shared-gdrive("mst-support.arr", "14fGt9flbRDF3GKwiYWbIlet63fbwxczF")

# DO NOT CHANGE ANYTHING ABOVE THIS LINE
#
# Write data bindings here that you'll need for tests in both mst-code.arr
# and mst-tests.arr

import lists as L
import string-dict as SD
type StringDict = SD.StringDict
string-dict = SD.string-dict

mt-dict = [string-dict: ]

### Union Find ###
data Elt: elt(v, ref parent :: Option<Elt>) end

fun fynd(e :: Elt) -> Elt:
  doc:```finds the source elt of an elt```
  cases (Option) e!parent:
    | none => e
    | some(p) =>
      block:
        new-parent = fynd(p)
        e!{parent: some(new-parent)}
        new-parent
      end
  end
end

fun union(e1 :: Elt, e2 :: Elt):
  doc:```unions two sets together in place```
  e1n = fynd(e1)
  e2n = fynd(e2)
  when not(e1 <=> e2):
    e1n!{parent: some(e2n)}
  end
end

### find nodes helper

fun find-nodes(graph :: Graph) -> List<String>:
  doc:```consumes a graph; returns the unique nodes it has```
  L.distinct(fold({(acc, x): link(x.a, link(x.b, acc))}, empty,
      graph))
end

### TESTING GRAPHS

## Lst same els implementation from map reduce. 
# Tests removed as they break exemplar

# count
fun count<A>(target :: A, a :: List<A>, eq-checker :: (A, A -> Boolean))
  -> Number:
  doc: "counts quantity of target in a"
  fun el-checker(el, cnt):
    if eq-checker(el, target):
      cnt + 1
    else:
      cnt
    end
  end
  a.foldl(el-checker, 0)
end

# lst-same-els
fun lst-same-els<A>(a :: List<A>,b :: List<A>, eq-checker :: (A, A -> Boolean))
  -> Boolean:
  doc: "checks whether two lists have the same elements in the same quantity"
  fun same-count(el, acc):
    acc and (count(el, a, eq-checker) == count(el, b, eq-checker))
  end
  (a.length() == b.length()) and a.foldl(same-count, true)
end

## Special Graphs

# two edges two nodes
two-e-two-n = [list: edge("a", "b", 1), edge("a", "b", 3)]

# already a tree
tree-graph = [list: edge("a", "b", 1), edge("b", "c", 2), edge("a", "d", 1)]

# symmetric cycle
pos-sym-cycle = [list: edge("a", "b", 2), edge("b", "c", 2), edge("c", "d", 2), 
  edge("d", "a", 2)]

# symmetric neg cycle
neg-sym-cycle = [list: edge("a", "b", -2), edge("b", "c", -2), 
  edge("c", "d", -2), edge("d", "a", -2)]

# non-sym cycle
pos-cycle = [list: edge("a", "b", 2), edge("b", "c", 3), edge("c", "d", 2), 
  edge("d", "a", 2)]

# negative non-sym cycle
neg-cycle = [list: edge("a", "b", -2), edge("b", "c", -3), edge("c", "d", -4), 
  edge("d", "a", -5)]

# disjoint two multiple possiblities 
disj-two-mult = [list: edge("a", "b", 2), edge("a", "c", 2), edge("b", "d", 2),
  edge("c", "d", 2), edge("d", "e", 2), edge("d", "f", 2), edge("e", "g", 2),
  edge("f", "g", 2)]

# disjoint approach from back
disj-two-back = [list: edge("a", "b", 3), edge("a", "c", 3), edge("b", "d", 2),
  edge("c", "d", 2), edge("d", "e", 3), edge("d", "f", 3), edge("e", "g", 2),
  edge("f", "g", 2)]

# interpolate two cycles
two-cycle-inter = [list: edge("a", "b", 2), edge("a", "b", -1), 
  edge("b", "c", 2), edge("b", "c", 3), edge("c", "a", 4), edge("c", "a", 6)]

# interpolate two cycles ambiguous
two-cycle-inter-amb = [list: edge("a", "b", 2), edge("a", "b", -1), 
  edge("b", "c", 2), edge("b", "c", 3), edge("c", "a", 2), edge("c", "a", 2)]

# single edge
single-edge = [list: edge("a", "b", 0)]

# two edges two nodes same weight
two-e-two-n-same = [list: edge("a", "b", 1), edge("a", "b", 3)]


### BAD MST IMPLEMENTATIONS

# empty returner
empty-mst = {(x): empty}

# return first element
fun first-mst(graph :: Graph) -> List<Edge>:
  doc:```returns first edge if it exists```
  cases (List) graph:
    | empty => empty
    | link(f, r) => [list: f]
  end
end

# return random selection
fun random-mst(graph :: Graph) -> List<Edge>:
  doc:```returns random subset of edges```
  g-size = graph.length()
  L.shuffle(graph).take(num-random(g-size))
end

# return n-1 lowest edges
fun naive-kruskal-mst(graph :: Graph) -> List<Edge>:
  doc: ```returns the n-1 lowest edge weights```
  cases (List) graph:
    | empty => empty
    | link(f, r) =>
      g-size = graph.length()
      graph.sort-by({(x, y): x.weight < y.weight}, {(x, y): x.weight == 
          y.weight}).take(g-size - 1)
  end
end

# labeling flipped
fun flipped-kruskal-mst(graph :: Graph) -> List<Edge>:
  doc:```consumes a graph; returns a minimal spanning tree using 
      kruskals algorithm but reverses edges labelling```
  sorted-edges = graph.sort-by({(x, y): x.weight < y.weight}, 
    {(x, y): x.weight == y.weight})
  unique-nodes = find-nodes(sorted-edges)
  groups = fold({(acc, x): acc.set(x, elt(x, none))}, mt-dict, unique-nodes)
  for fold(edges from empty, curr-edge from sorted-edges):
    e1 = fynd(groups.get-value(curr-edge.a))
    e2 = fynd(groups.get-value(curr-edge.b))
    if not(e1 == e2):
      block:
        union(e1, e2)
        link(edge(curr-edge.b, curr-edge.a, curr-edge.weight), edges)
      end
    else:
      edges
    end
  end
end

Their tests

include shared-gdrive("mst-support.arr", "14fGt9flbRDF3GKwiYWbIlet63fbwxczF")
include my-gdrive("mst-code.arr")
include my-gdrive("mst-common.arr")

# DO NOT CHANGE ANYTHING ABOVE THIS LINE
#
# Write your examples and tests in here. These should not be tests of
# implementation-specific details (e.g., helper functions).

import lists as L

# equality operator
eq = lam(x,y): x == y end

### MST KRUSKAL

check "mst-kruskal base case sensitivity":
  # empty test
  mst-kruskal(empty) is empty

  # singular edge
  mst-kruskal(single-edge) is single-edge
end

check "mst-kruskal multiple edge sensitivity":
  # two nodes two edges
  mst-kruskal(two-e-two-n) is [list: edge("a", "b", 1)]

  # two edges two nodes same weight
  mst-kruskal(two-e-two-n-same) is [list: edge("a", "b", 1)]

  # traverse interpolated cycles
  mst-kruskal(two-cycle-inter) satisfies lst-same-els(_, [list: edge("b", "c", 
        2), edge("a", "b", -1)], eq)
end

check "mst-kruskal properly traverses":
  # traverse a tree
  mst-kruskal(tree-graph) satisfies lst-same-els(_, tree-graph, eq)

  # traverse non-sym cycle
  mst-kruskal(pos-cycle) satisfies lst-same-els(_, [list: edge("a", "b", 2),
      edge("c", "d", 2), edge("d", "a", 2)], eq)

  # traverse non-sym negative cycle
  mst-kruskal(neg-cycle) satisfies lst-same-els(_, [list: edge("b", "c", -3), 
      edge("c", "d", -4), edge("d", "a", -5)], eq)
end

check "mst-kruskal ambiguous manually checked":
  # symmetric cycle
  mst-kruskal(pos-sym-cycle) satisfies {(x): L.any(lam(y): lst-same-els(x, y, 
        eq) end, [list: [list: edge("a", "b", 2), edge("b", "c", 2), 
          edge("c", "d", 2)],[list: edge("b", "c", 2), edge("c", "d", 2), 
          edge("d", "a", 2)]])}

  # symmetric neg cycle
  mst-kruskal(neg-sym-cycle) satisfies {(x): L.any(lam(y): lst-same-els(x, y, 
        eq) end, [list: [list: edge("a", "b", -2), edge("b", "c", -2), 
          edge("c", "d", -2)],[list: edge("b", "c", -2), edge("c", "d", -2), 
          edge("d", "a", -2)]])}

  # disjoint two possibilities approach back
  mst-kruskal(disj-two-back) satisfies {(x): L.any(lam(y): lst-same-els(x, y, 
        eq) end, [list: [list: edge("a", "b", 3), edge("b", "d", 2), 
          edge("c", "d", 2), edge("d", "e", 3), edge("e", "g", 2), 
          edge("f", "g", 2)], 
        [list: edge("a", "c", 3), edge("b", "d", 2), edge("c", "d", 2), 
          edge("d", "f", 3), edge("e", "g", 2), edge("f", "g", 2)],
        [list: edge("a", "b", 3), edge("b", "d", 2), edge("c", "d", 2),  
          edge("d", "f", 3), edge("e", "g", 2), edge("f", "g", 2)],
        [list: edge("a", "c", 3), edge("b", "d", 2), edge("c", "d", 2), 
          edge("d", "e", 3), edge("e", "g", 2), edge("f", "g", 2)]])}
end


### MST PRIM

check "mst-prim base case sensitivity":
  # empty test
  mst-prim(empty) is empty

  # singular edge
  mst-prim(single-edge) is single-edge
end

check "mst-prim multiple edge sensitivity":
  # two notes two edges
  mst-prim(two-e-two-n) is [list: edge("a", "b", 1)]

  # two edges two nodes same weight
  mst-prim(two-e-two-n-same) is [list: edge("a", "b", 1)]

  # traverse interpolated cycles
  mst-prim(two-cycle-inter) satisfies lst-same-els(_, [list: edge("b", "c", 
        2), edge("a", "b", -1)], eq)
end

check "mst-prim properly traverses":
  # traverse a tree
  mst-prim(tree-graph) satisfies lst-same-els(_, tree-graph, eq)

  # traverse non-sym cycle
  mst-prim(pos-cycle) satisfies lst-same-els(_, [list: edge("a", "b", 2),
      edge("c", "d", 2), edge("d", "a", 2)], eq)

  # traverse non-sym negative cycle
  mst-prim(neg-cycle) satisfies lst-same-els(_, [list: edge("b", "c", -3), 
      edge("c", "d", -4), edge("d", "a", -5)], eq)
end

check "mst-kruskal ambiguous manually checked":
  # symmetric cycle
  mst-prim(pos-sym-cycle) satisfies {(x): L.any(lam(y): lst-same-els(x, y, 
        eq) end, [list: [list: edge("a", "b", 2), edge("b", "c", 2), 
          edge("c", "d", 2)],[list: edge("b", "c", 2), edge("c", "d", 2), 
          edge("d", "a", 2)]])}

  # symmetric neg cycle
  mst-prim(neg-sym-cycle) satisfies {(x): L.any(lam(y): lst-same-els(x, y, 
        eq) end, [list: [list: edge("a", "b", -2), edge("b", "c", -2), 
          edge("c", "d", -2)],[list: edge("b", "c", -2), edge("c", "d", -2), 
          edge("d", "a", -2)]])}

  # disjoint two possibilities approach back
  mst-prim(disj-two-back) satisfies {(x): L.any(lam(y): lst-same-els(x, y, 
        eq) end, [list: [list: edge("a", "b", 3), edge("b", "d", 2), 
          edge("c", "d", 2), edge("d", "e", 3), edge("e", "g", 2), 
          edge("f", "g", 2)], 
        [list: edge("a", "c", 3), edge("b", "d", 2), edge("c", "d", 2), 
          edge("d", "f", 3), edge("e", "g", 2), edge("f", "g", 2)],
        [list: edge("a", "b", 3), edge("b", "d", 2), edge("c", "d", 2),  
          edge("d", "f", 3), edge("e", "g", 2), edge("f", "g", 2)],
        [list: edge("a", "c", 3), edge("b", "d", 2), edge("c", "d", 2), 
          edge("d", "e", 3), edge("e", "g", 2), edge("f", "g", 2)]])}
end


### GENERATE INPUT

check "generate-input base case":
  # empty graph 0
  generate-input(0) is empty

  # empty graph 1
  generate-input(1) is empty
end

check "generate-input size":
  # check over large space
  fold({(acc1, y): link(L.distinct(fold({(acc, x): link(x.a, link(x.b, acc))}, 
            empty, generate-input(y))).length(), acc1)}, empty, range(2, 30))
    is range(2, 30).reverse()
end

check "generate-input connectedness":
  # the connected condition means that there must be >= n - 1 unique edges
  input-g = map({(_): generate-input(num-random(15) + 15)}, range(0, 10))
  dist-ed = map({(y): L.distinct(map({(x): {x.a; x.b}}, y))}, input-g)
  dist-ed satisfies L.all({(y): y.length() >= (L.distinct(fold({(acc, x): 
              link(x.{0}, link(x.{1}, acc))}, empty, y)).length() - 1)}, _)
end

check "generate-input no loops":
  # check no a -> a
  input-g = map({(_): generate-input(num-random(15) + 15)}, range(0, 10))
  input-g satisfies L.all({(y): L.all(lam(x): not(x.a == x.b) end, y)}, _)
end

check "generate-input unidirectional":
  # check that if a -> b no b -> a
  input-g = map({(_): generate-input(num-random(15) + 15)}, range(0, 10))
  dist-ed = map({(y): L.distinct(map({(x): {x.a; x.b}}, y))}, input-g)
  dist-ed satisfies L.all({(y): L.all(lam(x): not(y.member({x.{1}; x.{0}})) end,
        y)}, _)
end


### MST CMP

check "mst-cmp base case":
  # empty graph correct implementations 
  empty satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # empty graph false implementation but still correct
  empty satisfies mst-cmp(_, mst-prim, empty-mst)
end

check "mst-cmp correct implementations deterministic graphs":
  # two notes two edges
  two-e-two-n satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # two edges two nodes same weight
  two-e-two-n-same satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # traverse interpolated cycles
  two-cycle-inter satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # traverse a tree
  tree-graph satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # traverse non-sym cycle
  pos-cycle satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # traverse non-sym negative cycle
  neg-cycle satisfies mst-cmp(_, mst-prim, mst-kruskal)
end

check "mst-cmp correct implementations non-deterministic":
  # positive symetric cycle
  pos-sym-cycle satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # negative symetric cycle
  neg-sym-cycle satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # disjoint two parts multiple paths
  disj-two-mult satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # disjoint multiple paths with backtracking
  disj-two-back satisfies mst-cmp(_, mst-prim, mst-kruskal)

  # two cycles interpolated ambiguous
  two-cycle-inter-amb satisfies mst-cmp(_, mst-prim, mst-kruskal)
end

check "mst-cmp failing implementations but same output":
  # both return empty
  two-e-two-n violates mst-cmp(_, empty-mst, empty-mst)
  
  # both return one edge in the graph
  tree-graph violates {(x): mst-cmp(x, first-mst, first-mst)}
end

check "mst-cmp failing specific criteria":
  # makes sure mst-cmp can weed out random
  disj-two-mult violates {(x): L.all({(y): mst-cmp(x, random-mst, mst-prim)}, 
      range(0, 10))}
  
  # weights off
  pos-cycle violates mst-cmp(_, mst-kruskal, {(x): [list: edge("b", "c", 3), 
        edge("c", "d", 2), edge("d", "a", 2)]})
  
  # spanning off
  pos-sym-cycle violates mst-cmp(_, mst-kruskal, {(x): [list: edge("a", "b", 2),
        edge("b", "c", 2), edge("b", "c", 2)]})
  
  # tree off
  single-edge violates mst-cmp(_, mst-kruskal, {(x): [list: edge("a", "b", 0),
        edge("a", "b", 0)]})
  
  # inclusion off
  pos-cycle violates mst-cmp(_, mst-kruskal, {(x): [list: edge("b", "c", -5), 
        edge("c", "d", -5), edge("d", "a", 17)]})
end

### SORT-O-CLE

check "sort-o-cle both valid":
  # mst-prim and mst-kruskal should be correct
  sort-o-cle(mst-prim, mst-kruskal) is true
end

check "sort-o-cle one valid":
  # mst-prim up against only empty
  empty-mst violates sort-o-cle(_, mst-prim)
  
  # first element
  first-mst violates sort-o-cle(_, mst-kruskal)
  
  # random
  random-mst violates sort-o-cle(_, mst-prim)
  
  # naive-kruskal
  naive-kruskal-mst violates sort-o-cle(_, mst-kruskal)
  
  # flipped edge labeling
  flipped-kruskal-mst violates sort-o-cle(_, mst-kruskal)
end

check "sort-o-cle input two wrong":
  sort-o-cle(empty-mst, first-mst) is false
  
  sort-o-cle(first-mst, random-mst) is false
  
  sort-o-cle(naive-kruskal-mst, flipped-kruskal-mst) is false
end

Week 11 Accelerated Intro to CS

Lecture 25 Closures

Watching lecture Mon 11/5/18. You can't see what he's writing off screen to show why 'equals-never' is impossible, but it's the subject of the entire next lecture. Rest of this lecture is demonstrating how closures/language scoping rules work, and how they don't work as you expect in Javascript/Python so you need to read the language scoping rules and not make assumptions. He talks about CS173 which we are doing after this course.

Lecture 26 Halting problem

Watching lecture Wed 11/7/18 lecture an entertaining introduction to theoretical CS that expands on the previous lecture 'can equals-never exist' question.

Lecture 31 P = NP

I moved this up since it's also teaching theoretical CS.

Watch lecture Mon 11/19/18 this is a must watch lecture that explains non-determinism/theoretical complexity theory better than anywhere else using as examples all the oracles we wrote. If you like complexity theory see this channel or watch this lecture by Erik Demaine it's a graduate class on practical applications of reductions where he'll show you how most decision problems (something with a yes/no answer) are not solvable by algorithms. Complexity theory is wide open to discoveries.

If you've ever heard of regex or regular expressions they are explained here with DFAs.

Week 12 Accelerated Intro to CS

Lecture 27 & 28 Memoization

Watching lecture Fri 11/09/18 and lecture Mon 11/12/18 introduces typical assignment operators and var keywords you'd see in other languages. Probably the best intro to memoization you'll find, he describes it as a seperation of concerns, explains closures some more, explains lambdas some more, finally we learn memoize() is really an identity function: given a function it returns the same function.

Lecture 29 & 30 Dynamic Programming

Watching lecture Wed 11/14/18 and lecture Fri 11/16/18. Dynamic programming is the bottom up approach to problem solving and memoization is the top down approach hilariously explained as 'people were afraid of recursion so invented DP'. The book covers all this in detail. If space is a concern, a memoized version is still useful as it can be used for testing against your dynamic programming implementation.

Final assignment

Fluid Images assignment here. Cut+Paste the following to access the starter code functions in the writeup like image-data-to-image:

provide *

include shared-gdrive("fluid-images-definitions.arr", "1qYtcBi1vQUhOZSgCyqLGv8GTJaWxypaL")

import image-url,
image-width,
image-height,
image-to-color-list,
color-list-to-image
from image

import image-structs as I

fun run-on-url(url :: String, func :: (Image, Number -> Image), n :: Number):
  doc: ```Runs one of your Fluid Images implementations on an image
       downloaded from a url```
  fl-image :: Image =
    let raw-image = image-url(url),
      width       = image-width(raw-image),
      height      = image-height(raw-image),
      image-data  = image-to-color-list(raw-image):
      image-data.map({(c):color(c.red, c.green, c.blue)})
        ^ builtins.raw-array-from-list
        ^ image(width,height).make
    end
  liquified = func(fl-image, n)
  im-list = lists.fold({(acc, cur): acc.append(cur)}, empty, liquified.pixels)
  im-list-colors = im-list.map({(e): I.color(e.red, e.green, e.blue, 1)})
  color-list-to-image(im-list-colors, liquified.width, liquified.height, 0, 0)
end

Student solution to compare to your own, I include the real student solutions I found because by this point in the course they have been drilled in testing by prof K so you may want to see what a real solution would look like and not my own hacked together solution as I was not a thorough as these students. Reminder you have to use the above code to import the shared gdrve:

import string-dict as SD
type StringDict = SD.StringDict
string-dict = SD.string-dict

### DATA DEFINITIONS
data SearchRet: ret(weight :: Number, path :: List<Number>) end
data Mem: mem(in :: List<List<Number>>, out :: List<List<SearchRet>>) end

### HELPER FUNCTIONS
fun calculate-brightness(in :: Color) -> Number:
  doc:```computes the brightness of a color```
  in.red + in.green + in.blue
where:
  calculate-brightness(color(1, 2, 3)) is 6
  calculate-brightness(color(0, 0, 0)) is 0
  calculate-brightness(color(34, 24, 90)) is 148
end

fun image-brightness(im :: Image) -> List<List<Number>>:
  doc:```computes the brightness of an entire image```
  pixels = im.pixels
  height = im.height
  width = im.width
  for fold(x-arr from empty, row from range(0, height + 2)):
    link(
      for fold(y-arr from empty, col from range(0, width + 2)):
        ask:
          | (row == 0) or (row == (height + 1)) then:
            link(0, y-arr)
          | (col == 0) or (col == (width + 1)) then:
            link(0, y-arr)
          | otherwise:
            link(calculate-brightness(pixels.get(row - 1).get(col -  1)), 
              y-arr)
        end
      end.reverse(), x-arr)
  end.reverse()
where:
  image-brightness([image(3,2): color(0,2,0), color(0,0,0),  color(0,3,0), 
      color(1,0,0), color(0,3,0),  color(0,0,4)])
    is [list: [list: 0, 0, 0, 0, 0], [list: 0, 2, 0, 3, 0], [list: 0, 1, 3, 4, 
      0], [list: 0, 0, 0, 0, 0]]
  image-brightness([image(1,1): color(133, 1, 45)]) is [list: [list: 0, 0, 0], 
    [list: 0, 179, 0], [list: 0, 0, 0]]
  image-brightness(image6) is [list: [list: 0, 0, 0, 0, 0, 0], [list: 0, 1, 9, 
      3, 6, 0], [list: 0, 0, 0, 0, 0, 0]]
end

fun energy(im :: Image) -> List<List<Number>>:
  doc:```computes the energy of each element```
  bright = image-brightness(im)
  height = im.height
  width = im.width
  for fold(x-arr from empty, row from range(1, height + 1)):
    link(
      for fold(y-arr from empty, col from range(1, width + 1)):
        x-energy = (((((bright.get(row - 1).get(col - 1) + (2 * bright.get(row)
                      .get(col - 1))) + bright.get(row + 1).get(col - 1)) - 
              bright.get(row - 1).get(col + 1)) - (2 * bright.get(row).get(col 
                  + 1))) - bright.get(row + 1).get(col + 1))

        y-energy = (((((bright.get(row - 1).get(col - 1) + (2 * bright.get(row 
                        - 1).get(col))) + bright.get(row - 1).get(col + 1)) - 
              bright.get(row + 1).get(col - 1)) - (2 * bright.get(row + 1).
              get(col))) - bright.get(row + 1).get(col + 1))

        link(num-sqrt(num-sqr(x-energy) + num-sqr(y-energy)), y-arr)
      end.reverse(), x-arr)
  end.reverse()
where:
  energy([image(3,2): color(0,2,0), color(0,0,0),  color(0,3,0), 
      color(1,0,0), color(0,3,0),  color(0,0,4)]) 
    is%(within(0.001)) [list: [list: ~5.83095, ~12.08304, ~11.40175], [list: 
      ~7.21110, ~8.60232, ~8.48528]]
  energy([image(1,1): color(133, 1, 45)]) is%(within(0.001)) [list: [list: 0]]
  energy([image(2, 2): color(1,1,1), color(1,0,2), color(2,0,1), color(0,3,0)])
    is%(within(0.0001)) [list: [list: ~12.72792, ~12.72792], [list: ~12.72792, 
      ~12.72792]]
  energy(rand-image) 
    is%(within(0.0001)) [list: [list: ~1676.70450, ~2084.75082, ~2125.71493, 
        ~1962.29457], [list: ~2162.65068, ~704.82480, ~178.31432, ~1656.33631], 
      [list: ~1940.31801, ~1988.38879, ~2120.82436, ~1801.16073]]
end

# cannot test this function
fun memoize(liquify :: (List<List<Number>> -> List<List<SearchRet>>)) 
  -> (List<List<Number>> -> List<List<SearchRet>>):
  doc:```consumes a liquifier and memoizes it```
  var memory = empty
  lam(compute :: List<List<Number>>) block:
    thresh = within(0.0001)
    re = find({(m :: Mem): thresh(m.in, compute)}, memory)
    cases (Option) re block:
      | some(v) => v.out
      | none =>
        actual = liquify(compute)
        memory := link(mem(compute, actual), memory)
        actual
    end
  end
end

rec seam :: (List<List<Number>> -> List<List<SearchRet>>) = 
  #consumes a list of a list of energies and returns the searchresults
  memoize(
    lam(compute :: List<List<Number>>):
      cases (List) compute:
        | empty => empty
        | link(f, r) =>
          before = seam(r)
          addition = for fold(acc from empty, elt from range(0, f.length())):
            ask:
              | before == empty then:
                link(ret(f.get(elt), [list: elt]), acc)
              | elt == 0 then:
                min-weight = fold({(acc1, x): if x.weight < acc1.weight: x 
                    else: acc1 end}, ret(100000000, empty), [list: before.first
                      .get(elt), before.first.get(elt + 1)])
                link(ret(f.get(elt) + min-weight.weight, link(elt, 
                      min-weight.path)), acc)
              | elt == (f.length() - 1) then:
                min-weight = fold({(acc1, x): if x.weight < acc1.weight: x 
                    else: acc1 end}, ret(100000000, empty), [list: before.first
                      .get(f.length() - 2), before.first.get(f.length() - 1)])
                link(ret(f.get(elt) + min-weight.weight, link(elt, 
                      min-weight.path)), acc)
              | otherwise:
                min-weight = fold({(acc1, x): if x.weight < acc1.weight: x 
                    else: acc1 end}, ret(100000000, empty), [list: before.first
                      .get(elt - 1), before.first.get(elt), before.first
                      .get(elt + 1)])
                link(ret(f.get(elt) + min-weight.weight, link(elt, 
                      min-weight.path)), acc)
            end
          end.reverse()
          link(addition, before)
      end
    end)

check "seam":
  seam(energy([image(3,2): color(0,2,0), color(0,0,0),  color(0,3,0), 
        color(1,0,0), color(0,3,0),  color(0,0,4)]))
    is%(within(0.0001)) [list: [list: ret(~13.04205, [list: 0, 0]), 
      ret(~19.29414, [list: 1, 0]), ret(~19.88703, [list: 2, 2])], [list: 
      ret(~7.21110, [list: 0]), ret(~8.60232, [list: 1]), ret(~8.48528, [list: 
          2])]]
  seam(energy([image(4, 3): color(100, 46, 1), color(148, 60, 109), color(210, 
          10, 153), color(114, 161, 217), color(115, 68, 96), color(235, 211, 
          143), color(249, 212, 51), color(172, 116, 209), color(180, 217, 96), 
        color(61, 243, 184), color(124, 38, 76), color(95, 83, 249)]))
    is%(within(0.0001)) [list: [list: ret(~4321.84732, [list: 0, 1, 0]), 
      ret(~4064.22589, [list: 1, 2, 3]), ret(~4105.19000, [list: 2, 2, 3]), 
      ret(~3941.76964, [list: 3, 2, 3])], [list: ret(~4102.96870, [list: 0, 0]),
      ret(~2645.14281, [list: 1, 0]), ret(~1979.47506, [list: 2, 3]), 
      ret(~3457.49705, [list: 3, 3])], [list: ret(~1940.31801, [list: 0]), 
      ret(~1988.38879, [list: 1]), ret(~2120.82436, [list: 2]), ret(~1801.16073,
        [list: 3])]]
  seam(energy([image(2, 2): color(1,1,1), color(1,0,2), color(2,0,1), 
        color(0,3,0)])) 
    is%(within(0.001)) [list: [list: ret(~25.45584, [list: 0, 0]), 
      ret(~25.45584, [list: 1, 0])], [list: ret(~12.72792, [list: 0]), 
      ret(~12.72792, [list: 1])]]
end

fun seam-DP(input :: Image) -> Image block:
  doc:```consumes an image and returns the lowest seam removed```
  height = input.height
  width = input.width
  paths = build-array({(_): array-of(none, width)}, height)
  energized = energy(input)

  for each(col from range(0, width)):
    paths.get-now(0).set-now(col, some(ret(energized.get(0).get(col), [list: 
            col])))
  end

  for each(row from range(1, height)):
    for each(col from range(0, width)):
      ask:
        | col == 0 then:
          min-weight = fold({(acc1, x): if x.weight < acc1.weight: x 
              else: acc1 end}, ret(100000000, empty), [list: paths.get-now(row 
                  - 1).get-now(col).value,  paths.get-now(row - 1).get-now(col 
                  + 1).value])
          paths.get-now(row).set-now(col, some(ret(min-weight.weight 
                  + energized.get(row).get(col), link(col, min-weight.path))))
        | col == (width - 1) then:
          min-weight = fold({(acc1, x): if x.weight < acc1.weight: x 
              else: acc1 end}, ret(100000000, empty), [list: paths.get-now(row 
                  - 1).get-now(col - 1).value, paths.get-now(row - 1)
                .get-now(col).value])
          paths.get-now(row).set-now(col, some(ret(min-weight.weight 
                  + energized.get(row).get(col), link(col, min-weight.path))))
        | otherwise:
          min-weight = fold({(acc1, x): if x.weight < acc1.weight: x 
              else: acc1 end}, ret(100000000, empty), [list: paths.get-now(row 
                  - 1).get-now(col - 1).value, paths.get-now(row - 1).get-now(
                col).value, paths.get-now(row - 1).get-now(col + 1).value])          
          paths.get-now(row).set-now(col, some(ret(min-weight.weight 
                  + energized.get(row).get(col), link(col, min-weight.path))))
      end
    end
  end

  last-row = paths.get-now(height - 1).to-list-now()
  min-path = fold({(acc1, x): 
      if x.value.weight < acc1.value.weight: 
        x 
      else if within(0.0001)(x.value.weight, acc1.value.weight):
        # if same stream value check start point
        if x.value.path.first < acc1.value.path.first:
          x
        else:
          acc1
        end
      else: 
        acc1 
      end}, last-row.first, last-row).value.path.reverse()
  pixels = input.pixels
  new-pixels = for map2(row from pixels, drop from min-path):
    row.take(drop).append(row.drop(drop + 1))
  end
  image-data-to-image(input.width - 1, input.height, 
    new-pixels)
where:
  seam-DP([image(3,2): color(0,2,0), color(0,0,0),  color(0,3,0), 
      color(1,0,0), color(0,3,0),  color(0,0,4)])
    is [image(2,2): color(0,0,0), color(0,3,0), color(0,3,0), color(0,0,4)]
  seam-DP([image(4, 3): color(100, 46, 1), color(148, 60, 109), color(210, 10, 
        153), color(114, 161, 217), color(115, 68, 96), color(235, 211, 143), 
      color(249, 212, 51), color(172, 116, 209), color(180, 217, 96), color(61,
        243, 184), color(124, 38, 76), color(95, 83, 249)])
    is [image(3,3): color(100, 46, 1), color(148, 60, 109), color(210, 10, 153),
    color(115, 68, 96), color(235, 211, 143), color(172, 116, 209), color(180, 
      217, 96), color(61, 243, 184), color(124, 38, 76)]
  seam-DP([image(2, 2): color(1,1,1), color(1,0,2), color(2,0,1), 
      color(0,3,0)]) is [image(1, 2): color(1, 0, 2), color(0, 3, 0)]
  # all streams same energy
  seam-DP(image2) is [image(1, 2): color(1, 0, 2), color(0, 3, 0)]

  # same energy streams start different place end one place
  seam-DP(image3)
    is [image(4, 3): color(0, 10, 0), color(0, 10, 0), color(0, 0, 0), 
    color(0, 10, 0), color(0, 100, 0), color(0, 10, 0), color(0, 10, 0), 
    color(0, 100, 0), color(0, 200, 0), color(0, 100, 0), color(0, 100, 0), 
    color(0, 200, 0)]

  # same energy streams start same end same diverge middle
  seam-DP(image4)
    is [image(4, 3): color(0, 20, 0), color(0, 0, 0), color(0, 0, 0), color(0, 
      20, 0), color(0, 250, 0), color(0, 100, 0), color(0, 10, 0), color(0, 250,
      0), color(0, 200, 0), color(0, 100, 0), color(0, 100, 0), color(0, 200, 
      0)]
  
  # multiple seams all disjoint
  seam-DP(image5)
    is [image(6, 3): color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0)]
end


### MAIN FUNCTIONS

fun liquify-memoization(input :: Image, n :: Number) -> Image:
  doc:```consumes an image and the number of seams to carve out; 
      returns the carved image using memoization```
  ask:
    | n == 0 then: input
    | otherwise:
      energized = energy(input)
      final-row = seam(energized).first
      min-path = fold({(acc1, x): 
          if x.weight < acc1.weight: 
            x 
          else if within(0.0001)(x.weight, acc1.weight):
            # if same stream value check start point
            if x.path.first < acc1.path.first:
              x
            else:
              acc1
            end
          else: 
            acc1 
          end}, final-row.first, final-row).path
      pixels = input.pixels
      new-pixels = for map2(row from pixels, drop from min-path):
        row.take(drop).append(row.drop(drop + 1))
      end
      liquify-memoization(image-data-to-image(input.width - 1, input.height, 
          new-pixels), n - 1)
  end
end

fun liquify-dynamic-programming(input :: Image, n :: Number) -> Image:
  doc:```consumes an image and the number of seams to carve out; 
      returns the carved image using DP```
  ask:
    | n == 0 then: input
    | otherwise: liquify-dynamic-programming(seam-DP(input), n - 1)
  end
end

Their 'common code' library

# Write data bindings here that you'll need for tests in both
# fluid-images-code.arr and fluid-images-tests.arr

### Test Images

image1 = [image(3,2): color(0,2,0), color(0,0,0), color(0,3,0), 
  color(1,0,0), color(0,3,0),  color(0,0,4)]

image2 = [image(2, 2): color(1,1,1), color(1,0,2), color(2,0,1), 
  color(0,3,0)]

image3 = [image(5, 3): color(0,10,0), color(0,0,0), color(0,10,0), 
  color(0,0,0), color(0,10,0), color(0,100,0), color(0,10,0), color(0,100,0)
  , color(0,10,0), color(0,100,0), color(0,200,0), color(0,100,0), 
  color(0,10,0), color(0,100,0), color(0,200,0)]

image4 = [image(5, 3): color(0,20,0), color(0,0,0), color(0,10,0), 
  color(0,0,0), color(0,20,0), color(0,250,0), color(0,10,0), color(0,100,0)
  , color(0,10,0), color(0,250,0), color(0,200,0), color(0,100,0), 
  color(0,250,0), color(0,100,0), color(0,200,0)]

rand-image = [image(4, 3): color(100, 46, 1), color(148, 60, 109), color(210, 
    10, 153), color(114, 161, 217), color(115, 68, 96), color(235, 211, 
    143), color(249, 212, 51), color(172, 116, 209), color(180, 217, 96), 
  color(61, 243, 184), color(124, 38, 76), color(95, 83, 249)]

image5 = [image(7, 3): color(10, 0, 0), color(0,0,0), color(10, 0, 0), 
  color(0,0,0), color(10, 0, 0), color(0,0,0), color(10, 0, 0), color(10, 0, 0)
  , color(0,0,0), color(10, 0, 0), color(0,0,0), color(10, 0, 0), color(0,0,0), 
  color(10, 0, 0), color(10, 0, 0), color(0,0,0), color(10, 0, 0), color(0,0,0),
  color(10, 0, 0), color(0,0,0), color(10, 0, 0)]

image6 = [image(4, 1): color(1,0,0), color(9,0,0), color(3,0,0), color(6,0,0)]

image7 = [image(5, 6): color(0,20,0), color(0,0,0), color(0,100,0), 
    color(0,0,0), color(0,20,0), color(0,250,0), color(100,100,100), 
  color(0,100,25), color(100,100,100), color(0,250,0), color(0,200,0), 
  color(0,100,0), color(0,250,0), color(0,100,0), color(0,200,0), color(0,20,0),
  color(0,100,0), color(0,100,0), color(0,100,0), color(0,20,0), color(0,100,0),
  color(100,100,100), color(100,0,100), color(100,100,100), color(0,100,0), 
  color(0,100,0), color(200,200,200), color(100,0,0), color(200,200,200), 
  color(0,100,0)]

Their tests

# Write your examples and tests in here. These should not be tests of
# implementation-specific details (e.g., helper functions).

### Liquify Memoization

check "liquify-memoization base case":
  # remove no seam
  liquify-memoization(image1, 0) is image1

  # vertical image no removal
  liquify-memoization([image(1,1): color(0,0,0)], 0) is [image(1,1): 
    color(0,0,0)]

  # single row
  liquify-memoization(image6, 1)
    is [image(3, 1): color(1, 0, 0), color(3, 0, 0), color(6, 0, 0)]
end

check "liquify-memoization standard removal":
  # remove one seam
  liquify-memoization(image1, 1) is [image(2,2): color(0,0,0), color(0,3,0), 
    color(0,3,0),  color(0,0,4)]
  
  # remove one from random
  liquify-memoization(rand-image, 1)
    is [image(3, 3): color(100, 46, 1), color(148, 60, 109), color(210, 10, 153)
    , color(115, 68, 96), color(235, 211, 143), color(172, 116, 209), color(180,
      217, 96), color(61, 243, 184), color(124, 38, 76)]
end

check "liquify-memoization multiple seam removal sensitivity":
  # remove two seams
  liquify-memoization(image1, 2) is [image(1,2): color(0,0,0), color(0,3,0)]

  # remove two seems from random image
  liquify-memoization(rand-image, 2) 
    is [image(2, 3): color(148, 60, 109), color(210, 10, 153), color(115, 68, 
      96), color(172, 116, 209), color(61, 243, 184), color(124, 38, 76)]

  # remove left crossing seam then take out another
  liquify-memoization(image7, 2) 
    is [image(3, 6): color(0, 0, 0), color(0, 0, 0), color(0, 20, 0), color(0, 
      250, 0), color(100, 100, 100), color(0, 250, 0), color(0, 200, 0), 
    color(0, 100, 0), color(0, 200, 0),  color(0, 20, 0), color(0, 100, 0), 
    color(0, 20, 0), color(100, 100, 100), color(100, 100, 100), color(0, 100, 
      0), color(100, 0, 0), color(200, 200, 200), color(0, 100, 0)]

  # single row multiple
  liquify-memoization(image6, 2)
    is [image(2, 1): color(3, 0, 0), color(6, 0, 0)]

  # remove all but one
  liquify-memoization(image7, 4)
    is [image(1, 6): color(0, 0, 0), color(0, 250, 0), color(0, 200, 0), 
    color(0, 20, 0), color(0, 100, 0), color(100, 0, 0)]
  
  # remove two from constant
  liquify-memoization(image5, 2)
    is [image(5, 3): color(0, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0,
      0, 0), color(10, 0, 0), color(0, 0, 0), color(0, 0, 0), color(10, 0, 0), 
    color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), color(0, 0, 0), color(10, 
      0, 0), color(0, 0, 0), color(10, 0, 0)]
end

check "liquify-memoization same energy sensitivity":
  # all streams same energy
  liquify-memoization(image2, 1) is [image(1, 2): color(1, 0, 2), 
    color(0, 3, 0)]

  # same energy streams start different place end one place
  liquify-memoization(image3, 1)
    is [image(4, 3): color(0, 10, 0), color(0, 10, 0), color(0, 0, 0), 
    color(0, 10, 0), color(0, 100, 0), color(0, 10, 0), color(0, 10, 0), 
    color(0, 100, 0), color(0, 200, 0), color(0, 100, 0), color(0, 100, 0), 
    color(0, 200, 0)]

  # same energy streams start same end same diverge middle
  liquify-memoization(image4, 1)
    is [image(4, 3): color(0, 20, 0), color(0, 0, 0), color(0, 0, 0), color(0, 
      20, 0), color(0, 250, 0), color(0, 100, 0), color(0, 10, 0), color(0, 250,
      0), color(0, 200, 0), color(0, 100, 0), color(0, 100, 0), color(0, 200, 
      0)]

  # multiple seams all disjoint
  liquify-memoization(image5, 1)
    is [image(6, 3): color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0)]
end

### Liquify Dynamic Programming

check "liquify-dynamic-programming base case":
  # remove no seam
  liquify-dynamic-programming(image1, 0) is image1

  # vertical image no removal
  liquify-dynamic-programming([image(1,1): color(0,0,0)], 0) is [image(1,1): 
    color(0,0,0)]

  # single row
  liquify-dynamic-programming(image6, 1)
    is [image(3, 1): color(1, 0, 0), color(3, 0, 0), color(6, 0, 0)]
end

check "liquify-dynamic-programming standard removal":
  # remove one seam
  liquify-dynamic-programming(image1, 1) is [image(2,2): color(0,0,0), 
    color(0,3,0), color(0,3,0),  color(0,0,4)]
  
  # remove one from random
  liquify-dynamic-programming(rand-image, 1)
    is [image(3, 3): color(100, 46, 1), color(148, 60, 109), color(210, 10, 153)
    , color(115, 68, 96), color(235, 211, 143), color(172, 116, 209), color(180,
      217, 96), color(61, 243, 184), color(124, 38, 76)]
end

check "liquify-dynamic-programming multiple seam removal sensitivity":
  # remove two seams
  liquify-dynamic-programming(image1, 2) is [image(1,2): color(0,0,0), 
    color(0,3,0)]

  # remove two seems from random image
  liquify-dynamic-programming(rand-image, 2) 
    is [image(2, 3): color(148, 60, 109), color(210, 10, 153), color(115, 68, 
      96), color(172, 116, 209), color(61, 243, 184), color(124, 38, 76)]

  # remove left crossing seam then take out another
  liquify-dynamic-programming(image7, 2) 
    is [image(3, 6): color(0, 0, 0), color(0, 0, 0), color(0, 20, 0), color(0, 
      250, 0), color(100, 100, 100), color(0, 250, 0), color(0, 200, 0), 
    color(0, 100, 0), color(0, 200, 0),  color(0, 20, 0), color(0, 100, 0), 
    color(0, 20, 0), color(100, 100, 100), color(100, 100, 100), color(0, 100, 
      0), color(100, 0, 0), color(200, 200, 200), color(0, 100, 0)]

  # single row multiple
  liquify-dynamic-programming(image6, 2)
    is [image(2, 1): color(3, 0, 0), color(6, 0, 0)]

  # remove all but one
  liquify-dynamic-programming(image7, 4)
    is [image(1, 6): color(0, 0, 0), color(0, 250, 0), color(0, 200, 0), 
    color(0, 20, 0), color(0, 100, 0), color(100, 0, 0)] 
  
  # remove two from constant
  liquify-dynamic-programming(image5, 2)
    is [image(5, 3): color(0, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0,
      0, 0), color(10, 0, 0), color(0, 0, 0), color(0, 0, 0), color(10, 0, 0), 
    color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), color(0, 0, 0), color(10, 
      0, 0), color(0, 0, 0), color(10, 0, 0)]
end

check "liquify-dynamic-programming same energy sensitivity":
  # all streams same energy
  liquify-dynamic-programming(image2, 1) is [image(1, 2): color(1, 0, 2), 
    color(0, 3, 0)]

  # same energy streams start different place end one place
  liquify-dynamic-programming(image3, 1)
    is [image(4, 3): color(0, 10, 0), color(0, 10, 0), color(0, 0, 0), 
    color(0, 10, 0), color(0, 100, 0), color(0, 10, 0), color(0, 10, 0), 
    color(0, 100, 0), color(0, 200, 0), color(0, 100, 0), color(0, 100, 0), 
    color(0, 200, 0)]

  # same energy streams start same end same diverge middle
  liquify-dynamic-programming(image4, 1)
    is [image(4, 3): color(0, 20, 0), color(0, 0, 0), color(0, 0, 0), color(0, 
      20, 0), color(0, 250, 0), color(0, 100, 0), color(0, 10, 0), color(0, 250,
      0), color(0, 200, 0), color(0, 100, 0), color(0, 100, 0), color(0, 200, 
      0)]

  # multiple seams all disjoint
  liquify-dynamic-programming(image5, 1)
    is [image(6, 3): color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0), color(0, 0, 0), 
    color(10, 0, 0), color(0, 0, 0), color(10, 0, 0)]
end

Finished

Remainder of the lectures if you want to watch them, we have no more assignments unless you want to try an older assignment that uses the unix shell. We are as he says at the end of lecture 36 officially developers now.

  • Lecture 32 to 34
    • Preview of his PL course CS1730
    • Parsing, let statements, lazy eval
  • Lecture 35
    • Internet Networks explained
  • Lecture 36
    • Security, how basic programs bring in way too much access and node dependency problems

Option: CS173 Programming Languages

If you liked CS19 you'll like CS173 the sequel taught by the same professor.

  • CS173 Programming Languages 2022
    • Lectures torrent
      • Split into 2 videos you play at the same time, the video labelled 'B' is typically the room and board and 'A' is his laptop. The laptop feed will always be the smaller sized video (no sound).
    • Book is free at plai.org and SMoL tutors
    • Assignments (starter code available in previous semesters)
    • DrRacket which you already downloaded/used in the placement for CS19

TODO

Option: 15-850 Advanced Algorithms

I go through this here.

This course teaches the design methods of algorithms and the fastest current runtime. To me it is more interesting to see how they designed all these instead of just being given a recipe of code to memorize. This is a graduate course so they expect you to figure everything out yourself. They have given us this list of resources to help us learn the material which we'll go into as we try and complete this. You learn this by reading the papers they are extremely thorough going from scratch describing each aspect of every algorithm and how the bound is derived. These papers are way more detailed than any algorithms text. We will try and implement these ourselves using competitive programming resources when possible designing our own algorithms.

It won't make sense at first but after you start reading enough of the papers and seeing how they derive bounds it will.

2023 direct links to lectures:

Lecture 01: Introductions 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/5dd7a3c9-ef73-442e-9d64-af8e016fde50.mp4 
Lecture 02: MSTs & Min Cost Arborescence (MCA) 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/bd5580b5-b3fc-46ea-bd01-af90016e783a.mp4 
Lecture 03: LPs and MCAs 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/5d01f36a-0a38-44d8-90fa-af93016f55da.mp4 
Lecture 04: Shortest Paths 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/014eb7e5-6232-4259-a5e0-af95016d7c9f.mp4 
Lecture 05: Low Stretch Spanning Trees 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/77756715-0b4c-4781-b42c-af97017689f8.mp4 
Lecture 06: Shortest Paths cont. 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/0b2036b9-1256-49a8-a8e1-af9a0176e7bf.mp4 
Lecture 07: Matchings 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/5624f2eb-b44f-46ab-b951-af9c0172e43c.mp4 
Lecture 08: The Polynomial Method 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/528376aa-fd8e-4f2e-baca-af9e016d91ae.mp4 
Lecture 09: Weighted Matchings 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/bb9646fd-0aaf-48a5-8da1-afa1016ed694.mp4 
Lecture 10: Weighted Matchings cont. 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/f41d1fd1-3bd1-435a-b8e4-afa3016e33d3.mp4 
Lecture 11: Concentration of Measure 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/88122358-e5cb-4951-9ce4-afa5016fd37b.mp4 
Lecture 12: Concentrations 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/e9b87046-39a3-4456-8e8b-afa8016db9f8.mp4 
Lecture 13: Dimension Reduction 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/2a9678cb-89b3-40b7-aa12-afaa01709787.mp4 
Lecture 14: Streaming Algorithms 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/f612651a-ebaf-47ab-8943-afac016e530e.mp4 
Lecture 15: Experts Algorithm 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/8dd4c0dd-3a71-4a3c-b0e7-afaf016d5a66.mp4 
Lecture 16: Submodular Functions 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/5cddeccf-3d21-4033-8f8a-afb10176e3ea.mp4 
Lecture 17: Experts cont. 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/901216d6-d004-4c2c-8d26-afb301723dba.mp4 
Lecture 18L Experts for LPs 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/0136339b-d0f6-49a2-9662-afb6016fd460.mp4 
Lecture 19 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/d1f82e72-4d0d-435a-b44a-afb80170f9ab.mp4 
Lecture 20: Gradient Descent 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/733d7f7f-5399-471a-a110-afba017134d9.mp4 
Lecture 21: Ellipsoids, Centroids, etc. 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/5d9edc8a-ebdb-4614-b1c3-afc40160a2c5.mp4 
Lecture 22: Ellipsoids &amp; Poly-Time LPs 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/c2a99b4a-2d07-41c6-a02e-afc60162134f.mp4 
Lecture 23: Interior Point Methods 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/b31db8e1-a889-4970-b5bb-afc8016293c6.mp4 
Lecture 24: Approximation Algorithms 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/7d979db1-eb4f-4abe-a32b-afcb0164b878.mp4 
Lecture 25: Approximations &amp; SDPs 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/97018e86-afa1-4eec-9474-afcd01669016.mp4 
Lecture 26: Online Algorithms 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/04c0a530-cf48-4787-b15d-afcf01612487.mp4 
Lecture 27: The Sum of Squares Paradigm 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/0d4ca6c8-cedf-440d-bdab-afda00ec8cae.mp4 
Lecture 28: Martingales &amp; Concentrations 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/7c6bc42f-8cb2-4aa6-9d7b-afdb016279f2.mp4 
Lecture 29: Expander Graphs 
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/831cce2b-fbe3-41ce-bd55-afe0016352b5.mp4 
Lecture 30: HDXs
https://scs.hosted.panopto.com/Panopto/Podcast/Syndication/5a43f3bf-e9e4-46fd-9833-afe2015bf488.mp4  

Home