Software Development

Table of Contents

Start here

I assume you are teaching yourself at the same time some kind of intro course you found on YouTube or a book somebody recommended such as:

  • Accelerated Intro to CS
    • eventually everything here will be there too
  • Functional CS
    • the math model of programming (category/type theory)
  • SICP
    • w/Sussman's new book on programming adaptive systems similar to biology

During work socials I run into recruiters in finance who tell me a tale of woe how much harder it has become to find good entry-level candidates from schools so I asked them what would a non-degree candidate need.

Their optimal candidate:

  • knows about the CPU-arch execution pipeline and caches
  • knows how their chosen language's runtime works
    • memory allocation can happen without intentionally triggering it
    • using free can be an expensive operation
  • knows the standard algorithmic problem solving strategies
    • divide/conquer, dynamic programming
  • knows the basics of problem complexity taught in every undergrad
  • knows or can learn probability for performance modeling
  • has a very high standard of code quality
  • has written at least one multi-threaded application where network programming was involved
  • knows what they don't know
    • usually caught in interviews when they catch the candidate bluffing and ask them to define on the spot any jargon being used

If you satisfy at least some of the above characteristics then you are considered a great candidate because you can be trained to learn everything else including whatever customized tooling/compilers or languages these firms use.

The work

There's many reasons why you'd want to work for trading firms and most of them like Valkyrie will hire juniors.

Everyday you are the race car engineer of software like how they will rebuild an engine and retune the car after each race to squeeze out more optimization that's what you do at many of these firms. There's also work in building high performance models and fancy consumer dashboards.

Recruiters

Jove Intl, Oxford Knight, Reload Search, and ex-employees who have quit to become recruiters. They hire internationally for Chicago, NYC, Austin, London, Sydney, Hong Kong, Mumbai, Shanghai, Tokyo and now Singapore seems to be the location everyone is opening offices these days. LinkedIn is trash but works fine for harvesting recruiter offers.

Selection

I got to try their fancy recruiter tools and paid access to LinkedIn to see the minefield of bad candidates they have to filter.

They told me all resumes are reviewed manually there's no automated ATS or applicant tracking system. USACO training or competitive background in anything from sports to math stands out, so-called modern C++ meaning C++17/20/23 stands out but is not necessary as any firm will train you. You never sell yourself as a junior, companies will hire anyone who knows how to sell themselves and can pass the technical competence filters. You always pretend to be employed writing software even if this means setting up a personal page declaring yourself a freelancer on Upwork actively looking for contract work.

Campus recruiting

Getting more and more monopolized by big tech corps pressuring students with 2-week expiring offers (aka exploding offers) that are against campus hiring guidelines but they do it anyway. Think about how much money is spent on campus engagement to find (highly paid) interns and they have to hope that intern comes back after graduating and many don't. This should tell you they really want to find junior candidates willing to stay to become senior staff so if they find you as a rough diamond in a sea of mediocre trash it's an easy investment for them.

Phone screens/on-site interviews

I don't know if they still do this but one Chicago company was notorious for firing off 3 problems to solve on HackerRank seconds after you submitted an application. The first phone screen is usually a recruiter asking are you legal to work and if not what credentials do you have because in some countries like the US you will need a reason for them to issue you a work visa for a job that can't be given to a local so a degree or equivalent work experience. If they want you they'll finesse these requirements if not try Asia.

The new fizzbuzz filter these days is to implement atoi and tests how careless of a developer you are meaning you don't ask any questions about requirements and start writing code immediately. You may be asked something like this as a screen. Here is another typical remote style interview screen which is the first of many, many more interviews. If you pass these screens you may be invited on-site meaning they will pay for you to come out there and job shadow for a week while being hammered with interviews everyday if it's a big US company however I've noticed in Asia the hiring process is much shorter.

As a personal anecdote the on-site interviews after all the screens seem designed as a trap to once again see how careless of a developer you are. The problems I've seen have an obvious area of optimization to take advantage of to squeeze out some pointless constant factors but will open a potential bug or performance sink elsewhere if you aren't taking your time thinking through the whole original problem space. The bad candidate will pounce on that and show off their meaningless optimization while the good candidate will ask what happens if I do X, is it a good tradeoff for the risks involved? Prime example they will try and get you with is cache invalidation and we will learn all about caches here so you won't be the one who invalidates the cache on their last day of on-site interview.

Hiring someone is an investment it will typically take 3 months for them to be productive and if you're being paid $200k/yr that's $50k they have spent hoping you will turn out to be a good hire so now you know why there is all these hoops to jump through.

Find a mentor

Once you are in and develop a routine meaning you are no longer panicking try and walk over to the department where all the money is being made and find a mentor to teach you by asking them to give you the work they don't like doing because you want to learn it all from scratch. All you have to do is be the reliable person they turn to whenever some (tedious at first) work needs to be done and if this work over time becomes increasingly more skilled then you have found the right mentor and you are set for life because they will eventually want you to work with them once you have proven to be indispensible. If they get promoted you get promoted, if they go off and start a new company you are the first hire. This can be tricky because of signing multiple NDAs even between internal departments but if you are serious about wanting to learn they will tell management to make it happen. I actually took all the above advice from a comedian many years ago and my life would not be the same if I hadn't watched that throwaway clip at 3am because I always assumed there was no way I could ever do this kind of work.

If you're not interested in finance there are a few companies that mentor developers sometimes even direct from local high schools and build them to senior developers. One is in New Hampshire called Northwoods Software. If you live in Egypt, Georgia, North Macedonia, Albania, Latvia, or Bosnia you can work remotely for Scandiweb (Latvia). The pay is only 500-600 Euros per month but you don't know anything yet so are being paid for direct mentorship to become a senior developer. I've never talked to any developers outside of FAANG that weren't interested in hiring good entry-level juniors even though they didn't advertise these positions.

Curriculum

USACO training is excellent and it's completely free. This is where we learn algorithmic problem solving strategies. We're given all the tests so use any language you want.

Explains that almost all industry popular languages are virtually identical and demystifies all the OOP jargon. We use a web-based stacker to see exactly how programs execute.

  • Conceptual design book The Essence of Software

MIT course and book on how to design software people want to use by modeling their state.

If you write enough software eventually you discover that almost everything is some kind of queue pattern. This is another workbook that's Q&A style not a giant theory text.

Here we learn kernel bypass methods for low latency networking, multicore programming, hashing/join algorithms, search space complexity reduction, and modern distributed application architecture like microservices. If you look at showcase it's an example of what we'll be doing which is building out some critical part of the application either I/O, query optimizer, buffer pool manager, or whatever it is you want to build that is taught in the course. The prof Andy Pavlo teaches the engineering of these analytical systems and what features/tradeoffs you may want to consider then we write the code from scratch in your preferred language.

We will learn about profiling and performance engineering taking some material on cache architecture from ETH Zurich and some compiler lectures.

Software Design

MIT has a course 6.1040 Software Design and the professor of that course wrote a book The Essence of Software. This is way to design software on a piece of paper so when it comes time to actually code the app it essentially writes itself.

See this YouTube video from an AI talk at MIT where the professor of this course further explains what a concept is and how you would use a bunch of state machines to get an LLM to develop an entire web app. In their experiment they had to use 'a few hundred' prompts which then becomes your version history and I can only imagine the nightmare of trying to refactor those prompts.

Audit through these slides first then come back to see them in detail when building something:

  • Diverge/Converge Brainstorm ideas then turn those ideas into concepts and work out the interface they will each need. In the tutorial for divergent design on the prof's personal site he used Chat-GPT to suggest additional features.

Page 8 check out his riffle.systems idea where a 'reactive database' is used to declaratively write entire websites even generating the DOM elements thus totally replacing how modern webdev is done.

  • Wireframing recitation. If you go on Upwork or other freelancing sites you will see many jobs looking for someone to take a figma design and turn it into a working app.
  • More Concepts and at the end of these slides you have a complete draft of an application that is now trivial to code into an MVP or minimum viable product (MVP). To watch more about concepts see here, skip the 'message of the day', scroll to the very bottom of the materials list where there are 6 lecture videos. The author goes through each topic like what an operational principle is on his tutorial page here and completely explains how these are state machines and how to define the state of concepts.
  • Design moves how to unify or loosen concepts with many examples here like Zoom's screwy interface or Gmail labels.
  • API design what a RESTful API is.
  • Data design there are recorded lectures here. NoSQL is introduced here which you never want to use if not writing a temporary prototype as doing updates and inserts as JSON is a convoluted mess.
  • Make things learnable anyone should be able to figure out your software without relying on documentation. Notable examples of software where you cannot do anything without extensive docs is Tailwind CSS and Git.
  • Good UIX design basics.
  • Design innovation what was novel about Zoom? Why did everyone use it instead of FaceTime, Meet, Microsoft Teams etc?

Brainstorm ideas, converge them into separate interfaces on a piece of paper like they are independent micro machines. Write out a dependency diagram by going back to the slides when designing your own app and looking at what they did. Go by the criteria: A 'uses' B only if A is simpler because it uses B and B is simpler because it doesn't need A. There is no useful subset containing A but not B. That means a legit dependency. Wait until you get paid to do this somewhere, everyone breaks the above advice and you get stuck fixing their countless needless dependencies that make your life hell.

If you want to try the Vue3 recitations and assignments they're all open on the course GitHub repository and an example is written in TypeScript which is taught in 6.102 but I'm not going to bother since our interest here is learning conceptual design then you can use any language/framework you wish. The project writeups are extremely good if you want to build something then try following them. You don't have to use Figma you can use a piece of paper to design your UI.

Version control

The prof of this course with a grad student created gitless.com see this research page on MIT's software design group. Read how they audited the conceptual design of Git and found things like staging were confusing.

Many companies today use a customized Mercurial fork or rewritten clone instead of git like Meta's rust written EdenFS and Sapling SCM because git was originally developed to take advantage of linux file system optimizations ergo is conceptually attached to that file system while Mercurial can be easily adapted to be stuffed in a database, distributed or run on any OS. In a Mercurial style of version control, you probably don't care about individual commits so you don't even add comments to commits you instead care about entire branch development.

For these reasons I don't think you should bother to use git at all unless someone is forcing you. Look up how easy and simple the Mercurial guide is for a lone developer why use anything else. If you must then try MIT's version control lecture from it's course The Missing Semester of Your CS Education or read 'Git for Hackers. Open source development outside of FOSS you generally fork the project to your account on Github, commit changes there then make a 'Pull Request' or PR to the original repo asking if they will merge your changes. The FOSS way is different using Savannah for example here is a patch it's all done in plain text using a traditional mailing list style of attaching files and people vote/review it.

Freelancing

Who needs software where you live? Restaurants, people cutting hair either out of their house or a storefront, people wanting to sell something to avoid Shopify fees, people wanting to rent their condo public spaces or guest suites, people wanting a dashboard of all sites they are selling on combined into one command center. Every service needs a booking or quote site such as painters, tailors or power washers or junk haulers. There's always something people want and the current offers for the service industry are largely pure shit.

My recommendation is to be the local Zapier style developer and stitch together software people actually use instead of inventing an inferior product yourself like everyone else is doing. People are already using business software they prefer like Quickbooks for accounting or Zoho/Salesforce for CRM, or iCal/Google calendar for bookings. You now sew that all together into a custom product for them.

Starting out

I tried a local experiment where the goal was to ask people about their business and what problems they were having to see if software could help. You just let others talk about themselves. I ended up with dozens of leads and if I was a software mercenary I would have picked up a lot of work. Unfortunately I'm not a software merc but kind of thinking I should be now that I've seen how easy it is. Everyone wants to work with someone they've personally met not a stranger on a website and almost every conversation ended with 'do you know anyone who can do this' and them giving me their contacts.

I also called people out of the blue from LinkedIn from various local companies and just told them hey I'm trying to find out about your work tell me everything about it and they talked to me too which was not what I expected.

Research

Here's what I found out by just walking into places and meeting people. Every one of them had a problem with existing software they wanted to modify and they had never met anyone locally who could independently write software who wasn't some SaaS sales guy pushing an existing product.

Salons

An example using a Brooklyn nail salon. Their website basically points to Instagram and solely consists of a form that means they will have to phone or text/email whoever books to correct any over bookings. This can all be automated easily and sync some iCal protocol. Every one of these businesses rents out chairs to independent freelancers so they needed a fast way to add/remove new names to the booking calendar as these freelancers move around all the time. Deposit for reservations they wanted too but nobody offers this.

Many of these salons are now mixed use meaning some girl doing tattoos will be working out of 'Clarissa's Nails' and while I was standing outside these places entering notes in my phone I had a few customers asking me if I've heard of 'Samantha's Tattoos' or whatever because it wasn't clear on any of their websites they were a shared business. There's another problem easily solved with custom software.

My own barber mainly took appointments by phone call or text message than had to manually enter it into some shared iCal protocol scheme the shop maintained as a centralized accounting as all barbershops rent chairs to visiting freelance barbers too.

Restaurants/Bars

Restaurants I found use at least 3-4 different software vendors. Every one prefers to handwrite tickets and have the wait staff manually enter it into an inventory and restaurant management system that uses a proven industry touch screen for high volume that's not some consumer grade junk like iPads. Problem here was there was only one touch screen so they had to take turns doing it as adding another would cost thousands in 'set up fees' to have a tech come do it. I didn't have much time so got the make/model of the screen, cables and inventory vendor which all had open APIs so you could if you wanted set up another screen. These inventory apps had a total monopoly over the entire city and would be impossible to compete with as everyone uses them by default since they offer one-click bill splitting, labour and scheduling, inventory, menu management and accounting which I found out has to be certified as there's a lot of tax fraud in restaurants so some regulator has to approve your software. We don't have to compete with that just interface with it.

The second software they used was reservations which we can compete with. They absolutely hated every option because of insanity level fees like 10-20% 'service charge' being added on which customers mistakenly thought was the tip but it was the scam fee for the online booking service. Prepaid reservations of any amount reduced no shows and only 1 online service they knew of offered it but charged way too much. All trivial to setup using Stripe or whatever merchant processor they want.

The third software they used was online orders/main website which usually just redirected to uber or similar gig service for online orders. The #1 request was to centralize all these 3rd party order sites into a single dashboard. Another request was some kind of dashboard for reputation management and a way to capture and control reviews meaning if you had a bad review you get directed to management to correct and if it was a good review you are redirected to google reviews or elsewhere. This is trivial to do too send an email 'yes/no' and if they click yes redirect to a public review site. I'm sure there's better techniques but I didn't look into it too much but surprisingly not a lot of services offer reputation management for small businesses just trying to survive some malicious campaign from an idiot determined to shut them down.

Not a single one I talked to had ever met a software guy in person and asked me if I could build something to take an online order directly and avoid paying massive fees, automate ticket sales for live bands, sell merch, adjust the lights in the restaurant at set times or other automation, interface x accounting software with y mandatory gov inventory software and many other problems this was a gold mine of potential work I didn't expect.

Retail

If you look at Huntsman suit tailors on Savile Row they have an interesting website where you can click Made to Order and customize almost everything. This was trivial to copy (make your own gfx to not rip them off and get sued) I simply used an AI to make outlines. I approached every tailor locally and every one said this is exactly what they wanted plus some kind of zoom/facetime way to book consultations.

Every else was either using Shopify or Etsy but having to pay some person they've never met to help them, which they didn't like. The best retail site I know of is https://www.mcmaster.com/ it's exactly what a mechanical engineer would want when ordering product such as getting free CAD files to add to their project proposal and then push a button and mass order everything. If you look up their careers page they don't hire react devs they hire anybody and say they will train them from scratch to work on their website.

Design examples

Look up whatever web design awards for 2024 and see what people are doing. It's always going to be Europe because you know why but anyway examples let's review them. All of these use minor animation to show an incremental view as your scroll. Here is one site they nominated.

Rates

You will want to bill daily not hourly. Keep increasing your rates too, if you have so many clients you can't accomodate them all then you're not charging enough of a daily rate.

Agencies

Search for agencies and you will find many of them. Agencies are often never hiring because they find people to work there by approaching the best in the business and asking them to join the agency. This is how architect agencies work, ad agencies, etc. There's probably agencies in your city they will be more receptive to meeting and working with you than some remote agency.

Software Marketplaces

Try software marketplaces like Shopify or Gsuite and read all the reviews to find out what paying customers hate about a paid addon then make a better one. Notice how bad form creating software is in Shopify app marketplace that seems like something you could start selling right away and compete. If you start making money make sure you have 24/7 service and it's not a worthless chat app.

Startups

A real startup requires venture capital. Here's Peter Thiel giving a very good crash course in making a startup to Stanford students. Start small and monopolize something small. Here's an old article of Sam Altman of Open AI (now a monopoly too) talking about how Y Combinator's success is due to it being … a monopoly.

I researched the Y Combinator startup school so you don't have to and they tell you as a founder everyday you are only talking to customers. If you have 5 customers you talk to them everyday and ask them about their needs and make the wagies implement those features. If you have 50 customers you still talk to them everyday. You phone everyone you can in the business niche you are targeting and find out what they want eventually they will start paying you. That is literally it there's no other magic or anything it's just talk to absolutely everybody and build growth by listening to them and making a product that others will want because you listened.

Who buys software companies? Salesforce. They seem the most aggressive company right now buying up everyone.

USACO

I'm going to do all the USACO contest problems using inquiry-based learning because that's exactly what you do in real life. If you took Accelerated Intro to CS you have already seen almost every topic in the USACO guide and if you didn't it doesn't matter that's why there exists a guide which we can look at while we solve a problem, so you learn the algorithm as you are using it.

Real competitive programming

Here is Gennady Korotkevich aka tourist practicing on codeforces and explaining what he's doing, though he didn't start with C++ he used to win the IOI with Pascal in high school then switched later. The first thing to notice is he's using a basic text editor and minimalist shell that looks like MinGW a Windows port of GCC and Far Manager where he manually submits through the codeforces website UI. He prepares individual directories named after the problem then always makes a sol.cpp file in that directory so he doesn't have to change a macro he's written to compile the solution from the shell.

His C++ template:

// include every C++ std library and bits library
// doesn't work on all platforms
#include <bits/stdc++.h>

// use std namespace to avoid having to write std::cin or std::cout
// in other words bringing into scope all C++ std libraries 
using namespace std;

int main() {
// turn off default cout flush 
 ios::sync_with_stdio(false);
 cin.tie(0)

std::cin is synchronized to std::cout and performs default buffer flushing operations so to save some runtime he disables this sync. His compiler flags he explains in one of these recorded streams, most of them are to make compiler messages extra verbose for debugging. Note you would never write software outside of competitive programming like this because he has opened a gigantic scope.

You don't have to watch the entire 2hr vid, the last problem seems like the most informative. He begins most problems by writing examples as comments. He copies the problem input test cases into a text file 'in1' then pipes it into his compiled program "sol.exe < in1" from the shell as stdin (standard input). He writes multiple submissions to compare their runtime/memory usage. He looks for some kind of mathematical structure in the formulas given in the problem write-up. After a submission he reviews the solutions others did and compares them to his own but note he solved the problems first, he didn't give up and go straight to the solution.

A TLE or time limit exceeded verdict shows how he would optimize to get accepted verdict also trolling the stream by making all the variable names into single letters claiming it will speed up the program. The judging server ran his program faster than his laptop, something to remember.

Big-O explained

The USACO guide has a good introduction to big-O and a resource we can use where running time is actually explained is the beginning chapters of Functional Data Structures by Prof Ragde at Waterloo try reading the first few chapters up to 2.2.

The formal definition is:

\(f(x) \le c\cdot g(x)\) for all inputs \(x \ge x_0\) as x (inputs) goes to infinity. If that inequality holds then f(x) is in O(g(x)) which represents a worse-case scenario upper bound. A 'best-case' (wishful thinking) scenario for insertion sort is you don't have to do anything but attach the element to the already sorted list and the worst-case is the entire list is in reverse order and every element looked at and moved.

f() function is a cost function of time or resources derived from a cost model. It's where you count up every constant operation and loop costs by analyzing source. This is not always easy since what costs do you assign when the code triggers garbage collection or triggers memory allocation because an object has been created deep in some library code.

The g() function is chosen from a list of math functions that represent a 'family of functions'. A family means all functions that are linear, all functions that are quadratic, all functions that are linearithmic etc. It actually is a specific list as we'll soon see.

Why are constants hidden in big-O? If you read SICP their example is 16 and 8 both differ by a factor of 2 and 8 but we don't care which factor. In real life the constant factors do matter if they are hiding a lot of operations. A classic example of this is when Knuth was writing the Stanford GraphBase (contents are in Vol 4A of TAOCP) and his real-world testing of popular graph algorithms found that 'Fibonacci heaps' that were analyzed using a technique called amortized analysis weren't any better than Kruskal's old algorithm because the analysis was hiding large constant operations. One way to reduce constants is to do consecutive rounds of a linear algorithm such as cleaning up a graph removing cycles because O(n + n) or O(2n) is still O(n).

Big-O doesn't represent growth since if f(x) = x2 it does not grow to be quadratic, f(x) already is quadratic, thus f() is in \(O(n^{2})\) or upper bounded by every quadratic function. Typically in algorithm analysis the inputs are supposed to be 'near infinite' which is why n is used instead of x in f(n) to signal the difference in inputs but here in the USACO guide we can see that if the input is small then we can get away with some very bad runtimes even exponential.

Problem complexity

Before we start trying to solve problems watch this Erik Demaine lecture and skip to 09:00. We want to know what polynomial time is, what NP means, what satisfiability means and this is a good enough ~30m crash course in complexity as these concepts come up all the time in algorithm literature. He tells us most decision problems can't be solved by an algorithm which ruins the AI superintelligence dystopia. When he translates Super Mario Bros to a 3SAT decision problem we can see that almost every algorithm you can think of can be transformed to a decision problem. You don't have to keep watching this but there's an interesting intro to constraint logic/graphs. Now you can explain to someone P=NP means all the nondeterministic polynomial time algorithms that always 'lucky guess' correctly or have a magic oracle can be solved by an algorithm in deterministic polynomial time which seems impossible.

The rest of this course I won't do here but the book he wrote that goes with the lectures looks excellent covering topics in economics too.

Languages

If you look at the archived contests we are given all the test inputs/outputs as a .zip and a fully worked out solution with a proof of the bounds. This means you can use any language you want however I am going to do it all in C++ with Knuth's literate programming tool cweb because it mimics exactly how you would build something in a collaborative setting and after using it for a few problems it's kind of awesome and I finally get why Knuth shills this style of programming so much.

TODO


Home