Blah blah, not done yet
The latest version of this document may be found at:
http://parseerror.com/pizza/doc/On-Code-as-Data.html On Code as Databy pizzaWelcome to the 21st century. For less than a day's salary I can buy an electronic machine capable of performing more simple operations in one second than I could perform in a year. With the wide availability of fast arithmetic workhorses it is within the capabilities of an intelligent human to perform tasks of polynomial complexity nearly instantaneously. And yet, for the most part, we can't and we don't. Why is that? Our software is hopelessly bug-ridden; large projects comprise millions of parts and have thousands of bugs. Programmers spend hours chasing down obscure syntax errors, hard-to-find runtime errors and version incompatibility issues. The more we work on software the more complex and inaccessible it becomes; the exact opposite of most other engineering fields. Why is that? The reason is so simple that most popular contemporary languages seem strangely to have overlooked it. The vast majority of popular, contemporary programming languages can easily digest, filter, sort and modify any data except for their own source. This is such a simple, fundamental concept that its importance cannot be overstated. This one seemingly esoteric feature is the cause of all the software industry's headaches. Why is that? OvercomplexityLike any machine, software is comprised of simple individual parts, each straight-forward in isolation. Software is much less limited by the physical world than its physical counterparts, however. For example, a Rube Goldberg device's charming overcomplexity is obvious because its structure and operation are clearly observable. Not so for software; indeed software's overall structure and operation is not clearly observable. This opaqueness leads to a chronic lack of **** details, and that leads to uninformed decision-making, and that leads to all sorts of predictable problems. Why is software's overall structure and operation not easily observable? It is a negative consequence of software's principle strength: its malleability. Software is valuable because it allows for modification within costs and timeframes that would be impossible in a physical machine. Constrast changing the color of your application vs. changing the color of a bridge. But with great power comes great responsibility; the perception of unlimited cheap and easy modification invariably leads to reduction in planning ("we'll (fix it/figure it out) later"). How Software GrowsThe problem of insufficient planning is compounded by another real-world factor. Like a mighty tree, software does not appear fully-formed overnight; a mature product takes years of incremental growth. This is because of cost; a project able to earn income while under development requires less up-front capital and ultimately costs less. Active development also allows for customer feedback. As software grows features are overhauled as necessary to support new features or new performance requirements. This means that meta-level diagnostics and monitoring features are almost never included in software's original design [cite Linux, etc.]. Necessary ComplexityIndeed, software is necessarily more complex than its outputs. Put another way, in the software world the solutions are more complex than the problems they solve. Which begs the question: if the original problem is complex enough to require a software solution, then how does one manage the solution? How can I be sure it's correct? How can I be sure it will always work? How can I be sure it fails properly? How can I be sure it doesn't include huge amounts of superfluous logic? How can I be sure it even runs? Human OverheadThe traditional method of addressing these questions is good old-fashioned human mental labor. Specifically: testing, documentation and peer review in various forms. However, these methods
Code as DataThe answer to managing software is the same as the answer to the original question: software. Just as we have programs to search, sift and classify documents, products, contacts and customers we need software to perform those tasks on software. We must allow software itself to reap the benefit That being said, software including meta-software capabilities has existed for decades.
Each of these tools addresses a facet of the problem of making ChallengeI posit this challenge to the programming community: Implement a source-code querying framework for an existing programming language in the language itself. Some suggestions are C, Python, Ruby or Lua, but anything goes. The interface would include the ability to: References |