Perhaps it is consultants' logic at work. Perhaps it is the dominant view of the time, missing the forest for the trees, the overwhelming pervasiveness of the problem, or a problem for which engineers are ill-suited to solve. I don't know.
Blaming defects on the engineers that wrote them is, although comforting and not necessarily incorrect, not the full story. The real defect is development and release methodologies which put complete trust in the unverifiable perfection of every single line of an encylopedia-sized tome compiled over time by disparate, imperfect sources.
Unit testing is simply incomplete. Tests are the equivalent to blacklists; incomplete, unverifiable lists of known, bad entities. The blacklist model is necessarily incomplete because it is reactionary — new tests are added only when new bugs are found. Bugs that are not found are not tested. Bugs that are not considered, tested and fixed by a human lay dormant in software.
Bugs in common codepaths are found soon; bugs in less-common codepaths may exist in widely-used, high-quality software for decades before they are found. The number of undiscovered bugs is impossible to determine.
I am convinced that the slow, methodical construction of software from provably defect-free components is the only sane way to develop software with predictable behavior; everything else is, at some level, cowboy coding.
The current problem with verifying things like operating systems is that they are not designed to be verified. Operating systems are written in languages like platform-specific assembly or C, instead of the abstract symbolic languages used by theorem provers. Those wishing to prove the correctness of an operating system must first translate it into a theorem-prover-friendly language before it can be proved.
I posit that this approach, though practical given the circumstances, is working in the wrong direction. Software should be designed and proven symbolically first, and then automatically translated into the target language.
This would be slow at first but each subsequent instruction, data structure and algorithm would build upon the last, providing a solid foundation upon which further layers may be developed. Proposed modifications or additions to proven modules would be proven defect-free before it was tested and released, instead of being tested after defects are found.
So, in the meantime, I'm going to be learning how to read and write mathematical proofs.
Speculative execution is the (side-effect free) execution of code with the expectation that its results will probably be useful in the near future. That is, if we can guess with high probability what someone will do next, then we do stuff before we're asked. If we predict correctly we apply the results; if we predicted incorrectly we discard the results.
There is precedence for this; recall those automated combobox dropdowns popular in some desktop and ajax-friendly web applications that generally display popular searches using your current entry text as a prefix.
Why do this in an interactive shell? In short because a significant amount of the time of a typical interactive shell command is used waiting for the user to type it in(!).
Imagine the following command being typed:
grep foo /etc/* | grep -v bar
In the time it takes the user to type | grep -v bar the machine could have, in all likelihood, executed the
first half of the pipeline (or, if not, at least started the process of ensuring /etc/* was loaded in RAM).
grep foo /etc/* could be launched in parallel, waiting for the command to be finished.
If the user changes their mind before hitting Enter, the parallel process could be killed and discarded with no side effect.
This does, however, open up some sticky issues. The key to speculative execution is that the results, until confirmed as being
useful, must not affect the system. Clearly we couldn't speculatively run rm, touch, etc. because
they are not side-effect free; same for arbitrary third-party applications. We could proceed sanely by limiting ourselves to a
subset of standard POSIX tools... (though that doesn't fix things like atime, so we might have to limit ourselves
to noatime mounts)
Also note that simply loading sets of file contents referenced in a pipeline command before it is run would likely greatly reduce latency.
This system would have its downsides, of course; being speculative by nature it would consume resources that may not ever actually be needed; imagine needlessly loading some huge directory's worth of files over an NFS mount, for example. Naturally one would want to be able to disable the feature completely, and also to have limitations on maximum RAM usage, and the ability to disable it from preloading from certain mountpoints of certain types of filesystems.
Despite potential downsides, as the gap increases between latencies between human input, disk and everything else I think you are likely to see a shell providing speculative execution capabilities in the near future.
Perhaps the software industry is still by and large in an artisanal phase and not really an industrial one?
However, the market now also chooses the microkernel-esque hypervisors. Why? Because mainstream PCs can handle it: CPU frequencies are much higher than memory bus speeds, most applications are I/O bound, there are obvious benefits to the ability to treat an OS instance as data and there are many popular operating systems as application platforms.
In the end it is features that drive development; the overwhelming majority of the computer-user population wants the lowest common denominator: more, more, more, at any cost. Security and/or stability at the price of even minor overhead or hassle is not worth it to most consumers.
for i in range(...) iteration method
is superior to traditional iteration via indexes because there is less chance for error.
sudo rmmod psmouse # disable touchpad on Dell Inspiron.
git log --pretty=oneline --graph --abbrev-commit
sudo yum install ImageMagick
log(n)/log(10.) is a good method of calculating the length of an integer's decimal digits.
make buildkernel compiling with llvm/clang.
man make.conf
svn diff
svn://svn.freebsd.org/base/head@196074 \
svn://svn.freebsd.org/base/projects/clangbsd@196074 \
| grep -v "FreeBSD=%H" | less
WRKDIRPREFIX=~/ports-work make
make in Linux the whole machine starts to freeze up after
exactly 5 minutes. If I kill the job the computer still shows ~100% CPU usage, but eventually settles down.
This is reproducible and disturbs me. Tried filing bug report, but VMWare has a large set of hoops to jump through
and I don't have time for it. Looking into other virtualization products. I am also in the process of getting
a new machine with VT-X I wonder if it'll make a difference.
the llvm architecture is really nice! and just as importantly, the code is very approachable because it is only as complex as it needs to be.
everything is lib-ized with well-defined interfaces. you can mix and match parts of the project; so if you wanted to write a tool to analyze C, or generate documentation, or generate tests you wouldn't need to write a whole parser, you could just use theirs. this interests me because that's what I'm interested in doing. the code seems very simple and hackable, unlike GCC which is (more mature, but) monstrous and unapproachable. i've never been a fan of C++, but i think i'll pick it up for the sake of playing with llvm.
set print object on
(gdb) p TD->getSemanticDC()->FirstDecl
$30 = (clang::RecordDecl *) 0x9452dc0
(gdb) p (clang::RecordDecl *) 0x9452dc0
A syntax error in expression, near `) 0x9452dc0'.
(gdb) p ('clang::RecordDecl' *) 0x9452dc0
$31 = (clang::RecordDecl *) 0x9452dc0
FIXME spots in the code itself (this stuff won't show up in the bug tracker but may be causing some of the bugs that are there):
grep -HIrn "FIXME\|TODO\|XXX\|WTF" *
svngrep(){
# recursively search, ignoring subversion files
grep -HIrn --exclude="*.svn*" "$1" *
}
pizza@imac:~/proj/llvm/tools/clang$ \
utils/test/MultiTestRunner.py \
-v test/Sema/pragma-weak.c
asm is still bubblesort.
mov, test, jmp and call.
You can't optimize these. You have to figure out how to make do with fewer of them.
And then I get to thinking that operating systems basically require C.
And then I get to thinking that I'd like to be able to design and implement algorithms in a HLL yet produce C. Scratch that, it would be cooler if I could parse existing C, transform it into an intermediate functional language and do things like modify it and export it back to C, or search for potential bugs in it.
Something like a traditional static analyzer, but not exactly. So let's get to it...
type 'a btree = Leaf of 'a \
| Tree of 'a btree * 'a btree;;
A slightly different bent would be a program containing a "standard library" of clips; a service would store/transfer "songs", which would just be the instructions in order to combine the shared library clips to recreate the author's song.
You could then take that idea and "tag" each of the library clips, or any combination of them with emotive/audio descriptors like "agressive", "harsh", "sad". Then songs could be generated/selected based on the listener's current mood.
Damn.
2009-05-08 I commute to work via train and bike. At least, I try to. The Metro North railroad recently changed their bicycle policy and it's making life harder for bike commuters. I wrote an article about it, have a read:
If you think Metro-North should actually encourage bike commuting please contact the MTA and say so.
"a" + "b"
var_dump("a" + "b");
int(0)
+ operator is a common string concatenator in other languages.
2009-05-03 Spent some serious, quality time doing some bicycle maintenance this weekend. It was fun. I'm proud to say that everything I removed eventually made its way back on (except for the crappy plastic fenders and enormous amounts of grit and grease stuck under my fingernails, of course).
I've been watching the excellent bicycle tutor videos and referring to my bicycle maintenence guide and am getting more confident about doing simple bike maintenance/cleaning. I'd love to be more self-sufficient; it's definitely in keeping with getting rid of my car in the first place. It's funny; like everything else bicycles are both simple and complex — fundamentally they're just a frame, wheels, handlebar, seat and drivetrain, but the details are so much more than that... each of the afore-mentioned components are just the required parts; modern bikes have lots of extra parts as well, and for every single component there are at least 3 major variants each containing a whole spectrum of options. And of course, there are interdependencies and incompatibilities, and each one has different tools.
![[]](images/netbook-lappy.gif)
The machine itself has a nice form factor, is light but seems relatively sturdy. The keyboard is (very) cramped, but I managed; and the machine made an acceptable desktop when augmented with external screen and keyboard.
New stuff usually has bugs that haven't been ironed out. Here's a crotchety old icon (
) to remind yourself of that. Use it!
acpi=off noapic nosmp nolapic clock=pit
it looks like shit in ie6, but i don't care.
__volatile__ is the key!
lsb_release -a
uname -a
cat /etc/*release
$ echo -e \
"GET / HTTP/1.1\r\n" \
"Host: parseerror.com\r\n\r\n" \
| nc 64.34.176.17 80 | head -n 20
That bypasses the parseerror.com DNS and goes straight to the IP host; I used these results to ensure that when DNS does finally propogate the website will appear identically on the new host.
echo "" | gcc -dM -E -
pizza@debian:~/proj/mutant$ time mzscheme -f gen.scm
g dist best
0 873 (+ 5 (- (* a a) (* 7 (+ 9 b))))
1 796 (+ a (* a a))
2 247 (+ (* 4 a) (* a a))
3 30 (+ (+ (+ (+ a 2) a) (+ b b)) (+ a (* a a)))
4 0 (+ 3 (+ (+ (+ (+ a 2) a) (+ b b)) (+ a (* a a))))
real 0m4.903s
user 0m4.099s
sys 0m0.794s
As the generations increase so does our "fitness" (the "distance" decreases) as we
find solutions that converge toward the ideal solution. Of course in the real world there isn't really
a perfect solution to the "problem" of one's environment (although sharks come to mind).Because the candidates are generated and mutated randomly (and also because for any ideal solution there are an infinite number of equivalent answers) the results of the program are always different. Often the distance converges to zero within the first few generations, but sometimes it gets stuck in an evolutionary dead end and fails to find a solution within the maximum 100 generations. I'm going to try to add some logic to backtrack if we detect a potential dead end.
Oh, and I had forgotten how minimal and elegant and beautiful of a language Scheme was. Because it is a homoiconic language (its code and data are interchangable) generating code executable expressions on the fly turned out to be pretty easy.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
/* random number weighted via gaussian distribution */
static double gauss(void)
{
double x = (double)random() / RAND_MAX,
y = (double)random() / RAND_MAX,
z = sqrt(-2 * log(x)) * cos(2 * M_PI * y);
return z;
}
/* aggregate 100k cycles and display */
main(void) {
static long g[11], i = 0;
srandom(time(NULL));
while (i++ < 100000)
g[(long)floor(gauss()+0.5) + 5]++;
for (i = 0; i < 11; i++)
printf("%2ld: %ld\n", i, g[i]);
return 0;
}
$ gcc -lm -o \
rand-normal-distribution rnd-normal-dist.c
$ ./rand-normal-distribution
0: 0
1: 20
2: 624
3: 6103
4: 24179
5: 38314
6: 24112
7: 6047
8: 577
9: 23
10: 1
or in scheme...
; random number weighted via gaussian distribution
(define (gauss)
(* (sqrt (* -2 (log (random))))
(cos (* 2 (acos -1) (random)))))
; aggregate 100k cycles and display
(let ((v (make-vector 11 0)))
(define (loop index cnt)
(let ((r (inexact->exact (round (+ 5 (gauss))))))
(vector-set! v r (+ (vector-ref v r) 1))
(if (< index cnt)
(loop (+ index 1) cnt) v)))
(loop 0 1e5))
$ cat random-normal-distribution.scm | mzscheme -i
> > #(0 27 631 6030 23923 38360 24232 6171 603 24 0)