lookit!

Hand-painted covers! You can buy this one here.


This one has hand-painted covers AND endpapers. You can buy it here.
( And this one was a commisssion: )
![]() | You are viewing Log in Create a LiveJournal Account Learn more | Explore LJ Culture Entertainment Life Music News & Politics Technology |






Listen, strange women lying in ponds, distributing swords is no basis for a system of government... You can't expect to wield extreme executive power just because a watery tart threw a sword at you!
-- Monty Python

There's a dawning sense that extremely large databases of information, starting in the petabyte level, could change how we learn things. The traditional way of doing science entails constructing a hypothesis to match observed data or to solicit new data. Here's a bunch of observations; what theory explains the data sufficiently so that we can predict the next observation?
It may turn out that tremendously large volumes of data are sufficient to skip the theory part in order to make a predicted observation. Google was one of the first to notice this. For instance, take Google's spell checker. When you misspell a word when googling, Google suggests the proper spelling. How does it know this? How does it predict the correctly spelled word? It is not because it has a theory of good spelling, or has mastered spelling rules. In fact Google knows nothing about spelling rules at all.
Instead Google operates a very large dataset of observations which show that for any given spelling of a word, x number of people say "yes" when asked if they meant to spell word "y." Google's spelling engine consists entirely of these datapoints, rather than any notion of what correct English spelling is. That is why the same system can correct spelling in any language.
Datasets alone, no matter how massive, aren't enough to produce and use hypotheses. Sure a "gods' eye view" can be very helpful and allow synoptic insights not possible any other way. But far more important than quantity of data is quality of data. Big datasets are often Just There. More often than not, there aren't ways of drilling down on a case and seeing where and how it was collected. Sometimes the case isn't informative at all. Sometimes the case is biased because of instrumental or measurement error. Sometimes the case is informative, but knowing something about the measurement could make it more so.
The idea that you can just throw away bad cases because you have so many good ones presumes the ones that are "bad" are known. In experimental work, much new stuff is found by eliminating or ruling out or correcting out biases and effects one layer at a time, and looking at the residuals. It's not like getting a big set of points and plotting them, or doing a regression on them. It's more like having a conversation with them.
A good example of this can be found in the 13th June 2008 issue of Science, D Purves, S Pacala, "Predictive Models of Forest Dynamics". There are many other examples from the LHC's detectors to geophysical prospecting and climate work.







First, America’s political leaders must recognise that, unless recent improvements in the country’s trade balance can be sustained and accelerated, and domestic savings rise sharply, the US will remain heavily dependent on foreign capital in the form of purchases of government or corporate bonds, stocks and direct investment. US policymakers will need to find ways to increase domestic savings, shrink the federal deficit, reduce the heavy reliance of American consumers on credit and curb oil imports. Without these measures, massive amounts of foreign capital will be needed for years to come. This is not a matter of politics, but of arithmetic. In such circumstances, the US must remain attractive to foreign capital or suffer adverse consequences. Sound finances, enabling the US to borrow on reasonable terms, will remain an important factor in national strength. Frequent financial crises, large trade imbalances, a series of outsized budget deficits and failure to put Social Security and Medicare on a more sound financial footing could undermine investor confidence. That would discourage overseas investments and the willingness of central banks to hold dollar reserves, causing a plunge in US financial markets and the dollar, thereby jeopardising America’s growth.
Second, with foreign competition intensifying, robust growth taking place in many parts of the world and large numbers of US jobs and corporate profits dependent on expanding exports, the US needs to boost its own competitiveness and further open foreign markets for its goods and services.
A robust response requires vastly improved training and education, especially in mathematics, engineering, physics and science. More than ever this must benefit minorities and immigrants, the fastest-growing portion of the US workforce. Critical also is acceleration of government and private-sector investment in research and development to create competitive new jobs, products and industries. This is especially true in energy, where a variety of new sources and new technologies are urgently needed: such measures can produce increased employment opportunities, reduce oil dependence and the attendant massive outflows of funds, and sharply curb greenhouse gas emissions. The financial system and tax code must encourage greater domestic savings and investment, and channel funds to the most productive sectors.