Thursday, April 24, 2008

On Google Books

Although I am a big fan of any initiative that makes books freely available (as witnessed by my vain attempt at tracking them), there are a few characteristics of Google Books that make no sense to me. Not to mention deeper issues (such as its not-as-open nature, when compared to initiatives such as the Internet Archive, and the restrictions imposed on the download of books from non-US IP addresses), I wonder why their scanning is often so sloppy. Oversized maps, for instance, are poorly scanned (or not scanned at all). I understand that their focus may be on text searchability, but if libraries (and readers) are to benefit from book digitization initiatives such as this, one is to expect that the integrity of the book's content (maps included) is respected and maintained.

I understand that, as they state in the cover page of their scanned books, this is an expensive and time-consuming job. That's why I find it hard to understand what seems to be another, apparently costlier ineffectiveness: duplicate copies of one and the same item. I am not referring to different editions of the same title, but to having the exact same book repeatedly scanned. For instance, there are at least two copies of the Arte, vocabulario y confesionario de la lengua de Chile, published by Platzmann in 1887 (one from Harvard University, the other from Stanford); there are also at least two copies of Karl von den Steinen's Unter den Naturvölkern Zentral-Brasiliens (here and here). Being the "search giant" it is, one would expect Google to be able to avoid such unnecessary repetitions, in order to better accomplish its "mission [...] to organize the world’s information and to make it universally accessible and useful." Maybe a more library-like cataloguing system would help (it would certainly help readers), with a way of uniquely identifying titles--OCLC numbers, etc.