REFLECTIONS ON "GOOGLE BOOKS"


I have been busy working on a freelance writing assignment, and in this effort, I have tapped into the vast resources of "Google Books."

Google announced maybe a year ago that it would try to scan all the world's books into digital format. In Mindanao, the south-most part of the Philippines, there are few libraries that have the kind of books that I need in my work. It occurred to me that I might find some of what I was looking for online.

A book that is typed into an "html" text format cannot be trusted to reflect the original, since it may contain errors by the typist. But original images of books in pdf format are as good as having the book itself in one's hands. There are now a vast number of works accessible free of charge through Google Books (http://books.google.com). From this vast reservoir, I have managed to gain access to at least some of the books I need in my work. This is why I am able to stay in the Philippines at this time. I do all my work on the computer. A visit to a library is not really necessary.

There are some significant problems with Google Books that I have noticed while working with it over the past month. While it has been good as s substitute for nothing, it cannot be said to have all the features of a library system. So many works are still protected by copyright, and Google cannot scan these in without permission, and possibly, some payment, an economic investment which they may not be willing to make. These are primarily works written after 1924, but, perhaps on the side of caution, Google has not done much scanning of complete works published after the 1860s--at least there is a major drop off after that point.

Furthermore, the scanning is often done in a haphazard way, so that there are missing pages, or sometimes one sees the hand of the individual doing the scanning. You can almost sense the corporate and detached manner in which this task was undertaken. A meeting is held deciding on the project. A huge number of temps are hired to do the heavy lifting--in this case, the actual scanning of the books. The object is to get as many books scanned in as quickly as possible. Little attention is paid to quality control. Thus, pages of many works are illegibly scanned, or pages are missing. In a book, where there is continuity of thought throughout, a missing page means that the entire sense of the work is affected.

For example, I'm reading "Systems of the World" by the 18th-19th Century physicist Pierre-Simon Laplace in English translation. It's wonderful that this work is now so easily accessible. No need to go to a library, just download the entire work. This is light years ahead of what existed before, and is a revolutionary use of the internet. But we turn to page 303, and find that half the page is made up of illegibly smeared letters, probably because the book was improperly laid on the scanner. This particular page has Laplace's reasoning for his formulation of the second law of motion, a particularly important topic upon which the rest of the work depends. Fortunately, in this particular case, page 303 and 304 were scanned in a second time, although it took me a while to figure that out. There are examples of other books I have looked at where pages are left out. I was lucky enough to find another digitized edition of the book in which the missing page or pages were present. It takes a lot of extra time to find your missing page by looking at other editions if they exist.

While one would like to praise Goggle for what has been accomplished, the company should have paid attention to the repercussions of a job poorly done. Not only are readers irked by the missing or blurred pages, but valuable work that has been invested in the task of scanning is harmed. What are the chances that some of these works will ever be scanned in to the system again, once it is found that they have been improperly scanned in the first place? Furthermore, as some of the books must be rather old, a second scanning subjects them to further wear and tear.

Now we come to the various categories of accessibility to the scanned books  that Goggle offers. Some are not available for viewing at all, so one only gains the knowledge that one's search term has made a hit in the text of a particular book. No clue is given as to its context.

The citation of the work may be confusing, and also often appears as if it was listed haphazardly. The feel is that the emphasis was on quantity, not quality. Some of these citations may be computer generated.

I have not detected a way to get easily back to a particular work I have looked at. Even when I write down the particulars, it is hard to find the work again.

For this task I had been using Yahoo bookmarks, which lets one record bookmarks independent of the PC being used, so that if you go to another machine, you can still recover the bookmarks. This is useful, except for the fact that some computers just won't let you save your yahoo bookmark. Google probably has its own bookmarking facility. In case you don't know, bookmarking is a way of saving the location of a particular web page for future viewing.

Back to the various degrees of access. No preview means you only know that your search sequence is included somewhere in the work in question.

A "snippet" view will allow you to see a window with a couple of lines, that, if you are lucky, contains your search terms, usually highlighted in yellow.

What I found interesting about "no preview" and "snippet" is that some of the works so classified on Google Books can actually be found fully digitalized and available to the public on other sites. Two such sites of fully scanned books and magazines are sponsored by Cornell University and Michigan State University. Why Google does not wish to make this information public, I do not know.

"Limited" access is often reserved for recently published works or reprints of old works. It appears that publishing companies have found it good policy to allow access to a smattering of the pages of the works they are producing. They must feel that it is good advertising.

I have used the limited view quite a bit in my work. Generally, however, the limited view, which generally consists of several consecutive accessible pages, followed by a swath of inaccesible pages, cuts off in the middle of the material I need.

The limited view, as useful as it has been to me, I find equally frustrating. I can only glean a certain amount of information on a particular topic. To supplement this, I try to find another "limited view" work with the rest of the information. This takes a lot of time.

One good feature of "limited view" is that one begins to see the wealth of literature that has been written in more recent times on many an interesting topic. In that sense it is, once again, a wonderful achievement.

Still, the "limited view" is another signpost that tells us the Google project is a way to commercialize publishing to the extent that you can really feel the bait and switch of these companies.

"Full view" is what everyone dreams of: a complete library of the world's books on line. Full view is nothing short of fantastic, were it not for the many copying errors that are incorporated in the product. The poor quality control makes what should be a wonderment of the 20th century into a poor reflection of our present era, its goals and direction, as contrasted with the excellent quality control of previous generations of publishers dating back many thousands of years. In that sense, Google Books is yet a blemish on the history of publishing.

In sum, Google Books is fantastic, yet falls short of what it really should be as a product of a civilized society. We would hope for less emphasis on the selfishness of economic return, more emphasis on quality control. One would think that quality sells. But perhaps that is not how Google Books sees it. Let's hope that the company, in undertaking a project as profoundly influential on our society as it is, will understand the public nature of its role, and step up to the task.

Posted July 1, 2007

BACK TO HOME PAGE