REFLECTIONS ON "GOOGLE BOOKS"
I have been busy working on a freelance writing assignment, and in this
effort, I have tapped into the vast resources of "Google Books."
Google announced maybe a year ago that it would try to scan all the
world's books into digital format. In Mindanao, the south-most part of
the Philippines, there are few libraries that have the kind of books
that I need in my work. It occurred to me that I might find some of
what I was looking for online.
A book that is typed into an "html" text format cannot be trusted to
reflect the original, since it may contain errors by the typist. But
original images of books in pdf format are as good as having the book
itself in one's hands. There are now a vast number of works accessible
free of charge through Google Books (http://books.google.com). From
this vast reservoir, I have managed to gain access to at least some of
the books I need in my work. This is why I am able to stay in the
Philippines at this time. I do all my work on the computer. A visit to
a library is not really necessary.
There are some significant problems with Google Books that I have
noticed while working with it over the past month. While it has been
good as s substitute for nothing, it cannot be said to have all the
features of a library system. So many works are still protected by
copyright, and Google cannot scan these in without permission, and
possibly, some payment, an economic investment which they may not be
willing to make. These are primarily works written after 1924, but,
perhaps on the side of caution, Google has not done much scanning of
complete works published after the 1860s--at least there is a major
drop off after that point.
Furthermore, the scanning is often done in a haphazard way, so that
there are missing pages, or sometimes one sees the hand of the
individual doing the scanning. You can almost sense the corporate and
detached manner in which this task was undertaken. A meeting is held
deciding on the project. A huge number of temps are hired to do the
heavy lifting--in this case, the actual scanning of the books. The
object is to get as many books scanned in as quickly as possible.
Little attention is paid to quality control. Thus, pages of many works
are illegibly scanned, or pages are missing. In a book, where there is
continuity of thought throughout, a missing page means that the entire
sense of the work is affected.
For example, I'm reading "Systems of the World" by the 18th-19th
Century physicist Pierre-Simon Laplace in English translation. It's
wonderful that this work is now so easily accessible. No need to go to
a library, just download the entire work. This is light years ahead of
what existed before, and is a revolutionary use of the internet. But we
turn to page 303, and find that half the page is made up of illegibly
smeared letters, probably because the book was improperly laid on the
scanner. This particular page has Laplace's reasoning for his
formulation of the second law of motion, a particularly important topic
upon which the rest of the work depends. Fortunately, in this
particular case, page 303 and 304 were scanned in a second time,
although it took me a while to figure that out. There are examples of
other books I have looked at where pages are left out. I was lucky
enough to find another digitized edition of the book in which the
missing page or pages were present. It takes a lot of extra time to
find your missing page by looking at other editions if they exist.
While one would like to praise Goggle for what has been accomplished,
the company should have paid attention to the repercussions of a job
poorly done. Not only are readers irked by the missing or blurred
pages, but valuable work that has been invested in the task of scanning
is harmed. What are the chances that some of these works will ever be
scanned in to the system again, once it is found that they have been
improperly scanned in the first place? Furthermore, as some of the
books must be rather old, a second scanning subjects them to further
wear and tear.
Now we come to the various categories of accessibility to the scanned
books that Goggle offers. Some are not available for viewing at
all, so one only gains the knowledge that one's search term has made a
hit in the text of a particular book. No clue is given as to its
context.
The citation of the work may be confusing, and also often appears as if
it was listed haphazardly. The feel is that the emphasis was on
quantity, not quality. Some of these citations may be computer
generated.
I have not detected a way to get easily back to a particular work I
have looked at. Even when I write down the particulars, it is hard to
find the work again.
For this task I had been using Yahoo bookmarks, which lets one record
bookmarks independent of the PC being used, so that if you go to
another machine, you can still recover the bookmarks. This is useful,
except for the fact that some computers just won't let you save your
yahoo bookmark. Google probably has its own bookmarking facility. In
case you don't know, bookmarking is a way of saving the location of a
particular web page for future viewing.
Back to the various degrees of access. No preview means you only know
that your search sequence is included somewhere in the work in question.
A "snippet" view will allow you to see a window with a couple of lines,
that, if you are lucky, contains your search terms, usually highlighted
in yellow.
What I found interesting about "no preview" and "snippet" is that some
of the works so classified on Google Books can actually be found fully
digitalized and available to the public on other sites. Two such sites
of fully scanned books and magazines are sponsored by Cornell
University and Michigan State University. Why Google does not wish to
make this information public, I do not know.
"Limited" access is often reserved for recently published works or
reprints of old works. It appears that publishing companies have found
it good policy to allow access to a smattering of the pages of the
works they are producing. They must feel that it is good advertising.
I have used the limited view quite a bit in my work. Generally,
however, the limited view, which generally consists of several
consecutive accessible pages, followed by a swath of inaccesible pages,
cuts off in the middle of the material I need.
The limited view, as useful as it has been to me, I find equally
frustrating. I can only glean a certain amount of information on a
particular topic. To supplement this, I try to find another "limited
view" work with the rest of the information. This takes a lot of time.
One good feature of "limited view" is that one begins to see the wealth
of literature that has been written in more recent times on many an
interesting topic. In that sense it is, once again, a wonderful
achievement.
Still, the "limited view" is another signpost that tells us the Google
project is a way to commercialize publishing to the extent that you can
really feel the bait and switch of these companies.
"Full view" is what everyone dreams of: a complete library of the
world's books on line. Full view is nothing short of fantastic, were it
not for the many copying errors that are incorporated in the product.
The poor quality control makes what should be a wonderment of the 20th
century into a poor reflection of our present era, its goals and
direction, as contrasted with the excellent quality control of previous
generations of publishers dating back many thousands of years. In that
sense, Google Books is yet a blemish on the history of publishing.
In sum, Google Books is fantastic, yet falls short of what it really
should be as a product of a civilized society. We would hope for less
emphasis on the selfishness of economic return, more emphasis on
quality control. One would think that quality sells. But perhaps that
is not how Google Books sees it. Let's hope that the company, in
undertaking a project as profoundly influential on our society as it
is, will understand the public nature of its role, and step up to the
task.
Posted July 1, 2007
BACK TO HOME PAGE